Re: general PG network slowness (possible cure) (repost) - Mailing list pgsql-performance
From | Peter T. Breuer |
---|---|
Subject | Re: general PG network slowness (possible cure) (repost) |
Date | |
Msg-id | 200705251344.l4PDibQ00402@inv.it.uc3m.es Whole thread Raw |
In response to | Re: general PG network slowness (possible cure) (repost) (Richard Huxton <dev@archonet.com>) |
Responses |
Re: general PG network slowness (possible cure) (repost)
|
List | pgsql-performance |
"Also sprach Richard Huxton:" [Charset ISO-8859-1 unsupported, filtering to ASCII...] > Peter T. Breuer wrote: > > I set up pg to replace a plain gdbm database for my application. But > > even running to the same machine, via a unix socket > > > > * the pg database ran 100 times slower > > For what operations? Bulk reads? 19-way joins? The only operations being done are simple "find the row with this key", or "update the row with this key". That's all. The queries are not an issue (though why the PG thread choose to max out cpu when it gets the chance to do so through a unix socket, I don't know). > > Across the net it was > > > > * about 500 to 1000 times slower than local gdbm > > > > with no cpu use to speak of. > > Disk-intensive or memory intensive? There is no disk as such... it's running on a ramdisk at the server end. But assuming you mean i/o, i/o was completely stalled. Everything was idle, all waiting on the net. > > On a whim I mapped the network bandwidth per packet size with the NPtcp > > suite, and got surprising answers .. at 1500B, naturally, the bandwidth > > was the full 10Mb/s (minus overheads, say 8.5Mb/s) of my pathetic little > > local net. At 100B the bandwidth available was only 25Kb/s. At 10B, > > you might as well use tin cans and taut string instead. > > This sounds like you're testing a single connection. You would expect > "dead time" to dominate in that scenario. What happens when you have 50 Indeed, it is single, because that's my application. I don't have 50 simultaneous connections. The use of the database is as a permanent storage area for the results of previous analyses (static analysis of the linux kernel codes) from a single client. Multiple threads accessing at the same time might help keep the network drivers busier, which would help. They would always see their buffers filling at an even rate and be able to send out groups of packets at once. > simultaneous connections? Or do you think it's just packet overhead? It's not quite overhead in the sense of the logical layer. It's a physical layer thing. I replied in another mail on this thread, but in summary, tcp behaves badly with small packets on ethernet, even on a dedicated line (as this was). One needs to keep it on a tight rein. > > I also mapped the network flows using ntop, and yes, the average packet > > size for both gdbm and pg in one direction was only about 100B or > > so. That's it! Clearly there are a lot of short queries going out and > > the answers were none too big either ( I had a LIMIT 1 in all my PG > > queries). > > I'm not sure that 100B query-results are usually the bottleneck. > Why would you have LIMIT 1 on all your queries? Because there is always only one answer to the query, according to the logic. So I can always tell the database manager to stop looking after one, which will always help it. > > About 75% of traffic was in the 64-128B range while my application was > > running, with the peak bandwidth in that range being about 75-125Kb/s > > (and I do mean bits, not bytes). > > None of this sounds like typical database traffic to me. Yes, there are > lots of small result-sets, but there are also typically larger (several > kilobytes) to much larger (10s-100s KB). There's none here. > > Soooo ... I took a look at my implementation of remote gdbm, and did > > a very little work to aggregate outgoing transmissions together into > > lumps. Three lines added in two places. At the level of the protocol > > where I could tell how long the immediate conversation segment would be, > > I "corked" the tcp socket before starting the segment and "uncorked" it > > after the segment (for "cork", see tcp(7), setsockopt(2) and TCP_CORK in > > linux). > > I'm a bit puzzled, because I'd have thought the standard Nagle algorithm > would manage this gracefully enough for short-query cases. There's no On the contrary, Nagle is also often wrong here because it will delay sending in order to accumulate more data into buffers when only a little has arrived, then give up when no more data arrives to be sent out, then send out the (short) packet anyway, late. There's no other traffic apart from my (single thread) application. What we want is to direct the sending exactly,n this situation saying when to not send, and when to send. Disable Nagle for a start, use async read (noblock), and sync write, with sends from the socket blocked from initiation of a message until the whole message is ready to be sent out. Sending the message piecemeal just hurts too. > way (that I know of) for a backend to handle more than one query at a time. That's not the scenario. > > Surprise, ... I got a speed up of hundreds of times. The same application > > that crawled under my original rgdbm implementation and under PG now > > maxed out the network bandwidth at close to a full 10Mb/s and 1200 > > pkts/s, at 10% CPU on my 700MHz client, and a bit less on the 1GHz > > server. > > > > So > > > > * Is that what is holding up postgres over the net too? Lots of tiny > > packets? > > I'm not sure your setup is typical, interesting though the figures are. > Google a bit for pg_bench perhaps and see if you can reproduce the > effect with a more typical load. I'd be interested in being proved wrong. But the load is typical HERE. The application works well against gdbm and I was hoping to see speedup from using a _real_ full-fledged DB instead. Well, at least it's very helpful for debugging. > > And if so > > > > * can one fix it the way I fixed it for remote gdbm? > > > > The speedup was hundreds of times. Can someone point me at the relevant > > bits of pg code? A quick look seems to say that fe-*.c is > > interesting. I need to find where the actual read and write on the > > conn->sock is done. > > You'll want to look in backend/libpq and interfaces/libpq I think > (although I'm not a developer). I'll look around there. Specific directions are greatly appreciated. Thanks. Peter
pgsql-performance by date: