Speed dblink using alternate libpq tuple storage - Mailing list pgsql-hackers
From | Greg Smith |
---|---|
Subject | Speed dblink using alternate libpq tuple storage |
Date | |
Msg-id | 4F110AD3.2090607@2ndQuadrant.com Whole thread Raw |
In response to | Re: Allow substitute allocators for PGresult. (Kyotaro HORIGUCHI <horiguchi.kyotaro@oss.ntt.co.jp>) |
Responses |
Re: Speed dblink using alternate libpq tuple storage
|
List | pgsql-hackers |
One patch that fell off the truck during a turn in the November CommitFest was "Allow substitute allocators for PGresult". Re-reading the whole thing again, it actually turned into a rather different submission in the middle, and I know I didn't follow that shift correctly then. I'm replying to its thread but have changed the subject to reflect that change. From a procedural point of view, I don't feel right kicking this back to its author on a Friday night when the deadline for resubmitting it would be Sunday. Instead I've refreshed the patch myself and am adding it to the January CommitFest. The new patch is a single file; it's easy enough to split out the dblink changes if someone wants to work with the pieces separately. After my meta-review I think we should get another reviewer familiar with using dblink to look at this next. This is fundamentally a performance patch now. Some results and benchmarking code were submitted along with it; the other issues are moot if those aren't reproducible. The secondary goal for a new review here is to provide another opinion on the naming issues and abstraction concerns raised so far. To clear out the original line of thinking, this is not a replacement low-level storage allocator anymore. The idea of using such a mechanism to help catch memory leaks has also been dropped. Instead this adds and documents a new path for libpq callers to more directly receive tuples, for both improved speed and lower memory usage. dblink has been modified to use this new mechanism. Benchmarking by the author suggests no significant change in libpq speed when only that change was made, while the modified dblink using the new mechanism was significantly faster. It jumped from 332K tuples/sec to 450K, a 35% gain, and had a lower memory footprint too. Test methodology and those results are at http://archives.postgresql.org/pgsql-hackers/2011-12/msg00008.php Robert Haas did a quick code review of this already, it along with author response mixed in are at http://archives.postgresql.org/pgsql-hackers/2011-12/msg01149.php I see two areas of contention there: -There are several naming bits no one is happy with yet. Robert didn't like some of them, but neither did Kyotaro. I don't have an opinion myself. Is it the case that some changes to the existing code's terminology are what's actually needed to make this all better? Or is this just fundamentally warty and there's nothing to be done about it. Dunno. -There is an abstraction wrapper vs. coding convenience trade-off centering around PGresAttValue. It sounded to me like it raised always fun questions like "where's the right place for the line between lipq-fe.h and libpq-int.h to be?" dblink is pretty popular, and this is a big performance win for it. If naming and API boundary issues are the worst problems here, this sounds like something well worth pursuing as part of 9.2's still advancing performance theme. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
Attachment
pgsql-hackers by date: