Re: COPY FROM WHEN condition - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: COPY FROM WHEN condition |
Date | |
Msg-id | b5fe1ea4-5007-2183-bd11-a8210cca3cd8@2ndquadrant.com Whole thread Raw |
In response to | Re: COPY FROM WHEN condition (David Rowley <david.rowley@2ndquadrant.com>) |
Responses |
Re: COPY FROM WHEN condition
|
List | pgsql-hackers |
On 1/29/19 8:18 AM, David Rowley wrote: > On Wed, 23 Jan 2019 at 06:35, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: >> It turned out to be a tad more complex due to partitioning, because when >> we find the partitions do not match, the tuple is already allocated in >> the "current" context (be it per-tuple or batch). So we can't just free >> the whole context at that point. The old code worked around this by >> alternating two contexts, but that seems a bit too cumbersome to me, so >> the patch simply copies the tuple to the new context. That allows us to >> reset the batch context always, right after emptying the buffer. I need >> to do some benchmarking to see if the extra copy causes any regression. > > I agree that fixing the problem mentioned by separating out tuple and > batch contexts is a good idea, but I disagree with removing the > alternating batch context idea. FWIW the reason the alternating > contexts went in wasn't just for fun, it was on performance grounds. > There's a lengthy discussion in [1] about not adding any new > performance regressions to COPY. To be more specific, Peter complains > about a regression of 5% in [2]. > > It's pretty disheartening to see the work there being partially undone. > Whoops :-( > Here are my performance tests of with and without your change to the > memory contexts (I missed where you posted your results). > > $ cat bench.pl > for (my $i=0; $i < 8912891; $i++) { > print "1\n1\n2\n2\n"; > } > 36a1281f86c: (with your change) > > postgres=# copy listp from program $$perl ~/bench.pl$$ delimiter '|'; > COPY 35651564 > Time: 26825.142 ms (00:26.825) > Time: 27202.117 ms (00:27.202) > Time: 26266.705 ms (00:26.267) > > 4be058fe9ec: (before your change) > > postgres=# copy listp from program $$perl ~/bench.pl$$ delimiter '|'; > COPY 35651564 > Time: 25645.460 ms (00:25.645) > Time: 25698.193 ms (00:25.698) > Time: 25737.251 ms (00:25.737) > > So looks like your change slows this code down 4% for this test case. > That's about twice as bad as the 2.2% regression that I had to solve > for the multi-insert partition patch (of which you've removed much of > the code from) > > I'd really like to see the alternating batch context code being put > back in to fix this. Of course, if you have a better idea, then we can > look into that, but just removing code that was carefully written and > benchmarked without any new benchmarks to prove that it's okay to do > so seems wrong to me. > Yeah, that's not very nice. Do you suggest to revert 36a1281f86c0f, or shall we try to massage it a bit first? ISTM we could simply create two batch memory contexts and alternate those, just like with the expression contexts before. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: