Re: Bug in batch tuplesort memory CLUSTER case (9.6 only) - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Bug in batch tuplesort memory CLUSTER case (9.6 only) |
Date | |
Msg-id | CAM3SWZSgxehDkDMq1FdiW2A0Dxc79wH0hz1x-TnGy=1BXEL+nw@mail.gmail.com Whole thread Raw |
In response to | Re: Bug in batch tuplesort memory CLUSTER case (9.6 only) (Noah Misch <noah@leadboat.com>) |
Responses |
Re: Bug in batch tuplesort memory CLUSTER case (9.6 only)
|
List | pgsql-hackers |
On Thu, Jul 7, 2016 at 12:34 AM, Noah Misch <noah@leadboat.com> wrote: >> How do you feel about adding testing to tuplesort.c not limited to >> hitting this bug (when Valgrind memcheck is used)? > > Sounds great, but again, not in the patch fixing this bug. Attached patch adds a CLUSTER external sort test case to the regression tests, as discussed. This makes a hardly noticeable difference on the run time of the CLUSTER tests, at least for me. Consider the following: $ time make installcheck-tests TESTS="create_function_1 create_type create_table copy cluster" *** SNIP *** (using postmaster on Unix socket, default port) ============== dropping database "regression" ============== DROP DATABASE ============== creating database "regression" ============== CREATE DATABASE ALTER DATABASE ============== running regression test queries ============== test create_function_1 ... ok test create_type ... ok test create_table ... ok test copy ... ok test cluster ... ok ===================== All 5 tests passed. ===================== make[1]: Leaving directory '/home/pg/postgresql/src/test/regress' make installcheck-tests 0.20s user 0.03s system 16% cpu 1.422 total The added overhead is discernible from the noise for me, but not by much. If I had to put a number on it, I'd say that these new additions make the regression tests take approximately 120 milliseconds longer to complete on a fast machine. And yet, the new tests add test coverage for: * Multiple pass external sorts. With only 1MB of maintenance_work_mem, only the minimum number of tapes, 7, are used, so it's really easy to see multiple merge passes with only 6 input tapes (the wide caller tuples used by this CLUSTER case also help). The final on-the-fly merge still uses batch memory, of course. * This CLUSTER external sort bug -- I've verified that the new movetup_cluster() function is hit (it's hit over 10 times, in fact), and so the Valgrind animal ("Skink") would have caught the CLUSTER bug had this testing been in place earlier. * The best case for replacement selection, where only one run is produced, and so no merge is required. (So, there are 2 CLUSTER operations that perform external sort operations added in total.) * Due to hitting that replacement selection best case, TSS_SORTEDONTAPE tuple handling also gets some coverage, and so (since we also test multiple passes) we perhaps manage to cover all code used for the randomAccess/materialized tape as final output case, without directly testing a randomAccess caller case separately (direct testing would not be possible with CLUSTER, which never cares about getting randomAccess to the final output). Overall, significant test coverage is added, with only a small impact on test runtime. It seemed to not make much sense to produce separate patches to do this much. As discussed, I intend to produce more test coverage still, including coverage of hash index build tuplesorts. I hope Noah does not imagine that I disregarded his remarks on doing more than the minimum in my first pass. I would have to increase maintenance_work_mem to not get multiple passes on of CLUSTER of any conveniently available regression test table. With 2MB of maintenance_work_mem, a one pass sort of 3 runs is all that is required (and certainly not multiple passes). With 6MB of maintenance_work_mem, the sort completes in memory. There seems to be little reason to not do this much at once. Testing replacement selection in the second CLUSTER is made very convenient by the fact that we just ran CLUSTER, so input should be presorted. -- Peter Geoghegan
Attachment
pgsql-hackers by date: