Re: pgsql: Add parallel-aware hash joins. - Mailing list pgsql-committers
From | Tom Lane |
---|---|
Subject | Re: pgsql: Add parallel-aware hash joins. |
Date | |
Msg-id | 30655.1514673257@sss.pgh.pa.us Whole thread Raw |
In response to | Re: pgsql: Add parallel-aware hash joins. (Thomas Munro <thomas.munro@enterprisedb.com>) |
Responses |
Re: pgsql: Add parallel-aware hash joins.
|
List | pgsql-committers |
Thomas Munro <thomas.munro@enterprisedb.com> writes: >> This is explained by the early exit case in >> ExecParallelHashEnsureBatchAccessors(). With just the right timing, >> it finishes up not reporting the true nbatch number, and never calling >> ExecParallelHashUpdateSpacePeak(). > Hi Tom, > You mentioned that prairiedog sees the problem about one time in > thirty. Would you mind checking if it goes away with this patch > applied? I've run 55 cycles of "make installcheck" without seeing a failure with this patch installed. That's not enough to be totally sure of course, but I think this probably fixes it. However ... I noticed that my other dinosaur gaur shows the other failure mode we see in the buildfarm, the "increased_batches = t" diff, and I can report that this patch does *not* help that. The underlying EXPLAIN output goes from something like ! Finalize Aggregate (cost=823.85..823.86 rows=1 width=8) (actual time=1378.102..1378.105 rows=1 loops=1) ! -> Gather (cost=823.63..823.84 rows=2 width=8) (actual time=1377.909..1378.006 rows=3 loops=1) ! Workers Planned: 2 ! Workers Launched: 2 ! -> Partial Aggregate (cost=823.63..823.64 rows=1 width=8) (actual time=1280.298..1280.302 rows=1 loops=3) ! -> Parallel Hash Join (cost=387.50..802.80 rows=8333 width=0) (actual time=1070.179..1249.142 rows=6667loops=3) ! Hash Cond: (r.id = s.id) ! -> Parallel Seq Scan on simple r (cost=0.00..250.33 rows=8333 width=4) (actual time=0.173..62.063rows=6667 loops=3) ! -> Parallel Hash (cost=250.33..250.33 rows=8333 width=4) (actual time=454.305..454.305 rows=6667loops=3) ! Buckets: 4096 Batches: 8 Memory Usage: 208kB ! -> Parallel Seq Scan on simple s (cost=0.00..250.33 rows=8333 width=4) (actual time=0.178..67.115rows=6667 loops=3) ! Planning time: 1.861 ms ! Execution time: 1687.311 ms to something like ! Finalize Aggregate (cost=823.85..823.86 rows=1 width=8) (actual time=1588.733..1588.737 rows=1 loops=1) ! -> Gather (cost=823.63..823.84 rows=2 width=8) (actual time=1588.529..1588.634 rows=3 loops=1) ! Workers Planned: 2 ! Workers Launched: 2 ! -> Partial Aggregate (cost=823.63..823.64 rows=1 width=8) (actual time=1492.631..1492.635 rows=1 loops=3) ! -> Parallel Hash Join (cost=387.50..802.80 rows=8333 width=0) (actual time=1270.309..1451.501 rows=6667loops=3) ! Hash Cond: (r.id = s.id) ! -> Parallel Seq Scan on simple r (cost=0.00..250.33 rows=8333 width=4) (actual time=0.219..158.144rows=6667 loops=3) ! -> Parallel Hash (cost=250.33..250.33 rows=8333 width=4) (actual time=634.614..634.614 rows=6667loops=3) ! Buckets: 4096 (originally 4096) Batches: 16 (originally 8) Memory Usage: 176kB ! -> Parallel Seq Scan on simple s (cost=0.00..250.33 rows=8333 width=4) (actual time=0.182..120.074rows=6667 loops=3) ! Planning time: 1.931 ms ! Execution time: 2219.417 ms so again we have a case where the plan didn't change but the execution behavior did. This isn't quite 100% reproducible on gaur/pademelon, but it fails more often than not seems like, so I can poke into it if you can say what info would be helpful. regards, tom lane
pgsql-committers by date: