Re: pgsql: Add parallel-aware hash joins. - Mailing list pgsql-committers
| From | Tom Lane |
|---|---|
| Subject | Re: pgsql: Add parallel-aware hash joins. |
| Date | |
| Msg-id | 30655.1514673257@sss.pgh.pa.us Whole thread Raw |
| In response to | Re: pgsql: Add parallel-aware hash joins. (Thomas Munro <thomas.munro@enterprisedb.com>) |
| Responses |
Re: pgsql: Add parallel-aware hash joins.
|
| List | pgsql-committers |
Thomas Munro <thomas.munro@enterprisedb.com> writes:
>> This is explained by the early exit case in
>> ExecParallelHashEnsureBatchAccessors(). With just the right timing,
>> it finishes up not reporting the true nbatch number, and never calling
>> ExecParallelHashUpdateSpacePeak().
> Hi Tom,
> You mentioned that prairiedog sees the problem about one time in
> thirty. Would you mind checking if it goes away with this patch
> applied?
I've run 55 cycles of "make installcheck" without seeing a failure
with this patch installed. That's not enough to be totally sure
of course, but I think this probably fixes it.
However ... I noticed that my other dinosaur gaur shows the other failure
mode we see in the buildfarm, the "increased_batches = t" diff, and
I can report that this patch does *not* help that. The underlying
EXPLAIN output goes from something like
! Finalize Aggregate (cost=823.85..823.86 rows=1 width=8) (actual time=1378.102..1378.105 rows=1 loops=1)
! -> Gather (cost=823.63..823.84 rows=2 width=8) (actual time=1377.909..1378.006 rows=3 loops=1)
! Workers Planned: 2
! Workers Launched: 2
! -> Partial Aggregate (cost=823.63..823.64 rows=1 width=8) (actual time=1280.298..1280.302 rows=1 loops=3)
! -> Parallel Hash Join (cost=387.50..802.80 rows=8333 width=0) (actual time=1070.179..1249.142
rows=6667loops=3)
! Hash Cond: (r.id = s.id)
! -> Parallel Seq Scan on simple r (cost=0.00..250.33 rows=8333 width=4) (actual
time=0.173..62.063rows=6667 loops=3)
! -> Parallel Hash (cost=250.33..250.33 rows=8333 width=4) (actual time=454.305..454.305
rows=6667loops=3)
! Buckets: 4096 Batches: 8 Memory Usage: 208kB
! -> Parallel Seq Scan on simple s (cost=0.00..250.33 rows=8333 width=4) (actual
time=0.178..67.115rows=6667 loops=3)
! Planning time: 1.861 ms
! Execution time: 1687.311 ms
to something like
! Finalize Aggregate (cost=823.85..823.86 rows=1 width=8) (actual time=1588.733..1588.737 rows=1 loops=1)
! -> Gather (cost=823.63..823.84 rows=2 width=8) (actual time=1588.529..1588.634 rows=3 loops=1)
! Workers Planned: 2
! Workers Launched: 2
! -> Partial Aggregate (cost=823.63..823.64 rows=1 width=8) (actual time=1492.631..1492.635 rows=1 loops=3)
! -> Parallel Hash Join (cost=387.50..802.80 rows=8333 width=0) (actual time=1270.309..1451.501
rows=6667loops=3)
! Hash Cond: (r.id = s.id)
! -> Parallel Seq Scan on simple r (cost=0.00..250.33 rows=8333 width=4) (actual
time=0.219..158.144rows=6667 loops=3)
! -> Parallel Hash (cost=250.33..250.33 rows=8333 width=4) (actual time=634.614..634.614
rows=6667loops=3)
! Buckets: 4096 (originally 4096) Batches: 16 (originally 8) Memory Usage: 176kB
! -> Parallel Seq Scan on simple s (cost=0.00..250.33 rows=8333 width=4) (actual
time=0.182..120.074rows=6667 loops=3)
! Planning time: 1.931 ms
! Execution time: 2219.417 ms
so again we have a case where the plan didn't change but the execution
behavior did. This isn't quite 100% reproducible on gaur/pademelon,
but it fails more often than not seems like, so I can poke into it
if you can say what info would be helpful.
regards, tom lane
pgsql-committers by date: