Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash - Mailing list pgsql-bugs
From | James Coleman |
---|---|
Subject | Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash |
Date | |
Msg-id | CAAaqYe8XAVk408rowxxUpSNMVXx=fpK-rRKGzwkuHBaCcv3jRQ@mail.gmail.com Whole thread Raw |
In response to | Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Responses |
Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash
|
List | pgsql-bugs |
On Fri, Nov 8, 2019 at 6:30 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > > On Fri, Nov 08, 2019 at 09:52:16PM +0000, PG Bug reporting form wrote: > >The following bug has been logged on the website: > > > >Bug reference: 16104 > >Logged by: James Coleman > >Email address: jtc331@gmail.com > >PostgreSQL version: 11.5 > >Operating system: Debian > >Description: > > > >We have a query that, after a recent logical migration to 11.5, ends up with > >a parallel hash join (I don't think the query plan/query itself is important > >here, but if needed after the rest of the explanation, I can try to redact > >it for posting). The query results in this error: > > > >ERROR: invalid DSA memory alloc request size 1375731712 > > > >(the size changes sometimes significantly, but always over a GB) > > > >At first glance it sounded eerily similar to this report which preceded the > >final release of 11.0: > >https://www.postgresql.org/message-id/flat/CAEepm%3D1x48j0P5gwDUXyo6c9xRx0t_57UjVaz6X98fEyN-mQ4A%40mail.gmail.com#465f3a61bea2719bc4a7102541326dde > >but I confirmed that the patch for that bug was applied and is in 11.5 (and > >earlier). > > > >We managed to reproduce this on a replica, and so were able to attach gdb in > >production to capture a backtrace: > > > >#0 errfinish (dummy=dummy@entry=0) at > >./build/../src/backend/utils/error/elog.c:423 > >#1 0x000055a7c0a00f79 in elog_finish (elevel=elevel@entry=20, > >fmt=fmt@entry=0x55a7c0babc18 "invalid DSA memory alloc request size %zu") at > >./build/../src/backend/utils/error/elog.c:1385 > >#2 0x000055a7c0a2308b in dsa_allocate_extended (area=0x55a7c1d6aa38, > >size=1140850688, flags=flags@entry=4) at > >./build/../src/backend/utils/mmgr/dsa.c:677 > >#3 0x000055a7c079bd17 in ExecParallelHashJoinSetUpBatches > >(hashtable=hashtable@entry=0x55a7c1db2740, nbatch=nbatch@entry=2097152) at > >./build/../src/backend/executor/nodeHash.c:2889 > > ... > > I've briefly looked at this today, and I think the root cause is > somewhat similar to what is described in [1] where we simply increase > the number of batches in an effort to keep batch contents in work_mem, > but ignoring that each batch requires quite a bit of memory. So we end > up with a lot of batches where each is small enough to fit into > work_mem, but we need much more than work_mem to track the batches. > > This seems to be about the same problem, except that instead of > forgeting about BufFile, the parallel hash join ignores this: > > pstate->batches = > dsa_allocate0(hashtable->area, > EstimateParallelHashJoinBatch(hashtable) * nbatch); > > Looking at the backtrace, you ended up with nbatch=2097152 and it failed > trying to allocate 1140850688 bytes, i.e. exactly 544B per batch. > > The EstimateParallelHashJoinBatch combines a bunch of structs and does > depend on the number of participants, and by my rough calculation 544B > means ~13 participants. Any idea if that matches the number of parallel > workers for this query? We currently have max_parallel_workers_per_gather = 6, which is where the repro happened with the backtrace. We initially had the problem with max_parallel_workers_per_gather = 2. I also just confirmed an explain shows Workers Planned: 6. > Can you try running the query with fewer workers per gather? Since it failed initially with 2, I think we've effectively tested this already (I assume that's a significant enough change to be expected to solve it). > IMHO this has the same issue as the non-parallel case described in [1] > in that we don't really consider this memory when planning the query, > which means we may accept hash join for queries where we know the number > of batches will be too high during planning. We need to reject parallel > hash join in those cases. > > Of course, I don't know if this is such case - perhaps this is a case of > underestimation and we only learn the actual number while executing the > query, at which point it's too late to reject PHJ. I suppose the right > solution in that case would be some form of BNL, as discussed in [2] but > I'm not sure how far are we from having that, or if that works for PHJ. > > > regards > > [1] https://www.postgresql.org/message-id/flat/20190504003414.bulcbnge3rhwhcsh@development > > [2] https://www.postgresql.org/message-id/CA%2BhUKGKWWmf%3DWELLG%3DaUGbcugRaSQbtm0tKYiBut-B2rVKX63g%40mail.gmail.com > > -- > Tomas Vondra http://www.2ndQuadrant.com > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-bugs by date: