Thread: BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker
BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 18349 Logged by: Alexey Ermakov Email address: alexey.ermakov@dataegret.com PostgreSQL version: 16.2 Operating system: Rocky Linux release 9.3 (Blue Onyx) Description: Hello! I found a problem with memory consumption of parallel worker when postgres 16.2 runs complex analytical query. After simplifying I was able to reproduce it with following query with just 2 joins: 2024-02-16 12:14:25.669 UTC [425424] ERROR: XX000: invalid DSA memory alloc request size 1811939328 2024-02-16 12:14:25.669 UTC [425424] LOCATION: dsa_allocate_extended, dsa.c:690 2024-02-16 12:14:25.669 UTC [425424] STATEMENT: SELECT * FROM history h LEFT JOIN coupon cpn ON cpn.recloc = h.recloc AND cpn.version = h.version LEFT JOIN free_text i2 ON i2.recloc = cpn.recloc AND i2.part = 1 WHERE h.date_operation >= '2024-02-13' and h.date_operation < '2024-02-14'; 2024-02-16 12:14:25.669 UTC [425424] DEBUG: 00000: shmem_exit(1): 5 before_shmem_exit callbacks to make 2024-02-16 12:14:25.669 UTC [425424] LOCATION: shmem_exit, ipc.c:240 Here is some memory/parallel workers- related settings: max_parallel_workers = 12 max_parallel_workers_per_gather = 2 hash_mem_multiplier = 2 work_mem = '4MB' When I increase work_mem to 16MB then query runs successfully (same plan): Gather (cost=1082271.57..8357367.15 rows=26292632 width=345) (actual time=50735.451..64000.782 rows=1325161 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=4102227 read=7004436, temp read=1808967 written=1809044 I/O Timings: shared read=37739.143, temp read=4809.969 write=9139.669 -> Parallel Hash Right Join (cost=1081271.57..5727103.95 rows=10955263 width=345) (actual time=50716.477..61271.276 rows=441720 loops=3) Hash Cond: ((i2.recloc)::text = (cpn.recloc)::text) Buffers: shared hit=4102227 read=7004436, temp read=1808967 written=1809044 I/O Timings: shared read=37739.143, temp read=4809.969 write=9139.669 -> Parallel Seq Scan on free_text i2 (cost=0.00..4160436.50 rows=74411061 width=63) (actual time=0.196..37944.130 rows=60240671 loops=3) Filter: (part = 1) Rows Removed by Filter: 160726692 Buffers: shared hit=158869 read=7004436 I/O Timings: shared read=37739.143 -> Parallel Hash (cost=1077675.12..1077675.12 rows=220524 width=282) (actual time=1406.500..1406.503 rows=209531 loops=3) Buckets: 131072 Batches: 8 Memory Usage: 30496kB Buffers: shared hit=3943254, temp written=10316 I/O Timings: temp write=58.043 -> Nested Loop Left Join (cost=1.14..1077675.12 rows=220524 width=282) (actual time=0.048..1239.847 rows=209531 loops=3) Buffers: shared hit=3943254 -> Parallel Index Scan using history_index_date on history h (cost=0.57..42000.62 rows=156296 width=90) (actual time=0.021..76.927 rows=198024 loops=3) Index Cond: ((date_operation >= '2024-02-13 00:00:00'::timestamp without time zone) AND (date_operation < '2024-02-14 00:00:00'::timestamp without time zone)) Buffers: shared hit=260503 -> Index Scan using coupon_index_rl on coupon cpn (cost=0.57..6.62 rows=1 width=192) (actual time=0.004..0.005 rows=1 loops=594071) Index Cond: ((recloc)::text = (h.recloc)::text) Filter: (version = h.version) Rows Removed by Filter: 2 Buffers: shared hit=3682751 Planning: Buffers: shared hit=72 Planning Time: 0.506 ms Execution Time: 64060.476 ms (32 rows) Interesting thing: after query fails it take some time (seconds) before psql gives control back. It turns out that postgresql written 1M+ (!) temporary files (most of them are empty) and cleaning up takes some time. With bigger work_mem it won't write so many files. Maybe memory issue somehow related with number of files. In PostgreSQL 16.2 there were fix for similar bug: [1],[2]. Could it be that there is some issue with it ? I wasn't able to make reproducible test case so far but may be given information is enough... [1] https://github.com/postgres/postgres/commit/2a67b5a60ee68892bb028587ddc6de7650822480 [2] https://www.postgresql.org/message-id/flat/16925-ec96d83529d0d629%40postgresql.org Thanks, Alexey Ermakov
Re: BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker
From
Andrei Lepikhov
Date:
On 16/2/2024 20:18, PG Bug reporting form wrote: > Interesting thing: after query fails it take some time (seconds) before psql > gives control back. > It turns out that postgresql written 1M+ (!) temporary files (most of them > are empty) and cleaning up takes some time. Interesting. It correlates with one performance issue I have been trying to catch already 3 months. Could you provide some reproduction of that behavior? > With bigger work_mem it won't write so many files. Maybe memory issue > somehow related with number of files. > > In PostgreSQL 16.2 there were fix for similar bug: [1],[2]. Could it be that > there is some issue with it ? Looks like surely the same issue we fixed recently: Parallel Hash Join + DSM is a sign of that problem. -- regards, Andrei Lepikhov Postgres Professional
Re: BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker
From
Alexey Ermakov
Date:
On 2024-02-16 20:40, Andrei Lepikhov wrote: > Interesting. It correlates with one performance issue I have been > trying to catch already 3 months. Could you provide some reproduction > of that behavior? > Yes, I'm still trying to make reproducer, it will take some time. Thanks. -- Alexey Ermakov
Re: BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker
From
Tomas Vondra
Date:
Hi, On 2/20/24 15:05, Alexey Ermakov wrote: > On 2024-02-16 20:40, Andrei Lepikhov wrote: >> Interesting. It correlates with one performance issue I have been >> trying to catch already 3 months. Could you provide some reproduction >> of that behavior? >> > Yes, I'm still trying to make reproducer, it will take some time. Thanks. > I wonder if this might be yet another manifestation of the hashjoin batch explosion issue we have. The plan has a hash join, and the fact that it runs with a bit more memory would be consistent too. The hashjoin batch explosion happens when we find a batch that's too large to fit into a work_mem, and increasing the number of batches does not really make it smaller (e.g. because there's a lot of rows with exactly the same key). We end up doubling the number of batches, but each batch needs a 8kB file buffer, so it's not hard to consume a lot of memory due to this. Chances are the DSA allocation fails simply because the system hits overcommit limit, or something like that. It's a bit weird it needs 1.8GB of memory, but perhaps that's also linked to the number of batches, somehow? Anyway, if you could set a breakpoint on the error, and see how many batches the hash join has, that'd be helpful. I'd probably try doing that with non-parallel query, it makes it easier to debug and it may even report the number of batches if it completes. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker
From
Andrei Lepikhov
Date:
On 21/2/2024 19:52, Tomas Vondra wrote: > Hi, > > On 2/20/24 15:05, Alexey Ermakov wrote: >> On 2024-02-16 20:40, Andrei Lepikhov wrote: >>> Interesting. It correlates with one performance issue I have been >>> trying to catch already 3 months. Could you provide some reproduction >>> of that behavior? >>> >> Yes, I'm still trying to make reproducer, it will take some time. Thanks. >> > It's a bit weird it needs 1.8GB of memory, but perhaps that's also > linked to the number of batches, somehow? I found one possible weak point in the code of PHJ: ExecParallelHashJoinSetUpBatches: pstate->batches = dsa_allocate0(hashtable->area, EstimateParallelHashJoinBatch(hashtable) * nbatch); It could explain why we have such a huge memory allocation with a size not bonded to a power of 2. Also, it makes more sense that we already have a second report [1] with the same value of DSA allocation. So, can Alexey apply a trivial patch and check whether the bug has gone? [1] https://www.postgresql.org/message-id/flat/PAYP264MB3368AF4D63704E370B204C39FF582%40PAYP264MB3368.FRAP264.PROD.OUTLOOK.COM -- regards, Andrei Lepikhov Postgres Professional
Re: BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker
From
Alexey Ermakov
Date:
Hello! I reproduced generation of multiple temporary files and abnormal memory usage (in 'top' output both backend and parallel worker used around 3-4GB of RAM each, I often got OOM killer) but not memory allocation error so far. On machine where I first got error I always got the same request allocation size despite usage of different tables and conditions. Interesting that same number was on another bug report. Here is what I used: PostgreSQL 16.2 (Ubuntu 16.2-1.pgdg20.04+1), virtual machine with 8GB of RAM, 6GB swap and around 15-20GB free space for temp files. Almost all settings in postgresql.conf are default: ========================================================================= shared_buffers = 128MB effective_cache_size = 4GB work_mem = 1MB hash_mem_multiplier = 2 max_parallel_workers_per_gather = 1 jit = off random_page_cost = 4 seq_page_cost = 1 test data: ========================================================================= --test_1 table, 3.5M rows, 1..3M, n_distinct -0.8 create table test_1(a bigint); insert into test_1 select id from generate_series(1,3000000) gs(id); insert into test_1 select id from generate_series(1,500000) gs(id); analyze test_1; --test_2 table, 2 identical columns, 10M rows, 1..1M, n_distinct 933k create table test_2(a bigint, b bigint); insert into test_2 select gs.id, gs.id from generate_series(1,1000000) gs(id), generate_series(1,10) gs2(id); create index on test_2(b); analyze test_2; --test_3 table, 12.58M rows, 1..500k, 80k rows with same value, n_distinct 490k create table test_3(a bigint); insert into test_3 select gs.id from generate_series(1,500000) gs(id), generate_series(1,25) gs2(id); insert into test_3 select 999 from generate_series(1,80000); create index on test_3(a); analyze test_3; ========================================================================= set work_mem = '2MB'; explain (analyze, buffers) select test_2.a from test_2 left join test_3 on test_3.a = test_2.a and (test_3.a + 0) = test_3.a left join test_1 on test_1.a = test_2.a where test_2.b < 21500; ~224k temp files generated QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------ Gather (cost=280156.31..385334.66 rows=308388 width=8) (actual time=16342.410..35577.099 rows=12349500 loops=1) Workers Planned: 1 Workers Launched: 1 Buffers: shared hit=1077677 read=14901, temp read=245247 written=912164 -> Parallel Hash Right Join (cost=279156.31..353495.86 rows=181405 width=8) (actual time=16340.950..29837.671 rows=6174750 loops=2) Hash Cond: (test_1.a = test_2.a) Buffers: shared hit=1077677 read=14901, temp read=245247 written=912164 -> Parallel Seq Scan on test_1 (cost=0.00..36075.24 rows=2058824 width=8) (actual time=0.142..169.721 rows=1750000 loops=2) Buffers: shared hit=586 read=14901 -> Parallel Hash (cost=276962.04..276962.04 rows=133701 width=8) (actual time=4391.620..4391.624 rows=3087375 loops=2) Buckets: 131072 (originally 131072) Batches: 65536 (originally 4) Memory Usage: 32352kB Buffers: shared hit=1077043, temp read=72723 written=387876 -> Nested Loop Left Join (cost=2814.38..276962.04 rows=133701 width=8) (actual time=2.884..567.014 rows=3087375 loops=2) Buffers: shared hit=1077043 -> Parallel Bitmap Heap Scan on test_2 (cost=2813.95..108618.25 rows=133701 width=8) (actual time=2.865..11.149 rows=107495 loops=2) Recheck Cond: (b < 21500) Heap Blocks: exact=652 Buffers: shared hit=1422 -> Bitmap Index Scan on test_2_b_idx (cost=0.00..2757.12 rows=227292 width=0) (actual time=3.901..3.901 rows=214990 loops=1) Index Cond: (b < 21500) Buffers: shared hit=250 -> Index Only Scan using test_3_a_idx on test_3 (cost=0.43..1.25 rows=1 width=8) (actual time=0.001..0.003 rows=29 loops=214990) Index Cond: (a = test_2.a) Filter: ((a + 0) = a) Heap Fetches: 0 Buffers: shared hit=1075621 Planning: Buffers: shared hit=8 Planning Time: 0.576 ms Execution Time: 35907.097 ms Unfortunately that plan is not very stable with different parameters and on different versions of postgresql (I wasn't able to get it on machine with enough resources to test quickly). Sometimes planner wants to hash join with test_3 table instead of test_1 or join with test_1 table but in inner part. With other plans it won't reproduce. with work_mem = 4MB we get 141k files, with work_mem = 1MB we get 66k files. more files = more memory consumed = bigger execution time. with more than 700-1000k files system became unresponsive. in that test I've got: >Batches: 65536 (originally 4) with prod data largest number of batches I've seen: >Buckets: 131072 (originally 131072) Batches: 262144 (originally 16) Memory Usage: 4736kB Perhaps that helps somehow. >So, can Alexey apply a trivial patch and check whether the bug has gone? yes, I think it's possible to test with prod data but I don't fully understand how would you like to change given line. Thanks, Alexey Ermakov On 2024-02-29 09:36, Andrei Lepikhov wrote: > I found one possible weak point in the code of PHJ: > ExecParallelHashJoinSetUpBatches: > > pstate->batches = dsa_allocate0(hashtable->area, > EstimateParallelHashJoinBatch(hashtable) * nbatch); > > It could explain why we have such a huge memory allocation with a size > not bonded to a power of 2. > Also, it makes more sense that we already have a second report [1] > with the same value of DSA allocation. > So, can Alexey apply a trivial patch and check whether the bug has gone? > > [1] > https://www.postgresql.org/message-id/flat/PAYP264MB3368AF4D63704E370B204C39FF582%40PAYP264MB3368.FRAP264.PROD.OUTLOOK.COM >
Re: BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker
From
Andrei Lepikhov
Date:
On 1/3/2024 16:50, Alexey Ermakov wrote: > Hello! I reproduced generation of multiple temporary files and abnormal > memory > usage (in 'top' output both backend and parallel worker used around > 3-4GB of > RAM each, I often got OOM killer) but not memory allocation error so far. Could you provide a backtrace at the moment of the error? It can show us exact place of the error without guesses. -- regards, Andrei Lepikhov Postgres Professional
Re: BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker
From
Alexey Ermakov
Date:
On 2024-03-01 18:48, Andrei Lepikhov wrote: > On 1/3/2024 16:50, Alexey Ermakov wrote: >> Hello! I reproduced generation of multiple temporary files and >> abnormal memory >> usage (in 'top' output both backend and parallel worker used around >> 3-4GB of >> RAM each, I often got OOM killer) but not memory allocation error so >> far. > Could you provide a backtrace at the moment of the error? It can show > us exact place of the error without guesses. > yes, will do it on next week (it could take some time to get approvals to install gdb/debug packages or set up another host). -- Alexey Ermakov
Re: BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker
From
Thomas Munro
Date:
On Thu, Feb 29, 2024 at 4:37 PM Andrei Lepikhov <a.lepikhov@postgrespro.ru> wrote: > On 21/2/2024 19:52, Tomas Vondra wrote: > > It's a bit weird it needs 1.8GB of memory, but perhaps that's also > > linked to the number of batches, somehow? > I found one possible weak point in the code of PHJ: > ExecParallelHashJoinSetUpBatches: > > pstate->batches = dsa_allocate0(hashtable->area, > EstimateParallelHashJoinBatch(hashtable) * nbatch); > > It could explain why we have such a huge memory allocation with a size > not bonded to a power of 2. Hmm, a couple of short term ideas: 1. Maybe the planner should charge a high cost for exceeding some soft limit on the expected number of batches; perhaps it could be linked to the number of file descriptors we can open (something like 1000), because during the build phase we'll be opening and closing random file descriptors like crazy due to vfd pressure, which is not free; that should hopefully discourage the planner from reaching cases like this, but of course only in cases the planner can predict. 2. It sounds like we should clamp nbatches. We don't want the per-partition state to exceed MaxAllocSize, which I guess is what happened here if the above-quoted line produced the error? (The flag that would allow "huge" allocations exceeding MaxAllocSize seems likely to make the world worse while we have so many footguns -- I think once a query has reached this stage of terrible execution, it's better to limit the damage.) The performance and memory usage will still be terrible. We just don't deal well with huge numbers of tiny partitions. Tiny relative to input size, with input size being effectively unbounded. Thoughts for later: A lower limit would of course be possible and likely desirable. Once the partition-bookkeeping memory exceeds work_mem * hash_mem_multiplier, it becomes stupid to double it just because a hash table has hit that size, because that actually increases total memory usage (and fast, because it's quadratic). We'd be better off in practice giving up on the hash table size limit and hoping for the best. But the real long term question is what strategy we're going to use to actually deal with this situation properly *without* giving up our memory usage policies and hoping for the best, and that remains an open question. To summarise the two main ideas put forward so far: (1) allow very high number of batches, but process at most N of M batches at a time, using a temporary "all-the-rest" batch to be re-partitioned to feed the next N batches + the rest in a later cycle, (2) fall back to looping over batches multiple times in order to keep nbatches <= a small limit while also not exceeding a hash table size limit. Both have some tricky edge cases, especially with parallelism in the picture but probably even without it. I'm willing to work more on exploring this some time after the 17 cycle.
Re: BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker
From
Tomas Vondra
Date:
On 3/3/24 23:12, Thomas Munro wrote: > On Thu, Feb 29, 2024 at 4:37 PM Andrei Lepikhov > <a.lepikhov@postgrespro.ru> wrote: >> On 21/2/2024 19:52, Tomas Vondra wrote: >>> It's a bit weird it needs 1.8GB of memory, but perhaps that's also >>> linked to the number of batches, somehow? > >> I found one possible weak point in the code of PHJ: >> ExecParallelHashJoinSetUpBatches: >> >> pstate->batches = dsa_allocate0(hashtable->area, >> EstimateParallelHashJoinBatch(hashtable) * nbatch); >> >> It could explain why we have such a huge memory allocation with a size >> not bonded to a power of 2. > Maybe, but how many batches would that require for 1.8GB of memory? From what I see in my experiments, this uses ~432B per batch, so that's be about 4M batches. That's ... a lot. There may be some dependency on the number of workers, but it's going to be in this ballpark. OTOH I don't see any other dsa_allocate calls for the reproducer, so maybe it really is this. I really wonder how many batches are there and what's the memory per batch. > Hmm, a couple of short term ideas: > > 1. Maybe the planner should charge a high cost for exceeding some > soft limit on the expected number of batches; perhaps it could be > linked to the number of file descriptors we can open (something like > 1000), because during the build phase we'll be opening and closing > random file descriptors like crazy due to vfd pressure, which is not > free; that should hopefully discourage the planner from reaching cases > like this, but of course only in cases the planner can predict. > Will we really open the file descriptors like crazy? I think the 8kB buffers we keep for temporary files is a pretty good protection against that (but also one of the reasons for memory explosion in the serial hash join). Or does the parallel version use files differently? > 2. It sounds like we should clamp nbatches. We don't want the > per-partition state to exceed MaxAllocSize, which I guess is what > happened here if the above-quoted line produced the error? (The flag > that would allow "huge" allocations exceeding MaxAllocSize seems > likely to make the world worse while we have so many footguns -- I > think once a query has reached this stage of terrible execution, it's > better to limit the damage.) > > The performance and memory usage will still be terrible. We just > don't deal well with huge numbers of tiny partitions. Tiny relative > to input size, with input size being effectively unbounded. > Yeah. > Thoughts for later: A lower limit would of course be possible and > likely desirable. Once the partition-bookkeeping memory exceeds > work_mem * hash_mem_multiplier, it becomes stupid to double it just > because a hash table has hit that size, because that actually > increases total memory usage (and fast, because it's quadratic). Right. This is pretty much exactly the reason for the memory explosion in serial hash join, with a report maybe once or twice a year (who knows how many people don't report, though). > We'd be better off in practice giving up on the hash table size limit > and hoping for the best. Perhaps it'd be better not to give up entirely, but to relax it a bit. One of the proposals in [1] (from ~5 years ago) was to "balance" the hash table size and memory needed for the temporary files. That is, when things go wrong, double the hash table size every time the the temp files need the same amount of memory. That minimizes the total amount of memory needed by the join, I think. Back then it seemed like an ad hoc band-aid, but maybe it's better than nothing. It was far simpler than various other ideas, and these failures should be quite rare. Perhaps a similar emergency approach would work here too? > But the real long term question is what strategy > we're going to use to actually deal with this situation properly > *without* giving up our memory usage policies and hoping for the best, > and that remains an open question. To summarise the two main ideas > put forward so far: (1) allow very high number of batches, but process > at most N of M batches at a time, using a temporary "all-the-rest" > batch to be re-partitioned to feed the next N batches + the rest in a > later cycle, (2) fall back to looping over batches multiple times in > order to keep nbatches <= a small limit while also not exceeding a > hash table size limit. I'm not sure about (2), but (1) sounds very much like a recursive hash join, where we'd allow only a limited fan out at each stage. It's also a bit like the "spill" files in my other proposal - that worked fine, and it enforced the memory limit better than the memory balancing (which is more like best-effort heuristics). > Both have some tricky edge cases, especially > with parallelism in the picture but probably even without it. I'm > willing to work more on exploring this some time after the 17 cycle. I haven't thought very much about parallel hash joins, but what would be the corner cases for non-parallel cases? Sure, it may increase the amount of I/O, but only for cases with unexpectedly many batches (and then it's just a natural trade-off I/O vs. enforcing memory limit). regards [1] https://www.postgresql.org/message-id/20190504003414.bulcbnge3rhwhcsh%40development -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker
From
Thomas Munro
Date:
On Mon, Mar 4, 2024 at 12:49 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > On 3/3/24 23:12, Thomas Munro wrote: > > But the real long term question is what strategy > > we're going to use to actually deal with this situation properly > > *without* giving up our memory usage policies and hoping for the best, > > and that remains an open question. To summarise the two main ideas > > put forward so far: (1) allow very high number of batches, but process > > at most N of M batches at a time, using a temporary "all-the-rest" > > batch to be re-partitioned to feed the next N batches + the rest in a > > later cycle, (2) fall back to looping over batches multiple times in > > order to keep nbatches <= a small limit while also not exceeding a > > hash table size limit. > > I'm not sure about (2), but (1) sounds very much like a recursive hash > join, where we'd allow only a limited fan out at each stage. It's also a > bit like the "spill" files in my other proposal - that worked fine, and > it enforced the memory limit better than the memory balancing (which is > more like best-effort heuristics). Yeah, I was trying to describe your spill idea in few words. It's a good idea. > > Both have some tricky edge cases, especially > > with parallelism in the picture but probably even without it. I'm > > willing to work more on exploring this some time after the 17 cycle. > > I haven't thought very much about parallel hash joins, but what would be > the corner cases for non-parallel cases? Sure, it may increase the > amount of I/O, but only for cases with unexpectedly many batches (and > then it's just a natural trade-off I/O vs. enforcing memory limit). For parallel-specific problems I'd have to swap that back into my brain... but I do remember one general problem with idea #1: if you have too many duplicate keys, then anything based only on partition by hash bits won't help, which was one of my objections to that idea and the reason why I was talking about idea #2, which doesn't suffer from that problem (but has other problems). This morning I've been wondering about a highly targeted solution to that: could we recognise this condition cheaply, and then activate a special case that spills a *bucket* to disk? That is, a problem bucket that is indivisible by hashing because all the hash values are the same. Or something like that.
Re: BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker
From
Thomas Munro
Date:
On Mon, Mar 4, 2024 at 1:11 PM Thomas Munro <thomas.munro@gmail.com> wrote: > ... This morning I've been > wondering about a highly targeted solution to that: could we recognise > this condition cheaply, and then activate a special case that spills a > *bucket* to disk? That is, a problem bucket that is indivisible by > hashing because all the hash values are the same. Or something like > that. (One of the main problems being how to manage the match bits for outer joins, a big problem also for idea #2. But maybe it's somehow more tractable there, I dunno. I'd need vastly more coffee to have any idea...)
Re: BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker
From
Alexey Ermakov
Date:
On 2024-03-01 18:48, Andrei Lepikhov wrote: > Could you provide a backtrace at the moment of the error? It can show > us exact place of the error without guesses. Hello. Уes, you are right about ExecParallelHashJoinSetUpBatches. Backtrace looks like this: ========parallel worker Breakpoint 1, 0x0000000000a2f1e0 in errfinish () (gdb) bt #0 0x0000000000a2f1e0 in errfinish () #1 0x0000000000515d06 in dsa_allocate_extended.cold () #2 0x000000000071abdc in ExecParallelHashJoinSetUpBatches () #3 0x000000000071bd03 in ExecParallelHashIncreaseNumBatches () #4 0x000000000071c50d in ExecParallelHashTableInsert () #5 0x000000000071c6fa in MultiExecHash () #6 0x000000000071ecd1 in ExecParallelHashJoin.lto_priv.0 () #7 0x00000000006f8dc2 in standard_ExecutorRun () #8 0x00007f860667cf75 in pgss_ExecutorRun () from /usr/pgsql-16/lib/pg_stat_statements.so #9 0x00007f8606271cad in pgsk_ExecutorRun () from /usr/pgsql-16/lib/pg_stat_kcache.so #10 0x00000000006fdd36 in ParallelQueryMain () #11 0x00000000005b1ca2 in ParallelWorkerMain () #12 0x000000000083072c in StartBackgroundWorker () #13 0x000000000083a177 in maybe_start_bgworkers.lto_priv () #14 0x000000000083b7f4 in ServerLoop.lto_priv.0 () #15 0x0000000000833b6d in PostmasterMain () #16 0x000000000051c221 in main () =======backend Breakpoint 1, 0x0000000000a2f1e0 in errfinish () (gdb) bt #0 0x0000000000a2f1e0 in errfinish () #1 0x00000000005b12e8 in HandleParallelMessages () #2 0x00000000008c74f5 in ProcessInterrupts.part.0 () #3 0x00000000008b0715 in ConditionVariableTimedSleep.part.0 () #4 0x000000000089ea85 in BarrierArriveAndWait () #5 0x000000000071b9e7 in ExecParallelHashIncreaseNumBatches () #6 0x000000000071c255 in ExecParallelHashTupleAlloc () #7 0x000000000071c449 in ExecParallelHashTableInsert () #8 0x000000000071c6fa in MultiExecHash () #9 0x000000000071ecd1 in ExecParallelHashJoin.lto_priv.0 () #10 0x0000000000718124 in ExecGather () #11 0x00000000006f8dc2 in standard_ExecutorRun () #12 0x00007f860667cf75 in pgss_ExecutorRun () from /usr/pgsql-16/lib/pg_stat_statements.so #13 0x00007f8606271cad in pgsk_ExecutorRun () from /usr/pgsql-16/lib/pg_stat_kcache.so #14 0x00000000008c8ba6 in PortalRunSelect () #15 0x00000000008ca556 in PortalRun () #16 0x00000000008cb5e0 in exec_simple_query () #17 0x00000000008cd96f in PostgresMain () #18 0x000000000083d174 in ServerLoop.lto_priv.0 () #19 0x0000000000833b6d in PostmasterMain () #20 0x000000000051c221 in main () Would you like to get any additional info from gdb (like number of batches) ? -- Thanks, Alexey Ermakov
Re: BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker
From
Andrei Lepikhov
Date:
On 11/3/2024 15:12, Alexey Ermakov wrote: > Would you like to get any additional info from gdb (like number of > batches) ? Can you try to reproduce the case with simple patch attached? -- regards, Andrei Lepikhov Postgres Professional