Re: BUG #18588: Cannot force/let database use parallel execution in simple case. - Mailing list pgsql-bugs

From Maxim Boguk
Subject Re: BUG #18588: Cannot force/let database use parallel execution in simple case.
Date
Msg-id CAK-MWwSmnp+VceM7evWD6py8g04Ybr7E=G-6O1uvqCh2mnmEvw@mail.gmail.com
Whole thread Raw
In response to Re: BUG #18588: Cannot force/let database use parallel execution in simple case.  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs


On Thu, Aug 22, 2024 at 11:52 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Maxim Boguk <maxim.boguk@gmail.com> writes:
> With sufficiently large table - I didn't manage find a combinations of
> settings to force parallel execution (im my real case - the database huge
> and mostly cold on ssd raid - so parallel execution provides almost linear
> speedup with amount of parallel workers even on 1000 rows... not talking
> about 10k-10m rows, but I cannot convince the PostgreSQL use parallel index
> scan for 2m returned rows case).

Well, there may be something going on in your real case that isn't
apparent from this toy example.  Perhaps look for a better example?

                        regards, tom lane

On the production DB even in fully cached case performance difference 4х
in parallel vs single process mode

single process 3s
explain (analyze, costs, buffers, timing) select * from user_tmp.negotiation_topic_archive_test this_ where employer_id = 5516123;
                                                                                                QUERY PLAN                                                                                                
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using negotiation_topic_archive_test_employer_id_key on negotiation_topic_archive_test this_  (cost=0.11..11171373.49 rows=6446264 width=8) (actual time=0.020..2823.467 rows=6213979 loops=1)
   Index Cond: (employer_id = 5516123)
   Buffers: shared hit=1783337
 Planning Time: 0.054 ms
 Execution Time: 3069.555 ms

parallel (forced adding unrelated antijoin so with some extra work):
 explain (analyze, costs, buffers, timing) select * from user_tmp.negotiation_topic_archive_test this_ where employer_id = 5516123 AND NOT EXISTS (
        SELECT FROM applicant_black_list
        WHERE this_.employer_id=applicant_black_list.employer_id
        AND this_.applicant_id=applicant_black_list.applicant_id
);
                                                                                                         QUERY PLAN                                                                                                          
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=10.27..11123882.70 rows=6446241 width=8) (actual time=0.835..468.214 rows=6213979 loops=1)
   Workers Planned: 8
   Workers Launched: 8
   Buffers: shared hit=1787471
   ->  Parallel Hash Anti Join  (cost=10.27..11123882.70 rows=805780 width=8) (actual time=0.164..484.239 rows=690442 loops=9)
         Hash Cond: ((this_.employer_id = applicant_black_list.employer_id) AND (this_.applicant_id = applicant_black_list.applicant_id))
         Buffers: shared hit=1787471
         ->  Parallel Index Scan using negotiation_topic_archive_test_employer_id_key on negotiation_topic_archive_test this_  (cost=0.11..11114968.68 rows=805783 width=8) (actual time=0.031..352.918 rows=690442 loops=9)
               Index Cond: (employer_id = 5516123)
               Buffers: shared hit=1787076
         ->  Parallel Hash  (cost=10.13..10.13 rows=2 width=8) (actual time=0.022..0.023 rows=0 loops=9)
               Buckets: 1024  Batches: 1  Memory Usage: 8kB
               Buffers: shared hit=3
               ->  Parallel Index Only Scan using employer_to_user_index on applicant_black_list  (cost=0.08..10.13 rows=2 width=8) (actual time=0.009..0.009 rows=0 loops=1)
                     Index Cond: (employer_id = 5516123)
                     Heap Fetches: 0
                     Buffers: shared hit=3
 Planning:
   Buffers: shared hit=4
 Planning Time: 0.164 ms
 Execution Time: 711.817 ms


Once there is some IO involved - speed difference is usually close to 8x and so far I didn't find a way
to force a parallel plan except adding cheap unrelated anti-join.



--
Maxim Boguk
Senior Postgresql DBA

Phone UA: +380 99 143 0000
Phone AU: +61  45 218 5678

pgsql-bugs by date:

Previous
From: Maxim Boguk
Date:
Subject: Re: BUG #18588: Cannot force/let database use parallel execution in simple case.
Next
From: Tom Lane
Date:
Subject: Re: BUG #18588: Cannot force/let database use parallel execution in simple case.