Re: BUG #18588: Cannot force/let database use parallel execution in simple case. - Mailing list pgsql-bugs

From Maxim Boguk
Subject Re: BUG #18588: Cannot force/let database use parallel execution in simple case.
Date
Msg-id CAK-MWwTqDRerxrpaN8Xb25DqDGBw1UT+YSkfp4QN-MgOH97c+Q@mail.gmail.com
Whole thread Raw
In response to Re: BUG #18588: Cannot force/let database use parallel execution in simple case.  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: BUG #18588: Cannot force/let database use parallel execution in simple case.
List pgsql-bugs
On Thu, Aug 22, 2024 at 5:34 PM David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Fri, 23 Aug 2024 at 01:14, PG Bug reporting form
> <noreply@postgresql.org> wrote:
> > set min_parallel_index_scan_size to '8kB';
> >
> > set min_parallel_table_scan_size to '8kB';
> > set parallel_tuple_cost to 0;
> > set parallel_setup_cost to 0;
> >
> > have no effect.
> >
> > Even with the set force_parallel_mode to on - no effect:
> > negotiation_chat_archive=# explain analyze select topic_id as y0_ from
> > public.negotiation_topic_archive_p005 this_ where
> > this_.employer_id='816144';
>
> Does it choose a parallel plan if you do:
>
> ALTER TABLE public.negotiation_topic_archive_p005 SET (parallel_workers = 2);
>
> I assume it must be a btree index given the other plan does use a
> parallel scan. So I wondered if something weird was happening in
> compute_parallel_worker() and it was choosing 1 worker, which might
> cause the path to be rejected.
>
> You might need to come up with a self-contained test case here as when
> I tried this on PG15, I do get a parallel index scan.
>
> create table t1 (a int);
> insert into t1 select x/1000 from generate_Series(1,1000000)X;
> create index on t1(a);
> set min_parallel_index_scan_size to '8kB';
> set min_parallel_table_scan_size to '8kB';
> set parallel_tuple_cost to 0;
> set parallel_setup_cost to 0;
> explain select * from t1 where a < 10;
>                                          QUERY PLAN
> --------------------------------------------------------------------------------------------
>  Gather  (cost=0.42..179.40 rows=8912 width=4)
>    Workers Planned: 2
>    ->  Parallel Index Only Scan using t1_a_idx on t1
> (cost=0.42..179.40 rows=3713 width=4)
>          Index Cond: (a < 10)
> (4 rows)
>
> select version();
>                                                 version
> -------------------------------------------------------------------------------------------------------
>  PostgreSQL 15.6 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
> 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
> (1 row)
>
> David

Hi David,

Some additional info about table:
\dt+ negotiation_topic_archive_p005
                                                List of relations
 Schema |              Name              | Type  |  Owner   |
Persistence | Access method |  Size  | Description
--------+--------------------------------+-------+----------+-------------+---------------+--------+-------------
 public | negotiation_topic_archive_p005 | table | postgres |
permanent   | heap          | 112 GB |

select relpages from pg_class where relname='negotiation_topic_archive_p005';
 relpages
----------
 14688625

select count(*) from negotiation_topic_archive_p005;
   count
-----------
 499710323
Quite big table, but nothing unusual for contemporary hardware.
Hm integer(?) overflow somewhere in planner costing model of parallel scans?
I'll looked into compute_parallel_worker code (
https://doxygen.postgresql.org/paths_8h.html#a64fa1e38a91614994697a7404cea242a
) but I didn't found anything obviously wrong for me there (but my C
knowledge pretty limited).

There are some results of different tests:
0. ALTER TABLE public.negotiation_topic_archive_p005 SET
(parallel_workers = 2); - have no effect

1. Create fresh table on same db with only required data (local
minimal test case):
Unfortunately - no luck:

--trying make miminal isolated case
create table user_tmp.negotiation_topic_archive_test as select
employer_id, applicant_id from negotiation_topic_archive_p005;
SELECT 499710323

create index negotiation_topic_archive_test_employer_id_key on
user_tmp.negotiation_topic_archive_test(employer_id);
CREATE INDEX

analyze verbose user_tmp.negotiation_topic_archive_test;
INFO:  analyzing "user_tmp.negotiation_topic_archive_test"
INFO:  "negotiation_topic_archive_test": scanned 30000 of 2211108
pages, containing 6780000 live rows and 0 dead rows; 30000 rows in
sample, 499710408 estimated total rows
ANALYZE

select relpages from pg_class where relname='negotiation_topic_archive_test';
 relpages
----------
  2211108

--no luck
explain
select * from user_tmp.negotiation_topic_archive_test this_
where
this_.employer_id=816144;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using negotiation_topic_archive_test_employer_id_key on
negotiation_topic_archive_test this_  (cost=0.11..2047834.92
rows=449739 width=8)
   Index Cond: (employer_id = 816144)

--still no luck
set min_parallel_index_scan_size to 0;
set min_parallel_table_scan_size to 0;
set parallel_tuple_cost to 0;
set parallel_setup_cost to 0;
set force_parallel_mode to on;

explain
select * from user_tmp.negotiation_topic_archive_test this_
where
this_.employer_id=816144;


  QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=0.11..2047834.92 rows=449739 width=8)
   Workers Planned: 1
   Single Copy: true
   ->  Index Scan using negotiation_topic_archive_test_employer_id_key
on negotiation_topic_archive_test this_  (cost=0.11..2047834.92
rows=449739 width=8)
         Index Cond: (employer_id = 816144)

--with not exists - still ok
explain
select * from user_tmp.negotiation_topic_archive_test this_
where
this_.employer_id=816144
    AND NOT EXISTS (
        SELECT FROM applicant_black_list
        WHERE this_.employer_id=applicant_black_list.employer_id
        AND this_.applicant_id=applicant_black_list.applicant_id
      );

         QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=10.27..2045714.44 rows=449737 width=8)
   Workers Planned: 4
   ->  Parallel Hash Anti Join  (cost=10.27..2045714.44 rows=112434 width=8)
         Hash Cond: ((this_.employer_id =
applicant_black_list.employer_id) AND (this_.applicant_id =
applicant_black_list.applicant_id))
         ->  Parallel Index Scan using
negotiation_topic_archive_test_employer_id_key on
negotiation_topic_archive_test this_  (cost=0.11..2044461.88
rows=112435 width=8)
               Index Cond: (employer_id = 816144)
         ->  Parallel Hash  (cost=10.13..10.13 rows=2 width=8)
               ->  Parallel Index Only Scan using
employer_to_user_index on applicant_black_list  (cost=0.08..10.13
rows=2 width=8)
                     Index Cond: (employer_id = 816144)

So isolated tests on table without inheritance parents or some
legacy/forgotten custom settings - show the same results.

2.Now lets try dump/reload to the test host with Postgresql 16.4 on
stock pgdg default postgresql.conf:
pg_dump -Fc --compress=2
--table=user_tmp.negotiation_topic_archive_test
--table=applicant_black_list db > negotiation_topic_archive_test.dump

ls -la negotiation_topic_archive_test.dump
-rw-r--r-- 1 postgres postgres 4028520703 Aug 22 19:10
negotiation_topic_archive_test.dump

--on test host... and still no luck
pg_restore -Fc --port=5433 --dbname=test negotiation_topic_archive_test.dump

test=# analyze applicant_black_list;
ANALYZE
test=# analyze user_tmp.negotiation_topic_archive_test;
ANALYZE

set min_parallel_index_scan_size to 0;
set min_parallel_table_scan_size to 0;
set parallel_tuple_cost to 0;
set parallel_setup_cost to 0;
set debug_parallel_query to on;
explain
select * from user_tmp.negotiation_topic_archive_test this_
where
this_.employer_id=816144;

  QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=0.57..1529820.39 rows=416420 width=8)
   Workers Planned: 1
   Single Copy: true
   ->  Index Scan using negotiation_topic_archive_test_employer_id_key
on negotiation_topic_archive_test this_  (cost=0.57..1529820.39
rows=416420 width=8)
         Index Cond: (employer_id = 816144)


In the same time modified youre test case on the same 16.4 test db -
test case seems ok even with large dataset:
insert into t1 select x/10000 from generate_Series(1,100000000)X;
create index on t1(a);
analyze t1;
set min_parallel_index_scan_size to 0;
set min_parallel_table_scan_size to 0;
set parallel_tuple_cost to 0;
set parallel_setup_cost to 0;
set debug_parallel_query to on;
explain select * from t1 where a=10;
explain select * from t1 where a=10;
                                         QUERY PLAN
---------------------------------------------------------------------------------------------
 Gather  (cost=0.57..4636.40 rows=10786 width=4)
   Workers Planned: 2
   ->  Parallel Index Only Scan using t1_a_idx on t1
(cost=0.57..4636.40 rows=4494 width=4)
         Index Cond: (a = 10)



--
Maxim Boguk
Senior Postgresql DBA

Phone UA: +380 99 143 0000
Phone AU: +61  45 218 5678



pgsql-bugs by date:

Previous
From: Kuntal Ghosh
Date:
Subject: Re: BUG #18559: Crash after detaching a partition concurrently from another session
Next
From: Andrei Lepikhov
Date:
Subject: Re: BUG #18588: Cannot force/let database use parallel execution in simple case.