Re: BUG #18588: Cannot force/let database use parallel execution in simple case. - Mailing list pgsql-bugs
From | Maxim Boguk |
---|---|
Subject | Re: BUG #18588: Cannot force/let database use parallel execution in simple case. |
Date | |
Msg-id | CAK-MWwTqDRerxrpaN8Xb25DqDGBw1UT+YSkfp4QN-MgOH97c+Q@mail.gmail.com Whole thread Raw |
In response to | Re: BUG #18588: Cannot force/let database use parallel execution in simple case. (David Rowley <dgrowleyml@gmail.com>) |
Responses |
Re: BUG #18588: Cannot force/let database use parallel execution in simple case.
|
List | pgsql-bugs |
On Thu, Aug 22, 2024 at 5:34 PM David Rowley <dgrowleyml@gmail.com> wrote: > > On Fri, 23 Aug 2024 at 01:14, PG Bug reporting form > <noreply@postgresql.org> wrote: > > set min_parallel_index_scan_size to '8kB'; > > > > set min_parallel_table_scan_size to '8kB'; > > set parallel_tuple_cost to 0; > > set parallel_setup_cost to 0; > > > > have no effect. > > > > Even with the set force_parallel_mode to on - no effect: > > negotiation_chat_archive=# explain analyze select topic_id as y0_ from > > public.negotiation_topic_archive_p005 this_ where > > this_.employer_id='816144'; > > Does it choose a parallel plan if you do: > > ALTER TABLE public.negotiation_topic_archive_p005 SET (parallel_workers = 2); > > I assume it must be a btree index given the other plan does use a > parallel scan. So I wondered if something weird was happening in > compute_parallel_worker() and it was choosing 1 worker, which might > cause the path to be rejected. > > You might need to come up with a self-contained test case here as when > I tried this on PG15, I do get a parallel index scan. > > create table t1 (a int); > insert into t1 select x/1000 from generate_Series(1,1000000)X; > create index on t1(a); > set min_parallel_index_scan_size to '8kB'; > set min_parallel_table_scan_size to '8kB'; > set parallel_tuple_cost to 0; > set parallel_setup_cost to 0; > explain select * from t1 where a < 10; > QUERY PLAN > -------------------------------------------------------------------------------------------- > Gather (cost=0.42..179.40 rows=8912 width=4) > Workers Planned: 2 > -> Parallel Index Only Scan using t1_a_idx on t1 > (cost=0.42..179.40 rows=3713 width=4) > Index Cond: (a < 10) > (4 rows) > > select version(); > version > ------------------------------------------------------------------------------------------------------- > PostgreSQL 15.6 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu > 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit > (1 row) > > David Hi David, Some additional info about table: \dt+ negotiation_topic_archive_p005 List of relations Schema | Name | Type | Owner | Persistence | Access method | Size | Description --------+--------------------------------+-------+----------+-------------+---------------+--------+------------- public | negotiation_topic_archive_p005 | table | postgres | permanent | heap | 112 GB | select relpages from pg_class where relname='negotiation_topic_archive_p005'; relpages ---------- 14688625 select count(*) from negotiation_topic_archive_p005; count ----------- 499710323 Quite big table, but nothing unusual for contemporary hardware. Hm integer(?) overflow somewhere in planner costing model of parallel scans? I'll looked into compute_parallel_worker code ( https://doxygen.postgresql.org/paths_8h.html#a64fa1e38a91614994697a7404cea242a ) but I didn't found anything obviously wrong for me there (but my C knowledge pretty limited). There are some results of different tests: 0. ALTER TABLE public.negotiation_topic_archive_p005 SET (parallel_workers = 2); - have no effect 1. Create fresh table on same db with only required data (local minimal test case): Unfortunately - no luck: --trying make miminal isolated case create table user_tmp.negotiation_topic_archive_test as select employer_id, applicant_id from negotiation_topic_archive_p005; SELECT 499710323 create index negotiation_topic_archive_test_employer_id_key on user_tmp.negotiation_topic_archive_test(employer_id); CREATE INDEX analyze verbose user_tmp.negotiation_topic_archive_test; INFO: analyzing "user_tmp.negotiation_topic_archive_test" INFO: "negotiation_topic_archive_test": scanned 30000 of 2211108 pages, containing 6780000 live rows and 0 dead rows; 30000 rows in sample, 499710408 estimated total rows ANALYZE select relpages from pg_class where relname='negotiation_topic_archive_test'; relpages ---------- 2211108 --no luck explain select * from user_tmp.negotiation_topic_archive_test this_ where this_.employer_id=816144; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------ Index Scan using negotiation_topic_archive_test_employer_id_key on negotiation_topic_archive_test this_ (cost=0.11..2047834.92 rows=449739 width=8) Index Cond: (employer_id = 816144) --still no luck set min_parallel_index_scan_size to 0; set min_parallel_table_scan_size to 0; set parallel_tuple_cost to 0; set parallel_setup_cost to 0; set force_parallel_mode to on; explain select * from user_tmp.negotiation_topic_archive_test this_ where this_.employer_id=816144; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------ Gather (cost=0.11..2047834.92 rows=449739 width=8) Workers Planned: 1 Single Copy: true -> Index Scan using negotiation_topic_archive_test_employer_id_key on negotiation_topic_archive_test this_ (cost=0.11..2047834.92 rows=449739 width=8) Index Cond: (employer_id = 816144) --with not exists - still ok explain select * from user_tmp.negotiation_topic_archive_test this_ where this_.employer_id=816144 AND NOT EXISTS ( SELECT FROM applicant_black_list WHERE this_.employer_id=applicant_black_list.employer_id AND this_.applicant_id=applicant_black_list.applicant_id ); QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Gather (cost=10.27..2045714.44 rows=449737 width=8) Workers Planned: 4 -> Parallel Hash Anti Join (cost=10.27..2045714.44 rows=112434 width=8) Hash Cond: ((this_.employer_id = applicant_black_list.employer_id) AND (this_.applicant_id = applicant_black_list.applicant_id)) -> Parallel Index Scan using negotiation_topic_archive_test_employer_id_key on negotiation_topic_archive_test this_ (cost=0.11..2044461.88 rows=112435 width=8) Index Cond: (employer_id = 816144) -> Parallel Hash (cost=10.13..10.13 rows=2 width=8) -> Parallel Index Only Scan using employer_to_user_index on applicant_black_list (cost=0.08..10.13 rows=2 width=8) Index Cond: (employer_id = 816144) So isolated tests on table without inheritance parents or some legacy/forgotten custom settings - show the same results. 2.Now lets try dump/reload to the test host with Postgresql 16.4 on stock pgdg default postgresql.conf: pg_dump -Fc --compress=2 --table=user_tmp.negotiation_topic_archive_test --table=applicant_black_list db > negotiation_topic_archive_test.dump ls -la negotiation_topic_archive_test.dump -rw-r--r-- 1 postgres postgres 4028520703 Aug 22 19:10 negotiation_topic_archive_test.dump --on test host... and still no luck pg_restore -Fc --port=5433 --dbname=test negotiation_topic_archive_test.dump test=# analyze applicant_black_list; ANALYZE test=# analyze user_tmp.negotiation_topic_archive_test; ANALYZE set min_parallel_index_scan_size to 0; set min_parallel_table_scan_size to 0; set parallel_tuple_cost to 0; set parallel_setup_cost to 0; set debug_parallel_query to on; explain select * from user_tmp.negotiation_topic_archive_test this_ where this_.employer_id=816144; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------ Gather (cost=0.57..1529820.39 rows=416420 width=8) Workers Planned: 1 Single Copy: true -> Index Scan using negotiation_topic_archive_test_employer_id_key on negotiation_topic_archive_test this_ (cost=0.57..1529820.39 rows=416420 width=8) Index Cond: (employer_id = 816144) In the same time modified youre test case on the same 16.4 test db - test case seems ok even with large dataset: insert into t1 select x/10000 from generate_Series(1,100000000)X; create index on t1(a); analyze t1; set min_parallel_index_scan_size to 0; set min_parallel_table_scan_size to 0; set parallel_tuple_cost to 0; set parallel_setup_cost to 0; set debug_parallel_query to on; explain select * from t1 where a=10; explain select * from t1 where a=10; QUERY PLAN --------------------------------------------------------------------------------------------- Gather (cost=0.57..4636.40 rows=10786 width=4) Workers Planned: 2 -> Parallel Index Only Scan using t1_a_idx on t1 (cost=0.57..4636.40 rows=4494 width=4) Index Cond: (a = 10) -- Maxim Boguk Senior Postgresql DBA Phone UA: +380 99 143 0000 Phone AU: +61 45 218 5678
pgsql-bugs by date: