Re: BUG #15577: Query returns different results when executedmultiple times - Mailing list pgsql-bugs
From | Bartosz Polnik |
---|---|
Subject | Re: BUG #15577: Query returns different results when executedmultiple times |
Date | |
Msg-id | CAM37ZevSHt0pdKhZ_Bu7nDoMMWSDsJaaPUPR_ApowoFida4fcA@mail.gmail.com Whole thread Raw |
In response to | Re: BUG #15577: Query returns different results when executedmultiple times (Thomas Munro <thomas.munro@enterprisedb.com>) |
Responses |
Re: BUG #15577: Query returns different results when executedmultiple times
|
List | pgsql-bugs |
For the following sql:
SET force_parallel_mode = on;
explain (costs false, analyze true)
SELECT ta.id AS table_a_id,
tc.id as table_c_id,
tba.id AS table_b_id
FROM test.table_b_active tba
INNER JOIN test.table_c tc ON tba.target_id = tc.id
INNER JOIN test.table_d td ON tc.table_d_id = td.id
LEFT JOIN test.table_a ta ON ta.table_c_id = tc.id AND ta.date =
'2018-08-31' :: DATE
WHERE tba.date BETWEEN '2018-08-10' :: DATE AND '2018-09-01' :: DATE
AND td.group = 'A'
AND tc.table_e_id = 4
AND (
(tba.target_id = tc.id AND tba.group_type = 'A')
OR tba.source_id = tc.id
);
I got:
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop Left Join (actual time=145.151..284.612 rows=26 loops=1)
-> Gather (actual time=145.092..304.166 rows=26 loops=1)
Workers Planned: 1
Workers Launched: 1
-> Nested Loop (actual time=153.289..247.768 rows=13 loops=2)
-> Hash Join (actual time=0.221..15.028 rows=945 loops=2)
Hash Cond: (tc.table_d_id = td.id)
-> Parallel Seq Scan on table_c tc (actual time=0.085..14.442 rows=1751 loops=2)
Filter: (table_e_id = 4)
Rows Removed by Filter: 49117
-> Hash (actual time=0.044..0.045 rows=8 loops=2)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on table_d td (actual time=0.028..0.033 rows=8 loops=2)
Filter: (group = 'A'::test.group)
Rows Removed by Filter: 55
-> Index Scan using table_b_idx_target_id on table_b (actual time=0.242..0.246 rows=0 loops=1890)
Index Cond: (target_id = tc.id)
Filter: ((date >= '2018-08-10'::date) AND (date <= '2018-09-01'::date) AND (((target_id = tc.id) AND (group_type = 'A'::test.group_type)) OR (source_id = tc.id)))
Rows Removed by Filter: 26
-> Index Scan using table_a_uq_001 on table_a ta (actual time=0.016..0.016 rows=1 loops=26)
Index Cond: ((table_c_id = tc.id) AND (date = '2018-08-31'::date))
Planning Time: 5.024 ms
Execution Time: 304.858 ms
On Second execution:
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop Left Join (actual time=228.405..279.383 rows=19 loops=1)
-> Gather (actual time=228.376..288.807 rows=19 loops=1)
Workers Planned: 1
Workers Launched: 1
-> Nested Loop (actual time=150.785..242.440 rows=10 loops=2)
-> Hash Join (actual time=0.212..14.353 rows=945 loops=2)
Hash Cond: (tc.table_d_id = td.id)
-> Parallel Seq Scan on table_c tc (actual time=0.075..13.794 rows=1751 loops=2)
Filter: (table_e_id = 4)
Rows Removed by Filter: 49117
-> Hash (actual time=0.050..0.050 rows=8 loops=2)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on table_d_id td (actual time=0.031..0.037 rows=8 loops=2)
Filter: (group = 'A'::test.group)
Rows Removed by Filter: 55
-> Index Scan using table_b_idx_target_id on table_b (actual time=0.237..0.241 rows=0 loops=1890)
Index Cond: (target_id = tc.id)
Filter: ((date >= '2018-08-10'::date) AND (date <= '2018-09-01'::date) AND (((target_id = tc.id) AND (group_type = 'A'::test.group_type)) OR (source_id = tc.id)))
Rows Removed by Filter: 26
-> Index Scan using table_a_uq_001 on table_a ta (actual time=0.021..0.021 rows=1 loops=19)
Index Cond: ((table_c_id = tc.id) AND (date = '2018-08-31'::date))
Planning Time: 4.816 ms
Execution Time: 289.503 ms
I executed it a couple of times and saw in output 26 rows (as above), 21 rows (also above), 28 rows, 28 rows, 29 rows etc.
The changing parts (nr of rows) were: Nested Loop Left Join and Nested Loop.
For the following sql:
SET parallel_leader_participation = on;
explain (costs false, analyze true)
SELECT ta.id AS table_a_id,
tc.id as table_c_id,
tba.id AS table_b_id
FROM test.table_b_active tba
INNER JOIN test.table_c tc ON tba.target_id = tc.id
INNER JOIN test.table_d td ON tc.table_d_id = td.id
LEFT JOIN test.table_a ta ON ta.table_c_id = tc.id AND ta.date =
'2018-08-31' :: DATE
WHERE tba.date BETWEEN '2018-08-10' :: DATE AND '2018-09-01' :: DATE
AND td.group = 'A'
AND tc.table_e_id = 4
AND (
(tba.target_id = tc.id AND tba.group_type = 'A')
OR tba.source_id = tc.id
);
I received:
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop Left Join (actual time=143.805..279.354 rows=29 loops=1)
-> Gather (actual time=143.735..299.066 rows=29 loops=1)
Workers Planned: 1
Workers Launched: 1
-> Nested Loop (actual time=149.334..242.108 rows=15 loops=2)
-> Hash Join (actual time=0.203..14.536 rows=945 loops=2)
Hash Cond: (tc.table_d_id = td.id)
-> Parallel Seq Scan on table_c tc (actual time=0.072..13.971 rows=1751 loops=2)
Filter: (table_e_id = 4)
Rows Removed by Filter: 49117
-> Hash (actual time=0.043..0.043 rows=8 loops=2)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on table_d td (actual time=0.028..0.034 rows=8 loops=2)
Filter: (group = 'A'::test.group)
Rows Removed by Filter: 55
-> Index Scan using table_b_idx_target_id on table_b (actual time=0.237..0.240 rows=0 loops=1890)
Index Cond: (target_id = tc.id)
Filter: ((date >= '2018-08-10'::date) AND (date <= '2018-09-01'::date) AND (((target_id = tc.id) AND (group_type = 'A'::test.group_type)) OR (source_id = tc.id)))
Rows Removed by Filter: 26
-> Index Scan using table_a_uq_001 on table_a ta (actual time=0.018..0.018 rows=1 loops=29)
Index Cond: ((table_c_id = tc.id) AND (date = '2018-08-31'::date))
Planning Time: 5.070 ms
Execution Time: 299.849 ms
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop Left Join (actual time=231.714..276.770 rows=26 loops=1)
-> Gather (actual time=231.654..295.971 rows=26 loops=1)
Workers Planned: 1
Workers Launched: 1
-> Nested Loop (actual time=151.209..239.965 rows=13 loops=2)
-> Hash Join (actual time=0.214..14.467 rows=945 loops=2)
Hash Cond: (tc.table_d_id = td.id)
-> Parallel Seq Scan on table_c tc (actual time=0.070..13.923 rows=1751 loops=2)
Filter: (table_e_id = 4)
Rows Removed by Filter: 49117
-> Hash (actual time=0.045..0.045 rows=8 loops=2)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on table_d td (actual time=0.028..0.033 rows=8 loops=2)
Filter: (group = 'A'::test.group)
Rows Removed by Filter: 55
-> Index Scan using table_b_idx_target_id on table_b (actual time=0.235..0.238 rows=0 loops=1890)
Index Cond: (target_id = tc.id)
Filter: ((date >= '2018-08-10'::date) AND (date <= '2018-09-01'::date) AND (((target_id = tc.id) AND (group_type = 'A'::test.group_type)) OR (source_id = tc.id)))
Rows Removed by Filter: 26
-> Index Scan using table_a_uq_001 on table_a ta (actual time=0.017..0.017 rows=1 loops=26)
Index Cond: ((table_c_id = tc.id) AND (date = '2018-08-31'::date))
Planning Time: 4.976 ms
Execution Time: 296.729 ms
I also made a test with:
SET parallel_leader_participation = off;
SET force_parallel_mode = on;
explain (costs false, analyze true)
SELECT ta.id AS table_a_id,
tc.id as table_c_id,
tba.id AS table_b_id
FROM test.table_b_active tba
INNER JOIN test.table_c tc ON tba.target_id = tc.id
INNER JOIN test.table_d td ON tc.table_d_id = td.id
LEFT JOIN test.table_a ta ON ta.table_c_id = tc.id AND ta.date =
'2018-08-31' :: DATE
WHERE tba.date BETWEEN '2018-08-10' :: DATE AND '2018-09-01' :: DATE
AND td.group = 'A'
AND tc.table_e_id = 4
AND (
(tba.target_id = tc.id AND tba.group_type = 'A')
OR tba.source_id = tc.id
);
The output is correct - 31 rows, but the plan is different:
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Gather (actual time=263.880..507.668 rows=31 loops=1)
Workers Planned: 1
Workers Launched: 1
Single Copy: true
-> Nested Loop Left Join (actual time=193.246..415.020 rows=31 loops=1)
-> Nested Loop (actual time=192.997..414.336 rows=31 loops=1)
-> Hash Join (actual time=0.287..23.958 rows=1890 loops=1)
Hash Cond: (tc.table_d_id = td.id)
-> Seq Scan on table_c tc (actual time=0.128..23.044 rows=3501 loops=1)
Filter: (table_e_id = 4)
Rows Removed by Filter: 98233
-> Hash (actual time=0.064..0.069 rows=8 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on table_d td (actual time=0.045..0.052 rows=8 loops=1)
Filter: (group = 'A'::test.group)
Rows Removed by Filter: 55
-> Index Scan using table_b_idx_target_id on table_b (actual time=0.203..0.206 rows=0 loops=1890)
Index Cond: (target_id = tc.id)
Filter: ((date >= '2018-08-10'::date) AND (date <= '2018-09-01'::date) AND (((target_id = tc.id) AND (group_type = 'A'::test.group_type)) OR (source_id = tc.id)))
Rows Removed by Filter: 26
-> Index Scan using table_a_uq_001 on table_a ta (actual time=0.021..0.021 rows=1 loops=31)
Index Cond: ((table_c_id = tc.id) AND (date = '2018-08-31'::date))
Planning Time: 5.051 ms
Execution Time: 507.858 ms
On Mon, Jan 7, 2019 at 8:59 PM Thomas Munro <thomas.munro@enterprisedb.com> wrote:
On Tue, Jan 8, 2019 at 1:56 AM Bartosz Polnik <bartoszpolnik@gmail.com> wrote:
> I've modified the script:
> 1. I Added SET force_parallel_mode = on;:
What if you try that with your original query in the database that
shows the problem? Or alternatively SET parallel_leader_participation
= on. What I'm wondering is whether the parallel worker is
consistently getting the same result, but it's a different result from
the leader, and the variation comes from the race between leader and
worker, or whether the instability comes from somewhere else.
--
Thomas Munro
http://www.enterprisedb.com
pgsql-bugs by date: