BUG #15577: Query returns different results when executed multipletimes - Mailing list pgsql-bugs
From | PG Bug reporting form |
---|---|
Subject | BUG #15577: Query returns different results when executed multipletimes |
Date | |
Msg-id | 15577-ca61ab18904af852@postgresql.org Whole thread Raw |
Responses |
Re: BUG #15577: Query returns different results when executedmultiple times
|
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 15577 Logged by: Bartosz Polnik Email address: bartoszpolnik@gmail.com PostgreSQL version: 11.1 Operating system: Windows 10 Description: Hello, A couple of days ago I stumbled upon a query that doesn't return all rows it should. When we execute it multiple times, we get only a subset of data. The query is: SELECT ta.id AS table_a_id, tc.id as table_c_id, tba.id AS table_b_id FROM test.table_b_active tba INNER JOIN test.table_c tc ON tba.target_id = tc.id INNER JOIN test.table_d td ON tc.table_d_id = td.id LEFT JOIN test.table_a ta ON ta.table_c_id = tc.id AND ta.date = '2018-08-31' :: DATE WHERE tba.date BETWEEN '2018-08-10' :: DATE AND '2018-09-01' :: DATE AND td.group = 'A' AND tc.table_e_id = 4 AND ( (tba.target_id = tc.id AND tba.group_type = 'A') OR tba.source_id = tc.id ); We always expect 31 rows in return: table_a_id | table_c_id | table_b_id ------------+------------+------------ 16116185 | 328860 | 2936924 16115788 | 348539 | 2913754 16115788 | 348539 | 3039173 16115788 | 348539 | 2913874 16116256 | 293541 | 2997160 16116256 | 293541 | 2901938 16116256 | 293541 | 2901933 16114789 | 292051 | 3038539 16114813 | 342353 | 3052371 16116069 | 351585 | 3025941 16114814 | 331329 | 2946332 16115861 | 350487 | 2933633 16115106 | 350047 | 2902075 16116066 | 351434 | 3010909 16114811 | 298605 | 2998909 16114811 | 298605 | 2893809 16114811 | 298605 | 2987038 16114811 | 298605 | 3038877 16114811 | 298605 | 3010694 16114811 | 298605 | 2893188 16114811 | 298605 | 2893391 16114811 | 298605 | 2983360 16114811 | 298605 | 3038221 16114811 | 298605 | 3026078 16114811 | 298605 | 2998966 16116260 | 290583 | 2955483 16116260 | 290583 | 2921135 16116260 | 290583 | 2947914 16116260 | 290583 | 2901669 16116241 | 295971 | 3038921 16116249 | 296708 | 3038888 But when we run it using psql, we get different results (not all rows are returned). Here's the query I used to get the execution plan and example output from two consecutive plan executions: explain (costs false, analyze true) SELECT ta.id AS table_a_id, tc.id as table_c_id, tba.id AS table_b_id FROM test.table_b_active tba INNER JOIN test.table_c tc ON tba.target_id = tc.id INNER JOIN test.table_d td ON tc.table_d_id = td.id LEFT JOIN test.table_a ta ON ta.table_c_id = tc.id AND ta.date = '2018-08-31' :: DATE WHERE tba.date BETWEEN '2018-08-10' :: DATE AND '2018-09-01' :: DATE AND td.group = 'A' AND tc.table_e_id = 4 AND ( (tba.target_id = tc.id AND tba.group_type = 'A') OR tba.source_id = tc.id ); Nested Loop Left Join (actual time=157.436..329.538 rows=21 loops=1) -> Gather (actual time=157.407..349.743 rows=21 loops=1) Workers Planned: 1 Workers Launched: 1 -> Nested Loop (actual time=167.978..281.540 rows=11 loops=2) -> Hash Join (actual time=0.300..18.206 rows=945 loops=2) Hash Cond: (tc.table_d_id = td.id) -> Parallel Seq Scan on table_c tc (actual time=0.107..17.326 rows=1751 loops=2) Filter: (table_e_id = 4) Rows Removed by Filter: 49117 -> Hash (actual time=0.065..0.065 rows=8 loops=2) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on table_d td (actual time=0.043..0.049 rows=8 loops=2) Filter: (group = 'A') Rows Removed by Filter: 55 -> Index Scan using table_b_idx_target_id on table_b (actual time=0.273..0.278 rows=0 loops=1890) Index Cond: (target_id = tc.id) Filter: ((date >= '2018-08-10'::date) AND (date <= '2018-09-01'::date) AND (((target_id = tc.id) AND (group_type = 'A')) OR (source_id = tc.id))) Rows Removed by Filter: 26 -> Index Scan using table_a_uq_001 on table_a ta (actual time=0.019..0.019 rows=1 loops=21) Index Cond: ((table_c_id = tc.id) AND (date = '2018-08-31'::date)) Planning Time: 1.218 ms Execution Time: 350.283 ms Above we only got 21 rows, instead of 31. Nested Loop Left Join (actual time=164.189..358.068 rows=28 loops=1) -> Gather (actual time=164.140..370.034 rows=28 loops=1) Workers Planned: 1 Workers Launched: 1 -> Nested Loop (actual time=188.718..306.289 rows=14 loops=2) -> Hash Join (actual time=0.321..21.860 rows=945 loops=2) Hash Cond: (tc.table_d_id = td.id) -> Parallel Seq Scan on table_c tc (actual time=0.113..20.784 rows=1751 loops=2) Filter: (table_e_id = 4) Rows Removed by Filter: 49117 -> Hash (actual time=0.063..0.063 rows=8 loops=2) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on table_d td (actual time=0.040..0.049 rows=8 loops=2) Filter: (group = 'A') Rows Removed by Filter: 55 -> Index Scan using table_b_idx_target_id on table_b (actual time=0.294..0.300 rows=0 loops=1890) Index Cond: (target_id = tc.id) Filter: ((date >= '2018-08-10'::date) AND (date <= '2018-09-01'::date) AND (((target_id = tc.id) AND (group_type = 'A')) OR (source_id = tc.id))) Rows Removed by Filter: 26 -> Index Scan using table_a_uq_001 on table_a ta (actual time=0.031..0.031 rows=1 loops=28) Index Cond: ((table_c_id = tc.id) AND (date = '2018-08-31'::date)) Planning Time: 0.971 ms Execution Time: 371.202 ms Here we only got 28 rows, instead of 31. Used indices: create index table_b_idx_target_id on test.table_b (target_id) where (t_active IS TRUE); create unique index table_a_uq_001 on table_a (table_c_id, date); And views: create view test.table_b_active as SELECT * FROM test.table_b WHERE (table_b.t_active IS TRUE); I made a couple of tests and couldn't reproduce this problem with max_parallel_workers=0. With this parameter set, we get the execution plan: Nested Loop Left Join (actual time=212.010..509.274 rows=31 loops=1) -> Gather (actual time=211.976..508.855 rows=31 loops=1) Workers Planned: 1 Workers Launched: 0 -> Nested Loop (actual time=211.593..508.425 rows=31 loops=1) -> Hash Join (actual time=0.063..31.628 rows=1890 loops=1) Hash Cond: (tc.table_d_id = td.id) -> Parallel Seq Scan on table_c tc (actual time=0.042..30.000 rows=3501 loops=1) Filter: (table_e_id = 4) Rows Removed by Filter: 98233 -> Hash (actual time=0.016..0.016 rows=8 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on table_d_id td (actual time=0.008..0.012 rows=8 loops=1) Filter: (group = 'A') Rows Removed by Filter: 55 -> Index Scan using table_b_idx_target_id on table_b (actual time=0.248..0.251 rows=0 loops=1890) Index Cond: (target_id = tc.id) Filter: ((date >= '2018-08-10'::date) AND (date <= '2018-09-01'::date) AND (((target_id = tc.id) AND (group_type = 'A')) OR (source_id = tc.id))) Rows Removed by Filter: 26 -> Index Scan using table_a_uq_001 on table_a ta (actual time=0.014..0.014 rows=1 loops=31) Index Cond: ((table_c_id = tc.id) AND (date = '2018-08-31'::date)) Planning Time: 1.135 ms Execution Time: 509.431 ms Nr of rows returned is correct - 31. I tried to create a clean schema with test data, but couldn't get the same execution plan, so I can't include that. The data above comes from db snapshot with table, index and view names changed by hand, so there might be some typos. Sorry about them. I'm using PostgreSQL 11.1, compiled by Visual C++ build 1914, 64-bit. Best regards, Bartosz Polnik
pgsql-bugs by date: