Thread: Parallel Append Query
Hi!
I'm seeing varying number of rows removed by filter on a parallel append query:
CREATE TABLE usuario (
id bigint generated always as identity,
nome text
);
INSERT INTO usuario (nome)
SELECT md5(seq)::text
FROM generate_series(1,500000,1) tab(seq);
ANALYZE usuario;
EXPLAIN ANALYZE
(SELECT * FROM usuario WHERE nome = 'c4ca4238a0b923820dcc509a6f75849b')
UNION ALL
(SELECT * FROM usuario WHERE nome = 'c81e728d9d4c2f636f067f89cc14862c')
Gather (cost=1000.00..15574.56 rows=2 width=41) (actual time=0.320..56.154 rows=2 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Append (cost=0.00..14574.36 rows=2 width=41) (actual time=13.449..47.899 rows=1 loops=3)
-> Parallel Seq Scan on usuario (cost=0.00..7287.17 rows=1 width=41) (actual time=8.934..25.476 rows=0 loops=3)
Filter: (nome = 'c81e728d9d4c2f636f067f89cc14862c'::text)
Rows Removed by Filter: 166666
-> Parallel Seq Scan on usuario usuario_1 (cost=0.00..7287.17 rows=1 width=41) (actual time=13.470..33.630 rows=0 loops=2)
Filter: (nome = 'c4ca4238a0b923820dcc509a6f75849b'::text)
Rows Removed by Filter: 250000
Planning Time: 0.106 ms
Execution Time: 56.184 ms
The execution plan varies just by running the query a few times, and it goes back and forth between these two:
Gather (cost=1000.00..15574.56 rows=2 width=41) (actual time=0.341..55.499 rows=2 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Append (cost=0.00..14574.36 rows=2 width=41) (actual time=15.904..49.821 rows=1 loops=3)
-> Parallel Seq Scan on usuario (cost=0.00..7287.17 rows=1 width=41) (actual time=0.013..48.591 rows=1 loops=1)
Filter: (nome = 'c81e728d9d4c2f636f067f89cc14862c'::text)
Rows Removed by Filter: 499999
-> Parallel Seq Scan on usuario usuario_1 (cost=0.00..7287.17 rows=1 width=41) (actual time=15.985..33.621 rows=0 loops=3)
Filter: (nome = 'c4ca4238a0b923820dcc509a6f75849b'::text)
Rows Removed by Filter: 166666
Planning Time: 0.107 ms
Execution Time: 55.524 ms
But I can't understand why does it changes the number of loops from the queries
I'm seeing varying number of rows removed by filter on a parallel append query:
CREATE TABLE usuario (
id bigint generated always as identity,
nome text
);
INSERT INTO usuario (nome)
SELECT md5(seq)::text
FROM generate_series(1,500000,1) tab(seq);
ANALYZE usuario;
EXPLAIN ANALYZE
(SELECT * FROM usuario WHERE nome = 'c4ca4238a0b923820dcc509a6f75849b')
UNION ALL
(SELECT * FROM usuario WHERE nome = 'c81e728d9d4c2f636f067f89cc14862c')
Gather (cost=1000.00..15574.56 rows=2 width=41) (actual time=0.320..56.154 rows=2 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Append (cost=0.00..14574.36 rows=2 width=41) (actual time=13.449..47.899 rows=1 loops=3)
-> Parallel Seq Scan on usuario (cost=0.00..7287.17 rows=1 width=41) (actual time=8.934..25.476 rows=0 loops=3)
Filter: (nome = 'c81e728d9d4c2f636f067f89cc14862c'::text)
Rows Removed by Filter: 166666
-> Parallel Seq Scan on usuario usuario_1 (cost=0.00..7287.17 rows=1 width=41) (actual time=13.470..33.630 rows=0 loops=2)
Filter: (nome = 'c4ca4238a0b923820dcc509a6f75849b'::text)
Rows Removed by Filter: 250000
Planning Time: 0.106 ms
Execution Time: 56.184 ms
The execution plan varies just by running the query a few times, and it goes back and forth between these two:
Gather (cost=1000.00..15574.56 rows=2 width=41) (actual time=0.341..55.499 rows=2 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Append (cost=0.00..14574.36 rows=2 width=41) (actual time=15.904..49.821 rows=1 loops=3)
-> Parallel Seq Scan on usuario (cost=0.00..7287.17 rows=1 width=41) (actual time=0.013..48.591 rows=1 loops=1)
Filter: (nome = 'c81e728d9d4c2f636f067f89cc14862c'::text)
Rows Removed by Filter: 499999
-> Parallel Seq Scan on usuario usuario_1 (cost=0.00..7287.17 rows=1 width=41) (actual time=15.985..33.621 rows=0 loops=3)
Filter: (nome = 'c4ca4238a0b923820dcc509a6f75849b'::text)
Rows Removed by Filter: 166666
Planning Time: 0.107 ms
Execution Time: 55.524 ms
But I can't understand why does it changes the number of loops from the queries