Re: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Parallel INSERT (INTO ... SELECT ...) |
Date | |
Msg-id | CAA4eK1K4xwT4gcO2AJF7i3xF6kpg70ZjuhjBJh8UG+YjKEGHzA@mail.gmail.com Whole thread Raw |
In response to | RE: Parallel INSERT (INTO ... SELECT ...) ("Tang, Haiying" <tanghy.fnst@cn.fujitsu.com>) |
Responses |
RE: Parallel INSERT (INTO ... SELECT ...)
|
List | pgsql-hackers |
On Mon, Jan 18, 2021 at 2:40 PM Tang, Haiying <tanghy.fnst@cn.fujitsu.com> wrote: > > Hi Tsunakawa-san > > > From: Tang, Haiying <tanghy.fnst@cn.fujitsu.com> > > > (does this patch make some optimizes in serial insert? I'm a little > > > confused here, Because the patched execution time is less than > > > unpatched, but I didn't find information in commit messages about it. > > > If I missed something, please kindly let me know.) > > > > I haven't thought of anything yet. Could you show us the output of > > EXPLAIN (ANALYZE, BUFFERS, VERBOSE) of 1,000 partitions case for the > > patched and unpatched? If it doesn't show any difference, the output > > of perf may be necessary next. > > Execute EXPLAIN on Patched: > postgres=# explain (ANALYZE, BUFFERS, VERBOSE) insert into test_part select * from test_data1; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------ > Insert on public.test_part (cost=0.00..15.00 rows=0 width=0) (actual time=44.139..44.140 rows=0 loops=1) > Buffers: shared hit=1005 read=1000 dirtied=3000 written=2000 > -> Seq Scan on public.test_data1 (cost=0.00..15.00 rows=1000 width=8) (actual time=0.007..0.201 rows=1000 loops=1) > Output: test_data1.a, test_data1.b > Buffers: shared hit=5 > Planning: > Buffers: shared hit=27011 > Planning Time: 24.526 ms > Execution Time: 44.981 ms > > Execute EXPLAIN on non-Patched: > postgres=# explain (ANALYZE, BUFFERS, VERBOSE) insert into test_part select * from test_data1; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------ > Insert on public.test_part (cost=0.00..15.00 rows=0 width=0) (actual time=72.656..72.657 rows=0 loops=1) > Buffers: shared hit=22075 read=1000 dirtied=3000 written=2000 > -> Seq Scan on public.test_data1 (cost=0.00..15.00 rows=1000 width=8) (actual time=0.010..0.175 rows=1000 loops=1) > Output: test_data1.a, test_data1.b > Buffers: shared hit=5 > Planning: > Buffers: shared hit=72 > Planning Time: 0.135 ms > Execution Time: 79.058 ms > So, the results indicate that after the patch we touch more buffers during planning which I think is because of accessing the partition information, and during execution, the patch touches fewer buffers for the same reason. But why this can reduce the time with patch? I think this needs some investigation. -- With Regards, Amit Kapila.
pgsql-hackers by date: