Re: ALTER TABLE ADD COLUMN fast default - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: ALTER TABLE ADD COLUMN fast default |
Date | |
Msg-id | CAKJS1f8kOwpb6XtRRpONPtR35dyHp5m6WgOYF9iGWR-FLmoGTA@mail.gmail.com Whole thread Raw |
In response to | Re: ALTER TABLE ADD COLUMN fast default (Andrew Dunstan <andrew.dunstan@2ndquadrant.com>) |
Responses |
Re: ALTER TABLE ADD COLUMN fast default
|
List | pgsql-hackers |
On 1 March 2018 at 11:49, Andrew Dunstan <andrew.dunstan@2ndquadrant.com> wrote: > On Wed, Feb 28, 2018 at 5:39 AM, Andres Freund <andres@anarazel.de> wrote: >> On 2018-02-27 14:29:44 +1030, Andrew Dunstan wrote: >>> Profiling through timer interrupt >>> samples % image name symbol name >>> 22584 28.5982 postgres ExecInterpExpr >>> 11950 15.1323 postgres slot_getmissingattrs >>> 5471 6.9279 postgres AllocSetAlloc >>> 3018 3.8217 postgres TupleDescInitEntry >>> 2042 2.5858 postgres MemoryContextAllocZeroAligned >>> 1807 2.2882 postgres SearchCatCache1 >>> 1638 2.0742 postgres expression_tree_walker >>> >>> >>> That's very different from what we see on the master branch. The time >>> spent in slot_getmissingattrs is perhaps not unexpected, but I don't >>> (at least yet) understand why we're getting so much time spent in >>> ExecInterpExpr, which doesn't show up at all when the same benchmark >>> is run on master. >> >> I'd guess that's because the physical tlist optimization is disabled. I >> assume you'd see something similar on master if you dropped a column. I put that to the test and there's still something fishy going on. I created some benchmark scripts to help out a bit. Basically, they'll allow you to choose how many columns you want in the test tables and how many rows to put in them. The following test is a 60 second single threaded pgbench test with 1000 columns, 400 rows, testing select sum(c10) from <table> *** Benchmarking normal table... tps = 1972.985639 (excluding connections establishing) *** Benchmarking missing table... tps = 433.779008 (excluding connections establishing) *** Benchmarking dropped table... tps = 3661.063986 (excluding connections establishing) The dropped table had 1001 columns, but the script drops the first. The missing table has all 1000 columns missing. In the normal table all tuples have all attrs. I imagine it should be possible to make the missing table case faster than either of the others. The reason it's slower is down to slot_getsomeattrs being called to get all 1000 attributes in all but the 2nd call to the function... ? whereas only the 10th attr is deformed in the Normal table case. Here's some perf output for each case: Normal: 15.88% postgres postgres [.] expression_tree_walker 7.82% postgres postgres [.] fix_expr_common 6.12% postgres [kernel.kallsyms] [k] __lock_text_start 5.45% postgres postgres [.] SearchCatCache3 3.98% postgres postgres [.] TupleDescInitEntry Missing: 47.80% postgres postgres [.] ExecInterpExpr 25.65% postgres postgres [.] slot_getmissingattrs 6.86% postgres postgres [.] build_tlist_index 4.43% postgres libc-2.17.so [.] __memset_sse2 2.57% postgres postgres [.] expression_tree_walker Dropped: 19.59% postgres postgres [.] slot_deform_tuple 6.63% postgres postgres [.] hash_search_with_hash_value 6.55% postgres postgres [.] heap_getnext 5.35% postgres postgres [.] ExecInterpExpr 4.79% postgres postgres [.] heap_page_prune_opt /me studies the code for a bit... Okay, it looks like the patch should disable physical tlists when there's a missing column the same way as we do for dropped columns. Patch attached. TPS looks much better now; *** Benchmarking missing table... tps = 5666.117627 (excluding connections establishing) everybody's going to want missing columns in their tables now... -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
pgsql-hackers by date: