Re: [HACKERS] Partitioned tables and relfilenode - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: [HACKERS] Partitioned tables and relfilenode |
Date | |
Msg-id | d937752c-bf74-5560-6b7c-a315efc1da77@lab.ntt.co.jp Whole thread Raw |
In response to | Re: [HACKERS] Partitioned tables and relfilenode (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: [HACKERS] Partitioned tables and relfilenode
|
List | pgsql-hackers |
> On Tue, Mar 21, 2017 at 10:37 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Tue, Mar 21, 2017 at 9:49 AM, Robert Haas <robertmhaas@gmail.com> wrote: >>> On Tue, Mar 21, 2017 at 5:05 AM, Amit Langote >>> <Langote_Amit_f8@lab.ntt.co.jp> wrote: >>>> Attached updated patches. >>> >>> Committed 0001 after removing a comma. >> >> Regarding 0002, Thanks for the review. >> I notice this surprising behavior: >> >> rhaas=# create table foo (a int, b text) partition by list (a); >> CREATE TABLE >> rhaas=# select relfilenode from pg_class where oid = 'foo'::regclass; >> relfilenode >> ------------- >> 16385 >> (1 row) >> >> Why do we end up with a relfilenode if there's no storage? I would >> have expected to see 0 there. Hmm, I see that happening even with views (and other relkinds that do not have storage): create table foo (a int); create view foov as select * from foo; select relfilenode from pg_class where relname = 'foov'; -[ RECORD 1 ]------ relfilenode | 24606 create type typ as (a int); select relfilenode from pg_class where relname = 'typ'; -[ RECORD 1 ]------ relfilenode | 24610 After staring at the relevant code, fixing things so that relfilenode is 0 for relations without storage seems to be slightly involved. In the header comment of relmapper.c, there is a line: Mapped catalogs have zero in their pg_class.relfilenode entries. which is kind of significant for the relcache initialization code. RelationInitPhysicalAddr() called when building a relcache entry initializes the rd_rel.relfilenode from pg_class.relfilenode if it's valid and from relation mapper if invalid (i.e. 0). So when relcache entries for the mapped catalogs are being built, pg_class.relfilenode being InvalidOid is a signal to get the same from the relmapper. Since the same code path is exercised in other cases, all relations supposedly without storage would wrongly be looked up in the relmapper. It may be possible to fix that, but will require non-trivial rearrangement of relevant code. >> Other than that, there's not much to see here. I'm a little worried >> that you might have missed some case that can result in an access to >> the file, but I can't find one. Stuff I tried: >> >> VACUUM >> VACUUM FULL >> CLUSTER >> ANALYZE >> CREATE INDEX >> ALTER TABLE .. ALTER COLUMN .. TYPE >> TRUNCATE >> >> It would be good to go through and make sure all of those - and any >> others you can think of - are represented in the regression tests. We now have tests that cover commands that previously would try to access file for partitioned tables (most of those your listed above). The commit yesterday to eliminate scan nodes is also covered by tests. Speaking of - do you think the following error message is reasonable when a a partitioned table is passed to pg_prewarm(): select pg_prewarm('p'); ERROR: fork "main" does not exist for this relation It's the same if a view name is passed. >> Also, a documentation update is probably in order to explain that >> we're not going to accept heap_reloptions on the parent because the >> parent has no storage; such options must be set on the child in order >> to have effect. Hmm, actually I am no longer sure if we should completely get rid of the reloptions. Some options like fillfactor don't make sense, but we might want to use the values for autovacuum_* options when we will improve autovacuum's handling of partitioned tables. Maybe even parallel_workers could be useful to specify for parent tables. So, I took out reloptions.c changes from the patch for now and documented that specifying the storage options for partitioned parents is ineffective currently. Does that make sense? Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: