Re: [HACKERS] Partitioned tables and relfilenode - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: [HACKERS] Partitioned tables and relfilenode |
Date | |
Msg-id | CA+TgmobusMYTZBN-Yd9=XTvsdEpD1U5fACqfA_x1V74KojVqxQ@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Partitioned tables and relfilenode (Simon Riggs <simon@2ndquadrant.com>) |
Responses |
Re: [HACKERS] Partitioned tables and relfilenode
|
List | pgsql-hackers |
On Tue, Mar 21, 2017 at 12:19 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > On 16 March 2017 at 10:03, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> On 2017/03/15 7:09, Robert Haas wrote: > >>> I think that eliding the Append node when there's only one child may >>> be unsafe in the case where the child's attribute numbers are >>> different from the parent's attribute numbers. I remember Tom making >>> some comment about this when I was working on MergeAppend, although I >>> no longer remember the specific details. >> >> Append node elision does not occur in the one-child case. With the patch: > ... >> create table q1 partition of q for values in (1); >> explain select * from q; >> QUERY PLAN >> ------------------------------------------------------------ >> Append (cost=0.00..35.50 rows=2550 width=4) >> -> Seq Scan on q1 (cost=0.00..35.50 rows=2550 width=4) >> (2 rows) >> >> Maybe that should be done, but this patch doesn't implement that. > > Robert raises the possible problem that removing the Append wouldn't > work when the parent and child attribute numbers don't match. Surely > that never happens with partitions, by definition? No, the attribute numbers don't have to match. This decision was made a long time ago, and there have been a whole bunch of followup commits since the original partitioning patch that were dedicated to fixing up cases where that wasn't working properly in the original commit. It seems superficially attractive to require the attribute numbers to match, but it creates some really unpleasant cases. For example, suppose a user tries to creates a partitioned table, drops a column, then creates a standalone table which matches the apparent column list of the partitioned table, then tries to attach it as a partition. ERROR: the columns you previously dropped from the parent that you can't see and don't know about aren't the same as the ones dropped from the standalone table you're trying to attach as a partition DETAIL: Try recreating your proposed new partition with a pass-by-value column of width 8 after the third column, and then dropping that column before trying to attach it as a partition. HINT: Abandon all hope, ye who enter here. Not cool with that. The decision not to require the attribute numbers to match doesn't necessarily mean we can't get rid of the Append node, though. First of all, in a lot of practical cases the attribute numbers will all match. Second, if they don't, the most that would be required is a projection step, which could usually be done without a separate node because most nodes are projection-capable. And maybe not even that much is needed; I'd have to go back and look at what Tom was worried about the last time this came up. (Hmm, maybe the problem had to do with varnos matching, rather then attribute numbers?) Another and independent problem with eliding the Append node is that, if we did that, we'd still have to guarantee that the parent relation corresponding to the Append node got locked somehow. Otherwise, we'll be accessing the tuple routing information for a table on which we don't have a lock. That's probably a solvable problem, too, but it hasn't been solved yet. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: