Re: Declarative partitioning - another take - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Declarative partitioning - another take |
Date | |
Msg-id | CA+Tgmob9o3+xgYHtGbfzUWm9bknRryx53Q5pEVFkzVHSDO=QXQ@mail.gmail.com Whole thread Raw |
In response to | Re: Declarative partitioning - another take (Amit Langote <amitlangote09@gmail.com>) |
Responses |
Re: Declarative partitioning - another take
|
List | pgsql-hackers |
On Wed, Nov 30, 2016 at 10:56 AM, Amit Langote <amitlangote09@gmail.com> wrote: > The latest patch I posted earlier today has this implementation. I decided to try out these patches today with #define CLOBBER_CACHE_ALWAYS 1 in pg_config_manual.h, which found a couple of problems: 1. RelationClearRelation() wasn't preserving the rd_partkey, even though there's plenty of code that relies on it not changing while we hold a lock on the relation - in particular, transformPartitionBound. 2. partition_bounds_equal() was using the comparator and collation for partitioning column 0 to compare the datums for all partitioning columns. It's amazing this passed the regression tests. The attached incremental patch fixes those things and some cosmetic issues I found along the way. 3. RelationGetPartitionDispatchInfo() is badly broken: 1010 pd[i] = (PartitionDispatch) palloc(sizeof(PartitionDispatchData)); 1011 pd[i]->relid = RelationGetRelid(partrel); 1012 pd[i]->key = RelationGetPartitionKey(partrel); 1013 pd[i]->keystate = NIL; 1014 pd[i]->partdesc = partdesc; 1015 pd[i]->indexes = (int *) palloc(partdesc->nparts * sizeof(int)); 1016 heap_close(partrel, NoLock); 1017 1018 m = 0; 1019 for (j = 0; j < partdesc->nparts; j++) 1020 { 1021 Oid partrelid = partdesc->oids[j]; This code imagines that pointers it extracted from partrel are certain to remain valid after heap_close(partrel, NoLock), perhaps on the strength of the fact that we still retain a lock on the relation. But this isn't the case. As soon as nobody has the relation open, a call to RelationClearRelation() will destroy the relcache entry and everything to which it points; with CLOBBER_CACHE_ALWAYS, I see a failure at line 1021: #0 RelationGetPartitionDispatchInfo (rel=0x1136dddf8, lockmode=3, leaf_part_oids=0x7fff5633b938) at partition.c:1021 1021 Oid partrelid = partdesc->oids[j]; (gdb) bt 5 #0 RelationGetPartitionDispatchInfo (rel=0x1136dddf8, lockmode=3, leaf_part_oids=0x7fff5633b938) at partition.c:1021 #1 0x0000000109b8d71f in ExecInitModifyTable (node=0x7fd12984d750, estate=0x7fd12b885438, eflags=0) at nodeModifyTable.c:1730 #2 0x0000000109b5e7ac in ExecInitNode (node=0x7fd12984d750, estate=0x7fd12b885438, eflags=0) at execProcnode.c:159 #3 0x0000000109b58548 in InitPlan (queryDesc=0x7fd12b87b638, eflags=0) at execMain.c:961 #4 0x0000000109b57dcd in standard_ExecutorStart (queryDesc=0x7fd12b87b638, eflags=0) at execMain.c:239 (More stack frames follow...) Current language: auto; currently minimal (gdb) p debug_query_string $1 = 0x7fd12b84c238 "insert into list_parted values (null, 1);" (gdb) p partdesc[0] $2 = { nparts = 2139062143, oids = 0x7f7f7f7f7f7f7f7f, boundinfo = 0x7f7f7f7f7f7f7f7f } As you can see, the partdesc is no longer valid here. I'm not immediately sure how to fix this; this isn't a simple thinko. You need to keep the relations open for the whole duration of the query, not just long enough to build the dispatch info. I think you should try to revise this so that each relation is opened once and kept open; maybe the first loop should be making a pointer-list of Relations rather than an int-list of relation OIDs. And it seems to me (though I'm getting a little fuzzy here because it's late) that you need all of the partitions open, not just the ones that are subpartitioned, because how else are you going to know how to remap the tuple if the column order is different? But I don't see this code doing that, which makes me wonder if the partitions are being opened yet again in some other location. I recommend that once you fix this, you run 'make check' with #define CLOBBER_CACHE_ALWAYS 1 and look for other hazards. Such mistakes are easy to make with this kind of patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
pgsql-hackers by date: