Re: Declarative partitioning - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: Declarative partitioning |
Date | |
Msg-id | 56DE2D4B.1040001@lab.ntt.co.jp Whole thread Raw |
In response to | Re: Declarative partitioning (Corey Huinker <corey.huinker@gmail.com>) |
Responses |
Re: Declarative partitioning
Re: Declarative partitioning |
List | pgsql-hackers |
Hi Corey, Sorry for replying so late. On 2016/02/25 3:31, Corey Huinker wrote: > [ ... ] > So I would assume that we'd use a syntax that presumed the columns were in > a composite range type. > > Which means your creates would look like (following Robert Haas's implied > suggestion that we leave off the string literal quotes): > > CREATE TABLE foo_ax1x PARTITION OF foo FOR VALUES ( , (b,2) ); > CREATE TABLE foo_ax1x PARTITION OF foo FOR VALUES [ (b,2), (b,3) ); > CREATE TABLE foo_ax1x PARTITION OF foo FOR VALUES [ (b,3), (b,4) ); > CREATE TABLE foo_ax1x PARTITION OF foo FOR VALUES [ (b,4), (c,2) ); > CREATE TABLE foo_ax1x PARTITION OF foo FOR VALUES [ (c,2), (c,3) ); > CREATE TABLE foo_ax1x PARTITION OF foo FOR VALUES [ (c,3), (c,4) ); > > That's not terrible looking. So I tried a grammar that looked like the following: range_spec: lb_inc bound ',' bound ub_inc {<create-range-struct>} lb_inc: '[' { $$ = true; } | '(' { $$ = false; } ub_inc: ']' { $$ = true; } | ')' { $$ = false; } bound: a_expr { if (IsA($1, RowExpr)) /* (key1, key2, ...) */ $$ = ((RowExpr) $1)->args; else /* key */ $$ = list_make1($1); } | /* EMPTY */ { $$ = NIL; } Everything seemed to go dandy until I tried FOR VALUES (blah , blah], where psql wouldn't send the command string without accepting the closing parenthesis, :(. So maybe I should try to put the whole thing in '', that is, accept the full range_spec in a string, but then we are back to requiring full-blown range parse function which I was trying to avoid by using the aforementioned grammar. So, I decided to move ahead with the following grammar for time being: START (lower-bound) [ EXCLUSIVE ] | END (upper-bound) [ INCLUSIVE ] | START (lower-bound) [ EXCLUSIVE ] END (upper-bound) [ INCLUSIVE ] Where, *-bound: a_expr | *-bound ',' a_expr Note that in the absence of explicit specification, lower-bound is inclusive and upper-bound is exclusive. So the verbosity couldn't be helped unless accepting range literal in string form and exporting rangetypes.c:range_parse() with range partitioning-specific hacks (consider composite bounds) to parse it are acceptable things. >> IOW, one shouldn't create an unbounded partition if more partitions in the >> unbounded direction are expected to be created. It would be OK for >> unbounded partitions to be on the lower end most of the times. >> > > On this I'll have to disagree. My own use case where I use my > range_partitioning extension starts off with a single partition () and all > new partitions are splits of that. The ranges evolve over time as > partitions grow and slow down. It's nice because we're not trying to > predict where growth will be, we split where growth is. Okay, perhaps I should not presume a certain usage. However, as you know, the usage like yours requires some mechanism of data redistribution (also not without some syntax), which I am not targeting with the initial patch.If that was the only way of creating partitions,matters would be a little easier - you only specify a split point and have some rule about inclusivity around the split point. But we have to start with the usage where each new partition is separately created with explicit partition bound specification that is complete in itself and that's where the logic to check partition invariants may get a bit complicated. > Ok, I'll wait a bit. In the mean time I can tell you a bit about the > existing production system I'm hoping to replicate in true partitioning > looks like this: > > Big Master Table: > Range partition by C collated text > Date Range > Date Range > ... > Range partition by C collated text > Date Range > Date Range > ... > ... > > Currently this is accomplished through my range_partitioning module, and > then using pg_partman on those partitions. It works, but it's a lot of > moving parts. > > The machine will be a 32 core AWS box. As per usual with AWS, it will be > have ample memory and CPU, and be somewhat starved for I/O. > > Question: I haven't dove into the code, but I was curious about your tuple > routing algorithm. Is there any way for the algorithm to begin it's scan of > candidate partitions based on the destination of the last row inserted this > statement? I ask because most use cases (that I am aware of) have data that > would naturally cluster in the same partition. No. Actually the tuple-routing function starts afresh for each row. For range partitions, it's binary search over an array of upper bounds. There is no row-to-row state caching in the partition module itself. Thanks, Amit
pgsql-hackers by date: