Re: [HACKERS] [POC] hash partitioning - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: [HACKERS] [POC] hash partitioning |
Date | |
Msg-id | cdd10cdc-0c92-8294-db56-3e89b65604b6@lab.ntt.co.jp Whole thread Raw |
In response to | Re: [HACKERS] [POC] hash partitioning (Jesper Pedersen <jesper.pedersen@redhat.com>) |
Responses |
Re: [HACKERS] [POC] hash partitioning
Re: [HACKERS] [POC] hash partitioning |
List | pgsql-hackers |
On 2017/09/27 22:41, Jesper Pedersen wrote: > On 09/27/2017 03:05 AM, amul sul wrote: >>>> Attached rebased patch, thanks. >>>> >>>> >>> While reading through the patch I thought it would be better to keep >>> MODULUS and REMAINDER in caps, if CREATE TABLE was in caps too in order to >>> highlight that these are "keywords" for hash partition. >>> >>> Also updated some of the documentation. >>> >>> >> Thanks a lot for the patch, included in the attached version. >> > > Thank you. > > Based on [1] I have moved the patch to "Ready for Committer". Thanks a lot Amul for working on this. Like Jesper said, the patch looks pretty good overall. I was looking at the latest version with intent to study certain things about hash partitioning the way patch implements it, during which I noticed some things. + The modulus must be a positive integer, and the remainder must a must be a + suppose you have a hash-partitioned table with 8 children, each of which + has modulus 8, but find it necessary to increase the number of partitions + to 16. Might it be a good idea to say 8 "partitions" instead of "children" in the first sentence? + each modulus-8 partition until none remain. While this may still involve + a large amount of data movement at each step, it is still better than + having to create a whole new table and move all the data at once. + </para> + I read the paragraph that ends with the above text and started wondering if the example to redistribute data in hash partitions by detaching and attaching with new modulus/remainder could be illustrated with an example? Maybe in the examples section of the ALTER TABLE page? + Since hash operator class provide only equality, not ordering, collation Either "Since hash operator classes provide" or "Since hash operator class provides" Other than the above points, patch looks good. By the way, I noticed a couple of things about hash partition constraints: 1. In get_qual_for_hash(), using get_fn_expr_rettype(&key->partsupfunc[i]), which returns InvalidOid for the lack of fn_expr being set to non-NULL value, causes funcrettype of the FuncExpr being generated for hashing partition key columns to be set to InvalidOid, which I think is wrong. That is, the following if condition in get_fn_expr_rettype() is always satisfied: if (!flinfo || !flinfo->fn_expr) return InvalidOid; I think we could use get_func_rettype(&key->partsupfunc[i].fn_oid) instead. Attached patch hash-v21-set-funcexpr-funcrettype-correctly.patch, which applies on top v21 of your patch. 2. It seems that the reason constraint exclusion doesn't work with hash partitions as implemented by the patch is that predtest.c: operator_predicate_proof() returns false even without looking into the hash partition constraint, which is of the following form: satisfies_hash_partition(<mod>, <rem>, <key1-exthash>,..) beccause the above constraint expression doesn't translate into a a binary opclause (an OpExpr), which operator_predicate_proof() knows how to work with. So, false is returned at the beginning of that function by the following code: if (!is_opclause(predicate)) return false; For example, create table p (a int) partition by hash (a); create table p0 partition of p for values with (modulus 4, remainder 0); create table p1 partition of p for values with (modulus 4, remainder 1); \d+ p0 <...> Partition constraint: satisfies_hash_partition(4, 0, hashint4extended(a, '8816678312871386367'::bigint)) -- both p0 and p1 scanned explain select * from p where satisfies_hash_partition(4, 0, hashint4extended(a, '8816678312871386367'::bigint)); QUERY PLAN ---------------------------------------------------------------------------------------------------- Append (cost=0.00..96.50 rows=1700 width=4) -> Seq Scan on p0 (cost=0.00..48.25 rows=850 width=4) Filter: satisfies_hash_partition(4, 0, hashint4extended(a, '8816678312871386367'::bigint)) -> Seq Scan on p1 (cost=0.00..48.25 rows=850 width=4) Filter: satisfies_hash_partition(4, 0, hashint4extended(a, '8816678312871386367'::bigint)) (5 rows) -- both p0 and p1 scanned explain select * from p where satisfies_hash_partition(4, 1, hashint4extended(a, '8816678312871386367'::bigint)); QUERY PLAN ---------------------------------------------------------------------------------------------------- Append (cost=0.00..96.50 rows=1700 width=4) -> Seq Scan on p0 (cost=0.00..48.25 rows=850 width=4) Filter: satisfies_hash_partition(4, 1, hashint4extended(a, '8816678312871386367'::bigint)) -> Seq Scan on p1 (cost=0.00..48.25 rows=850 width=4) Filter: satisfies_hash_partition(4, 1, hashint4extended(a, '8816678312871386367'::bigint)) (5 rows) I looked into how satisfies_hash_partition() works and came up with an idea that I think will make constraint exclusion work. What if we emitted the hash partition constraint in the following form instead: hash_partition_mod(hash_partition_hash(key1-exthash, key2-exthash), <mod>) = <rem> With that form, constraint exclusion seems to work as illustrated below: \d+ p0 <...> Partition constraint: (hash_partition_modulus(hash_partition_hash(hashint4extended(a, '8816678312871386367'::bigint)), 4) = 0) -- note only p0 is scanned explain select * from p where hash_partition_modulus(hash_partition_hash(hashint4extended(a, '8816678312871386367'::bigint)), 4) = 0; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------- Append (cost=0.00..61.00 rows=13 width=4) -> Seq Scan on p0 (cost=0.00..61.00 rows=13 width=4) Filter: (hash_partition_modulus(hash_partition_hash(hashint4extended(a, '8816678312871386367'::bigint)), 4) = 0) (3 rows) -- note only p1 is scanned explain select * from p where hash_partition_modulus(hash_partition_hash(hashint4extended(a, '8816678312871386367'::bigint)), 4) = 1; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------- Append (cost=0.00..61.00 rows=13 width=4) -> Seq Scan on p1 (cost=0.00..61.00 rows=13 width=4) Filter: (hash_partition_modulus(hash_partition_hash(hashint4extended(a, '8816678312871386367'::bigint)), 4) = 1) (3 rows) I tried to implement that in the attached hash-v21-hash-part-constraint.patch, which applies on top v21 of your patch (actually on top of hash-v21-set-funcexpr-funcrettype-correctly.patch, which I think should be applied anyway as it fixes a bug of the original patch). What do you think? Eventually, the new partition-pruning method [1] will make using constraint exclusion obsolete, but it might be a good idea to have it working if we can. Thanks, Amit [1] https://commitfest.postgresql.org/14/1272/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: