Bundle of patches - Mailing list pgsql-patches
From | Teodor Sigaev |
---|---|
Subject | Bundle of patches |
Date | |
Msg-id | 45742C51.9020602@sigaev.ru Whole thread Raw |
Responses |
Re: Bundle of patches
Re: Bundle of patches Re: Bundle of patches Re: Bundle of patches Re: Bundle of patches |
List | pgsql-patches |
The 1C (http://www.1c.ru/) company kindly permits to publish a set of patches we (Oleg and me) developed during our work on porting of the '1C:Enterprise' system to support the PostgreSQL database. We would like to suggest to commit they to HEAD. 1) Typmod for user-defined types http://www.sigaev.ru/misc/user_defined_typmod-0.7.gz Patch is based on ideas from http://archives.postgresql.org/pgsql-hackers/2004-06/msg00932.php http://archives.postgresql.org/pgsql-hackers/2005-08/msg01007.php Patch adds to type definition two optional function: typmodinput and typmodoutput. That allows to develop user-defined types with type's modificators. Built-in types use typmod input/output functions too. Typmod internally is represented as non-negative int4 value, but patch allows to point list of integer in type definition. So, NUMERIC type works with a help of typmodin/typmodout function. 2) ORDER BY .. [ NULLS ( FIRST | LAST ) ] http://www.sigaev.ru/misc/NULLS_82-0.5.gz Allow to sort NULLs as greater or lesser than any value. The goal was to simplificate migrations from MySQL/MS SQL which think that NULL is less. Also, syntax conforms to SQL2003. It operate on gram.y level, and adds 'field is [not] null' qualification to sortClause. Note, to allow queries like 'select .. union .. order by f nulls first' pgsql now can rewrite that query to 'select * from (select .. union ..) order by f nulls first'. This solves the problem with 'resjunk' column in SelectStmt->sortClause. 3) Allow to use index for IS [NOT] NULL http://www.sigaev.ru/misc/indexnulls_82-0.6.gz Initially patch was developed by Martijn van Oosterhout <kleptog@svana.org>. But it's reworked and support of searching NULLS to GiST too. Patch adds new column named amsearchnull to pg_am. To recognize IS NULL clause ScanKey->sk_flags contains (SK_ISNULL & SK_INDEXFINDNULL) and ScanKey->sk_strategy == BTEqualStrategyNumber. For IS NOT NULL, ScanKey->sk_strategy == BTLessStrategyNumber. Thats because NULLs are treated greater than any value. It might be look some odd that for IS [NOT] NULL clauses we use Btree strategy numbers even for GiST, but if sk_flags contains SK_ISNULL then we never call user-defined functions. 4) OR clauses optimizations http://www.sigaev.ru/misc/OR_82-0.6.gz Patch can suggest new indexpaths to optimizer for ORed clauses. Patch uses generate_bitmapscan and predicate_implied_by/predicate_refuted_by machineries 4.1) Allow any useful common restriction clauses to be extracted from OR-of-AND quals. Also, it allows to combine several different operations to one which can be used in index scan. SELECT a, b FROM tst WHERE ( a = 50000 ) OR ( a > 50000 AND b > 50000 ) ORDER BY a, b LIMIT 20; Limit (cost=0.00..2.95 rows=20 width=8) (actual time=0.271..0.677 rows=20 loops=1) -> Index Scan using abidx on tst (cost=0.00..3671.26 rows=24878 width=8) (actual time=0.265..0.611 rows=20 loops=1) Index Cond: (a >= 50000) Filter: ((a = 50000) OR ((a > 50000) AND (b > 50000))) 4.2) When OR clauses aren't intersect and use the same index, it's possible to just concatenate results of indexscans. For that, now postgres may use Append node. Append node is modified to have a pathkeys. SELECT a FROM tst WHERE ( a > 60000 AND a < 61000 ) OR ( a > 20000 AND a < 21000 ) ORDER BY ASC LIMIT 20; Limit (cost=0.00..39.86 rows=20 width=4) (actual time=0.364..0.883 rows=20 loops=1) -> Result (cost=0.00..4001.55 rows=2008 width=4) (actual time=0.359..0.824 rows=20 loops=1) -> Append (cost=0.00..4001.55 rows=2008 width=4) (actual time=0.349..0.742 rows=20 loops=1) -> Index Scan using aidx on tst (cost=0.00..2000.42 rows=990 width=4) (actual time=0.346..0.684 rows=20 loops=1) Index Cond: ((a > 20000) AND (a < 21000)) -> Index Scan using aidx on tst (cost=0.00..2001.12 rows=1018 width=4) (never executed) Index Cond: ((a > 60000) AND (a < 61000)) Also, if there is a 'ORDER BY' clause, childs nodes may be ordered by theys ranges (compare plan with previous one). SELECT a FROM tst WHERE ( a > 60000 AND a < 61000 ) OR ( a > 20000 AND a < 21000 ) ORDER BY a DESC LIMIT 20; Limit (cost=0.00..39.86 rows=20 width=4) (actual time=0.162..0.651 rows=20 loops=1) -> Result (cost=0.00..4001.55 rows=2008 width=4) (actual time=0.157..0.589 rows=20 loops=1) -> Append (cost=0.00..4001.55 rows=2008 width=4) (actual time=0.149..0.511 rows=20 loops=1) -> Index Scan Backward using aidx on tst (cost=0.00..2001.12 rows=1018 width=4) (actual time=0.145..0.450 rows=20 loops=1) Index Cond: ((a > 60000) AND (a < 61000)) -> Index Scan Backward using aidx on tst (cost=0.00..2000.42 rows=990 width=4) (never executed) Index Cond: ((a > 20000) AND (a < 21000)) 4.3) As side effect of previous point, overlapped clauses can be eliminated: SELECT a FROM tst WHERE ( a > 50000 AND a < 61000 ) OR ( a > 60000 AND a < 60100 ) ORDER BY a LIMIT 20; Limit (cost=0.00..4.14 rows=20 width=4) (actual time=0.168..1.001 rows=20 loops=1) -> Index Scan using aidx on tst (cost=0.00..2344.85 rows=11338 width=4) (actual time=0.162..0.935 rows=20 loops=1) Index Cond: ((a > 50000) AND (a < 61000)) -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
pgsql-patches by date: