Home > mailing lists
Re: SQL:2011 application time - Mailing list pgsql-hackers

From	Paul Jungwirth
Subject	Re: SQL:2011 application time
Date	January 18, 2024 03:59:06
Msg-id	0af6a323-f026-4cd9-adf1-941038cd00de@illuminatedcomputing.com Whole thread Raw
In response to	Re: SQL:2011 application time (jian he <jian.universality@gmail.com>)
Responses	Re: SQL:2011 application time Re: SQL:2011 application time
List	pgsql-hackers
Tree view
Hello,

Here are new patches consolidating feedback from several emails.
I haven't addressed everything but I think I'm overdue for a reply:

On 1/4/24 21:06, jian he wrote:
 >
 > I am confused.
 > say condition: " (cmp_l1l2 <= 0 && cmp_u1l2 >= 0 && cmp_u1u2 <= 0)"
 > the following code will only run PartA, never run PartB?
 >
 > `
 > else if (cmp_l1l2 >= 0 && cmp_u1u2 <= 0)
 >      PartA
 > else if (cmp_l1l2 <= 0 && cmp_u1l2 >= 0 && cmp_u1u2 <= 0)
 >      PartB
 > `
 >
 > minimum example:
 > #include<stdio.h>
 > #include<string.h>
 > #include<stdlib.h>
 > #include<assert.h>
 > int
 > main(void)
 > {
 >      int cmp_l1l2;
 >      int cmp_u1u2;
 >      int cmp_u1l2;
 >      int cmp_l1u2;
 >      cmp_l1u2 = -1;
 >      cmp_l1l2 = 0;
 >      cmp_u1u2 = 0;
 >      cmp_u1l2 = 0;
 >      assert(cmp_u1l2 == 0);
 > if (cmp_l1u2 > 0 || cmp_u1l2 < 0)
 >          printf("calling partA\n");
 >      else if (cmp_l1l2 >= 0 && cmp_u1u2 <= 0)
 >          printf("calling partB\n");
 >      else if (cmp_l1l2 <= 0 && cmp_u1l2 >= 0 && cmp_u1u2 <= 0)
 >          printf("calling partC\n");
 > }

All of the branches are used. I've attached a `without_portion.c` minimal example showing different 
cases. For ranges it helps to go through the Allen relationships 
(https://en.wikipedia.org/wiki/Allen%27s_interval_algebra) to make a comprehensive check. (But note 
that our operators don't exactly match that terminology, and it's important to consider 
closed-vs-open and unbounded cases.)

 > I am confused with the name "range_without_portion", I think
 > "range_not_overlap" would be better.

I think I covered this in my other reply and we are now in agreement, but if that's mistaken let 
know me.

 > select numrange(1.1, 2.2) @- numrange(2.0, 3.0);
 > the result is not the same as
 > select numrange(2.0, 3.0) @- numrange(1.1, 2.2);

Correct, @- is not commutative.

 > So your categorize oprkind as 'b' for operator "@-" is wrong?
 > select oprname,oprkind,oprcanhash,oprcanmerge,oprleft,oprright,oprresult,oprcode
 > from pg_operator
 > where oprname = '@-';

'b' is the correct oprkind. It is a binary (infix) operator.

 > aslo
 > select count(*), oprkind from pg_operator group by oprkind;
 > there are only 5% are prefix operators.
 > maybe we should design it as:
 > 1. if both inputs are empty range, the result array is empty.
 > 2. if both inputs are non-empty and never overlaps, put both of them
 > to the result array.
 > 3. if one input is empty another one is not, then put the non-empty
 > one into the result array.

Also covered before, but if any of this still applies please let me know.

 > after applying the patch: now the catalog data seems not correct to me.
 > SELECT  a1.amopfamily
 >          ,a1.amoplefttype::regtype
 >          ,a1.amoprighttype
 >          ,a1.amopstrategy
 >          ,amoppurpose
 >          ,amopsortfamily
 >          ,amopopr
 >          ,op.oprname
 >          ,am.amname
 > FROM    pg_amop as a1 join pg_operator op on op.oid = a1.amopopr
 > join    pg_am   am on am.oid = a1.amopmethod
 > where   amoppurpose = 'p';
 > output:
 >   amopfamily | amoplefttype  | amoprighttype | amopstrategy |
 > amoppurpose | amopsortfamily | amopopr | oprname | amname
 > 

------------+---------------+---------------+--------------+-------------+----------------+---------+---------+--------
 >         2593 | box           |           603 |           31 | p
 >    |              0 |     803 | #       | gist
 >         3919 | anyrange      |          3831 |           31 | p
 >    |              0 |    3900 | *       | gist
 >         6158 | anymultirange |          4537 |           31 | p
 >    |              0 |    4394 | *       | gist
 >         3919 | anyrange      |          3831 |           32 | p
 >    |              0 |    8747 | @-      | gist
 >         6158 | anymultirange |          4537 |           32 | p
 >    |              0 |    8407 | @-      | gist
 > (5 rows)
 >
 > select  oprcode, oprname, oprleft::regtype
 > from    pg_operator opr
 > where   opr.oprname in ('#','*','@-')
 > and     oprleft = oprright
 > and     oprleft in (603,3831,4537);
 > output:
 >
 >            oprcode           | oprname |    oprleft
 > ----------------------------+---------+---------------
 >   box_intersect              | #       | box
 >   range_intersect            | *       | anyrange
 >   multirange_intersect       | *       | anymultirange
 >   range_without_portion      | @-      | anyrange
 >   multirange_without_portion | @-      | anymultirange
 > (5 rows)

This seems correct. '#' is the name of the box overlaps operator. Probably I should add a box @- 
operator too. But see below. . . .

 > should amoppurpose = 'p' is true apply to ' @-' operator?

Yes.

 > catalog-pg-amop.html:
 > `
 > amopsortfamily oid (references pg_opfamily.oid):
 > The B-tree operator family this entry sorts according to, if an
 > ordering operator; zero if a search operator
 > `
 > you should also update the above entry, the amopsortfamily is also
 > zero for "portion operator" for the newly implemented "portion
 > operator".

Okay, done.

 > v21-0006-Add-UPDATE-DELETE-FOR-PORTION-OF.patch
 >   create mode 100644 src/backend/utils/adt/period.c
 >   create mode 100644 src/include/utils/period.h
 > you should put these two files to v21-0008-Add-PERIODs.patch.
 > it's not related to that patch, it also makes people easy to review.

You're right, sorry!

On 1/8/24 16:00, jian he wrote:
 >
 > +/*
 > + * ForPortionOfClause
 > + * representation of FOR PORTION OF <period-name> FROM <ts> TO <te>
 > + * or FOR PORTION OF <period-name> (<target>)
 > + */
 > +typedef struct ForPortionOfClause
 > +{
 > + NodeTag type;
 > + char   *range_name;
 > + int range_name_location;
 > + Node   *target;
 > + Node   *target_start;
 > + Node   *target_end;
 > +} ForPortionOfClause;
 >
 > "range_name_location" can be just "location"?
 > generally most of the struct put the "location" to the last field in the struct.
 > (that's the pattern I found all over other code)

Agreed, done.

 > + if (isUpdate)
 > + {
 > + /*
 > + * Now make sure we update the start/end time of the record.
 > + * For a range col (r) this is `r = r * targetRange`.
 > + */
 > + Expr *rangeSetExpr;
 > + TargetEntry *tle;
 > +
 > + strat = RTIntersectStrategyNumber;
 > + GetOperatorFromCanonicalStrategy(opclass, InvalidOid, "intersects",
 > "FOR PORTION OF", &opid, &strat);
 > + rangeSetExpr = (Expr *) makeSimpleA_Expr(AEXPR_OP, get_opname(opid),
 > + (Node *) copyObject(rangeVar), targetExpr,
 > + forPortionOf->range_name_location);
 > + rangeSetExpr = (Expr *) transformExpr(pstate, (Node *) rangeSetExpr,
 > EXPR_KIND_UPDATE_PORTION);
 > +
 > + /* Make a TLE to set the range column */
 > + result->rangeSet = NIL;
 > + tle = makeTargetEntry(rangeSetExpr, range_attno, range_name, false);
 > + result->rangeSet = lappend(result->rangeSet, tle);
 > +
 > + /* Mark the range column as requiring update permissions */
 > + target_perminfo->updatedCols = bms_add_member(target_perminfo->updatedCols,
 > +  range_attno - FirstLowInvalidHeapAttributeNumber);
 > + }
 > + else
 > + result->rangeSet = NIL;
 > I think the name "rangeSet" is misleading, since "set" is generally
 > related to a set of records.
 > but here it's more about the "range intersect".

Okay, I can see that. I used "rangeSet" because we add it to the SET clause of the UPDATE command. 
Here I've changed it to rangeTargetList. I think this matches other code and better indicates what 
it holds. Any objections?

In the PERIOD patch we will need two TLEs here (that's why it's a List): one for the start column 
and one for the end column.

 > in ExecDelete
 > we have following code pattern:
 > ExecDeleteEpilogue(context, resultRelInfo, tupleid, oldtuple, changingPart);
 > if (processReturning && resultRelInfo->ri_projectReturning)
 > {
 > ....
 > if (!table_tuple_fetch_row_version(resultRelationDesc, tupleid,
 >     SnapshotAny, slot))
 > elog(ERROR, "failed to fetch deleted tuple for DELETE RETURNING");
 > }
 > }
 >
 > but the ExecForPortionOfLeftovers is inside ExecDeleteEpilogue.
 > meaning even without ExecForPortionOfLeftovers, we can still call
 > table_tuple_fetch_row_version
 > also if it was *not* concurrently updated, then our current process
 > holds the lock until the ending of the transaction, i think.
 > So the following TODO is unnecessary?
 >
 > + /*
 > + * Get the range of the old pre-UPDATE/DELETE tuple,
 > + * so we can intersect it with the FOR PORTION OF target
 > + * and see if there are any "leftovers" to insert.
 > + *
 > + * We have already locked the tuple in ExecUpdate/ExecDelete
 > + * (TODO: if it was *not* concurrently updated, does
 > table_tuple_update lock the tuple itself?
 > + * I don't found the code for that yet, and maybe it depends on the AM?)
 > + * and it has passed EvalPlanQual.
 > + * Make sure we're looking at the most recent version.
 > + * Otherwise concurrent updates of the same tuple in READ COMMITTED
 > + * could insert conflicting "leftovers".
 > + */
 > + if (!table_tuple_fetch_row_version(resultRelInfo->ri_RelationDesc,
 > tupleid, SnapshotAny, oldtupleSlot))
 > + elog(ERROR, "failed to fetch tuple for FOR PORTION OF");

I think you're right. According to the comments on TM_Result (returned by table_tuple_update), a 
TM_Ok indicates that the lock was acquired.

 > +/* ----------------------------------------------------------------
 > + * ExecForPortionOfLeftovers
 > + *
 > + * Insert tuples for the untouched timestamp of a row in a FOR
 > + * PORTION OF UPDATE/DELETE
 > + * ----------------------------------------------------------------
 > + */
 > +static void
 > +ExecForPortionOfLeftovers(ModifyTableContext *context,
 > +   EState *estate,
 > +   ResultRelInfo *resultRelInfo,
 > +   ItemPointer tupleid)
 >
 > maybe change the comment to
 > "Insert tuples for the not intersection of a row in a FOR PORTION OF
 > UPDATE/DELETE."

Changed to "untouched portion".

 > + deconstruct_array(DatumGetArrayTypeP(allLeftovers),
 > typcache->type_id, typcache->typlen,
 > +   typcache->typbyval, typcache->typalign, &leftovers, NULL, &nleftovers);
 > +
 > + if (nleftovers > 0)
 > + {
 > I think add something like assert nleftovers >=0 && nleftovers <= 2
 > (assume only range not multirange) would improve readability.

I added the first assert. The second is not true for non-range types.

 > +  <para>
 > +   If the table has a range column or
 > +   <link linkend="ddl-periods-application-periods"><literal>PERIOD</literal></link>,
 > +   you may supply a <literal>FOR PORTION OF</literal> clause, and
 > your delete will
 > +   only affect rows that overlap the given interval. Furthermore, if
 > a row's span
 > +   extends outside the <literal>FOR PORTION OF</literal> bounds, then
 > your delete
 > +   will only change the span within those bounds. In effect you are
 > deleting any
 > +   moment targeted by <literal>FOR PORTION OF</literal> and no moments outside.
 > +  </para>
 > +
 > +  <para>
 > +   Specifically, after <productname>PostgreSQL</productname> deletes
 > the existing row,
 > +   it will <literal>INSERT</literal>
 > +   new rows whose range or start/end column(s) receive the remaining
 > span outside
 > +   the targeted bounds, containing the original values in other columns.
 > +   There will be zero to two inserted records,
 > +   depending on whether the original span extended before the targeted
 > +   <literal>FROM</literal>, after the targeted <literal>TO</literal>,
 > both, or neither.
 > +  </para>
 > +
 > +  <para>
 > +   These secondary inserts fire <literal>INSERT</literal> triggers. First
 > +   <literal>BEFORE DELETE</literal> triggers first, then
 > +   <literal>BEFORE INSERT</literal>, then <literal>AFTER INSERT</literal>,
 > +   then <literal>AFTER DELETE</literal>.
 > +  </para>
 > +
 > +  <para>
 > +   These secondary inserts do not require <literal>INSERT</literal>
 > privilege on the table.
 > +   This is because conceptually no new information has been added.
 > The inserted rows only preserve
 > +   existing data about the untargeted time period. Note this may
 > result in users firing <literal>INSERT</literal>
 > +   triggers who don't have insert privileges, so be careful about
 > <literal>SECURITY DEFINER</literal> trigger functions!
 > +  </para>
 >
 > I think you need to wrap them into a big paragraph, otherwise they
 > lose the context?
 > please see the attached build sql-update.html.

Still TODO.

 > also I think
 > +   <link linkend="ddl-periods-application-periods"><literal>PERIOD</literal></link>,
 > should shove into Add-PERIODs.patch.
 >
 > otherwise you cannot build  Add-UPDATE-DELETE-FOR-PORTION-OF.patch
 > without all the patches.

Fixed.

 > I think the "FOR-PORTION-OF" feature is kind of independ?
 > Because, IMHO, "for portion" is a range datum interacting with another
 > single range datum, but the primary key with  "WITHOUT OVERLAPS", is
 > range datum interacting with a set of range datums.
 > now I cannot  just git apply v22-0006-Add-UPDATE-DELETE-FOR-PORTION-OF.patch.
 > That maybe would make it more difficult to get commited?

Still TODO.

On 1/8/24 21:33, jian he wrote:
 >
 > src5=# select range_without_portion(numrange(1.0,3.0,'[]'),
 > numrange(1.5,2.0,'(]'));
 >     range_without_portion
 > ---------------------------
 >   {"[1.0,1.5]","(2.0,3.0]"}
 > (1 row)
 >
 > src5=# \gdesc
 >          Column         |   Type
 > -----------------------+-----------
 >   range_without_portion | numeric[]
 > (1 row)
 >
 > src5=# \df range_without_portion
 >                                   List of functions
 >     Schema   |         Name          | Result data type | Argument data
 > types | Type
 > ------------+-----------------------+------------------+---------------------+------
 >   pg_catalog | range_without_portion | anyarray         | anyrange,
 > anyrange  | func
 > (1 row)
 >
 > so apparently, you cannot from (anyrange, anyrange) get anyarray the
 > element type is anyrange.
 > I cannot find the documented explanation in
 > https://www.postgresql.org/docs/current/extend-type-system.html#EXTEND-TYPES-POLYMORPHIC
 >
 > anyrange is POLYMORPHIC, anyarray is POLYMORPHIC,
 > but I suppose, getting an anyarray the element type is anyrange would be hard.

You're right, that is a problem.

I think the right approach is to make intersect and without_portion just be support functions, not 
operators. Then I don't need to introduce the new 'p' amop strategy at all, which seemed like a 
dubious idea anyway. Then the without_portion function can return a SETOF instead of an array.

Another idea is to add more polymorphic types, anyrangearray and anymultirangearray, but maybe that 
is too big a thing. OTOH I have wanted those same types before. I will take a stab at it.

On 1/11/24 06:44, Peter Eisentraut wrote:
 > Here is some more detailed review of the first two patches.  (I reviewed v20; I see you have also
 > posted v21, but they don't appear very different for this purpose.)
 >
 > v20-0001-Add-stratnum-GiST-support-function.patch
 >
 > * contrib/btree_gist/Makefile
 >
 > Needs corresponding meson.build updates.

Fixed.

 > * contrib/btree_gist/btree_gist--1.7--1.8.sql
 >
 > Should gist_stratnum_btree() live in contrib/btree_gist/ or in core?
 > Are there other extensions that use the btree strategy numbers for
 > gist?

Moved. None of our other contrib extensions use it. I thought it would be friendly to offer it to 
outside extensions, but maybe that is too speculative.

 > +ALTER OPERATOR FAMILY gist_vbit_ops USING gist ADD
 > +   FUNCTION  12 (varbit, varbit) gist_stratnum_btree (int2) ;
 >
 > Is there a reason for the extra space after FUNCTION here (repeated
 > throughout the file)?

Fixed.

 > +-- added in 1.4:
 >
 > What is the purpose of these "added in" comments?

I added those to help me make sure I was including every type in the extension, but I've taken them 
out here.

 > v20-0002-Add-temporal-PRIMARY-KEY-and-UNIQUE-constraints.patch
 >
 > * contrib/btree_gist/Makefile
 >
 > Also update meson.build.

Done.

 > * contrib/btree_gist/sql/without_overlaps.sql
 >
 > Maybe also insert a few values, to verify that the constraint actually
 > does something?

Done.

 > * doc/src/sgml/ref/create_table.sgml
 >
 > Is "must have a range type" still true?  With the changes to the
 > strategy number mapping, any type with a supported operator class
 > should work?

Updated. Probably more docs to come; I want to go through them all now that we support more types.

 > * src/backend/utils/adt/ruleutils.c
 >
 > Is it actually useful to add an argument to
 > decompile_column_index_array()?  Wouldn't it be easier to just print
 > the " WITHOUT OVERLAPS" in the caller after returning from it?

Okay, done.

 > * src/include/access/gist_private.h
 >
 > The added function gistTranslateStratnum() isn't really "private" to
 > gist.  So access/gist.h would be a better place for it.

Moved.

 > Also, most other functions there appear to be named "GistSomething",
 > so a more consistent name might be GistTranslateStratnum.
 >
 > * src/include/access/stratnum.h

Changed.

 > The added StrategyIsValid() doesn't seem that useful?  Plenty of
 > existing code just compares against InvalidStrategy, and there is only
 > one caller for the new function.  I suggest to do without it.
 >
 > * src/include/commands/defrem.h

Okay, removed.

 > We are using two terms here, well-known strategy number and canonical
 > strategy number, to mean the same thing (I think?).  Let's try to
 > stick with one.  Or explain the relationship?

True. Changed everything to "well-known" which seems like a better match for what's going on.

I haven't gone through jian he's Jan 13 patch yet, but since he was also implementing Peter's 
requests I thought I should share what I have. I did this work a while ago, but I was hoping to 
finish the TODOs above first, and then we got hit with a winter storm that knocked out power. Sorry 
to cause duplicate work!

Rebased to 2f35c14cfb.

Yours,

-- 
Paul              ~{:-)
pj@illuminatedcomputing.com
Attachment

pgsql-hackers by date:
From: Shubham Khanna
Date: 18 January 2024, 03:54:36
Subject: Re: Fix search_path for all maintenance commands
From: David Rowley
Date: 18 January 2024, 04:08:33
Subject: Re: Strange Bitmapset manipulation in DiscreteKnapsack()
Re: SQL:2011 application time - Mailing list pgsql-hackers

Attachment

Previous

Next