User defined window functions calling WinGetFuncArgInPartition/WINDOW_SEEK_HEAD - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject User defined window functions calling WinGetFuncArgInPartition/WINDOW_SEEK_HEAD
Date
Msg-id 20251009.150858.1461274753395856075.ishii@postgresql.org
Whole thread Raw
List pgsql-hackers
Hi Oliver,

PostGis has user defined window functions. I checked them and found
they call WinGetFuncArgInPartition with WINDOW_SEEK_HEAD.
https://gitea.osgeo.org/postgis/postgis/search?q=WinGetFuncArgInPartition&fuzzy=true

If they decide to call the window functions with IGNORE NULLS option,
I think they may be surprised that
WinGetFuncArgInPartition/WINDOW_SEEK_HEAD returns somewhat
counter intuitive results.

To study this I created a tiny user defined window function which
calls WinGetFuncArgInPartition with WINDOW_SEEK_HEAD (see attached).

CREATE FUNCTION mywindowfunc(IN anycompatible, IN int)
RETURNS anycompatible
AS 'MODULE_PATHNAME', 'mywindowfunc'
LANGUAGE C WINDOW STRICT;

The first argument is an any expression and the second argument is
passed to WinGetFuncArgInPartition's relpos argument.  Here is the
result:

CREATE TEMP TABLE g(x INT, y INT);
CREATE TABLE
INSERT INTO g (VALUES(NULL,1),(NULL,2),(10,3),(20,4));
INSERT 0 4
SELECT * FROM g;
 x  | y 
----+---
    | 1
    | 2
 10 | 3
 20 | 4
(4 rows)

SELECT x, y, mywindowfunc(x, 0) IGNORE NULLS OVER w FROM g
WINDOW w AS (ORDER BY y);
 x  | y | mywindowfunc 
----+---+--------------
    | 1 |             
    | 2 |             
 10 | 3 |             
 20 | 4 |             
(4 rows)

SELECT x, y, mywindowfunc(x, 1) IGNORE NULLS OVER w FROM g
WINDOW w AS (ORDER BY y);
 x  | y | mywindowfunc 
----+---+--------------
    | 1 |           10
    | 2 |           10
 10 | 3 |           10
 20 | 4 |           10
(4 rows)

SELECT x, y, mywindowfunc(x, 2) IGNORE NULLS OVER w FROM g
WINDOW w AS (ORDER BY y);
 x  | y | mywindowfunc 
----+---+--------------
    | 1 |           20
    | 2 |           20
 10 | 3 |           20
 20 | 4 |           20
(4 rows)

For relpos==0, 1, 2 it returns NULL, 10, 20 on the first result row.

I can explain why for relpos==0, NULL is returned. If IGNORE NULLS and
relpos==0, it is treated as if RESPECT NULLS is specified. But this
may be counterintuitive result.

reslpos==1, 2 case also seems strange to me. I expected for relpos==1,
the second non null row (20) is returned, but it returns 10. For
relpos==2, I expected NULL since there's no third non null row. But
others may think differently.

Since there's no such an idea like WINDOW_SEEK_HEAD in the SQL
standard, I think we have to decide what to do with WINDOW_SEEK_HEAD
case. One idea is, we treat it as nth_value with "ROWS BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING" frame option.
With nth_value we get 10, 20 and NULL for offset 1, 2 and 3.

What do you think?

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Logical Replication of sequences
Next
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: Patch for migration of the pg_commit_ts directory