Re: knngist - 0.8 - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: knngist - 0.8 |
Date | |
Msg-id | 9909.1298009274@sss.pgh.pa.us Whole thread Raw |
In response to | Re: knngist - 0.8 (Teodor Sigaev <teodor@sigaev.ru>) |
Responses |
Re: knngist - 0.8
Re: knngist - 0.8 |
List | pgsql-hackers |
Teodor Sigaev <teodor@sigaev.ru> writes: >> I've applied all of this, and written documentation for all of it, > Thank you a lot >> except for the contrib/btree_gist additions which still need to be >> redone for the revised API (and then documented!). My patience ran out > Done, btree_gist is reworked for a new API. I did a quick look at this patch. The major problem with it is of course that it needs to be fixed for the recent extension-related changes. I transposed the .sql.in changes into additions to btree_gist--1.0.sql (attached), but haven't really sanity-checked them beyond checking that the regression tests pass. The same mods would need to be made in btree_gist--unpackaged--1.0.sql. However, I feel that this is not ready to apply even given those fixes. Problems yet to solve: 1. oid_dist() returns oid ... really? Oid is unsigned. I'd be inclined to argue though that distance between Oids is a meaningless concept, so you should remove this not just mess with the result type. Anybody who actually wants to form a distance between Oids should have to cast them to an arithmetic type first. Let the user figure out how wraparound cases should be handled. 2. Beyond that, none of the distance routines have given any thought to avoiding overflow. For instance, dist_int2 had better return something wider than int2, and so on up. It looks to me like the internal gist distance functions also suffer overflow risks, in that they tend to form the difference first (in the source datatype) and only afterwards cast to float8. 3. I was surprised that there wasn't a distance implementation for numeric. I suppose that this might be difficult to do without risking overflow in conversion to float8, though. 4. I didn't much care for changing the result type of gbt_num_consistent from bool to float8; that's just messy, and I don't see any compensating advantage. I suggest you leave gbt_num_consistent and its callers alone, and add a separate gbt_num_distance routine that only handles the KNNDistance case. There might be more issues, I haven't read the patch in detail. But anyway, I'm going to set it to Waiting on Author. I think it needs at least a day or so's work, and I can't put in that kind of time on it now. regards, tom lane diff --git a/contrib/btree_gist/btree_gist--1.0.sql b/contrib/btree_gist/btree_gist--1.0.sql index 1ea5fa2db418d06f991f362a956fec07ccfba9e9..dd428995c185a09519ed65433081752b83b71bb7 100644 *** a/contrib/btree_gist/btree_gist--1.0.sql --- b/contrib/btree_gist/btree_gist--1.0.sql *************** CREATE TYPE gbtreekey_var ( *** 81,86 **** --- 81,231 ---- STORAGE = EXTENDED ); + --distance operators + + CREATE FUNCTION cash_dist(money, money) + RETURNS money + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + + CREATE OPERATOR <-> ( + LEFTARG = money, + RIGHTARG = money, + PROCEDURE = cash_dist, + COMMUTATOR = '<->' + ); + + CREATE FUNCTION date_dist(date, date) + RETURNS int4 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + + CREATE OPERATOR <-> ( + LEFTARG = date, + RIGHTARG = date, + PROCEDURE = date_dist, + COMMUTATOR = '<->' + ); + + CREATE FUNCTION float4_dist(float4, float4) + RETURNS float4 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + + CREATE OPERATOR <-> ( + LEFTARG = float4, + RIGHTARG = float4, + PROCEDURE = float4_dist, + COMMUTATOR = '<->' + ); + + CREATE FUNCTION float8_dist(float8, float8) + RETURNS float8 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + + CREATE OPERATOR <-> ( + LEFTARG = float8, + RIGHTARG = float8, + PROCEDURE = float8_dist, + COMMUTATOR = '<->' + ); + + CREATE FUNCTION int2_dist(int2, int2) + RETURNS int2 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + + CREATE OPERATOR <-> ( + LEFTARG = int2, + RIGHTARG = int2, + PROCEDURE = int2_dist, + COMMUTATOR = '<->' + ); + + CREATE FUNCTION int4_dist(int4, int4) + RETURNS int4 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + + CREATE OPERATOR <-> ( + LEFTARG = int4, + RIGHTARG = int4, + PROCEDURE = int4_dist, + COMMUTATOR = '<->' + ); + + CREATE FUNCTION int8_dist(int8, int8) + RETURNS int8 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + + CREATE OPERATOR <-> ( + LEFTARG = int8, + RIGHTARG = int8, + PROCEDURE = int8_dist, + COMMUTATOR = '<->' + ); + + CREATE FUNCTION interval_dist(interval, interval) + RETURNS interval + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + + CREATE OPERATOR <-> ( + LEFTARG = interval, + RIGHTARG = interval, + PROCEDURE = interval_dist, + COMMUTATOR = '<->' + ); + + CREATE FUNCTION oid_dist(oid, oid) + RETURNS oid + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + + CREATE OPERATOR <-> ( + LEFTARG = oid, + RIGHTARG = oid, + PROCEDURE = oid_dist, + COMMUTATOR = '<->' + ); + + CREATE FUNCTION time_dist(time, time) + RETURNS interval + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + + CREATE OPERATOR <-> ( + LEFTARG = time, + RIGHTARG = time, + PROCEDURE = time_dist, + COMMUTATOR = '<->' + ); + + CREATE FUNCTION ts_dist(timestamp, timestamp) + RETURNS interval + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + + CREATE OPERATOR <-> ( + LEFTARG = timestamp, + RIGHTARG = timestamp, + PROCEDURE = ts_dist, + COMMUTATOR = '<->' + ); + + CREATE FUNCTION tstz_dist(timestamptz, timestamptz) + RETURNS interval + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + + CREATE OPERATOR <-> ( + LEFTARG = timestamptz, + RIGHTARG = timestamptz, + PROCEDURE = tstz_dist, + COMMUTATOR = '<->' + ); -- *************** RETURNS bool *** 96,101 **** --- 241,251 ---- AS 'MODULE_PATHNAME' LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_oid_distance(internal,oid,int2,oid) + RETURNS float8 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_oid_compress(internal) RETURNS internal AS 'MODULE_PATHNAME' *************** AS *** 154,160 **** -- that's the only state that can be reproduced during an upgrade from 9.0. ALTER OPERATOR FAMILY gist_oid_ops USING gist ADD ! OPERATOR 6 <> (oid, oid) ; -- --- 304,312 ---- -- that's the only state that can be reproduced during an upgrade from 9.0. ALTER OPERATOR FAMILY gist_oid_ops USING gist ADD ! OPERATOR 6 <> (oid, oid) , ! OPERATOR 15 <-> (oid, oid) FOR ORDER BY pg_catalog.oid_ops , ! FUNCTION 8 (oid, oid) gbt_oid_distance (internal, oid, int2, oid) ; -- *************** RETURNS bool *** 170,175 **** --- 322,332 ---- AS 'MODULE_PATHNAME' LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_int2_distance(internal,int2,int2,oid) + RETURNS float8 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_int2_compress(internal) RETURNS internal AS 'MODULE_PATHNAME' *************** AS *** 214,220 **** STORAGE gbtreekey4; ALTER OPERATOR FAMILY gist_int2_ops USING gist ADD ! OPERATOR 6 <> (int2, int2) ; -- --- 371,379 ---- STORAGE gbtreekey4; ALTER OPERATOR FAMILY gist_int2_ops USING gist ADD ! OPERATOR 6 <> (int2, int2) , ! OPERATOR 15 <-> (int2, int2) FOR ORDER BY pg_catalog.integer_ops , ! FUNCTION 8 (int2, int2) gbt_int2_distance (internal, int2, int2, oid) ; -- *************** RETURNS bool *** 230,235 **** --- 389,399 ---- AS 'MODULE_PATHNAME' LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_int4_distance(internal,int4,int2,oid) + RETURNS float8 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_int4_compress(internal) RETURNS internal AS 'MODULE_PATHNAME' *************** AS *** 274,280 **** STORAGE gbtreekey8; ALTER OPERATOR FAMILY gist_int4_ops USING gist ADD ! OPERATOR 6 <> (int4, int4) ; -- --- 438,446 ---- STORAGE gbtreekey8; ALTER OPERATOR FAMILY gist_int4_ops USING gist ADD ! OPERATOR 6 <> (int4, int4) , ! OPERATOR 15 <-> (int4, int4) FOR ORDER BY pg_catalog.integer_ops , ! FUNCTION 8 (int4, int4) gbt_int4_distance (internal, int4, int2, oid) ; -- *************** RETURNS bool *** 290,295 **** --- 456,466 ---- AS 'MODULE_PATHNAME' LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_int8_distance(internal,int8,int2,oid) + RETURNS float8 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_int8_compress(internal) RETURNS internal AS 'MODULE_PATHNAME' *************** AS *** 334,340 **** STORAGE gbtreekey16; ALTER OPERATOR FAMILY gist_int8_ops USING gist ADD ! OPERATOR 6 <> (int8, int8) ; -- --- 505,513 ---- STORAGE gbtreekey16; ALTER OPERATOR FAMILY gist_int8_ops USING gist ADD ! OPERATOR 6 <> (int8, int8) , ! OPERATOR 15 <-> (int8, int8) FOR ORDER BY pg_catalog.integer_ops , ! FUNCTION 8 (int8, int8) gbt_int8_distance (internal, int8, int2, oid) ; -- *************** RETURNS bool *** 350,355 **** --- 523,533 ---- AS 'MODULE_PATHNAME' LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_float4_distance(internal,float4,int2,oid) + RETURNS float8 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_float4_compress(internal) RETURNS internal AS 'MODULE_PATHNAME' *************** AS *** 394,400 **** STORAGE gbtreekey8; ALTER OPERATOR FAMILY gist_float4_ops USING gist ADD ! OPERATOR 6 <> (float4, float4) ; -- --- 572,580 ---- STORAGE gbtreekey8; ALTER OPERATOR FAMILY gist_float4_ops USING gist ADD ! OPERATOR 6 <> (float4, float4) , ! OPERATOR 15 <-> (float4, float4) FOR ORDER BY pg_catalog.float_ops , ! FUNCTION 8 (float4, float4) gbt_float4_distance (internal, float4, int2, oid) ; -- *************** RETURNS bool *** 410,415 **** --- 590,600 ---- AS 'MODULE_PATHNAME' LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_float8_distance(internal,float8,int2,oid) + RETURNS float8 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_float8_compress(internal) RETURNS internal AS 'MODULE_PATHNAME' *************** AS *** 454,460 **** STORAGE gbtreekey16; ALTER OPERATOR FAMILY gist_float8_ops USING gist ADD ! OPERATOR 6 <> (float8, float8) ; -- --- 639,647 ---- STORAGE gbtreekey16; ALTER OPERATOR FAMILY gist_float8_ops USING gist ADD ! OPERATOR 6 <> (float8, float8) , ! OPERATOR 15 <-> (float8, float8) FOR ORDER BY pg_catalog.float_ops , ! FUNCTION 8 (float8, float8) gbt_float8_distance (internal, float8, int2, oid) ; -- *************** RETURNS bool *** 470,480 **** --- 657,677 ---- AS 'MODULE_PATHNAME' LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_ts_distance(internal,timestamp,int2,oid) + RETURNS float8 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_tstz_consistent(internal,timestamptz,int2,oid,internal) RETURNS bool AS 'MODULE_PATHNAME' LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_tstz_distance(internal,timestamptz,int2,oid) + RETURNS float8 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_ts_compress(internal) RETURNS internal AS 'MODULE_PATHNAME' *************** AS *** 524,530 **** STORAGE gbtreekey16; ALTER OPERATOR FAMILY gist_timestamp_ops USING gist ADD ! OPERATOR 6 <> (timestamp, timestamp) ; -- Create the operator class --- 721,729 ---- STORAGE gbtreekey16; ALTER OPERATOR FAMILY gist_timestamp_ops USING gist ADD ! OPERATOR 6 <> (timestamp, timestamp) , ! OPERATOR 15 <-> (timestamp, timestamp) FOR ORDER BY pg_catalog.interval_ops , ! FUNCTION 8 (timestamp, timestamp) gbt_ts_distance (internal, timestamp, int2, oid) ; -- Create the operator class *************** AS *** 546,552 **** STORAGE gbtreekey16; ALTER OPERATOR FAMILY gist_timestamptz_ops USING gist ADD ! OPERATOR 6 <> (timestamptz, timestamptz) ; -- --- 745,753 ---- STORAGE gbtreekey16; ALTER OPERATOR FAMILY gist_timestamptz_ops USING gist ADD ! OPERATOR 6 <> (timestamptz, timestamptz) , ! OPERATOR 15 <-> (timestamptz, timestamptz) FOR ORDER BY pg_catalog.interval_ops , ! FUNCTION 8 (timestamptz, timestamptz) gbt_tstz_distance (internal, timestamptz, int2, oid) ; -- *************** RETURNS bool *** 562,567 **** --- 763,773 ---- AS 'MODULE_PATHNAME' LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_time_distance(internal,time,int2,oid) + RETURNS float8 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_timetz_consistent(internal,timetz,int2,oid,internal) RETURNS bool AS 'MODULE_PATHNAME' *************** AS *** 616,622 **** STORAGE gbtreekey16; ALTER OPERATOR FAMILY gist_time_ops USING gist ADD ! OPERATOR 6 <> (time, time) ; CREATE OPERATOR CLASS gist_timetz_ops --- 822,830 ---- STORAGE gbtreekey16; ALTER OPERATOR FAMILY gist_time_ops USING gist ADD ! OPERATOR 6 <> (time, time) , ! OPERATOR 15 <-> (time, time) FOR ORDER BY pg_catalog.interval_ops , ! FUNCTION 8 (time, time) gbt_time_distance (internal, time, int2, oid) ; CREATE OPERATOR CLASS gist_timetz_ops *************** RETURNS bool *** 653,658 **** --- 861,871 ---- AS 'MODULE_PATHNAME' LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_date_distance(internal,date,int2,oid) + RETURNS float8 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_date_compress(internal) RETURNS internal AS 'MODULE_PATHNAME' *************** AS *** 697,703 **** STORAGE gbtreekey8; ALTER OPERATOR FAMILY gist_date_ops USING gist ADD ! OPERATOR 6 <> (date, date) ; -- --- 910,918 ---- STORAGE gbtreekey8; ALTER OPERATOR FAMILY gist_date_ops USING gist ADD ! OPERATOR 6 <> (date, date) , ! OPERATOR 15 <-> (date, date) FOR ORDER BY pg_catalog.integer_ops , ! FUNCTION 8 (date, date) gbt_date_distance (internal, date, int2, oid) ; -- *************** RETURNS bool *** 713,718 **** --- 928,938 ---- AS 'MODULE_PATHNAME' LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_intv_distance(internal,interval,int2,oid) + RETURNS float8 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_intv_compress(internal) RETURNS internal AS 'MODULE_PATHNAME' *************** AS *** 762,768 **** STORAGE gbtreekey32; ALTER OPERATOR FAMILY gist_interval_ops USING gist ADD ! OPERATOR 6 <> (interval, interval) ; -- --- 982,990 ---- STORAGE gbtreekey32; ALTER OPERATOR FAMILY gist_interval_ops USING gist ADD ! OPERATOR 6 <> (interval, interval) , ! OPERATOR 15 <-> (interval, interval) FOR ORDER BY pg_catalog.interval_ops , ! FUNCTION 8 (interval, interval) gbt_intv_distance (internal, interval, int2, oid) ; -- *************** RETURNS bool *** 778,783 **** --- 1000,1010 ---- AS 'MODULE_PATHNAME' LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_cash_distance(internal,money,int2,oid) + RETURNS float8 + AS 'MODULE_PATHNAME' + LANGUAGE C IMMUTABLE STRICT; + CREATE FUNCTION gbt_cash_compress(internal) RETURNS internal AS 'MODULE_PATHNAME' *************** AS *** 822,828 **** STORAGE gbtreekey16; ALTER OPERATOR FAMILY gist_cash_ops USING gist ADD ! OPERATOR 6 <> (money, money) ; -- --- 1049,1057 ---- STORAGE gbtreekey16; ALTER OPERATOR FAMILY gist_cash_ops USING gist ADD ! OPERATOR 6 <> (money, money) , ! OPERATOR 15 <-> (money, money) FOR ORDER BY pg_catalog.money_ops , ! FUNCTION 8 (money, money) gbt_cash_distance (internal, money, int2, oid) ; --
pgsql-hackers by date: