Thread: [HACKERS] [PATCH] Improve geometric types
This is my next attempt to bring more sanity to the geometric types. After the previous one [1] went nowhere, I extracted the parts I can fix without touching the EPSILON. I still hope to improve the problems around EPSILON after this. I organised the changes as 4 patches for ease of review: == geo-funcs-v1 == Refactor geometric functions and operators code The geometric types were not using each other's functions. I believe the reason behind this is simpler types line point and line being developed after more complicated ones. This patch reduces duplicate code and makes functions of different datatypes more compatible. We can do much better than that, but it would require touching many more lines. The changes can be summarised as: * Re-use more functions to implement others * Unify *_construct functions to obtain pre-allocated memory * Unify *_interpt_internal functions to obtain pre-allocated memory * Remove private functions from geo_decls.h * Switch using C11 hypot() as the comment suggested src/backend/utils/adt/geo_ops.c | 810 +++++++++++++------------------- src/include/utils/geo_decls.h | 12 +- src/test/regress/regress.c | 11 +- 3 files changed, 345 insertions(+), 488 deletions(-) == float-header-v04 == Provide header file for built-in float datatypes Even though, some datatypes under adt/ have separate header files, most of the simple ones do not. Their public functions were on the builtins.h. We would need to make more functions of floats public to let the geometric types built on top of them. This is a good opportunity to make a separate header file for floats. 1acf7572554515b99ef6e783750aaea8777524ec made _cmp functions public to solve NaN problem locally for GiST indexes. This patch reworks it in favour of a more extensive API. Kevin Grittner suggested to design the API using inline functions. They are easier to use compared to macros, and avoid double-evaluation hazards. contrib/btree_gin/btree_gin.c | 3 +- contrib/btree_gist/btree_ts.c | 2 +- contrib/cube/cube.c | 2 +- contrib/postgres_fdw/postgres_fdw.c | 2 +- src/backend/access/gist/gistget.c | 2 +- src/backend/access/gist/gistproc.c | 55 +- src/backend/access/gist/gistutil.c | 2 +- src/backend/utils/adt/float.c | 593 ++++-------------- src/backend/utils/adt/formatting.c | 8 +- src/backend/utils/adt/geo_ops.c | 6 +- src/backend/utils/adt/geo_spgist.c | 2 +- src/backend/utils/adt/numeric.c | 1 + src/backend/utils/adt/rangetypes_gist.c | 3 +- src/backend/utils/adt/rangetypes_selfuncs.c | 3 +- src/backend/utils/adt/rangetypes_typanalyze.c | 3 +- src/backend/utils/adt/timestamp.c | 1 + src/backend/utils/misc/guc.c | 1 + src/include/utils/builtins.h | 14 - src/include/utils/float.h | 383 +++++++++++ src/include/utils/geo_decls.h | 1 + 20 files changed, 561 insertions(+), 526 deletions(-) == geo-float-v1 == Use the built-in float datatype to implement geometric types This will provide: * Check for underflow and overflow * Check for division by zero * Handle NaNs consistently The patch also replaces all occurrences of "double" as "float8". They are the same, but were randomly spread around on the same file. src/backend/access/gist/gistproc.c | 156 +++++---- src/backend/utils/adt/geo_ops.c | 546 +++++++++++++++-------------- src/backend/utils/adt/geo_spgist.c | 36 +- src/include/utils/float.h | 13 + src/include/utils/geo_decls.h | 40 +-- 5 files changed, 412 insertions(+), 379 deletions(-) == line-fixes-v1 == Fix obvious problems around the line datatype I have noticed some line operators retuning wrong results, and Tom Lane spotted similar problems on more places. Source history reveals that during 1990s, the internal format of the line datatype is changed, but most functions haven't got the hint. The fixes are: * Make operators more symmetric * Reject invalid specification A=B=0 on receive * Avoid division by zero on perpendicular operator * Fix intersection and distance operators when neither A nor B is 1 * Avoid point distance operator crashing on float precision loss * Fix line segment distance by getting the minimum as the comment suggested Previous discussion: https://www.postgresql.org/message-id/flat/CAE2gYzw_-z%3DV2kh8QqFjenu%3D8MJXzOP44wRW%3DAzzeamrmTT1%3DQ%40mail.gmail.com src/backend/utils/adt/geo_ops.c | 115 +++++++++++++++++++++----------- 1 file changed, 77 insertions(+), 38 deletions(-) [1] https://www.postgresql.org/message-id/flat/CAE2gYzwwxPWbzxY3mtN4WL7W0DCkWo8gnB2ThUHU2XQ9XwgHMg%40mail.gmail.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation: not tested Hi Emre, I'm afraid these patches conflict with current master branch. The new status of this patch is: Waiting on Author
> I'm afraid these patches conflict with current master branch. Rebased. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
The following review has been posted through the commitfest application: make installcheck-world: tested, failed Implements feature: not tested Spec compliant: not tested Documentation: not tested PostgreSQL fails with SIGSEGV during `make check-world`. Backtrace: http://afiskon.ru/s/d4/f3dc17838a_sigsegv.txt regression.diffs (not very useful): http://afiskon.ru/s/ac/ac5294656c_regression.diffs.txt regression.out: http://afiskon.ru/s/70/39d872e2b8_regression.out.txt How to reproduce: https://github.com/afiskon/pgscripts/blob/master/full-build.sh The environment is Arch Linux x64, gcc 7.1.1 The new status of this patch is: Waiting on Author
> PostgreSQL fails with SIGSEGV during `make check-world`. Fixed. I am sorry for not running "check-world" before. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation: tested, passed LGTM. The new status of this patch is: Ready for Committer
Hello, sorry to late for the party, but may I comment on this? At Tue, 05 Sep 2017 13:18:12 +0000, Aleksander Alekseev <a.alekseev@postgrespro.ru> wrote in <20170905131812.18925.13551.pgcf@coridan.postgresql.org> > The following review has been posted through the commitfest application: > make installcheck-world: tested, passed > Implements feature: tested, passed > Spec compliant: tested, passed > Documentation: tested, passed > > LGTM. > > The new status of this patch is: Ready for Committer The first patch reconstructs the operators in layers. These functions are called very frequently when used. Some function are already inlined in float.h but some static functions in float.h also can be and are better be inlined. Some of *_internal, point_construct, line_calculate_point and so on are the candidates. You removed some DirectFunctionCall to the functions within the same file but other functions remain in the style, ex. poly_center or on_sl. The function called from the former seems large enough but the latter function calls a so small function that it could be inlined. Would you like to make some additional functions use C call (instead of DirectFunctionCall) and inlining them? This is not a fault of this patch, but some functions like on_pb seems missing comment to describe what it is. Would you like to add some? In the second patch, the additional include fmgrprotos.h in btree_gin.c seems needless. Some float[48] features were macros so that they share the same expressions between float4 and float8. They still seems sharing perfectly the same expressions in float.h. Is there any reason for converting them into typed inline functions? In float.h, MAXDOUBLEWIDTH is redueced from 500 to 128, but the exponent of double is up to 308 so it doesn't seem sufficient. On the other hand we won't use non-scientific notation for extremely large numbers and it requires (perhaps) up to 26 bytes in the case. In the soruce code, most of them uses "%e" and one of them uses '%g". %e always takes the format of "-1.(17digits)e+308".. So it would be less than 26 characters. =# set extra_float_digits to 3; =# select -1.221423424320453e308::float8; ?column? ----------------------------1.22142342432045302e+308 man printf: (linux) > Style e is used if the exponent from its conversion is less than > -4 or greater than or equal to the precision. So we should be safe to have a buffer with 26 byte length and 500 bytes will apparently too large and even 128 will be too loose in most cases. So how about something like the following? #define MINDOUBLEWIDTH 32 ... float4out@float.c<modified>: > int ndig = FLT_DIG + extra_float_digits; > > if (ndig < 1) > ndig = 1; > > len = snprintf(ascii, MINDOUBLEWIDTH + 1, "%+.*g", ndig, num); > if (len > MINDOUBLEWIDTH + 1) > { > ascii = (char *) repalloc(ascii, len); > if (snprintf(ascii, len, "%+.*e", ndig, num) > len) > error(ERROR, "something wrong happens..."); > } I don't think the if part can be used so there would be no performance degradation, I believe. ---- I'd like to pause here. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
> Hello, sorry to late for the party, but may I comment on this? Thank you for picking this up again. > The first patch reconstructs the operators in layers. These > functions are called very frequently when used. Some function are > already inlined in float.h but some static functions in float.h > also can be and are better be inlined. Some of *_internal, > point_construct, line_calculate_point and so on are the > candidates. They are static functions. I though compiler can decide to inline them. Do you think adding "inline" to the function signatures are necessary? > You removed some DirectFunctionCall to the functions within the > same file but other functions remain in the style, > ex. poly_center or on_sl. The function called from the former > seems large enough but the latter function calls a so small > function that it could be inlined. Would you like to make some > additional functions use C call (instead of DirectFunctionCall) > and inlining them? I tried to minimise my changes to make reviewing easier. I can make "_internal" functions for the remaining DirectFunctionCall()s, if you find it necessary. > This is not a fault of this patch, but some functions like on_pb > seems missing comment to describe what it is. Would you like to > add some? I will add some on the next version. > In the second patch, the additional include fmgrprotos.h in > btree_gin.c seems needless. It must be something I missed on rebase. I will remove it. > Some float[48] features were macros > so that they share the same expressions between float4 and > float8. They still seems sharing perfectly the same expressions > in float.h. Is there any reason for converting them into typed > inline functions? Kevin Grittner suggested using inline functions instead of macros. They are easier to use compared to macros, and avoid double-evaluation hazards. > In float.h, MAXDOUBLEWIDTH is redueced from 500 to 128, but the > exponent of double is up to 308 so it doesn't seem sufficient. On > the other hand we won't use non-scientific notation for extremely > large numbers and it requires (perhaps) up to 26 bytes in the > case. In the soruce code, most of them uses "%e" and one of them > uses '%g". %e always takes the format of > "-1.(17digits)e+308".. So it would be less than 26 > characters. > > =# set extra_float_digits to 3; > =# select -1.221423424320453e308::float8; > ?column? > --------------------------- > -1.22142342432045302e+308 > > man printf: (linux) >> Style e is used if the exponent from its conversion is less than >> -4 or greater than or equal to the precision. > > So we should be safe to have a buffer with 26 byte length and 500 > bytes will apparently too large and even 128 will be too loose in > most cases. So how about something like the following? > > #define MINDOUBLEWIDTH 32 Should it be same for float4 and float8? > ... > float4out@float.c<modified>: >> int ndig = FLT_DIG + extra_float_digits; >> >> if (ndig < 1) >> ndig = 1; >> >> len = snprintf(ascii, MINDOUBLEWIDTH + 1, "%+.*g", ndig, num); >> if (len > MINDOUBLEWIDTH + 1) >> { >> ascii = (char *) repalloc(ascii, len); >> if (snprintf(ascii, len, "%+.*e", ndig, num) > len) >> error(ERROR, "something wrong happens..."); >> } > > I don't think the if part can be used so there would be no > performance degradation, I believe. Wouldn't this change the output of the float datatypes? That would be a backwards incompatible change. > I'd like to pause here. I will submit new versions after your are done with your review. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
At Tue, 12 Sep 2017 19:30:44 +0200, Emre Hasegeli <emre@hasegeli.com> wrote in <CAE2gYzxDRvNdhrWQa7ym423uvHLWJXkqx=BoJePYMdrKPR1Zhg@mail.gmail.com> > > Hello, sorry to late for the party, but may I comment on this? > > Thank you for picking this up again. > > > The first patch reconstructs the operators in layers. These > > functions are called very frequently when used. Some function are > > already inlined in float.h but some static functions in float.h > > also can be and are better be inlined. Some of *_internal, > > point_construct, line_calculate_point and so on are the > > candidates. > > They are static functions. I though compiler can decide to inline > them. Do you think adding "inline" to the function signatures are > necessary? C++ surely make just static functions inlined but I'm not sure C compiler does that. "static" is a scope modifier and "inline" is not (what kind of modifier is it?). In regard to gcc, https://gcc.gnu.org/onlinedocs/gcc-7.2.0/gcc/Inline.html#Inline > By declaring a function inline, you can direct GCC to make > calls to that function faster. <snip> You can also direct GCC > to try to integrate all “simple enough” functions into their > callers with the option -finline-functions. I didn't find that postgresql uses the options so I think we shouldn't expect that they are inlined automatically. > > You removed some DirectFunctionCall to the functions within the > > same file but other functions remain in the style, > > ex. poly_center or on_sl. The function called from the former > > seems large enough but the latter function calls a so small > > function that it could be inlined. Would you like to make some > > additional functions use C call (instead of DirectFunctionCall) > > and inlining them? > > I tried to minimise my changes to make reviewing easier. I can make > "_internal" functions for the remaining DirectFunctionCall()s, if you > find it necessary. Ok, it would be another patch even if we need that. > > This is not a fault of this patch, but some functions like on_pb > > seems missing comment to describe what it is. Would you like to > > add some? > > I will add some on the next version. > > > In the second patch, the additional include fmgrprotos.h in > > btree_gin.c seems needless. > > It must be something I missed on rebase. I will remove it. > > > Some float[48] features were macros > > so that they share the same expressions between float4 and > > float8. They still seems sharing perfectly the same expressions > > in float.h. Is there any reason for converting them into typed > > inline functions? > > Kevin Grittner suggested using inline functions instead of macros. > They are easier to use compared to macros, and avoid double-evaluation > hazards. I recall a bit about the double-evaluation hazards. I think the functions needs a comment describing the reasons so that anyone kind won't try to merge them into a macro again. > > In float.h, MAXDOUBLEWIDTH is redueced from 500 to 128, but the > > exponent of double is up to 308 so it doesn't seem sufficient. On > > the other hand we won't use non-scientific notation for extremely > > large numbers and it requires (perhaps) up to 26 bytes in the > > case. In the soruce code, most of them uses "%e" and one of them > > uses '%g". %e always takes the format of > > "-1.(17digits)e+308".. So it would be less than 26 > > characters. > > > > =# set extra_float_digits to 3; > > =# select -1.221423424320453e308::float8; > > ?column? > > --------------------------- > > -1.22142342432045302e+308 > > > > man printf: (linux) > >> Style e is used if the exponent from its conversion is less than > >> -4 or greater than or equal to the precision. > > > > So we should be safe to have a buffer with 26 byte length and 500 > > bytes will apparently too large and even 128 will be too loose in > > most cases. So how about something like the following? > > > > #define MINDOUBLEWIDTH 32 > > Should it be same for float4 and float8? Strictly they are different each other but float8 needs 26 bytes (additional 1 byte is the sign for expnent, which I forgot.) and float4 needs 15 bytes so it could be reduced to 32 and 16 respectively. | select -1.11111111111111111111111111e-38::float4; | -1.11111113e-38 | select -1.11111111111111111111111111e-4::float4; | -0.000111111112 | select -1.11111111111111111111111111e-5::float4; | -1.11111112e-05 On the other hand float4 is anyway converted into double in float4out and I'm not sure that the extension doesn't adds spurious digits. Therefore I think it would be reasonable that "%g" formatter requires up to 27 bytes (26 + terminating zero) so it should use MINDOUBLEWIDTH regardless of the precision of the value given to the formatter. Addition to that if we are deliberately using %g in float4out, we can use flot8out_internal instead of most of the stuff of float4out. | Datum | float4out(PG_FUNCTION_ARGS) | { | float8 num = PG_GETARG_FLOAT4(0); | | PG_RETURN_CSTRING(float8out_internal(num)); | } > > ... > > float4out@float.c<modified>: > >> int ndig = FLT_DIG + extra_float_digits; > >> > >> if (ndig < 1) > >> ndig = 1; > >> > >> len = snprintf(ascii, MINDOUBLEWIDTH + 1, "%+.*g", ndig, num); > >> if (len > MINDOUBLEWIDTH + 1) > >> { > >> ascii = (char *) repalloc(ascii, len); > >> if (snprintf(ascii, len, "%+.*e", ndig, num) > len) > >> error(ERROR, "something wrong happens..."); > >> } > > > > I don't think the if part can be used so there would be no > > performance degradation, I believe. > > Wouldn't this change the output of the float datatypes? That would be > a backwards incompatible change. It just reduces the unused part after terminator \0, and we won't get incompatible result. > > I'd like to pause here. > > I will submit new versions after your are done with your review. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Sep 14, 2017 at 3:33 AM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > I recall a bit about the double-evaluation hazards. I think the > functions needs a comment describing the reasons so that anyone > kind won't try to merge them into a macro again. I think we can count on PostgreSQL developers to understand the advantages of an inline function over a macro. Even if they don't, the solution can't be to put a comment in every place where an inline function is used explaining it. That would be very repetitive. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
At Thu, 14 Sep 2017 16:19:13 -0400, Robert Haas <robertmhaas@gmail.com> wrote in <CA+TgmobinBA7uvQifYaYGdDUoF6VTo56dvoTT6nKSpJF-Zfv5A@mail.gmail.com> > On Thu, Sep 14, 2017 at 3:33 AM, Kyotaro HORIGUCHI > <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > > I recall a bit about the double-evaluation hazards. I think the > > functions needs a comment describing the reasons so that anyone > > kind won't try to merge them into a macro again. > > I think we can count on PostgreSQL developers to understand the > advantages of an inline function over a macro. Even if they don't, > the solution can't be to put a comment in every place where an inline > function is used explaining it. That would be very repetitive. Of course putting such a comment to all inline functions is silly. The point here is that many pairs of two functions with exactly the same shape but handle different types are defined side by side. Such situation seems tempting to merge them into single macros, as the previous author did there. So a simple one like the following would be enough. /* don't merge the following same functions with different types into single macros so that double evaluation won't happen*/ Is it still too verbose? regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
At Fri, 15 Sep 2017 17:23:28 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in <20170915.172328.97446299.horiguchi.kyotaro@lab.ntt.co.jp> > At Thu, 14 Sep 2017 16:19:13 -0400, Robert Haas <robertmhaas@gmail.com> wrote in <CA+TgmobinBA7uvQifYaYGdDUoF6VTo56dvoTT6nKSpJF-Zfv5A@mail.gmail.com> > > On Thu, Sep 14, 2017 at 3:33 AM, Kyotaro HORIGUCHI > > <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > > > I recall a bit about the double-evaluation hazards. I think the > > > functions needs a comment describing the reasons so that anyone > > > kind won't try to merge them into a macro again. > > > > I think we can count on PostgreSQL developers to understand the > > advantages of an inline function over a macro. Even if they don't, > > the solution can't be to put a comment in every place where an inline > > function is used explaining it. That would be very repetitive. > > Of course putting such a comment to all inline functions is > silly. The point here is that many pairs of two functions with > exactly the same shape but handle different types are defined > side by side. Such situation seems tempting to merge them into > single macros, as the previous author did there. > > So a simple one like the following would be enough. > > /* don't merge the following same functions with different types > into single macros so that double evaluation won't happen */ > > Is it still too verbose? That being said, I'm not stick on that if Robert or others think it as needless. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Hello, just one point on 0001. The patch replace pg_hypot with hypot in libc. The man page says as follows. man 3 hypot > If the result overflows, a range error occurs, and the functions return > HUGE_VAL, HUGE_VALF, or HUGE_VALL, respectively. .. >ERRORS > See math_error(7) for information on how to determine whether an error > has occurred when calling these functions. > > The following errors can occur: > > Range error: result overflow > errno is set to ERANGE. An overflow floating-point exception > (FE_OVERFLOW) is raised. > > Range error: result underflow > An underflow floating-point exception (FE_UNDERFLOW) is raised. > > These functions do not set errno for this case. So, the code seems to need some amendments following to this spec. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Sep 15, 2017 at 4:23 AM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > /* don't merge the following same functions with different types > into single macros so that double evaluation won't happen */ > > Is it still too verbose? Personally, I don't think such a comment has much value, but I am not going to spend a lot of time arguing about it. It's really up to the eventual committer to decide, and that probably won't be me in this case. My knowledge of the geometric types isn't great, and I am a tad busy breaking^Wimproving things I do understand. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
At Fri, 15 Sep 2017 11:25:30 -0400, Robert Haas <robertmhaas@gmail.com> wrote in <CA+TgmoYgg8=m9+Y54Az1r+KBpMuiEOZM_DLDF04jMP4twGR3Ng@mail.gmail.com> > On Fri, Sep 15, 2017 at 4:23 AM, Kyotaro HORIGUCHI > <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > > /* don't merge the following same functions with different types > > into single macros so that double evaluation won't happen */ > > > > Is it still too verbose? > > Personally, I don't think such a comment has much value, but I am not > going to spend a lot of time arguing about it. It's really up to the > eventual committer to decide, and that probably won't be me in this > case. My knowledge of the geometric types isn't great, and I am a tad > busy breaking^Wimproving things I do understand. Ok, I agree with you. # Though it is not a issue of geometric types :p I'll withdrow the comment. Maybe someone notices of that when reviewing such a patch. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
The new versions of the patches are attached addressing your comments. > C++ surely make just static functions inlined but I'm not sure C > compiler does that. Thank you for your explanation. I marked the mentioned functions "inline". > So we should be safe to have a buffer with 26 byte length and 500 > bytes will apparently too large and even 128 will be too loose in > most cases. So how about something like the following? > > #define MINDOUBLEWIDTH 32 I left this part out for now. We can improve it separately. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
> The patch replace pg_hypot with hypot in libc. The man page says > as follows. > > man 3 hypot >> If the result overflows, a range error occurs, and the functions return >> HUGE_VAL, HUGE_VALF, or HUGE_VALL, respectively. > .. >>ERRORS >> See math_error(7) for information on how to determine whether an error >> has occurred when calling these functions. >> >> The following errors can occur: >> >> Range error: result overflow >> errno is set to ERANGE. An overflow floating-point exception >> (FE_OVERFLOW) is raised. >> >> Range error: result underflow >> An underflow floating-point exception (FE_UNDERFLOW) is raised. >> >> These functions do not set errno for this case. > > So, the code seems to need some amendments following to this > spec. I included them on the latest version. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Hello. Thank you for the new version. 0001: applies cleanly. regress passed. this mainly refactoring geo_ops.c and replacing pg_hypot with hypot(3). 0002: applies cleanly. regress passed. this just replaces float-ops macros into inline functions. 0003: applies cleanly. regress passed. replaces double with float8 and bare arithmetic with defined functions. 0004: applies cleanly. regress passsed. fix broken line-related functions. I have some comments on this (later). At Wed, 27 Sep 2017 17:44:52 +0200, Emre Hasegeli <emre@hasegeli.com> wrote in <CAE2gYzz2=335XEMHK-neNo=HgGskkPHQxUXkh8yDNsZAnCB5Bg@mail.gmail.com> > The new versions of the patches are attached addressing your comments. > > > C++ surely make just static functions inlined but I'm not sure C > > compiler does that. > > Thank you for your explanation. I marked the mentioned functions "inline". Thanks. > > So we should be safe to have a buffer with 26 byte length and 500 > > bytes will apparently too large and even 128 will be too loose in > > most cases. So how about something like the following? > > > > #define MINDOUBLEWIDTH 32 > > I left this part out for now. We can improve it separately. Agreed. I found that psprintf does that with initial length of 128. By the way, I found that MAXDOUBLEWIDTH has two inconsistent definitions in formatting.c(500) and float.c(128). The definition in new float.h is according to float.c and it seems better to be untouched and it would be another issue. At Wed, 27 Sep 2017 17:45:04 +0200, Emre Hasegeli <emre@hasegeli.com> wrote in <CAE2gYzz1KuT57vrO-XZk3KSqdpehAupJPLYu7AshuUwRF7MiMg@mail.gmail.com> > > The patch replace pg_hypot with hypot in libc. The man page says > > as follows. ... > I included them on the latest version. # The commit message of 0001 says that "using C11 hypot()" but it # is sine C99. I suppose we shold follow C99 at the time so I # suggest rewrite it in the next version if any. It seems bettern than expected. I confirmed that pg_hypot(DBL_MAX, DBL_MAX) returned a value that is equivalent to HUGE_VAL * HUGE_VAL on glibc, but I'm not sure the behavior on other platforms is the same. ====== For other potential reviewers: I found the origin of the function here. https://www.postgresql.org/message-id/4A90BD76.7070804@netspace.net.au https://www.postgresql.org/message-id/AANLkTim4cHELcGPf5w7Zd43_dQi_2RJ_b5_F_idSSbZI%40mail.gmail.com And the reason for pg_hypot is seen here. https://www.postgresql.org/message-id/407d949e0908222139t35ad3ad2q3e6b15646a27dd64@mail.gmail.com I think the replacement is now acceptable according to the discussion. ====== And this should be the last comment of mine on the patch set. In 0004, line_distance and line_interpt_internal seem correct. Seemingly horizontal/vertical checks are redundant but it is because unclearly defined EPSLON bahavior. lseg_perp seems correct. close_pl got a bug fix not only refactoring. I think it is recommended to separate bugs and improvements, but I'm fine with the current patch. You added sanity check "A==0 && B==0" (in Ax + By + C) in line_recv. I'm not sure the necessity of the check since it has been checked in line_in but anyway the comparisons seem to be typo(or thinko) of FPzero. dist_pl is changed to take the smaller distance of both ends of the segment. It seems absorbing error, so it might be better taking the mean of the two distances. If you have a firm reason for the change, it is better to be written there, or it might be better left alone. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
> And this should be the last comment of mine on the patch set. Thank you. The new versions of the patches are attached addressing your comments. > By the way, I found that MAXDOUBLEWIDTH has two inconsistent > definitions in formatting.c(500) and float.c(128). The definition > in new float.h is according to float.c and it seems better to be > untouched and it would be another issue. The last version of the patch don't move these declarations to the header file. > # The commit message of 0001 says that "using C11 hypot()" but it > # is sine C99. I suppose we shold follow C99 at the time so I > # suggest rewrite it in the next version if any. Changed. > close_pl got a bug fix not only refactoring. I think it is > recommended to separate bugs and improvements, but I'm fine with > the current patch. I split the refactoring to the first patch. > You added sanity check "A==0 && B==0" (in Ax + By + C) in > line_recv. I'm not sure the necessity of the check since it has > been checked in line_in but anyway the comparisons seem to be > typo(or thinko) of FPzero. Tom Lane suggested [1] this one. I now made it use FPzero(). > dist_pl is changed to take the smaller distance of both ends of > the segment. It seems absorbing error, so it might be better > taking the mean of the two distances. If you have a firm reason > for the change, it is better to be written there, or it might be > better left alone. I don't really, so I left that part out. [1] https://www.postgresql.org/message-id/11053.1466362319%40sss.pgh.pa.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Mon, Oct 2, 2017 at 4:23 AM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > For other potential reviewers: > > I found the origin of the function here. > > https://www.postgresql.org/message-id/4A90BD76.7070804@netspace.net.au > https://www.postgresql.org/message-id/AANLkTim4cHELcGPf5w7Zd43_dQi_2RJ_b5_F_idSSbZI%40mail.gmail.com > > And the reason for pg_hypot is seen here. > > https://www.postgresql.org/message-id/407d949e0908222139t35ad3ad2q3e6b15646a27dd64@mail.gmail.com > > I think the replacement is now acceptable according to the discussion. > ====== I think if we're going to do this it should be separated out as its own patch. Also, I think someone should explain what the reasoning behind the change is. Do we, for example, foresee that the built-in code might be faster or less likely to overflow? Because we're clearly taking a risk -- most trivially, that the BF will break, or more seriously, that some machines will have versions of this function that don't actually behave quite the same. That brings up a related point. How good is our test case coverage for hypot(), especially in strange corner cases, like this one mentioned in pg_hypot()'s comment: * This implementation conforms to IEEE Std 1003.1 and GLIBC, in that the* case of hypot(inf,nan) results in INF, and notNAN. I'm potentially willing to commit a patch that just makes the pg_hypot() -> hypot() change and does nothing else, if there are not objections to that change, but I want to be sure that we'll know right away if that turns out to break. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Thanks. At Mon, 2 Oct 2017 11:46:15 +0200, Emre Hasegeli <emre@hasegeli.com> wrote in <CAE2gYzz7wiUnyb1TxCk5GzXt6j7efzGmMgH0gTe8xwKZ=FAF5A@mail.gmail.com> > > And this should be the last comment of mine on the patch set. > > Thank you. The new versions of the patches are attached addressing > your comments. > > > By the way, I found that MAXDOUBLEWIDTH has two inconsistent > > definitions in formatting.c(500) and float.c(128). The definition > > in new float.h is according to float.c and it seems better to be > > untouched and it would be another issue. > > The last version of the patch don't move these declarations to the header file. Yeah, it is not about the patch itself. > > # The commit message of 0001 says that "using C11 hypot()" but it > > # is sine C99. I suppose we shold follow C99 at the time so I > > # suggest rewrite it in the next version if any. > > Changed. > > > close_pl got a bug fix not only refactoring. I think it is > > recommended to separate bugs and improvements, but I'm fine with > > the current patch. > > I split the refactoring to the first patch. > > > You added sanity check "A==0 && B==0" (in Ax + By + C) in > > line_recv. I'm not sure the necessity of the check since it has > > been checked in line_in but anyway the comparisons seem to be > > typo(or thinko) of FPzero. > > Tom Lane suggested [1] this one. I now made it use FPzero(). I see. It's fine with me. I suppose that Tom didn't intened the suggestion to be teken literary so using FPzero() seems better (at least for now). > > dist_pl is changed to take the smaller distance of both ends of > > the segment. It seems absorbing error, so it might be better > > taking the mean of the two distances. If you have a firm reason > > for the change, it is better to be written there, or it might be > > better left alone. > > I don't really, so I left that part out. Mmm, sorry. It's my mistake. > [1] https://www.postgresql.org/message-id/11053.1466362319%40sss.pgh.pa.us I'll look the new version further. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
At Mon, 2 Oct 2017 08:23:49 -0400, Robert Haas <robertmhaas@gmail.com> wrote in <CA+TgmoYsgw0TcjJQ1CE_6vDOxgEhxYQkfNx93Mfwx23WOLM0NA@mail.gmail.com> > On Mon, Oct 2, 2017 at 4:23 AM, Kyotaro HORIGUCHI > <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > > For other potential reviewers: > > > > I found the origin of the function here. > > > > https://www.postgresql.org/message-id/4A90BD76.7070804@netspace.net.au > > https://www.postgresql.org/message-id/AANLkTim4cHELcGPf5w7Zd43_dQi_2RJ_b5_F_idSSbZI%40mail.gmail.com > > > > And the reason for pg_hypot is seen here. > > > > https://www.postgresql.org/message-id/407d949e0908222139t35ad3ad2q3e6b15646a27dd64@mail.gmail.com > > > > I think the replacement is now acceptable according to the discussion. > > ====== > > I think if we're going to do this it should be separated out as its > own patch. +1 > Also, I think someone should explain what the reasoning > behind the change is. Do we, for example, foresee that the built-in > code might be faster or less likely to overflow? Because we're > clearly taking a risk -- most trivially, that the BF will break, or > more seriously, that some machines will have versions of this function > that don't actually behave quite the same. > > That brings up a related point. How good is our test case coverage > for hypot(), especially in strange corner cases, like this one > mentioned in pg_hypot()'s comment: > > * This implementation conforms to IEEE Std 1003.1 and GLIBC, in that the > * case of hypot(inf,nan) results in INF, and not NAN. I'm not sure how precise we practically need them to be identical. FWIW as a rough confirmation on my box, I compared hypot and pg_hypot for the following arbitrary choosed pairs of parameters. {2.2e-308, 2.2e-308},{2.2e-308, 1.7e307},{1.7e307, 1.7e307},{1.7e308, 1.7e308},{2.2e-308, DBL_MAX},{1.7e308, DBL_MAX},{DBL_MIN,DBL_MAX},{DBL_MAX, DBL_MAX},{1.7e307, INFINITY},{2.2e-308, INFINITY},{0, INFINITY},{DBL_MIN, INFINITY},{INFINITY,INFINITY},{1, NAN},{INFINITY, NAN},{NAN, NAN}, Only the first pair gave slightly not-exactly-equal results but it seems to do no harm. hypot set underflow flag. 0: hypot=3.111269837220809e-308 (== 0.0 is 0, < DBL_MIN is 0) pg_hypot=3.11126983722081e-308 (== 0.0 is 0, < DBL_MIN is0) equal=0, hypot(errno:0, inval:0, div0:0, of=0, uf=1), pg_hypot(errno:0, inval:0, div0:0, of=0, uf=0) But not sure how the both functions behave on other platforms. > I'm potentially willing to commit a patch that just makes the > pg_hypot() -> hypot() change and does nothing else, if there are not > objections to that change, but I want to be sure that we'll know right > away if that turns out to break. -- Kyotaro Horiguchi NTT Open Source Software Center #include <stdio.h> #include <float.h> #include <math.h> #include <fenv.h> #include <errno.h> double pg_hypot(double x, double y) {double yx; /* Handle INF and NaN properly */if (isinf(x) || isinf(y)) return (double) INFINITY; if (isnan(x) || isnan(y)) return (double) NAN; /* Else, drop any minus signs */x = fabs(x);y = fabs(y); /* Swap x and y if needed to make x the larger one */if (x < y){ double temp = x; x = y; y = temp;} /* * If y is zero, the hypotenuse is x. This test saves a few cycles in * such cases, but more importantly it also protectsagainst * divide-by-zero errors, since now x >= y. */if (y == 0.0) return x; /* Determine the hypotenuse */yx = y / x;return x * sqrt(1.0 + (yx * yx)); } void setfeflags(int *invalid, int *divzero, int *overflow, int *underflow) {int err = fetestexcept(FE_INVALID | FE_DIVBYZERO | FE_OVERFLOW | FE_UNDERFLOW);*invalid = ((err &FE_INVALID) != 0);*divzero = ((err & FE_DIVBYZERO) != 0);*overflow = ((err & FE_OVERFLOW) != 0);*underflow = ((err & FE_UNDERFLOW)!= 0); } int main(void) {double x;double y;double p[][2] = { {2.2e-308, 2.2e-308}, {2.2e-308, 1.7e307}, {1.7e307, 1.7e307}, {1.7e308, 1.7e308}, {2.2e-308, DBL_MAX}, {1.7e308, DBL_MAX}, {DBL_MIN, DBL_MAX}, {DBL_MAX, DBL_MAX}, {1.7e307, INFINITY}, {2.2e-308, INFINITY}, {0, INFINITY}, {DBL_MIN, INFINITY}, {INFINITY, INFINITY}, {1, NAN}, {INFINITY, NAN}, {NAN, NAN}, {0.0, 0.0} };inti; for (i = 0 ; p[i][0] != 0.0 || p[i][1] != 0.0 ; i++){ int errno1, errno2; int invalid1 = 0, divzero1 = 0, overflow1= 0, underflow1 = 0; int invalid2 = 0, divzero2 = 0, overflow2 = 0, underflow2 = 0; int cmp; errno = 0; feclearexcept(FE_ALL_EXCEPT); x = hypot(p[i][0], p[i][1]); errno1 = errno; setfeflags(&invalid1,&divzero1, &overflow1, &underflow1); errno = 0; feclearexcept(FE_ALL_EXCEPT); y = pg_hypot(p[i][0], p[i][1]); errno2 = errno; setfeflags(&invalid2,&divzero2, &overflow2, &underflow2); /* assume NaN == NaN here */ if (isnan(x) && isnan(y)) cmp = 1; else cmp = (x == y); printf("%d: hypot=%.16lg (== 0.0 is %d, < DBL_MIN is %d)\n pg_hypot=%.16lg (== 0.0 is %d, < DBL_MIN is %d)\n equal=%d,hypot(errno:%d, inval:%d, div0:%d, of=%d, uf=%d), pg_hypot(errno:%d, inval:%d, div0:%d, of=%d, uf=%d)\n", i, x, x == 0.0, x < DBL_MIN, y, y == 0.0, y < DBL_MIN, cmp, errno1, invalid1, divzero1, overflow1, underflow1, errno2, invalid2, divzero2, overflow2, underflow2);}return 0; } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas wrote: > I'm potentially willing to commit a patch that just makes the > pg_hypot() -> hypot() change and does nothing else, if there are not > objections to that change, but I want to be sure that we'll know right > away if that turns out to break. Uh, I thought pg_hypot() was still needed on our oldest supported platforms. Or is hypot() already supported there? If not, I suppose we can keep the HYPOT() macro, and have it use hypot() on platforms that have it and pg_hypot elsewhere? That particular fraction of 0001 seemed a bit dubious to me, as the largest possible source of platform dependency issues. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
> I think if we're going to do this it should be separated out as its > own patch. Also, I think someone should explain what the reasoning > behind the change is. Do we, for example, foresee that the built-in > code might be faster or less likely to overflow? Because we're > clearly taking a risk -- most trivially, that the BF will break, or > more seriously, that some machines will have versions of this function > that don't actually behave quite the same. I included removal of pg_hypot() on my patch simply because the comment on the function header is suggesting it. I though if we are going to clean this module up, we better deal it first. I understand the risk. The patches include more changes. It may be a good idea to have those together. > That brings up a related point. How good is our test case coverage > for hypot(), especially in strange corner cases, like this one > mentioned in pg_hypot()'s comment: > > * This implementation conforms to IEEE Std 1003.1 and GLIBC, in that the > * case of hypot(inf,nan) results in INF, and not NAN. I don't see any tests of geometric types with INF or NaN. Currently, there isn't consistent behaviour for them. I don't think we can easily add portable ones on the current state, but we should be able to do so with my patches. I will look into it. > I'm potentially willing to commit a patch that just makes the > pg_hypot() -> hypot() change and does nothing else, if there are not > objections to that change, but I want to be sure that we'll know right > away if that turns out to break. I can split this one into another patch. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
> Uh, I thought pg_hypot() was still needed on our oldest supported > platforms. Or is hypot() already supported there? If not, I suppose we > can keep the HYPOT() macro, and have it use hypot() on platforms that > have it and pg_hypot elsewhere? That particular fraction of 0001 seemed > a bit dubious to me, as the largest possible source of platform > dependency issues. What is our oldest supported platform? We can also just keep pg_hypot(). I don't think getting rid of it justifies much work. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Emre Hasegeli <emre@hasegeli.com> writes: >> Uh, I thought pg_hypot() was still needed on our oldest supported >> platforms. Or is hypot() already supported there? > What is our oldest supported platform? Our normal reference for such questions is SUS v2 (POSIX 1997): http://pubs.opengroup.org/onlinepubs/007908799/ I looked into that, and noted that it does specify hypot(), but it has different rules for edge conditions: If the result would cause overflow, HUGE_VAL is returned and errnomay be set to [ERANGE]. If x or y is NaN, NaN is returned. and errno may be set to [EDOM]. If the correct result would cause underflow, 0 is returned anderrno may be set to [ERANGE]. whereas POSIX 2008 saith: If the correct value would cause overflow, a range error shalloccur and hypot(), hypotf(), and hypotl() shall return thevalueof the macro HUGE_VAL, HUGE_VALF, and HUGE_VALL, respectively. [MX] If x or y is ±Inf, +Inf shall be returned (even if one of xor y is NaN). If x or y is NaN, and the other is not ±Inf, a NaN shall bereturned. [MXX] If both arguments are subnormal and the correct result issubnormal, a range error may occur and the correct resultshallbe returned. In short, the reason that the existing comment makes a point of the Inf-and-NaN behavior is that the standard changed somewhere along the line. While I believe we could get away with assuming that hypot() exists everywhere, it's much less clear that we could rely on the result being Inf and not NaN in this case. Now, it's also not clear that anything in PG really cares. But if we do care, I think we should keep pg_hypot() ... and maybe clarify the comment a bit more. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
> Now, it's also not clear that anything in PG really cares. But if we > do care, I think we should keep pg_hypot() ... and maybe clarify the > comment a bit more. I am not sure how useful NaNs are in geometric types context, but we allow them, so inconsistent hypot() would be a problem. I will change my patches to keep pg_hypot(). -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
> I am not sure how useful NaNs are in geometric types context, but we > allow them, so inconsistent hypot() would be a problem. I will change > my patches to keep pg_hypot(). New versions of the patches are attached with 2 additional ones. The new versions leave pg_hypot() in place. One of the new patches improves the test coverage. The line coverage of geo_ops.c increases from 55% to 81%. The other one fixes -0 values to 0 on float operators. I am not sure about performance implication of this, so kept it separate. It may be a better idea to check this only on the platforms that has tendency to produce -0. While working on the tests, I found some unreachable code and removed it. I also found that lseg ## lseg operator returning wrong results. It is defined as "closest point to first segment on the second segment", but: > # select '[(1,2),(3,4)]'::lseg ## '[(0,0),(6,6)]'::lseg; > ?column? > ---------- > (1,2) > (1 row) I appended the fix to the patches. This is also effecting lseg ## box operator. I also changed recently band-aided point ## lseg operator to return the point instead of NULL when it cannot find the correct result to avoid the operators depending on this one to crash. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Hello, thanks for the new patch. 0004 failed to be applied on the underneath patches. At Sun, 5 Nov 2017 15:54:19 +0100, Emre Hasegeli <emre@hasegeli.com> wrote in <CAE2gYzzngpYgrQbJ-2TjzZ+MZBa0D0Xzj8tjjJLv6C3CPARMGw@mail.gmail.com> > > I am not sure how useful NaNs are in geometric types context, but we > > allow them, so inconsistent hypot() would be a problem. I will change > > my patches to keep pg_hypot(). > > New versions of the patches are attached with 2 additional ones. The > new versions leave pg_hypot() in place. One of the new patches > improves the test coverage. The line coverage of geo_ops.c increases > from 55% to 81%. The other one fixes -0 values to 0 on float > operators. I am not sure about performance implication of this, so > kept it separate. It may be a better idea to check this only on the > platforms that has tendency to produce -0. > > While working on the tests, I found some unreachable code and removed > it. I also found that lseg ## lseg operator returning wrong results. > It is defined as "closest point to first segment on the second > segment", but: > > > # select '[(1,2),(3,4)]'::lseg ## '[(0,0),(6,6)]'::lseg; > > ?column? > > ---------- > > (1,2) > > (1 row) > > I appended the fix to the patches. This is also effecting lseg ## box operator. Mmm.. It returns (1.5, 1.5) with the 0004 patch. It is surely a point on the second operand but I'm not sure it's right that the operator returns a specific point for two parallel segments. > I also changed recently band-aided point ## lseg operator to return > the point instead of NULL when it cannot find the correct result to > avoid the operators depending on this one to crash. I'd like to put comments on 0001 and 0004 only now: - Adding [LR]DELIM_L looks good but they should be described in the comment just above. - Renaming float8_slope to slope seems no problem. - I'm not sure the change of box_construct is good but currently all callers fits the new interface (giving two points,not four coordinates). - close_lseg seems forgetting the case where the two given segments are crossing (and parallel). For example, select '(-8,-8),(1,1)'::lseg ## '(-10,0),(2,0)'::lseg; is expected to return (0,0), which is the crossing point of the two segments, but it returns (1,0) (and returned (1,1) before the patch), which is the point on the l2 closest to the closer end of l1 to l2. As mentioned above, it is difficult to dicide what is the proper result for parallel segments. I suppose that the followingoperation should return an invalid value as a point. select '(-1,0),(1,0)'::lseg ## '(-1,1),(1,1)'::lseg; close_lseg does the following operations in the else case of 'if (float8_...)'. If i'm not missing something, the resultof the following operation is always the closer end of l2. In other words it seems a waste of cycles. | point= DirectFunctionCall2(close_ps, | PointPGetDatum(&l2->p[closer_end2]), | LsegPGetDatum(l1)); | return DirectFunctionCall2(close_ps, point, LsegPGetDatum(l2)); - make_bound_box operates directly on the poly->boundbox. I'm afraid that such coding hinders compiers from using registers. This is a bit apart from this patch, it would be better if we could eliminate internal calls using DirectFunctionCall. reagrds, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Hello, > I'd like to put comments on 0001 and 0004 only now: ... I don't have a comment on 0002. About 0003: > @@ -4487,21 +4486,21 @@ circle_in(PG_FUNCTION_ARGS) > ... > circle->radius = single_decode(s, &s, "circle", str); > - if (circle->radius < 0) > + if (float8_lt(circle->radius, 0.0)) > ereport(ERROR, > (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION), flost8_lt and its family functions are provided to unify the sorting order including NaN. NaN is not rejected by the usage of float8_lt in the case but it is what the function is expected to be used for. If we wanted to check if it is positive, it unexpectedly throws an exception. (I suppose that NaNs should be silently ignored rather than stopping a query by throwng an exception.) Addition to that I don't think it proper that applying EPSILON(!) there. It should be strictly compared regardless whether EPSION is considered or not. Similary, circle_overlap for example, float8_le is used to compare the distance and the summed radius. NaN causes a problem in another place. > PG_RETURN_BOOL(FPle(point_dt(&circle1->center, &circle2->center), > float8_pl(circle1->radius, circle2->radius))); If the distance was NaN and the summed radius is not, the function returns true. I think that a reasonable behavior is that an object containing NaN cannot make any meaningful relationship with other objects as floating number itself behave so. (Any comparison other than != with NaN returns always false) Just using another series of comparison functions that return false for NaN-containing comparison is not a solution since the meaning of the returned false differs by context, just same as the first problem above. For exameple, the fictious functions below, | bool circle_overlaps() | ret = FPle(distance, radius_sum); This gives correct results, but | bool circle_not_overlaps() | ret = FPgt(distance, radius_sum); This gives a wrong result for NaN-containing objects. Perhaps it is enough to explicitly define behaviors for NaN before comparison. circle_overlap() > distance = point_dt(....); > radius_sum = float8_pl(...); > > /* NaN-containing objects doesn't overlap any other objects */ > if (isnan(distance) || isnan(radius_sum)) > PG_RETURN_BOOL(false); > > /* NaN ordering of FPle() doesn't get into mischief here */ > return PG_RETURN_BOOL(FPle(distance, radius_sum)); (End Of the Comment to 0003) regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
>> This is also effecting lseg ## box operator. > > Mmm.. It returns (1.5, 1.5) with the 0004 patch. It is surely a > point on the second operand but I'm not sure it's right that the > operator returns a specific point for two parallel segments. I am changing it to return NULL, when they are parallel. > I'd like to put comments on 0001 and 0004 only now: > > - Adding [LR]DELIM_L looks good but they should be described in > the comment just above. I will mention it on the new version. > - I'm not sure the change of box_construct is good but currently > all callers fits the new interface (giving two points, not > four coordinates). I tried to make things more consistent. The other constructors takes points. > - close_lseg seems forgetting the case where the two given > segments are crossing (and parallel). I am re-implementing it covering those cases. > - make_bound_box operates directly on the poly->boundbox. I'm > afraid that such coding hinders compiers from using registers. I am changing it back. > This is a bit apart from this patch, it would be better if we > could eliminate internal calls using DirectFunctionCall. We don't seem to be able to fix all issues without doing that. I will incorporate the change. Thank you for your review. I will address your other email before posting new versions. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
> dist_pl is changed to take the smaller distance of both ends of > the segment. It seems absorbing error, so it might be better > taking the mean of the two distances. If you have a firm reason > for the change, it is better to be written there, or it might be > better left alone. I am sorry for not paying enough attention to this before. The distance functions are meant to return the minimum distance. Getting the maximum is just wrong in here. Can you think of any particular error it would absorb? Otherwise I will put this change back to the patch for fixes.
> flost8_lt and its family functions are provided to unify the > sorting order including NaN. NaN is not rejected by the usage of > float8_lt in the case but it is what the function is expected to > be used for. If we wanted to check if it is positive, it > unexpectedly throws an exception. (I suppose that NaNs should be > silently ignored rather than stopping a query by throwng an > exception.) It would at least be dump-and-restore hazard if we don't let them in. The new version allows NaNs. > This gives a wrong result for NaN-containing objects. I removed the NaN aware comparisons from FP macros, and carefully reviewed the places that needs to be NaN aware. I am sorry that it took so long for me to post the new versions. The more I get into this the more problems I find. The new versions include non-trivial changes. I would be glad if you can look into them.
Attachment
Does this patch series fix bug #14971? https://www.postgresql.org/message-id/20171213172806.20142.73638@wrigleys.postgresql.org Eric: see https://www.postgresql.org/message-id/CAE2gYzzvp=uvpw4FytoaEvYK-WZE4jw7u2s1gLRoK35Mr1-kYQ@mail.gmail.com for last version of patch. Motivation for patch is at https://www.postgresql.org/message-id/CAE2gYzxF7-5djV6-cEvqQu-fNsnt%3DEqbOURx7ZDg%2BVv6ZMTWbg%40mail.gmail.com -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> Does this patch series fix bug #14971? No, it doesn't. I am trying to improve things without touching the EPSILON.
Rebased versions are attached.
Attachment
Hello, sorry for the absense. (Sorry for the long but should be hard-to-read article..) At Thu, 4 Jan 2018 14:53:47 +0100, Emre Hasegeli <emre@hasegeli.com> wrote in <CAE2gYzz0hRfC-M8KDV4Aytj+6k6cvoiGqVvEjVyHTbtraf=YBQ@mail.gmail.com> > Rebased versions are attached. Thank you for the new patch. I'm looking 0001 patch. This does many things at once but seems hard to split into step-by-step patches. So I reviewed this from the viewpoint that how each of the external function is changed. (Attatchment 1). I think that this patch is not intending to change any behaviors of the external functions, but intending make it mathematically reasonable. So some behavioral changes are ineviablly requried. The problem here is what is the most reasonable (and acceptable) behavior. The comments below are just the starting of a discussion about some aspects of design. I'm sorry that this discussion might be changing the way to go largily, but I'd like to hear what you think about two topics. I found seemingly inconsistent handling of NaN. - Old box_same assumed NaN != NaN, but new assumes NaN == NaN. I'm not sure how the difference is significant out of the context of sorting, though... - box_overlap()'s behavior has not been changed. As the result box_same and box_overlap now behaves differently concerning NaN handling. - line_construct_pts has been changed so that generates {NaN,-1,NaN} for two identical points (old code generated {-1,0,0}) but I think the right behavior here is erroring out. (as line_in behaves.) I feel that it'd better to define how to handle non-numbers before refactoring. I prefer to just denying NaN as a part of the geometric types, or any input containing NaN just yields a result filled with NaNs. The next point is reasonability of the algorithm and consistency among related functions. - line_parallel considers the EPSILON(ugaa) with different scale from the old code, but both don't seem reasonable.. It might not be the time to touch it, but if we changed the behavior, I'd like to choose reasonable one. At least the tolerance of parallelity should be taken by rotation, not by slope. I think that one reasonable way to do this is taking the distance between the far ends of two direction (unit) vectors. Assuming Ax + By + C = 0, the direction vecotr dv(x, y) for the line is: n = sqrt(A^2 + B^2), dv = (B / n, -A / n). (and the normal vector nv = (A / n, B / n)) The vector needs to be restricted within upper or any two successive quadrants so that it is usable for this objective. So let's restrict A to be negative value for now. Something like the follwoing would be an answer. void line_direction_vector(Point *result, LINE *l) { float8 A = l->A; float8 B = l->B; float8 n; n = HYPOT(A, B); /* keep the result vector within the 1st-2nd quadrants */ if (A > 0) { A = -A; B = -B; } point_construct(result, B / n, -A / n); } bool line_parallel(LINE *l1, LINE *l2) { Point d1, d2; line_direction_vector(&d1, l1); line_direction_vector(&d2, l2); return FPzero(point_dt(&d1, &d2)); } Also perpendicularity is detected by comparing the direction vector of one line and the normal vector of another in the same manner. void line_normal_vector(Point *result, LINE *l) { float8 A = l->A; float8 B = l->B; float8 n; /* keep the result vector within the 1st-2nd quadrants */ n = HYPOT(A, B); if (B < 0) { A = -A; B = -B; } point_construct(result, A / n, B / n); } bool line_perp(LINE *l1, LINE *l2) { Point d1, d2; line_direction_vector(&d1, l1); line_normal_vector(&d2, l2); return FPzero(point_dt(&d1, &d2)); } In relation to this, equality of two lines is also nuisance. From the definitions, two equal lines are parallel. In contraposition, two nonparallel lines cannot be equal. This relationship is represented by the following expression: is_equal =: line_parallel(l1, l2) && FPzero(line_distance(l1, l2)) But the line_distance returns zero in most cases even if it is considered "parallel" with considering tolerance. This means that There's almost no two lines that are parallel but not equal (that is, the truely parallel lines)... sigh. So we might should calculate the distance in different (not purely mathematical/geometrical) way. For example, if we can assume the geometric objects are placed mainly around the origin, we could take the distance between the points nearest to origin on both lines. (In other words, between the foots of perpendicular lines from the origin onto the both lines). Of couse this is not ideal but... The same discussion also affects line_interpt_line or other similar functions. At last, just a simple comment. - point_eq_point() assumes that NaN == NaN. This is an inherited behavior from old float[48]_cmp_internal() but it's not a common treat. point_eq_point() needs any comment about the special definition as float[48]_cmp_internal has. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
> I found seemingly inconsistent handling of NaN. > > - Old box_same assumed NaN != NaN, but new assumes NaN == > NaN. I'm not sure how the difference is significant out of the > context of sorting, though... There is no box != box operator. If there was one, previous code would return false for every input including NaN, because it was ignorant about NaNs other than a few bandaids to avoid crashes. My idea is to make the equality (same) operators behave like the float datatypes treating NaNs as equals. The float datatypes also treat NaNs as greater than any non-NAN. We don't need to worry about this part now, because there are no basic comparison operators defined for the geometric types. > - box_overlap()'s behavior has not been changed. As the result > box_same and box_overlap now behaves differently concerning > NaN handling. After your previous email, I though this would be the right thing to do. I made the containment and placement operators return false for NaN input unless we can find the result using non-NaN coordinates. Do you find this behaviour reasonable? > - line_construct_pts has been changed so that generates > {NaN,-1,NaN} for two identical points (old code generated > {-1,0,0}) but I think the right behavior here is erroring out. > (as line_in behaves.) I agree. We should also check for the endpoints being the same on lseg_interpt functions. I will make the changes on the next version. > I feel that it'd better to define how to handle non-numbers > before refactoring. I prefer to just denying NaN as a part of > the geometric types, or any input containing NaN just yields a > result filled with NaNs. It would be nice to deny NaNs altogether, but I don't think it is feasible. People cannot restore their existing data if we start doing that. It would also require us to check any result we emit being NaN and error out in more cases. > The next point is reasonability of the algorithm and consistency > among related functions. > > - line_parallel considers the EPSILON(ugaa) with different scale > from the old code, but both don't seem reasonable.. It might > not be the time to touch it, but if we changed the behavior, > I'd like to choose reasonable one. At least the tolerance of > parallelity should be taken by rotation, not by slope. I think you are right. Vector based algorithms would be good for many other operations as well. This would be more changes, though. I am try to reduce the amount of changes to increase my chances to get this committed. I just used slope() there to increase the code reuse and to make the operation symmetrical. I can revert it back to its previous state, if you thing it is better. > So we might should calculate the distance in different (not > purely mathematical/geometrical) way. For example, if we can > assume the geometric objects are placed mainly around the origin, > we could take the distance between the points nearest to origin > on both lines. (In other words, between the foots of > perpendicular lines from the origin onto the both lines). Of > couse this is not ideal but... The EPSILON is unfair to coordinates closer to the origin. I am not sure what it the best way to improve this. I am trying to avoid dealing with EPSILON related issues in the scope of these changes. > At last, just a simple comment. > > - point_eq_point() assumes that NaN == NaN. This is an inherited > behavior from old float[48]_cmp_internal() but it's not a > common treat. point_eq_point() needs any comment about the > special definition as float[48]_cmp_internal has. I hope I answered this on the previous questions. I will incorporate the decision to threat NaNs as equals into the comments and the commit messages on the next version, if we agree on it.
Hello, I'm still wandering on the way and confused. Sorry for inconsistent comments in advanceX-( At Sun, 14 Jan 2018 13:20:57 +0100, Emre Hasegeli <emre@hasegeli.com> wrote in <CAE2gYzyY3U4n1Zn48qG6dL=FWgv29yag5PdGV2Gc17w9Toeaog@mail.gmail.com> > > I found seemingly inconsistent handling of NaN. > > > > - Old box_same assumed NaN != NaN, but new assumes NaN == > > NaN. I'm not sure how the difference is significant out of the > > context of sorting, though... > > There is no box != box operator. If there was one, previous code > would return false for every input including NaN, because it was > ignorant about NaNs other than a few bandaids to avoid crashes. > > My idea is to make the equality (same) operators behave like the float > datatypes treating NaNs as equals. The float datatypes also treat > NaNs as greater than any non-NAN. We don't need to worry about this > part now, because there are no basic comparison operators defined for > the geometric types. I'm not sure what you mean by the "basic comparison ops" but I'm fine with the policy, handling each component values in the same way with float. So point_eq_point() is right and other functions should follow the same policy. > > - box_overlap()'s behavior has not been changed. As the result > > box_same and box_overlap now behaves differently concerning > > NaN handling. > > After your previous email, I though this would be the right thing to > do. I made the containment and placement operators return false for > NaN input unless we can find the result using non-NaN coordinates. Do > you find this behaviour reasonable? Sorry for going back and force, but I don't see a difference between it and the original behavior from the point of view of reasonability. Isn't is enough to let each component comparison follow float by redefining FPxx() functions? For example, define FPle as the follwoing: static inline bool FP8le(float8 a, float8 b) { return float8_le(a, float8_pl(b, EPSILON)); } Then define box_ov using this function: static bool box_ov(BOX *box1, BOX *box2) { return (FP8le(box1->low.x, box2->high.x) &&....); } Another example, define FPeq as: static inline bool FP8eq(float8 a, float8 b) { if (isnan(a)) { if (isnan(b)) return true; return false; } else if (isnan(b)) return false; if (isinf(a) && isinf(b)) return true; return abs(a - b) <= EPSILON; } Then redefine point_eq_point as: statinc inline bool point_eq_point(Point *pt1, Point *pt2) { return FP8eq(pt1->x, pt2->x) && FP8eq(pt1->y, pt2->y); } ... At least this is in a kind of consistency and the behavior is inferable (in a certain sense). > > - line_construct_pts has been changed so that generates > > {NaN,-1,NaN} for two identical points (old code generated > > {-1,0,0}) but I think the right behavior here is erroring out. > > (as line_in behaves.) > > I agree. We should also check for the endpoints being the same on > lseg_interpt functions. I will make the changes on the next version. > > > I feel that it'd better to define how to handle non-numbers > > before refactoring. I prefer to just denying NaN as a part of > > the geometric types, or any input containing NaN just yields a > > result filled with NaNs. > > It would be nice to deny NaNs altogether, but I don't think it is > feasible. People cannot restore their existing data if we start doing > that. It would also require us to check any result we emit being NaN > and error out in more cases. It sounds right. > > The next point is reasonability of the algorithm and consistency > > among related functions. > > > > - line_parallel considers the EPSILON(ugaa) with different scale > > from the old code, but both don't seem reasonable.. It might > > not be the time to touch it, but if we changed the behavior, > > I'd like to choose reasonable one. At least the tolerance of > > parallelity should be taken by rotation, not by slope. > > I think you are right. Vector based algorithms would be good for many > other operations as well. This would be more changes, though. I am > try to reduce the amount of changes to increase my chances to get this > committed. I just used slope() there to increase the code reuse and > to make the operation symmetrical. I can revert it back to its > previous state, if you thing it is better. I'm fine with both using slope with the same scale (that is, keeping the previous behavior) or going into vector based. But the latter raises a problem as below:( > > So we might should calculate the distance in different (not > > purely mathematical/geometrical) way. For example, if we can > > assume the geometric objects are placed mainly around the origin, > > we could take the distance between the points nearest to origin > > on both lines. (In other words, between the foots of > > perpendicular lines from the origin onto the both lines). Of > > couse this is not ideal but... > > The EPSILON is unfair to coordinates closer to the origin. I am not > sure what it the best way to improve this. I am trying to avoid > dealing with EPSILON related issues in the scope of these changes. Right. And I don't have a good idea here.. > > At last, just a simple comment. > > > > - point_eq_point() assumes that NaN == NaN. This is an inherited > > behavior from old float[48]_cmp_internal() but it's not a > > common treat. point_eq_point() needs any comment about the > > special definition as float[48]_cmp_internal has. > > I hope I answered this on the previous questions. I will incorporate > the decision to threat NaNs as equals into the comments and the commit > messages on the next version, if we agree on it. Perhas it's fine for me. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
> I'm not sure what you mean by the "basic comparison ops" but I'm > fine with the policy, handling each component values in the same > way with float. So point_eq_point() is right and other functions > should follow the same policy. I mean <, >, <= and >= by basic comparison operators. Operators with those symbols are available for some geometric types, but they are comparing the sizes of the objects. Currently only the equality operators follow the same policy with point_eq_point (), others never return true when NaNs are involved. > Sorry for going back and force, but I don't see a difference > between it and the original behavior from the point of view of > reasonability. Isn't is enough to let each component comparison > follow float by redefining FPxx() functions? My previous patch was doing exactly that. I though that is not what we want to do. Do we want box_overlap() to return true when NaNs are involved?
Hello, I played more around geometric types and recalled that geometric types don't have orderings. Also have corrected some other misunderstanding. Sorry for my confusion and I think I returned on the way. At Wed, 17 Jan 2018 18:59:30 +0100, Emre Hasegeli <emre@hasegeli.com> wrote in <CAE2gYzzgJ7B-HdmxuSDoqu-_x1nEeoEA45is2hP8ex4r3KNH8Q@mail.gmail.com> > > I'm not sure what you mean by the "basic comparison ops" but I'm > > fine with the policy, handling each component values in the same > > way with float. So point_eq_point() is right and other functions > > should follow the same policy. > > I mean <, >, <= and >= by basic comparison operators. Operators with > those symbols are available for some geometric types, but they are > comparing the sizes of the objects. Currently only the equality > operators follow the same policy with point_eq_point (), others never > return true when NaNs are involved. > > > Sorry for going back and force, but I don't see a difference > > between it and the original behavior from the point of view of > > reasonability. Isn't is enough to let each component comparison > > follow float by redefining FPxx() functions? > > My previous patch was doing exactly that. I though that is not what > we want to do. Do we want box_overlap() to return true when NaNs are > involved? All comparison operators don't need consideration on ordering. And the existing comparison operators behaves diferrently from float and already behaves in the way of bare float. There's no behavior as the whole of an object. I have fixed my understanding as this. (And I saw all patches altoghter by mistake..) As the result most my concern was pointless. 0001: - You removed the check of parallelity check of line_interpt_line(old line_interpt_internal) but line_parallel is not changed so the consistency between the two functions are lost. It is better to be another patch file (maybe 0004?). - + Assert(p1->npts > 0 && p2->npts > 0); Other path_ functions are allowing path with no points so just returning false if (p1->npts == 0 || p2->npts == 0) seems enough. Assertion should be used only for something server cannot continue working. (division by zero doesn't crash server) I saw other similar assertions in the patch. 0002: I have no comment so far on this patch. 0003: close_pl/ps/lseg/pb/ls/sb have changed to return null when lseg_closept_line returns NAN, but they are defined as strict and that is reasonable. Anyway pg_hypot returns NaN only when parameters contain INF or NAN so we should error out when it returns nan. circle_in rejects negative radius and circle_recv modived to reject zero and negative. What is the reason for the change? I understand that we don't need to consider NAN is equal to NAN. circle_same, point_eq_point, line_eq no longer needs such change? Assertion is added in pg_hypot but it is impossible for result to be negative there. Why do you added it? 0004: Looking line_recv's change, I became to think that FPxx macros is provided for coordinate values. I don't think it is applied to non-coordinate values like coefficients. If some kind of tolerance is needed here, it is not the same with the EPSILON. + if (FPzero(line->A) && FPzero(line->B)) + ereport(ERROR, So, the above should be exact comparison with 0.0 and line_in also should be the same. And maybe the same thing should be done at many places.. But the following line of line_parallel still requires some kind of tolerance to work properly. Since this patch is an imoprovement patch, I think we can change it to the vector method. The problem of line_distance is an existing one so we can leave it alone. It returns 0.0 for the most cases but it is a long-standing behavior.. (Anyway I don't find a reasonable definition of the distance between very-nearly parallel lines..) -- Sorry time's up today. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
> 0001: > > - You removed the check of parallelity check of > line_interpt_line(old line_interpt_internal) but line_parallel > is not changed so the consistency between the two functions are > lost. It is better to be another patch file (maybe 0004?). I am making line_parallel() use line_interpt_line(). What they do is exactly the same. > - + Assert(p1->npts > 0 && p2->npts > 0); > > Other path_ functions are allowing path with no points so just > returning false if (p1->npts == 0 || p2->npts == 0) seems > enough. Assertion should be used only for something server > cannot continue working. (division by zero doesn't crash > server) I saw other similar assertions in the patch. path_in() and path_recv() reject paths with no points. I thought we shouldn't spend cycles for things that cannot happen. I can revert this part if you find previous practice better. > 0003: > > close_pl/ps/lseg/pb/ls/sb have changed to return null when > lseg_closept_line returns NAN, but they are defined as strict > and that is reasonable. Anyway pg_hypot returns NaN only when > parameters contain INF or NAN so we should error out when it > returns nan. I though strict is only related to the input being NULL. Tom Lane made close_ps() return NULL with commit 278148907a971ec7fa4ffb24248103d8012155d2. The other functions currently fail with elog()s from DirectFunctionCalls. I don't have strong preference for NULL or an error. Could you please suggest an errcode and errmsg, if you have? > circle_in rejects negative radius and circle_recv modived to > reject zero and negative. What is the reason for the change? Overlooking. Thank you for noticing. I am fixing it. > I understand that we don't need to consider NAN is equal to NAN. > circle_same, point_eq_point, line_eq no longer needs such > change? I though it would be better to consider NaNs equal only on the equality operators. That is why only they are changed that way. > Assertion is added in pg_hypot but it is impossible for result > to be negative there. Why do you added it? True. I am removing it. > 0004: > > Looking line_recv's change, I became to think that FPxx macros > is provided for coordinate values. I don't think it is applied > to non-coordinate values like coefficients. If some kind of > tolerance is needed here, it is not the same with the EPSILON. > > + if (FPzero(line->A) && FPzero(line->B)) > + ereport(ERROR, > > So, the above should be exact comparison with 0.0 and line_in > also should be the same. And maybe the same thing should be done > at many places.. I agree you. EPSILON is currently applied prematurely. I am trying to stay away from EPSION related issues to improve the chances to get this part committed. > But the following line of line_parallel still requires some kind > of tolerance to work properly. Since this patch is an > imoprovement patch, I think we can change it to the vector method. I am making line_parallel() use line_interpt_line() in response to your first remark. Vector based algorithm is probably better for every use of line_interpt_line(), but it is a bigger change with more user visible effects. > The problem of line_distance is an existing one so we can leave > it alone. It returns 0.0 for the most cases but it is a > long-standing behavior.. (Anyway I don't find a reasonable > definition of the distance between very-nearly parallel lines..) Exact comparison with 0.0 instead of FPzero() would also be an improvement for line_distance(), but I am not doing it now. > -- Sorry time's up today. I am waiting for the rest of your review to post the new versions.
Hello, At Thu, 18 Jan 2018 16:01:01 +0100, Emre Hasegeli <emre@hasegeli.com> wrote in <CAE2gYzwgcan3w3=-3oHa4Efif=0yvoErn-e_V5KJLUi9pd8ivQ@mail.gmail.com> > > 0001: > > > > - You removed the check of parallelity check of > > line_interpt_line(old line_interpt_internal) but line_parallel > > is not changed so the consistency between the two functions are > > lost. It is better to be another patch file (maybe 0004?). > > I am making line_parallel() use line_interpt_line(). What they do is > exactly the same. Thanks. > > - + Assert(p1->npts > 0 && p2->npts > 0); > > > > Other path_ functions are allowing path with no points so just > > returning false if (p1->npts == 0 || p2->npts == 0) seems > > enough. Assertion should be used only for something server > > cannot continue working. (division by zero doesn't crash > > server) I saw other similar assertions in the patch. > > path_in() and path_recv() reject paths with no points. I thought we > shouldn't spend cycles for things that cannot happen. I can revert > this part if you find previous practice better. I'm fine with the current shape. Thanks. Maybe the same discussion applies to polygons. (cf. poly_overlap) > > 0003: > > > > close_pl/ps/lseg/pb/ls/sb have changed to return null when > > lseg_closept_line returns NAN, but they are defined as strict > > and that is reasonable. Anyway pg_hypot returns NaN only when > > parameters contain INF or NAN so we should error out when it > > returns nan. > > I though strict is only related to the input being NULL. Tom Lane Oops. Yes. you're rigtht. > made close_ps() return NULL with commit > 278148907a971ec7fa4ffb24248103d8012155d2. The other functions Thank you for the pointer. By the way, https://www.postgresql.org/message-id/1919.1468269494%40sss.pgh.pa.us | > Is it reasonable to disallow NaN coordinates on the next major | > release. Are there any reason to deal with them? | | I don't see how we can do that; what would you do about tables already | containing NaNs? Even without that consideration, assuming that there's | no way a NaN could creep in seems a pretty fragile assumption, considering | that operations like Infinity/Infinity will produce one. Ok, it is convicing. The policy is don't do anything that is not harmful to server. Currently close_* are *strict* so what we should do is avoid returning '(anything *)NULL' as a result. > currently fail with elog()s from DirectFunctionCalls. I don't have > strong preference for NULL or an error. Could you please suggest an > errcode and errmsg, if you have? =# select close_sb('((nan,0),(1,1))'::lseg, '((-1,-1),(2,2))'::box); ERROR: function 0x8fe61c returned NULL Recosdering on it and I came to think that such usage of SQL null is the same to "invalid" objects I mentioned upthread. But we don't actively check component NaNs but if we happen to see an invalid result, return SQL null instead. In this criteria close_* functions looks good that return SQL null. 0003: line_closept_point asserts line_interpt_line returns true but it is hazardous. It is safer if line_closept_point returns NaN if line_interpt_line returns false. All other functions looks good in the criteria. > > I understand that we don't need to consider NAN is equal to NAN. > > circle_same, point_eq_point, line_eq no longer needs such > > change? > > I though it would be better to consider NaNs equal only on the > equality operators. That is why only they are changed that way. I'm convinced by that. > > 0004: > > > > Looking line_recv's change, I became to think that FPxx macros > > is provided for coordinate values. I don't think it is applied > > to non-coordinate values like coefficients. If some kind of > > tolerance is needed here, it is not the same with the EPSILON. > > > > + if (FPzero(line->A) && FPzero(line->B)) > > + ereport(ERROR, > > > > So, the above should be exact comparison with 0.0 and line_in > > also should be the same. And maybe the same thing should be done > > at many places.. > > I agree you. EPSILON is currently applied prematurely. I am trying > to stay away from EPSION related issues to improve the chances to get > this part committed. Agreed. > > But the following line of line_parallel still requires some kind > > of tolerance to work properly. Since this patch is an > > imoprovement patch, I think we can change it to the vector method. > > I am making line_parallel() use line_interpt_line() in response to > your first remark. Vector based algorithm is probably better for > every use of line_interpt_line(), but it is a bigger change with more > user visible effects. I'm fine with that. > > The problem of line_distance is an existing one so we can leave > > it alone. It returns 0.0 for the most cases but it is a > > long-standing behavior.. (Anyway I don't find a reasonable > > definition of the distance between very-nearly parallel lines..) > > Exact comparison with 0.0 instead of FPzero() would also be an > improvement for line_distance(), but I am not doing it now. reagrds, -- Kyotaro Horiguchi NTT Open Source Software Center
> I'm fine with the current shape. Thanks. Maybe the same > discussion applies to polygons. (cf. poly_overlap) It indeed does. I am incorporating. > line_closept_point asserts line_interpt_line returns true but it > is hazardous. It is safer if line_closept_point returns NaN if > line_interpt_line returns false. Yes, it makes more sense. I am changing it. >> > I understand that we don't need to consider NAN is equal to NAN. >> > circle_same, point_eq_point, line_eq no longer needs such >> > change? >> >> I though it would be better to consider NaNs equal only on the >> equality operators. That is why only they are changed that way. > > I'm convinced by that. Great. I am documenting this decision better on the patches.
New versions are attached including all changes we discussed.
Attachment
At Sun, 21 Jan 2018 21:59:19 +0100, Emre Hasegeli <emre@hasegeli.com> wrote in <CAE2gYzxDYs5tcvc4uErsWaFTb3UTYS0ERt_fFyi-28Ldvs5d4A@mail.gmail.com> > New versions are attached including all changes we discussed. Thanks for the new version. # there's many changes from the previous version.. About 0001 and 0002. 1."COPT=-DGEODEBUG make" complains as follows. | geo_ops.c:2445:62: error: invalid type argument of unary ‘*’ (have ‘float8 {aka double}’) | printf("dist_ppoly_internal- segment 0/n distance is %f\n", *result); 2. line_construct_pm has been renamed to line_construct. I noticed that the patch adds the second block for "(m == 0.0)" (from the ealier versions) but it seems to work exactly as the same to the "else" block. We need a comment about the reason for the seemingly redundant second block. 3. point_sl can return -0.0 and that is a thing that this patch intends to avoid. line_invsl has the same problem. 4. lseg_interpt_line is doing as follows. > if (FPeq(lseg->p[0].x, interpt.x) && FPeq(lseg->p[0].y, interpt.y)) > *result = lseg->p[0]; > else if (FPeq(lseg->p[1].x, interpt.x) && FPeq(lseg->p[1].y, interpt.y)) > *result = lseg->p[1]; I suppose we can use point_pt_point for this purpose. > if (point_eq_point(&lseg->p[0], &interpt)) > *result = lseg->p[0]; > else if (point_eq_point(&lseg->p[1], &interpt)) > *result = lseg->p[1]; However I'm not sure that adjusting the intersection to the tips of the segment is good or not. Adjusting onto the line can be better in another case. lseg_interpt_lseg, for instance, checks lseg_contain_point on the line parameter of lseg_interpt_line. # I'll be back later.. regards -- Kyotaro Horiguchi NTT Open Source Software Center
At Wed, 31 Jan 2018 13:09:09 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in <20180131.130909.210233873.horiguchi.kyotaro@lab.ntt.co.jp> > At Sun, 21 Jan 2018 21:59:19 +0100, Emre Hasegeli <emre@hasegeli.com> wrote in <CAE2gYzxDYs5tcvc4uErsWaFTb3UTYS0ERt_fFyi-28Ldvs5d4A@mail.gmail.com> > > New versions are attached including all changes we discussed. > > Thanks for the new version. > > # there's many changes from the previous version.. > > About 0001 and 0002. > > 1."COPT=-DGEODEBUG make" complains as follows. > > | geo_ops.c:2445:62: error: invalid type argument of unary ‘*’ (have ‘float8 {aka double}’) > | printf("dist_ppoly_internal- segment 0/n distance is %f\n", *result); > > 2. line_construct_pm has been renamed to line_construct. I > noticed that the patch adds the second block for "(m == 0.0)" > (from the ealier versions) but it seems to work exactly as the > same to the "else" block. We need a comment about the reason > for the seemingly redundant second block. > > 3. point_sl can return -0.0 and that is a thing that this patch > intends to avoid. line_invsl has the same problem. > > 4. lseg_interpt_line is doing as follows. > > > if (FPeq(lseg->p[0].x, interpt.x) && FPeq(lseg->p[0].y, interpt.y)) > > *result = lseg->p[0]; > > else if (FPeq(lseg->p[1].x, interpt.x) && FPeq(lseg->p[1].y, interpt.y)) > > *result = lseg->p[1]; > > I suppose we can use point_pt_point for this purpose. > > > if (point_eq_point(&lseg->p[0], &interpt)) > > *result = lseg->p[0]; > > else if (point_eq_point(&lseg->p[1], &interpt)) > > *result = lseg->p[1]; > > However I'm not sure that adjusting the intersection to the > tips of the segment is good or not. Adjusting onto the line > can be better in another case. lseg_interpt_lseg, for > instance, checks lseg_contain_point on the line parameter of > lseg_interpt_line. > > # I'll be back later.. I've been back. 0003: This patch replaces "double" with float and bare arithmetic and comparisons on double to special functions accompanied with checking against abnormal values. - Almost all of them are eliminated with a few safe exceptions. - circle_recv(), circle_distance(), dist_pc(), dist_cpoint() are using "< 0.0" comparison but it looks fine. - pg_hypot is right to use bare arithmetics. ! circle_contain_pt() does the following comparison and it seems to be out of our current policy. point_dt(center, point) <= radius I suppose this should use FPle. FPle(point_dt(center, point), radius) The same is true of circle_contain_pt(), pt_contained_circle . # Sorry, that' all for today.. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
> 1."COPT=-DGEODEBUG make" complains as follows. > > | geo_ops.c:2445:62: error: invalid type argument of unary ‘*’ (have ‘float8 {aka double}’) > | printf("dist_ppoly_internal- segment 0/n distance is %f\n", *result); Fixing. > 2. line_construct_pm has been renamed to line_construct. I > noticed that the patch adds the second block for "(m == 0.0)" > (from the ealier versions) but it seems to work exactly as the > same to the "else" block. We need a comment about the reason > for the seemingly redundant second block. It is indeed redundant. I am removing it. > 3. point_sl can return -0.0 and that is a thing that this patch > intends to avoid. line_invsl has the same problem. The existing version of the function has the same issue. I think we need a global solution for -0s. See the float-zero patch. > 4. lseg_interpt_line is doing as follows. > > > if (FPeq(lseg->p[0].x, interpt.x) && FPeq(lseg->p[0].y, interpt.y)) > > *result = lseg->p[0]; > > else if (FPeq(lseg->p[1].x, interpt.x) && FPeq(lseg->p[1].y, interpt.y)) > > *result = lseg->p[1]; > > I suppose we can use point_pt_point for this purpose. Yes, I am changing it. > > if (point_eq_point(&lseg->p[0], &interpt)) > > *result = lseg->p[0]; > > else if (point_eq_point(&lseg->p[1], &interpt)) > > *result = lseg->p[1]; > > However I'm not sure that adjusting the intersection to the > tips of the segment is good or not. Adjusting onto the line > can be better in another case. lseg_interpt_lseg, for > instance, checks lseg_contain_point on the line parameter of > lseg_interpt_line. Me neither, but it is probably better than returning a point that extends the endpoints of the segment. I am inclined to leave it alone for now.
> ! circle_contain_pt() does the following comparison and it > seems to be out of our current policy. > > point_dt(center, point) <= radius > > I suppose this should use FPle. > > FPle(point_dt(center, point), radius) > > The same is true of circle_contain_pt(), pt_contained_circle . box_contain_point() also doesn't use the macros. They are certainly inconsistent, but I don't think it would be an improvement to make them use the macros. As we have discussed, there are many problems with the current application of EPSILON. I think we would be better off not using the macros for none of the containment operators, but this is out of my scope for now. > # Sorry, that' all for today.. I am waiting the rest of your review to post the new versions.
Hello, At Wed, 31 Jan 2018 17:33:42 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in <20180131.173342.26333067.horiguchi.kyotaro@lab.ntt.co.jp> > 0003: This patch replaces "double" with float and bare arithmetic > and comparisons on double to special functions accompanied > with checking against abnormal values. > > - Almost all of them are eliminated with a few safe exceptions. > > - circle_recv(), circle_distance(), dist_pc(), dist_cpoint() > are using "< 0.0" comparison but it looks fine. > > - pg_hypot is right to use bare arithmetics. > > ! circle_contain_pt() does the following comparison and it > seems to be out of our current policy. > > point_dt(center, point) <= radius > > I suppose this should use FPle. > > FPle(point_dt(center, point), radius) > > The same is true of circle_contain_pt(), pt_contained_circle . - line_eq looks too complex in the normal (not containing NANs) cases. We should avoid such complexity if possible. One problem here is that comparison conceals NANness of operands. Conversely arithmetics propagate it. We can converge NANness into a number. The attached line_eq() doesn that. This doesn't have almost no additional complexity when NAN is involved. I believe it qbehaves in the same way and shares a doubious behavior like this. =# select '{nan, 1, nan}'::line = '{nan, 2, nan}'::line; ?column? ---------- t But probably no point in fixing(?) it. The attached file contains line_eq, point_eq_point and circle_same. I expect that line_eq is fast but other two are doubious. 0004: - line_perp We can detect perpendicularity without division. The normal vecotor of Ax + Bx + C = 0 is (A, B). If two lines are perpendicular, the inner product of the normal vectors of v1 and v2 is 0. No point in dividing. l1->A * l2->A + l1->B * l2->B == 0 . . . Mmm.. The function seems broken. I posted the fix for the existing version is posted, and line_perp() in the attched file will work fine. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
> - line_eq looks too complex in the normal (not containing NANs) > cases. We should avoid such complexity if possible. > > One problem here is that comparison conceals NANness of > operands. Conversely arithmetics propagate it. We can converge > NANness into a number. The attached line_eq() doesn that. This > doesn't have almost no additional complexity when NAN is > involved. I believe it qbehaves in the same way > and shares a doubious behavior like this. > > =# select '{nan, 1, nan}'::line = '{nan, 2, nan}'::line; > ?column? > ---------- > t > > But probably no point in fixing(?) it. I think we should fix it. > The attached file contains line_eq, point_eq_point and > circle_same. I expect that line_eq is fast but other two are > doubious. I haven't got an attachment. > . . . Mmm.. The function seems broken. I posted the fix for > the existing version is posted, and line_perp() in the attched > file will work fine. I am incorporating the fix you have posted to the other thread to this patch.
Hi, On 2018-02-07 16:46:38 +0100, Emre Hasegeli wrote: > I am incorporating the fix you have posted to the other thread to this patch. You've not posted a version fixing the issues in the surrounding thread, do I see that right? I'm a bit confused how this patch has gone through several revisions since, but has been marked as "ready for committer" since 2017-09-05. Am I missing something? Greetings, Andres Freund
The following review has been posted through the commitfest application: make installcheck-world: tested, failed Implements feature: not tested Spec compliant: not tested Documentation: not tested Unfortunately according to http://commitfest.cputube.org/ this patch doesn't apply anymore. The new status of this patch is: Waiting on Author
> Unfortunately according to http://commitfest.cputube.org/ this patch doesn't apply anymore. New versions are attached.
Attachment
> I'm a bit confused how this patch has gone through several revisions > since, but has been marked as "ready for committer" since 2017-09-05. Am > I missing something? This is floating between commitfests for longer than a year. Aleksander Alekseev set it to "ready for committer", but Kyotaro HORIGUCHI continues his review. I hope not many issues have remained.
Hello, sorry to be late. At Fri, 2 Mar 2018 15:26:25 +0100, Emre Hasegeli <emre@hasegeli.com> wrote in <CAE2gYzzj6wgGeBY1vdAHBzzQDSDs-8NhD+=eTcgBKEsCBWpUXg@mail.gmail.com> > > Unfortunately according to http://commitfest.cputube.org/ this patch doesn't apply anymore. > New versions are attached. Thank you for the revised version. This seems fine for me so marked this as "Ready for Committer". - This applies cleanly on the master head and regression passes. - The new behavior looks sane (except for the EPSILON, which is out of the scope of this patch). - Test is complete as long as I can see. At least far more completely filled than the previous state. Some test items might seem a bit big, but it seems to be needed to raise coverage on required combinaions of dimension values. By the way I think that line_perp is back patched, could you propose it for the older versions? (I already marked my entry as Rejected) Brief summary follows (almost same to the header of patch files): - 0001 looks fine. > * Eliminate SQL level function calls > * Re-use more functions to implement others > * Unify internal function names and signatures > * Remove private functions from geo_decls.h > * Replace not-possible checks with Assert() > * Add comments to describe some functions > * Remove some unreachable code > * Define delimiter symbols of line datatype like the other ones > * Remove debugging print()s from the refactored functions And one bug fix. - 0002 looks fine. Refactors adt/float.c and utils/float.h making float checking *macros* into inline functions. making float comparison operators more efficiently. others are the consequence of the above change. and fixes NaN problem of GiST. - 0003 looks fine. just changes the usage of double to float8 as more proper type. uses operator functions instead of bare arithmetics to handle arithmetic errors more properly. - 0004 looks fine. (Sorry for overlooking that this treats bugs) all overlooked failure cases seems to be fixed. - 0005 looks fine. this unifies +-0.0 to +0.0 for the convenient of later processing. - 0006 It seems cover the all types of operations. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
> Thank you for the revised version. This seems fine for me so > marked this as "Ready for Committer". Thank you for bearing with me for longer than year. > By the way I think that line_perp is back patched, could you > propose it for the older versions? (I already marked my entry as > Rejected) Do you mean back-patching? > Brief summary follows (almost same to the header of patch files): > > - 0001 looks fine. > > * Eliminate SQL level function calls > > * Re-use more functions to implement others > > * Unify internal function names and signatures > > * Remove private functions from geo_decls.h > > * Replace not-possible checks with Assert() > > * Add comments to describe some functions > > * Remove some unreachable code > > * Define delimiter symbols of line datatype like the other ones > > * Remove debugging print()s from the refactored functions > > And one bug fix. What is that one? Should I incorporate it into the commit message?
Rebased versions are attached.
Attachment
Hi Emre, Thanks for the rebased patch. I remember reviewing the patch in the last CF, and it seems in a pretty good shape. I plan to look at it again in the next commitfest, but it seems to have been reviewed by other experienced people so I'm not worried about this part. The main remaining question I have is what do do with back-branches. Shall we back-patch this or not? The trouble is that while the patch is essentially a bugfix, it refactors quite significant amount of code to make the fixes practical. If it was possible to back-patch just the fixes without the refactoring, that would be ideal, but unfortunately that's not the case. Based on discussion with Emre in Ottawa that would be rather impractical due to the nature of the bugs and low code reuse. I do believe we should back-patch - after all, it fixes real bugs. It's true the bugs were there for years and no one noticed/reported them, but it's still buggy and that's not fun. Opinions? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > The main remaining question I have is what do do with back-branches. > Shall we back-patch this or not? Given the behavioral changes involved, I'd say "no way". That's reinforced by the lack of field complaints; if there were lots of complaints, maybe we'd be willing to break backwards compatibility, but ... regards, tom lane
On 06/03/2018 11:50 PM, Tom Lane wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> The main remaining question I have is what do do with back-branches. >> Shall we back-patch this or not? > > Given the behavioral changes involved, I'd say "no way". That's > reinforced by the lack of field complaints; if there were lots of > complaints, maybe we'd be willing to break backwards compatibility, > but ... > Fair enough, I tend to over-estimate importance of bugfixes and under-estimate breakage due to behavior change. But if we don't want to back-patch this, I'm fine with that. I was a bit worried about making future backpatches more painful, but this code received only ~20 commits over the past files, half of that due tot pgindent, so that seems to be a non-issue. But now I'm wondering what does this mean for existing indexes? Doesn't this effectively mean those are unlikely to give meaningful responses (in the old or new semantics)? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> But now I'm wondering what does this mean for existing indexes? Doesn't this > effectively mean those are unlikely to give meaningful responses (in the old > or new semantics)? The patches shouldn't cause any change to the indexable operators. The fixes are mostly around the lines and the line segments which doesn't have index support.
On Sun, Jun 3, 2018 at 12:58 PM, Emre Hasegeli <emre@hasegeli.com> wrote: > Rebased versions are attached. Hi Emre, This produces build errors on Windows[1][2]: C:\projects\postgresql\src\include\utils/float.h(136): warning C4013: '_fpclass' undefined; assuming extern returning int [C:\projects\postgresql\postgres.vcxproj] C:\projects\postgresql\src\include\utils/float.h(297): warning C4013: '_isnan' undefined; assuming extern returning int [C:\projects\postgresql\postgres.vcxproj] C:\projects\postgresql\src\include\utils/float.h(136): error C2065: '_FPCLASS_PINF' : undeclared identifier [C:\projects\postgresql\postgres.vcxproj] C:\projects\postgresql\src\include\utils/float.h(136): error C2065: '_FPCLASS_NINF' : undeclared identifier [C:\projects\postgresql\postgres.vcxproj] 2778 This is apparently coming from the expansion of the following macros from src/include/port/win32_port.h: #if (_MSC_VER < 1800) #define isinf(x) ((_fpclass(x) == _FPCLASS_PINF) || (_fpclass(x) == _FPCLASS_NINF)) #define isnan(x) _isnan(x) #endif Those underscore-prefixed names are defined in Microsoft's <float.h>[3][4]. So now I'm wondering if win32_port.h needs to #include <float.h> if (_MSC_VER < 1800). [1] http://cfbot.cputube.org/emre-hasegeli.html [2] https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.1022 [3] https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/fpclass-fpclassf [4] https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/isnan-isnan-isnanf -- Thomas Munro http://www.enterprisedb.com
> Those underscore-prefixed names are defined in Microsoft's > <float.h>[3][4]. So now I'm wondering if win32_port.h needs to > #include <float.h> if (_MSC_VER < 1800). I don't have the C experience to decide the correct way. There are currently many .c files that are including float.h conditionally or unconditionally. The condition they use is "#ifdef _MSC_VER" without a version. One idea is to include float.h from the new utils/float.h file together with math.h, and remove those includes from the .c files which would include utils/float.h. We can do this only, or together with what you suggest, or by also keeping the includes on the .c files. Which way do you think is the proper?
On 06/05/2018 06:32 PM, Emre Hasegeli wrote: >> Those underscore-prefixed names are defined in Microsoft's >> <float.h>[3][4]. So now I'm wondering if win32_port.h needs to >> #include <float.h> if (_MSC_VER < 1800). > > I don't have the C experience to decide the correct way. There are > currently many .c files that are including float.h conditionally or > unconditionally. The condition they use is "#ifdef _MSC_VER" without > a version. > > One idea is to include float.h from the new utils/float.h file > together with math.h, and remove those includes from the .c files > which would include utils/float.h. We can do this only, or together > with what you suggest, or by also keeping the includes on the .c > files. Which way do you think is the proper? > Do we have any solution to the float.h include issues on Windows? I don't have any Windows box at hand so I can't verify it, but just using "#ifdef _MSC_VER" seems OK to me (and it's used elsewhere). Thomas, why do you think the version number restriction is needed here? I don't see the version mentioned in the MS docs you linked either. Once this gets resolved, I'd like to get this committed ... so if you have other objections, please speak now. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Jul 10, 2018 at 7:21 AM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > On 06/05/2018 06:32 PM, Emre Hasegeli wrote: >>> Those underscore-prefixed names are defined in Microsoft's >>> <float.h>[3][4]. So now I'm wondering if win32_port.h needs to >>> #include <float.h> if (_MSC_VER < 1800). >> >> I don't have the C experience to decide the correct way. There are >> currently many .c files that are including float.h conditionally or >> unconditionally. The condition they use is "#ifdef _MSC_VER" without >> a version. >> >> One idea is to include float.h from the new utils/float.h file >> together with math.h, and remove those includes from the .c files >> which would include utils/float.h. We can do this only, or together >> with what you suggest, or by also keeping the includes on the .c >> files. Which way do you think is the proper? >> > > Do we have any solution to the float.h include issues on Windows? I > don't have any Windows box at hand so I can't verify it, but just using > "#ifdef _MSC_VER" seems OK to me (and it's used elsewhere). Thomas, why > do you think the version number restriction is needed here? I don't see > the version mentioned in the MS docs you linked either. The version number restriction isn't strictly needed. I only suggested it because it'd match the #if that wraps the code that's actually using those macros, introduced by commit cec8394b5ccd. That was presumably done because versions >= 1800 (= Visual Studio 2013) have their own definitions of isinf() and isnan(), and I guess that our definitions were probably breaking stuff on that compiler. > Once this gets resolved, I'd like to get this committed ... so if you > have other objections, please speak now. +1, no objections. -- Thomas Munro http://www.enterprisedb.com
> The version number restriction isn't strictly needed. I only > suggested it because it'd match the #if that wraps the code that's > actually using those macros, introduced by commit cec8394b5ccd. That > was presumably done because versions >= 1800 (= Visual Studio 2013) > have their own definitions of isinf() and isnan(), and I guess that > our definitions were probably breaking stuff on that compiler. Now I understand what you mean. win32_port.h defines isnan(x) as _isnan(x) if (_MSC_VER < 1800). It doesn't look right to have the definition in here but not include <float.h> as _isnan() is coming from there. I am preparing an additional patch to add the include and remove it from files where it is obviously put to work around this problem.
> Now I understand what you mean. win32_port.h defines isnan(x) as > _isnan(x) if (_MSC_VER < 1800). It doesn't look right to have the > definition in here but not include <float.h> as _isnan() is coming > from there. I am preparing an additional patch to add the include and > remove it from files where it is obviously put to work around this > problem. I posted this to another thread. Until this is sorted out I made the new float header patch include <float.h>, so they are not dependent. New versions are attached.
Attachment
New versions are attached after the <float.h> patch got in. I noticed tests failing on Windows [1] and added alternative .out file. [1] https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.5235
Attachment
On 07/11/2018 07:13 PM, Emre Hasegeli wrote: > New versions are attached after the <float.h> patch got in. I noticed > tests failing on Windows [1] and added alternative .out file. > > [1] https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.5235 > The version posted about two weeks ago is slightly broken - AFAICS the float.h includes in geo_ops.c and gistproc.c need to be part of 0002, not 0003. Attached is a version fixing that. Barring objections, I'll get this committed over the next few days, once I review all the individual parts once more. I'm planning to commit the 6 parts separately, as submitted. No backpatching, as discussed before. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
- 0001-Refactor-geometric-functions-and-operators-code.patch
- 0002-Provide-header-file-for-built-in-float-datatypes.patch
- 0006-Improve-test-coverage-of-geometric-types.patch
- 0005-Check-for-float-0-after-multiplications-and-division.patch
- 0004-Fix-obvious-problems-around-the-line-datatype.patch
- 0003-Use-the-built-in-float-datatype-to-implement-geometr.patch
Thank you for taking this. At Thu, 26 Jul 2018 17:12:50 +0200, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in <672f4c42-6742-c1ec-e9a4-1994b815acc7@2ndquadrant.com> > On 07/11/2018 07:13 PM, Emre Hasegeli wrote: > > New versions are attached after the <float.h> patch got in. I noticed > > tests failing on Windows [1] and added alternative .out file. > > [1] > > https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.5235 > > > > The version posted about two weeks ago is slightly broken - AFAICS the > float.h includes in geo_ops.c and gistproc.c need to be part of 0002, > not 0003. Attached is a version fixing that. > > Barring objections, I'll get this committed over the next few days, > once I review all the individual parts once more. I'm planning to > commit the 6 parts separately, as submitted. No backpatching, as > discussed before. +1 for no backpatching, and not merging into single big patch. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On 07/27/2018 10:20 AM, Kyotaro HORIGUCHI wrote: > Thank you for taking this. > > At Thu, 26 Jul 2018 17:12:50 +0200, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in <672f4c42-6742-c1ec-e9a4-1994b815acc7@2ndquadrant.com> >> On 07/11/2018 07:13 PM, Emre Hasegeli wrote: >>> New versions are attached after the <float.h> patch got in. I noticed >>> tests failing on Windows [1] and added alternative .out file. >>> [1] >>> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.5235 >>> >> >> The version posted about two weeks ago is slightly broken - AFAICS the >> float.h includes in geo_ops.c and gistproc.c need to be part of 0002, >> not 0003. Attached is a version fixing that. >> >> Barring objections, I'll get this committed over the next few days, >> once I review all the individual parts once more. I'm planning to >> commit the 6 parts separately, as submitted. No backpatching, as >> discussed before. > > +1 for no backpatching, and not merging into single big patch. > I've committed the first two parts, after a review and testing. As these two parts were primarily refactoring (and quite extensive), this seems like a good place to wait if the buildfarm is happy with it. If yes, I'll continue applying the patches that do fix/change the behavior in various ways. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 07/29/2018 03:54 AM, Tomas Vondra wrote: > > > On 07/27/2018 10:20 AM, Kyotaro HORIGUCHI wrote: >> Thank you for taking this. >> >> At Thu, 26 Jul 2018 17:12:50 +0200, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in <672f4c42-6742-c1ec-e9a4-1994b815acc7@2ndquadrant.com> >>> On 07/11/2018 07:13 PM, Emre Hasegeli wrote: >>>> New versions are attached after the <float.h> patch got in. I noticed >>>> tests failing on Windows [1] and added alternative .out file. >>>> [1] >>>> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.5235 >>>> >>> >>> The version posted about two weeks ago is slightly broken - AFAICS the >>> float.h includes in geo_ops.c and gistproc.c need to be part of 0002, >>> not 0003. Attached is a version fixing that. >>> >>> Barring objections, I'll get this committed over the next few days, >>> once I review all the individual parts once more. I'm planning to >>> commit the 6 parts separately, as submitted. No backpatching, as >>> discussed before. >> >> +1 for no backpatching, and not merging into single big patch. >> > > I've committed the first two parts, after a review and testing. > > As these two parts were primarily refactoring (and quite extensive), > this seems like a good place to wait if the buildfarm is happy with it. > If yes, I'll continue applying the patches that do fix/change the > behavior in various ways. > Seems there's a couple of failures like this, so far: *** 42,48 **** s --------------------------------------------- {1,-1,1} ! {1,-1,0} {-0.4,-1,-6} {-0.000184615384615385,-1,15.3846153846154} {1,-1,11} --- 42,48 ---- s --------------------------------------------- {1,-1,1} ! {1,-1,-0} {-0.4,-1,-6} {-0.000184615384615385,-1,15.3846153846154} {1,-1,11} *************** It's always 0/-0 difference, and it's limited to power machines. I'll try to get access to such system and see what's wrong. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sun, Jul 29, 2018 at 10:35 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > It's always 0/-0 difference, and it's limited to power machines. I'll > try to get access to such system and see what's wrong. This is suspicious: /* on some platforms, the preceding expression tends to produce -0 */ if (line->C == 0.0) line->C = 0.0; FWIW I tried this on our Linux/POWER8 box but it didn't show the problem. -- Thomas Munro http://www.enterprisedb.com
On Sun, Jul 29, 2018 at 10:57 PM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > On Sun, Jul 29, 2018 at 10:35 PM, Tomas Vondra > <tomas.vondra@2ndquadrant.com> wrote: >> It's always 0/-0 difference, and it's limited to power machines. I'll >> try to get access to such system and see what's wrong. > > This is suspicious: > > /* on some platforms, the preceding expression tends to produce -0 */ > if (line->C == 0.0) > line->C = 0.0; I mean, it's suspiciously absent from the new line_construct() function. It was introduced here: commit 43fe90f66a0b200f6c32507428349afb45f661ca Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Fri Oct 25 15:55:15 2013 -0400 Suppress -0 in the C field of lines computed by line_construct_pts(). It's not entirely clear why some PPC machines are generating -0 here, since the underlying computation should be exactly 0 - 0. Perhaps there's some wider-than-nominal-precision calculations happening? Anyway, the best way to avoid platform-dependent results seems to be to explicitly reset -0 to regular zero. -- Thomas Munro http://www.enterprisedb.com
On 07/29/2018 01:28 PM, Thomas Munro wrote: > On Sun, Jul 29, 2018 at 10:57 PM, Thomas Munro > <thomas.munro@enterprisedb.com> wrote: >> On Sun, Jul 29, 2018 at 10:35 PM, Tomas Vondra >> <tomas.vondra@2ndquadrant.com> wrote: >>> It's always 0/-0 difference, and it's limited to power machines. I'll >>> try to get access to such system and see what's wrong. >> >> This is suspicious: >> >> /* on some platforms, the preceding expression tends to produce -0 */ >> if (line->C == 0.0) >> line->C = 0.0; > > I mean, it's suspiciously absent from the new line_construct() > function. It was introduced here: > > commit 43fe90f66a0b200f6c32507428349afb45f661ca > Author: Tom Lane <tgl@sss.pgh.pa.us> > Date: Fri Oct 25 15:55:15 2013 -0400 > > Suppress -0 in the C field of lines computed by line_construct_pts(). > > It's not entirely clear why some PPC machines are generating -0 here, since > the underlying computation should be exactly 0 - 0. Perhaps there's some > wider-than-nominal-precision calculations happening? Anyway, the best way > to avoid platform-dependent results seems to be to explicitly reset -0 to > regular zero. > Hmm, I see. I think adding it to the else branch should do the trick, then, I guess. But I'd be much happier if I could test it somewhere before the commit. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sat, Jul 28, 2018 at 9:54 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
I've committed the first two parts, after a review and testing.
I'm getting a compiler warning now:
geo_ops.c: In function 'line_closept_point':
geo_ops.c:2528:7: warning: variable 'retval' set but not used [-Wunused-but-set-variable]
bool retval;
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10)
Cheers,
Jeff
As these two parts were primarily refactoring (and quite extensive),
this seems like a good place to wait if the buildfarm is happy with it.
If yes, I'll continue applying the patches that do fix/change the
behavior in various ways.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 07/29/2018 04:31 PM, Jeff Janes wrote: > > > On Sat, Jul 28, 2018 at 9:54 PM, Tomas Vondra > <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote: > > > > I've committed the first two parts, after a review and testing. > > > I'm getting a compiler warning now: > > geo_ops.c: In function 'line_closept_point': > geo_ops.c:2528:7: warning: variable 'retval' set but not used > [-Wunused-but-set-variable] > bool retval; > Yeah, the variable is apparently only used in an assert. Will fix. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 07/29/2018 02:03 PM, Tomas Vondra wrote: > > > On 07/29/2018 01:28 PM, Thomas Munro wrote: >> On Sun, Jul 29, 2018 at 10:57 PM, Thomas Munro >> <thomas.munro@enterprisedb.com> wrote: >>> On Sun, Jul 29, 2018 at 10:35 PM, Tomas Vondra >>> <tomas.vondra@2ndquadrant.com> wrote: >>>> It's always 0/-0 difference, and it's limited to power machines. I'll >>>> try to get access to such system and see what's wrong. >>> >>> This is suspicious: >>> >>> /* on some platforms, the preceding expression tends to produce -0 */ >>> if (line->C == 0.0) >>> line->C = 0.0; >> >> I mean, it's suspiciously absent from the new line_construct() >> function. It was introduced here: >> >> commit 43fe90f66a0b200f6c32507428349afb45f661ca >> Author: Tom Lane <tgl@sss.pgh.pa.us> >> Date: Fri Oct 25 15:55:15 2013 -0400 >> >> Suppress -0 in the C field of lines computed by line_construct_pts(). >> >> It's not entirely clear why some PPC machines are generating -0 here, since >> the underlying computation should be exactly 0 - 0. Perhaps there's some >> wider-than-nominal-precision calculations happening? Anyway, the best way >> to avoid platform-dependent results seems to be to explicitly reset -0 to >> regular zero. >> > > Hmm, I see. I think adding it to the else branch should do the trick, > then, I guess. But I'd be much happier if I could test it somewhere > before the commit. > FWIW I think this should fix it. Can someone with access to an affected machine confirm? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On 07/29/2018 05:14 PM, Tomas Vondra wrote: > On 07/29/2018 02:03 PM, Tomas Vondra wrote: >> >> ... >> >> Hmm, I see. I think adding it to the else branch should do the trick, >> then, I guess. But I'd be much happier if I could test it somewhere >> before the commit. >> > > FWIW I think this should fix it. Can someone with access to an affected > machine confirm? > Gah, shouldn't have posted before trying to compile it. Here is a fixed version of the fix. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > On 07/29/2018 05:14 PM, Tomas Vondra wrote: >> FWIW I think this should fix it. Can someone with access to an affected >> machine confirm? > Gah, shouldn't have posted before trying to compile it. Here is a fixed > version of the fix. Sure, I'll try this on prairiedog. It's slow though ... regards, tom lane
I wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> On 07/29/2018 05:14 PM, Tomas Vondra wrote: >>> FWIW I think this should fix it. Can someone with access to an affected >>> machine confirm? >> Gah, shouldn't have posted before trying to compile it. Here is a fixed >> version of the fix. > Sure, I'll try this on prairiedog. It's slow though ... Yup, this fixes the core regression tests on that machine. I was too lazy to try contrib. regards, tom lane
On 07/29/2018 08:02 PM, Tom Lane wrote: > I wrote: >> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >>> On 07/29/2018 05:14 PM, Tomas Vondra wrote: >>>> FWIW I think this should fix it. Can someone with access to an affected >>>> machine confirm? > >>> Gah, shouldn't have posted before trying to compile it. Here is a fixed >>> version of the fix. > >> Sure, I'll try this on prairiedog. It's slow though ... > > Yup, this fixes the core regression tests on that machine. > I was too lazy to try contrib. > OK, thanks for confirming. I'll get it committed and we'll see what the animals think soon. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 07/29/2018 05:11 PM, Tomas Vondra wrote: > > > On 07/29/2018 04:31 PM, Jeff Janes wrote: >> >> >> On Sat, Jul 28, 2018 at 9:54 PM, Tomas Vondra >> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote: >> >> >> >> I've committed the first two parts, after a review and testing. >> >> >> I'm getting a compiler warning now: >> >> geo_ops.c: In function 'line_closept_point': >> geo_ops.c:2528:7: warning: variable 'retval' set but not used >> [-Wunused-but-set-variable] >> bool retval; >> > > Yeah, the variable is apparently only used in an assert. Will fix. > This should fix it I guess, and it's how we deal with unused return values elsewhere. I've considered using USE_ASSERT_CHECKING here, but it seems rather ugly with that. I'll wait for Emre's opinion ... regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > This should fix it I guess, and it's how we deal with unused return > values elsewhere. I've considered using USE_ASSERT_CHECKING here, but it > seems rather ugly with that. I'll wait for Emre's opinion ... I think what you want is to mark the variable with PG_USED_FOR_ASSERTS_ONLY. regards, tom lane
On 07/29/2018 10:57 PM, Tom Lane wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> This should fix it I guess, and it's how we deal with unused return >> values elsewhere. I've considered using USE_ASSERT_CHECKING here, but it >> seems rather ugly with that. I'll wait for Emre's opinion ... > > I think what you want is to mark the variable with > PG_USED_FOR_ASSERTS_ONLY. > Oh, good idea. I don't think I've ever used that macro and I've completely forgotten about it. Committed that way. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> This should fix it I guess, and it's how we deal with unused return > values elsewhere. I've considered using USE_ASSERT_CHECKING here, but it > seems rather ugly with that. I'll wait for Emre's opinion ... Assert() is the wrong thing to do in here. Drawn-perpendicular lines may not intersect because of precision loss. We have to check it and return NULL. There a few of those that we crash, or return garbage, or get NULL and fail in DirectFunctionCall()s. The next patch "line-fixes" fixes them.
> OK, thanks for confirming. I'll get it committed and we'll see what the > animals think soon. Thank you for fixing this. I wanted to preserve this code but wasn't sure about the correct place or whether it is still necessary. There are more places we produce -0. The regression tests have alternative results to cover them. I have the "float-zero" patch for this. Although I am not sure if it is a correct fix. I think we should find the correct fix, and apply it globally to floating point operations. This can be only enabled for platforms which produce -0, so the others don't have to pay the price.
On 07/30/2018 11:57 AM, Emre Hasegeli wrote: >> OK, thanks for confirming. I'll get it committed and we'll see what the >> animals think soon. > > Thank you for fixing this. I wanted to preserve this code but wasn't > sure about the correct place or whether it is still necessary. > > There are more places we produce -0. The regression tests have > alternative results to cover them. I have the "float-zero" patch for > this. Although I am not sure if it is a correct fix. I think we > should find the correct fix, and apply it globally to floating point > operations. This can be only enabled for platforms which produce -0, > so the others don't have to pay the price. > Hmmm. It'll be difficult to review such patch without access to a platform exhibiting such behavior ... IIRC IBM offers free access to open-source devs, I wonder if that would be a way. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi Emre, Now that the buildfarm is no longer complaining about 0001 and 0002, I'm working on reviewing and committing 0003. It seems quite straightforward but I do have couple of comment/questions: 1) common_entry_cmp is still handling 'delta' as double, although the CommonEntry was modified to use float8. IMHO it should also simply call float8_cmp_internal instead of doing comparisons. 2) gist_box_picksplit does this int m = ceil(LIMIT_RATIO * (float8) nentries); instead of int m = ceil(LIMIT_RATIO * (double) nentries); which seems rather unnecessary, considering the only point of the cast was probably to do more accurate multiplication. And it seems pointless to cast it to float8 but then not use float8_mul(). 3) computeDistance does this: if (point->y > box->high.y) result = float8_mi(point->y, box->high.y); else if (point->y < box->low.y) result = float8_mi(box->low.y, point->y); which seems suspicious. Shouldn't the comparisons be done by float8_lt and float8_gt too? That's what we do elsewhere. 4) I think we should just get rid of GEODEBUG entirely. The preceding patches removes about 20 out of 27 occurrences anyway, so let's ditch the remaining few. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> 1) common_entry_cmp is still handling 'delta' as double, although the > CommonEntry was modified to use float8. IMHO it should also simply call > float8_cmp_internal instead of doing comparisons. I am changing it to define delta as "float8" and to use float8_cmp_internal(). > 2) gist_box_picksplit does this > > int m = ceil(LIMIT_RATIO * (float8) nentries); > > instead of > > int m = ceil(LIMIT_RATIO * (double) nentries); > > which seems rather unnecessary, considering the only point of the cast was > probably to do more accurate multiplication. And it seems pointless to cast > it to float8 but then not use float8_mul(). I am removing the cast. > 3) computeDistance does this: > > if (point->y > box->high.y) > result = float8_mi(point->y, box->high.y); > else if (point->y < box->low.y) > result = float8_mi(box->low.y, point->y); > > which seems suspicious. Shouldn't the comparisons be done by float8_lt and > float8_gt too? That's what we do elsewhere. I assumed the GiST code already handles NaNs correctly and tried not to change its behavior. It may be a good idea to revert existing NaN handling in favour of using the inline functions every time. Should I do that? > 4) I think we should just get rid of GEODEBUG entirely. The preceding > patches removes about 20 out of 27 occurrences anyway, so let's ditch the > remaining few. I agree. Shall I append it to this patch?
On 07/31/2018 05:14 PM, Emre Hasegeli wrote: >> 1) common_entry_cmp is still handling 'delta' as double, although the >> CommonEntry was modified to use float8. IMHO it should also simply call >> float8_cmp_internal instead of doing comparisons. > > I am changing it to define delta as "float8" and to use float8_cmp_internal(). > >> 2) gist_box_picksplit does this >> >> int m = ceil(LIMIT_RATIO * (float8) nentries); >> >> instead of >> >> int m = ceil(LIMIT_RATIO * (double) nentries); >> >> which seems rather unnecessary, considering the only point of the cast was >> probably to do more accurate multiplication. And it seems pointless to cast >> it to float8 but then not use float8_mul(). > > I am removing the cast. > >> 3) computeDistance does this: >> >> if (point->y > box->high.y) >> result = float8_mi(point->y, box->high.y); >> else if (point->y < box->low.y) >> result = float8_mi(box->low.y, point->y); >> >> which seems suspicious. Shouldn't the comparisons be done by float8_lt and >> float8_gt too? That's what we do elsewhere. > > I assumed the GiST code already handles NaNs correctly and tried not > to change its behavior. It may be a good idea to revert existing NaN > handling in favour of using the inline functions every time. Should I > do that? Ah, so there's an assumption that NaNs are handled earlier and never reach this place? That's probably a safe assumption. I haven't thought about that, it simply seemed suspicious that the code mixes direct comparisons and float8_mi() calls. > >> 4) I think we should just get rid of GEODEBUG entirely. The preceding >> patches removes about 20 out of 27 occurrences anyway, so let's ditch the >> remaining few. > > I agree. Shall I append it to this patch? > Not sure, I'll leave that up to you. I don't mind doing it in a separate patch (I'd probably prefer that over mixing it into unrelated patch). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> Ah, so there's an assumption that NaNs are handled earlier and never reach > this place? That's probably a safe assumption. I haven't thought about that, > it simply seemed suspicious that the code mixes direct comparisons and > float8_mi() calls. The comparison functions handle NaNs. The arithmetic functions handle returning error on underflow, overflow and division by zero. I assumed we want to return error on those in any case, but we don't want to handle NaNs at every place. > Not sure, I'll leave that up to you. I don't mind doing it in a separate > patch (I'd probably prefer that over mixing it into unrelated patch). It is attached separately.
Attachment
> Hmmm. It'll be difficult to review such patch without access to a platform > exhibiting such behavior ... IIRC IBM offers free access to open-source > devs, I wonder if that would be a way. I don't have access to such platform either, and I don't know too much about this business. I left this patch out for now. It is something that should affect all float operations not only the geometric types anyway. Anybody who knows and cares about these platforms can pick this up. We can fix -0 issues in localized way, like you have done, if there would be any more.
On 08/01/2018 01:40 PM, Emre Hasegeli wrote: >> Ah, so there's an assumption that NaNs are handled earlier and never reach >> this place? That's probably a safe assumption. I haven't thought about that, >> it simply seemed suspicious that the code mixes direct comparisons and >> float8_mi() calls. > > The comparison functions handle NaNs. The arithmetic functions handle > returning error on underflow, overflow and division by zero. I > assumed we want to return error on those in any case, but we don't > want to handle NaNs at every place. > >> Not sure, I'll leave that up to you. I don't mind doing it in a separate >> patch (I'd probably prefer that over mixing it into unrelated patch). > > It is attached separately. > OK, thanks. So, have we reached conclusion about all the bits I mentioned on 7/31? The delta and float8/double cast are fixed, and for computeDistance (i.e. doing comparisons directly or using float8_lt), the code may seem a bit inconsistent, but it is in fact correct as the NaNs are handled elsewhere. That seems reasonable, but perhaps a comment pointing that out would be nice. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
At Thu, 2 Aug 2018 11:50:55 +0200, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in <ce3cf95a-4751-c168-54ae-636c486e06cd@2ndquadrant.com> > > > On 08/01/2018 01:40 PM, Emre Hasegeli wrote: > >> Ah, so there's an assumption that NaNs are handled earlier and never > >> reach > >> this place? That's probably a safe assumption. I haven't thought about > >> that, > >> it simply seemed suspicious that the code mixes direct comparisons and > >> float8_mi() calls. > > The comparison functions handle NaNs. The arithmetic functions handle > > returning error on underflow, overflow and division by zero. I > > assumed we want to return error on those in any case, but we don't > > want to handle NaNs at every place. > > > >> Not sure, I'll leave that up to you. I don't mind doing it in a > >> separate > >> patch (I'd probably prefer that over mixing it into unrelated patch). > > It is attached separately. > > > > OK, thanks. > > So, have we reached conclusion about all the bits I mentioned on 7/31? > The delta and float8/double cast are fixed, and for computeDistance > (i.e. doing comparisons directly or using float8_lt), the code may > seem a bit inconsistent, but it is in fact correct as the NaNs are > handled elsewhere. That seems reasonable, but perhaps a comment > pointing that out would be nice. I'm not confident on replacing double to float8 partially in gist code. After the 0002 patch applied, I see most of problematic usage of double or bare arithmetic on dimentional values in gistproc.c. > static inline float > non_negative(float val) > { > if (val >= 0.0f) > return val; > else > return 0.0f; > } It is used as "non_negative(overlap)", where overlap is float4, which is calculated using float8_mi. Float4 makes sense only if we need to store a large number of it to somewhere but they are just working varialbles. Couldn't we eliminate float4 that doesn't have a requirement to do so? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
At Fri, 03 Aug 2018 13:38:40 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in <20180803.133840.180843182.horiguchi.kyotaro@lab.ntt.co.jp> > At Thu, 2 Aug 2018 11:50:55 +0200, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in <ce3cf95a-4751-c168-54ae-636c486e06cd@2ndquadrant.com> > > > > > > On 08/01/2018 01:40 PM, Emre Hasegeli wrote: > > >> Ah, so there's an assumption that NaNs are handled earlier and never > > >> reach > > >> this place? That's probably a safe assumption. I haven't thought about > > >> that, > > >> it simply seemed suspicious that the code mixes direct comparisons and > > >> float8_mi() calls. > > > The comparison functions handle NaNs. The arithmetic functions handle > > > returning error on underflow, overflow and division by zero. I > > > assumed we want to return error on those in any case, but we don't > > > want to handle NaNs at every place. > > > > > >> Not sure, I'll leave that up to you. I don't mind doing it in a > > >> separate > > >> patch (I'd probably prefer that over mixing it into unrelated patch). > > > It is attached separately. > > > > > > > OK, thanks. > > > > So, have we reached conclusion about all the bits I mentioned on 7/31? > > The delta and float8/double cast are fixed, and for computeDistance > > (i.e. doing comparisons directly or using float8_lt), the code may > > seem a bit inconsistent, but it is in fact correct as the NaNs are > > handled elsewhere. That seems reasonable, but perhaps a comment > > pointing that out would be nice. > > I'm not confident on replacing double to float8 partially in gist > code. After the 0002 patch applied, I see most of problematic > usage of double or bare arithmetic on dimentional values in > gistproc.c. > > > static inline float > > non_negative(float val) > > { > > if (val >= 0.0f) > > return val; > > else > > return 0.0f; > > } > > This takes float(4) and it is used as "non_negative(overlap)", > where overlap is float4, which is calculated using float8_mi. > Float4 makes sense if we store a large number of values somewhere > but they are just working varialbles. Couldn't we eliminate > float(4) whthout any specifc requirement? I'm fine with the 0002-geo-float-v14 except the above. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On 08/03/2018 06:40 AM, Kyotaro HORIGUCHI wrote: > ... > > I'm not confident on replacing double to float8 partially in gist > code. After the 0002 patch applied, I see most of problematic > usage of double or bare arithmetic on dimentional values in > gistproc.c. > >> static inline float >> non_negative(float val) >> { >> if (val >= 0.0f) >> return val; >> else >> return 0.0f; >> } > > It is used as "non_negative(overlap)", where overlap is float4, > which is calculated using float8_mi. Float4 makes sense only if > we need to store a large number of it to somewhere but they are > just working varialbles. Couldn't we eliminate float4 that > doesn't have a requirement to do so? > I'm not sure I follow. The patch does not modify non_negative() at all, and we still call it like this: if (non_negative(overlap) < non_negative(context->overlap) || (range > context->range && non_negative(overlap) <= non_negative(context->overlap))) selectthis = true; where all the "overlap" values are still float4. The only thing that changed here is that instead of doing the arithmetic operations directly we call float8_mi/float8_div to benefit from the float8 handling. So I'm not sure how does the patch beaks this? And what do you mean by 'eliminate float4'? thank you -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 08/03/2018 02:39 PM, Tomas Vondra wrote: > On 08/03/2018 06:40 AM, Kyotaro HORIGUCHI wrote: >> ... >> >> I'm not confident on replacing double to float8 partially in gist >> code. After the 0002 patch applied, I see most of problematic >> usage of double or bare arithmetic on dimentional values in >> gistproc.c. >> >>> static inline float >>> non_negative(float val) >>> { >>> if (val >= 0.0f) >>> return val; >>> else >>> return 0.0f; >>> } >> >> It is used as "non_negative(overlap)", where overlap is float4, >> which is calculated using float8_mi. Float4 makes sense only if >> we need to store a large number of it to somewhere but they are >> just working varialbles. Couldn't we eliminate float4 that >> doesn't have a requirement to do so? >> > > I'm not sure I follow. The patch does not modify non_negative() at all, > and we still call it like this: > > if (non_negative(overlap) < non_negative(context->overlap) || > (range > context->range && > non_negative(overlap) <= non_negative(context->overlap))) > selectthis = true; > > where all the "overlap" values are still float4. The only thing that > changed here is that instead of doing the arithmetic operations directly > we call float8_mi/float8_div to benefit from the float8 handling. > > So I'm not sure how does the patch beaks this? And what do you mean by > 'eliminate float4'? > Kyotaro-san, can you explain what your concerns regarding this bit are? I'd like to get 0002 committed, but I'm not sure I understand your point about the changes in gist code, so I can't address them. And I certainly don't want to just ignore them ... regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
At Wed, 8 Aug 2018 14:39:54 +0200, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in <6ecb4f61-1fb1-08a1-31d6-e58e9c352374@2ndquadrant.com> > > > On 08/03/2018 02:39 PM, Tomas Vondra wrote: > > On 08/03/2018 06:40 AM, Kyotaro HORIGUCHI wrote: > >> ... > >> > >> I'm not confident on replacing double to float8 partially in gist > >> code. After the 0002 patch applied, I see most of problematic > >> usage of double or bare arithmetic on dimentional values in > >> gistproc.c. > >> > >>> static inline float > >>> non_negative(float val) > >>> { > >>> if (val >= 0.0f) > >>> return val; > >>> else > >>> return 0.0f; > >>> } > >> > >> It is used as "non_negative(overlap)", where overlap is float4, > >> which is calculated using float8_mi. Float4 makes sense only if > >> we need to store a large number of it to somewhere but they are > >> just working varialbles. Couldn't we eliminate float4 that > >> doesn't have a requirement to do so? > >> > > I'm not sure I follow. The patch does not modify non_negative() at > > all, and we still call it like this: > > if (non_negative(overlap) < non_negative(context->overlap) || > > (range > context->range && > > non_negative(overlap) <= non_negative(context->overlap))) > > selectthis = true; > > where all the "overlap" values are still float4. The only thing that > > changed here is that instead of doing the arithmetic operations > > directly we call float8_mi/float8_div to benefit from the float8 > > handling. > > So I'm not sure how does the patch beaks this? And what do you mean by > > 'eliminate float4'? > > > > Kyotaro-san, can you explain what your concerns regarding this bit > are? I'd like to get 0002 committed, but I'm not sure I understand > your point about the changes in gist code, so I can't address > them. And I certainly don't want to just ignore them ... It doesn't break nothing so nothing must be done with this. Just I was a bit uneasy to see meaninglessly used foat4. Sorry for the unnecessary argument. After all I don't object to commit it in this shape. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On 08/09/2018 11:42 AM, Kyotaro HORIGUCHI wrote: > At Wed, 8 Aug 2018 14:39:54 +0200, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in <6ecb4f61-1fb1-08a1-31d6-e58e9c352374@2ndquadrant.com> >> >> >> On 08/03/2018 02:39 PM, Tomas Vondra wrote: >>> On 08/03/2018 06:40 AM, Kyotaro HORIGUCHI wrote: >>>> ... >>>> >>>> I'm not confident on replacing double to float8 partially in gist >>>> code. After the 0002 patch applied, I see most of problematic >>>> usage of double or bare arithmetic on dimentional values in >>>> gistproc.c. >>>> >>>>> static inline float >>>>> non_negative(float val) >>>>> { >>>>> if (val >= 0.0f) >>>>> return val; >>>>> else >>>>> return 0.0f; >>>>> } >>>> >>>> It is used as "non_negative(overlap)", where overlap is float4, >>>> which is calculated using float8_mi. Float4 makes sense only if >>>> we need to store a large number of it to somewhere but they are >>>> just working varialbles. Couldn't we eliminate float4 that >>>> doesn't have a requirement to do so? >>>> >>> I'm not sure I follow. The patch does not modify non_negative() at >>> all, and we still call it like this: >>> if (non_negative(overlap) < non_negative(context->overlap) || >>> (range > context->range && >>> non_negative(overlap) <= non_negative(context->overlap))) >>> selectthis = true; >>> where all the "overlap" values are still float4. The only thing that >>> changed here is that instead of doing the arithmetic operations >>> directly we call float8_mi/float8_div to benefit from the float8 >>> handling. >>> So I'm not sure how does the patch beaks this? And what do you mean by >>> 'eliminate float4'? >>> >> >> Kyotaro-san, can you explain what your concerns regarding this bit >> are? I'd like to get 0002 committed, but I'm not sure I understand >> your point about the changes in gist code, so I can't address >> them. And I certainly don't want to just ignore them ... > > It doesn't break nothing so nothing must be done with this. Just > I was a bit uneasy to see meaninglessly used foat4. Sorry for > the unnecessary argument. > > After all I don't object to commit it in this shape. > Understood. Thanks for the explanation. I've pushed parts 0001 and 0002, as submitted on August 1. Let's see if that upsets some of the buildfarm animals. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, the buildfarm seems to be mostly happy so far, so I've taken a quick look at the remaining two parts. The patches still apply, but I'm getting plenty of failures in regression tests, due to 0.0 being replaced by -0.0. This reminds me 74294c7301, except that these patches don't seem to remove any such checks by mistake. Instead it seems to be caused by simply switching to float8_ methods. The attached patch fixes the issue for me, although I'm not claiming it's the right way to fix it. Another thing I noticed is the last few lines from line_interpt_line are actually unreachable, because there's now 'else return false' branch. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
> the buildfarm seems to be mostly happy so far, so I've taken a quick > look at the remaining two parts. The patches still apply, but I'm > getting plenty of failures in regression tests, due to 0.0 being > replaced by -0.0. I think we are better off fixing them locally at the moment like your patch does. We should consider to eliminate -0 globally for all floating point based datatypes later. I simplified and incorporated your change to line_interpt_line() into mine. I am not sure about normalising -0s on point_construct(). We currently allow points to be initialized with -0s. I think it is fair for us to return -0 when -x and 0 are multiplied. That is the current behavior and the behavior of the float datatypes. I adjusted the results of the new regression tests accordingly. > Another thing I noticed is the last few lines from line_interpt_line are > actually unreachable, because there's now 'else return false' branch. Which lines do you mean exactly? I don't see any being unreachable.
Attachment
On 08/17/2018 06:40 PM, Emre Hasegeli wrote: >> the buildfarm seems to be mostly happy so far, so I've taken a quick >> look at the remaining two parts. The patches still apply, but I'm >> getting plenty of failures in regression tests, due to 0.0 being >> replaced by -0.0. > > I think we are better off fixing them locally at the moment like your > patch does. We should consider to eliminate -0 globally for all > floating point based datatypes later. I simplified and incorporated > your change to line_interpt_line() into mine. > > I am not sure about normalising -0s on point_construct(). We > currently allow points to be initialized with -0s. I think it is fair > for us to return -0 when -x and 0 are multiplied. That is the current > behavior and the behavior of the float datatypes. I adjusted the > results of the new regression tests accordingly. > Hmmm, I need to think about that a bit more. BTW how did we end up with the regression differences? Presumably you've tried that on your machine and it passed. So if we adjust the expected file, won't it fail on some other machines? >> Another thing I noticed is the last few lines from line_interpt_line are >> actually unreachable, because there's now 'else return false' branch. > > Which lines do you mean exactly? I don't see any being unreachable. > Apologies, I got confused - there are no unreachable lines. -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> BTW how did we end up with the regression differences? Presumably you've > tried that on your machine and it passed. So if we adjust the expected > file, won't it fail on some other machines? I had another patch to check for -0 inside float{4,8}_{div,mul}(). I dropped it on the last set of patches, so the tests were broken. I get -0 as a result of -x * 0 both on Mac and Linux.
On 08/17/2018 08:24 PM, Emre Hasegeli wrote: >> BTW how did we end up with the regression differences? Presumably you've >> tried that on your machine and it passed. So if we adjust the expected >> file, won't it fail on some other machines? > > I had another patch to check for -0 inside float{4,8}_{div,mul}(). I > dropped it on the last set of patches, so the tests were broken. I > get -0 as a result of -x * 0 both on Mac and Linux. > OK, that explains is. I won't have time to get this committed before CF 2018-09, but I'll pick it up in September. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Emre Hasegeli <emre@hasegeli.com> writes: >> BTW how did we end up with the regression differences? Presumably you've >> tried that on your machine and it passed. So if we adjust the expected >> file, won't it fail on some other machines? > I had another patch to check for -0 inside float{4,8}_{div,mul}(). I > dropped it on the last set of patches, so the tests were broken. I > get -0 as a result of -x * 0 both on Mac and Linux. I'll bet a good deal of money that you'll find that does not hold true across the whole buildfarm. regards, tom lane
On 08/17/2018 08:56 PM, Tom Lane wrote: > Emre Hasegeli <emre@hasegeli.com> writes: >>> BTW how did we end up with the regression differences? Presumably you've >>> tried that on your machine and it passed. So if we adjust the expected >>> file, won't it fail on some other machines? > >> I had another patch to check for -0 inside float{4,8}_{div,mul}(). I >> dropped it on the last set of patches, so the tests were broken. I >> get -0 as a result of -x * 0 both on Mac and Linux. > > I'll bet a good deal of money that you'll find that does not hold > true across the whole buildfarm. > Hmm, yeah. Based on past experience, the powerpc machines are likely to stumble on this. FWIW my understanding is that these failures actually happen in new tests, it's not an issue introduced by this patch series. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > Hmm, yeah. Based on past experience, the powerpc machines are likely to > stumble on this. > FWIW my understanding is that these failures actually happen in new > tests, it's not an issue introduced by this patch series. Yeah, we've definitely hit such problems before. The geometric logic seems particularly prone to it because it's doing more and subtler float arithmetic than the rest of the system ... but it's not the sole source of minus-zero issues. regards, tom lane
On 08/17/2018 11:23 PM, Tom Lane wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> Hmm, yeah. Based on past experience, the powerpc machines are likely to >> stumble on this. > >> FWIW my understanding is that these failures actually happen in new >> tests, it's not an issue introduced by this patch series. > > Yeah, we've definitely hit such problems before. The geometric logic > seems particularly prone to it because it's doing more and subtler > float arithmetic than the rest of the system ... but it's not the sole > source of minus-zero issues. > It's not entirely clear to me what's the best way forward with this. We can go through the patch and fix places with obvious -0 risks, but then what? I don't have access to any powepc machines, so I can't check if there are any other failures. So the only thing I could do is commit and fix based on buildfarm failures ... Emre, any better ideas? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > On 08/17/2018 11:23 PM, Tom Lane wrote: >> Yeah, we've definitely hit such problems before. The geometric logic >> seems particularly prone to it because it's doing more and subtler >> float arithmetic than the rest of the system ... but it's not the sole >> source of minus-zero issues. > We can go through the patch and fix places with obvious -0 risks, but > then what? I don't have access to any powepc machines, so I can't check > if there are any other failures. So the only thing I could do is commit > and fix based on buildfarm failures ... That's the usual solution. I don't see anything particularly wrong with it. If the buildfarm results suggest a whole mess of problems, it might be appropriate to revert and rethink; but you shouldn't be afraid to use the buildfarm as a testbed. regards, tom lane
On 09/03/2018 04:08 AM, Tom Lane wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> On 08/17/2018 11:23 PM, Tom Lane wrote: >>> Yeah, we've definitely hit such problems before. The geometric logic >>> seems particularly prone to it because it's doing more and subtler >>> float arithmetic than the rest of the system ... but it's not the sole >>> source of minus-zero issues. > >> We can go through the patch and fix places with obvious -0 risks, but >> then what? I don't have access to any powepc machines, so I can't check >> if there are any other failures. So the only thing I could do is commit >> and fix based on buildfarm failures ... > > That's the usual solution. I don't see anything particularly wrong > with it. If the buildfarm results suggest a whole mess of problems, > it might be appropriate to revert and rethink; but you shouldn't be > afraid to use the buildfarm as a testbed. > FWIW I plan to get this committed sometime this week, when I'm available to fix the expected buildfarm breakage. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 09/23/2018 11:55 PM, Tomas Vondra wrote: > On 09/03/2018 04:08 AM, Tom Lane wrote: >> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >>> On 08/17/2018 11:23 PM, Tom Lane wrote: >>>> Yeah, we've definitely hit such problems before. The geometric logic >>>> seems particularly prone to it because it's doing more and subtler >>>> float arithmetic than the rest of the system ... but it's not the sole >>>> source of minus-zero issues. >> >>> We can go through the patch and fix places with obvious -0 risks, but >>> then what? I don't have access to any powepc machines, so I can't check >>> if there are any other failures. So the only thing I could do is commit >>> and fix based on buildfarm failures ... >> >> That's the usual solution. I don't see anything particularly wrong >> with it. If the buildfarm results suggest a whole mess of problems, >> it might be appropriate to revert and rethink; but you shouldn't be >> afraid to use the buildfarm as a testbed. >> > > FWIW I plan to get this committed sometime this week, when I'm available > to fix the expected buildfarm breakage. > Pushed. Now let's wait for the buildfarm to complain ... regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > Pushed. Now let's wait for the buildfarm to complain ... gaur's not happy, but rather surprisingly, it looks like we're mostly OK elsewhere. Do you need me to trace down exactly what's going wrong on gaur? regards, tom lane
On 09/26/2018 06:45 PM, Tom Lane wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> Pushed. Now let's wait for the buildfarm to complain ... > > gaur's not happy, but rather surprisingly, it looks like we're > mostly OK elsewhere. Do you need me to trace down exactly what's > going wrong on gaur? > Hmmm, interesting. It seems both failures happen in the chunk that multiplies paths with points, i.e. essentially point_mul_point. So it seems most platforms end up with (0,0) * (-3,4) = (-0, 0) while gaur apparently thinks it's (0,0). And indeed, that's what the attached trivial program does - I'd bet if you run it on gaur, it'll print 0.000000, not -0.000000. Or you could just try doing select '(0,0)'::point * '(-3,4)'::point; If this is what's going on, I'd say the best solution is to make it produce (0,0) everywhere, so that we don't expect -0.0 anywhere. We could do that either by adding the == 0.0 check to yet another place, or to point_construct() directly. Adding it to point_construct() means we'll pay the price always, but I guess there are few paths where we know we don't need it. And if we add it to many places it's likely about as expensive as adding it to point_construct. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > Hmmm, interesting. It seems both failures happen in the chunk that > multiplies paths with points, i.e. essentially point_mul_point. So it > seems most platforms end up with > (0,0) * (-3,4) = (-0, 0) > while gaur apparently thinks it's (0,0). And indeed, that's what the > attached trivial program does - I'd bet if you run it on gaur, it'll > print 0.000000, not -0.000000. Nope, no cigar: $ gcc -Wall -O2 test.c $ ./a.out -0.000000 (I tried a couple other -O levels to see if that affected anything, but it didn't.) I'll try to isolate the problem more closely, but it will take awhile. That machine is slow :-( regards, tom lane
On 09/26/2018 10:58 PM, Tom Lane wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> Hmmm, interesting. It seems both failures happen in the chunk that >> multiplies paths with points, i.e. essentially point_mul_point. So it >> seems most platforms end up with > >> (0,0) * (-3,4) = (-0, 0) > >> while gaur apparently thinks it's (0,0). And indeed, that's what the >> attached trivial program does - I'd bet if you run it on gaur, it'll >> print 0.000000, not -0.000000. > > Nope, no cigar: > > $ gcc -Wall -O2 test.c > $ ./a.out > -0.000000 > > (I tried a couple other -O levels to see if that affected anything, > but it didn't.) > Interesting ... > I'll try to isolate the problem more closely, but it will take awhile. > That machine is slow :-( > OK, thanks. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > On 09/26/2018 06:45 PM, Tom Lane wrote: >> gaur's not happy, but rather surprisingly, it looks like we're >> mostly OK elsewhere. Do you need me to trace down exactly what's >> going wrong on gaur? > Or you could just try doing > select '(0,0)'::point * '(-3,4)'::point; > If this is what's going on, I'd say the best solution is to make it > produce (0,0) everywhere, so that we don't expect -0.0 anywhere. Actually, it seems simpler than that: gaur produces plus zero already from the multiplication: regression=# select '-3'::float8 * '0'::float8; ?column? ---------- 0 (1 row) whereas I get -0 elsewhere. I'm surprised that this doesn't create more widely-visible regression failures, but there you have it. > We could do that either by adding the == 0.0 check to yet another place, > or to point_construct() directly. Adding it to point_construct() means > we'll pay the price always, but I guess there are few paths where we > know we don't need it. And if we add it to many places it's likely about > as expensive as adding it to point_construct. If gaur is the only machine showing this failure, which seems more likely by the hour, I'm not sure that we should give up performance across-the-board to make it happy. Perhaps a variant expected-file is a better answer; or we could remove these specific test cases. Anyway, I'd counsel doing nothing for a day or so, till the buildfarm breakage from the strerror/snprintf changes clears up. Then we'll have a better idea of whether any other machines are affected. regards, tom lane
On 09/27/2018 12:48 AM, Tom Lane wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> On 09/26/2018 06:45 PM, Tom Lane wrote: >>> gaur's not happy, but rather surprisingly, it looks like we're >>> mostly OK elsewhere. Do you need me to trace down exactly what's >>> going wrong on gaur? > >> Or you could just try doing >> select '(0,0)'::point * '(-3,4)'::point; >> If this is what's going on, I'd say the best solution is to make it >> produce (0,0) everywhere, so that we don't expect -0.0 anywhere. > > Actually, it seems simpler than that: gaur produces plus zero already > from the multiplication: > > regression=# select '-3'::float8 * '0'::float8; > ?column? > ---------- > 0 > (1 row) > > whereas I get -0 elsewhere. I'm surprised that this doesn't create > more widely-visible regression failures, but there you have it. > Hmmm, interesting. But I still don't quite understand why the test program still produced -0.000000 and not 0.000000. That seems like a direct contradiction to what we see in regression tests, doesn't it? >> We could do that either by adding the == 0.0 check to yet another place, >> or to point_construct() directly. Adding it to point_construct() means >> we'll pay the price always, but I guess there are few paths where we >> know we don't need it. And if we add it to many places it's likely about >> as expensive as adding it to point_construct. > > If gaur is the only machine showing this failure, which seems more > likely by the hour, I'm not sure that we should give up performance > across-the-board to make it happy. Perhaps a variant expected-file > is a better answer; or we could remove these specific test cases. > > Anyway, I'd counsel doing nothing for a day or so, till the buildfarm > breakage from the strerror/snprintf changes clears up. Then we'll > have a better idea of whether any other machines are affected. > Yep, gaur seems to be the only animal affected by this, so no need to rush anyway. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > On 09/27/2018 12:48 AM, Tom Lane wrote: >> Actually, it seems simpler than that: gaur produces plus zero already >> from the multiplication: >> regression=# select '-3'::float8 * '0'::float8; >> ?column? >> ---------- >> 0 >> (1 row) > Hmmm, interesting. But I still don't quite understand why the test > program still produced -0.000000 and not 0.000000. That seems like a > direct contradiction to what we see in regression tests, doesn't it? OK, so after poking at it for another hour and getting more and more confused, I realized that gdb was lying to me by printing genuine minus zero values as just "0". Throw out everything I thought I knew and start over ... ... and awhile later, this is the answer: on this machine, printf with "%f" will show the sign of minus zero. But printf with "%g" will not. Guess which format float8out uses. (I'll bet that gdb does too, so that its lie wasn't its fault.) AFAICT at the moment, gaur is doing the underlying IEEE float math the same as everybody else, which is not very surprising because HP bought into IEEE math pretty early. Hex-dumping shows conclusively that point_mul_point *does* emit (-0,0) in the case in question. But we've got a platform-specific issue with whether the minus zero gets printed as such. I wonder whether similar effects explain some of the other platform-specific oddities we've seen with minus zero. Anyway, at this point I'd say let's just leave gaur broken so far as the geometric tests are concerned, pending results from the concurrent thread about possibly rewriting snprintf.c's float handling to not depend on the platform's sprintf. If that doesn't happen, we can revisit some sort of narrower fix for this. The narrow fix ought to be in snprintf.c anyway, not anywhere near the geometric code. I notice BTW that it's sort of accidental that snprintf.c behaves properly for minus zero on most machines. The test "value < 0" isn't true, so it doesn't think there's a sign. When sprintf outputs a "-" anyway, that's effectively treated as a digit. We'd do the wrong thing with a format like "%+f", and maybe in other cases too. regards, tom lane
If you look at the differing results carefully, there's this one: *** 3249,3255 **** ! [(0,0),(3,0),(4,5),(1,6)] | (-5,-12) | [(0,-0),(-15,-36),(40,-73),(67,-42)] --- 3249,3255 ---- ! [(0,0),(3,0),(4,5),(1,6)] | (-5,-12) | [(0,0),(-15,-36),(40,-73),(67,-42)] (Third column is first multiplied by second). I wonder why the expected file has a -0 only in the second position and not both first and second. These are both positive zeroes being multiplied by a negative number. Why is 0 * -12 = -0 yet 0 * -5 = 0? What is going on? Is the sign suppressed for negative zeros only in the first coordinate? I suppose this is just a side effect of how float8_mi, _pl, _mul work (in point_mul_point). Anyway maybe your test case should use more of the float8 op combinations in order to show the difference. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 09/27/2018 07:21 PM, Alvaro Herrera wrote: > If you look at the differing results carefully, there's this one: > > *** 3249,3255 **** > ! [(0,0),(3,0),(4,5),(1,6)] | (-5,-12) | [(0,-0),(-15,-36),(40,-73),(67,-42)] > --- 3249,3255 ---- > ! [(0,0),(3,0),(4,5),(1,6)] | (-5,-12) | [(0,0),(-15,-36),(40,-73),(67,-42)] > > (Third column is first multiplied by second). > > I wonder why the expected file has a -0 only in the second position and > not both first and second. These are both positive zeroes being > multiplied by a negative number. Why is 0 * -12 = -0 yet 0 * -5 = 0? > What is going on? Is the sign suppressed for negative zeros only in the > first coordinate? I suppose this is just a side effect of how > float8_mi, _pl, _mul work (in point_mul_point). > > Anyway maybe your test case should use more of the float8 op > combinations in order to show the difference. > I may be missing what you're saying, but point_mul_point is not just a simple multiplication of coordinates, i.e. (x1,y1) * (x2,y2) != (x1*x2, y1*y2) It essentially does this: ((x1 * x2 - y1 * y2), (x1 * y2 + x2 * y1)) so I wouldn't be surprised if this was a difference between _pl and _mi. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 09/27/2018 07:05 PM, Tom Lane wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> On 09/27/2018 12:48 AM, Tom Lane wrote: >>> Actually, it seems simpler than that: gaur produces plus zero already >>> from the multiplication: >>> regression=# select '-3'::float8 * '0'::float8; >>> ?column? >>> ---------- >>> 0 >>> (1 row) > >> Hmmm, interesting. But I still don't quite understand why the test >> program still produced -0.000000 and not 0.000000. That seems like a >> direct contradiction to what we see in regression tests, doesn't it? > > OK, so after poking at it for another hour and getting more and more > confused, I realized that gdb was lying to me by printing genuine > minus zero values as just "0". Throw out everything I thought I knew > and start over ... > Heh. A debugger lying to you just a wee bit is fun ... > ... and awhile later, this is the answer: on this machine, > printf with "%f" will show the sign of minus zero. But printf > with "%g" will not. Guess which format float8out uses. > (I'll bet that gdb does too, so that its lie wasn't its fault.) > > AFAICT at the moment, gaur is doing the underlying IEEE float math > the same as everybody else, which is not very surprising because > HP bought into IEEE math pretty early. Hex-dumping shows conclusively > that point_mul_point *does* emit (-0,0) in the case in question. > But we've got a platform-specific issue with whether the minus zero > gets printed as such. I wonder whether similar effects explain some > of the other platform-specific oddities we've seen with minus zero. > > Anyway, at this point I'd say let's just leave gaur broken so far as the > geometric tests are concerned, pending results from the concurrent thread > about possibly rewriting snprintf.c's float handling to not depend on the > platform's sprintf. If that doesn't happen, we can revisit some sort > of narrower fix for this. The narrow fix ought to be in snprintf.c > anyway, not anywhere near the geometric code. > > I notice BTW that it's sort of accidental that snprintf.c behaves properly > for minus zero on most machines. The test "value < 0" isn't true, so > it doesn't think there's a sign. When sprintf outputs a "-" anyway, > that's effectively treated as a digit. We'd do the wrong thing with a > format like "%+f", and maybe in other cases too. > OK, makes sense. Thanks for the investigation! regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018-Sep-27, Tomas Vondra wrote: > I may be missing what you're saying, but point_mul_point is not just a > simple multiplication of coordinates, i.e. > > (x1,y1) * (x2,y2) != (x1*x2, y1*y2) > > It essentially does this: > > ((x1 * x2 - y1 * y2), (x1 * y2 + x2 * y1)) > > so I wouldn't be surprised if this was a difference between _pl and _mi. Yeah, I had misinterpreted the operation before reading the code, then when reading it I realized the formula is what you were saying, so I updated the final part of my reply but failed to realize I had written my misunderstanding in the first portion. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services