Thread: BUG #15758: regr_* return wrong answers for some valid data
The following bug has been logged on the website: Bug reference: 15758 Logged by: Dingyuan Wang Email address: gumblex@aosc.io PostgreSQL version: 11.2 Operating system: Debian Description: This can reproduce on at least version 10 or 11, on Debian amd64. Given the following data: CREATE TABLE testdata ( x double precision, t1 double precision, t2 double precision ); INSERT INTO testdata VALUES (79.7439999999999998, 1506705739.38499999, 1506705817.38100004); INSERT INTO testdata VALUES (123.903999999999996, 1506705766.89499998, 1506705824.0079999); INSERT INTO testdata VALUES (87.2960099999999954, 1506705746.29299998, 1506705818.24300003); INSERT INTO testdata VALUES (116.352000000000004, 1506705761.87299991, 1506705823.05500007); INSERT INTO testdata VALUES (67.7120100000000065, 1506705734.74300003, 1506705815.5539999); INSERT INTO testdata VALUES (72.9600099999999969, 1506705735.35100007, 1506705815.6500001); INSERT INTO testdata VALUES (101.632000000000005, 1506705756.25999999, 1506705820.70600009); INSERT INTO testdata VALUES (108.927999999999997, 1506705761.30699992, 1506705821.59500003); INSERT INTO testdata VALUES (94.0799999999999983, 1506705747.37199998, 1506705819.82999992); This select: SELECT regr_slope(x, t1) a1, regr_intercept(x, t1) b1, regr_r2(x, t1) r1, regr_slope(x, t2) a2, regr_intercept(x, t2) b2, regr_r2(x, t2) r2, regr_slope(t1, x) a3, regr_intercept(t1, x) b3, regr_r2(t1, x) r3, regr_slope(t2, x) a4, regr_intercept(t2, x) b4, regr_r2(t2, x) r4 FROM testdata; Outputs: null | null | null | 0.131221234798431 | -197711616 | 0.020541283900245 | 0.613632754607929 | 1506705691.81008 | 1 | 0.156539327890021 | 1506705804.72837 | 0.020541283900245 Some numbers are obviously wrong. Using Python's Decimal arithmetic, the reference answer should be: 1.5950991071, -2403344901.6928107374, 0.9788051681 6.3409218262, -9553903722.1136632550, 0.9926034164 0.6136328230, 1506705691.8100808484, 0.9788051681 0.1565392925, 1506705804.7283713567, 0.9926034164 I understand that there may be some floating point overflows/underflows, but the major spreadsheets can calculate this well (to some precision). I doubt other regr_* functions have similar problems.
On Tue, Apr 16, 2019 at 3:55 AM PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:
Bug reference: 15758
Logged by: Dingyuan Wang
Email address: gumblex@aosc.io
PostgreSQL version: 11.2
Operating system: Debian
Description:
This can reproduce on at least version 10 or 11, on Debian amd64.
Given the following data:
This will be improved in version 12 when that is released later this year. See the below commit:
commit e954a727f0c8872bf5203186ad0f5312f6183746
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Date: Sat Oct 6 11:20:09 2018 +0100
Improve the accuracy of floating point statistical aggregates.
Cheers,
Jeff
Pg 12 does fix the problem, and the numbers are slightly better than that of Gnumeric and LibreOffice Calc. 2019/4/16 21:41, Jeff Janes: > On Tue, Apr 16, 2019 at 3:55 AM PG Bug reporting form > <noreply@postgresql.org <mailto:noreply@postgresql.org>> wrote: > > The following bug has been logged on the website: > > Bug reference: 15758 > Logged by: Dingyuan Wang > Email address: gumblex@aosc.io <mailto:gumblex@aosc.io> > PostgreSQL version: 11.2 > Operating system: Debian > Description: > > This can reproduce on at least version 10 or 11, on Debian amd64. > Given the following data: > > > This will be improved in version 12 when that is released later this > year. See the below commit: > > commit e954a727f0c8872bf5203186ad0f5312f6183746 > Author: Dean Rasheed <dean.a.rasheed@gmail.com > <mailto:dean.a.rasheed@gmail.com>> > Date: Sat Oct 6 11:20:09 2018 +0100 > > Improve the accuracy of floating point statistical aggregates. > > > > Cheers, > > Jeff