Thread: convert real to numeric.
Hi,
Postresql version: 10.5
I need to convert an SQL field from real to numeric, but I’m getting a strange behavior.
See the following query in preprod:
select amount, amount::numeric, amount::numeric(16,4), amount::varchar::numeric from mytable where id = 32560545;
Result:
17637.75, 17637.8, 17637.8000, 17637.75
As you can see, the conversion to ::numeric is truncating the number to just 1 decimal digit.
Also we tried to change the schema definition of this table, from real to numeric, and the value was truncated.
Is the ::varchar::numeric workaround a good option in your opinion? Any other ideas to fix this issue?
Thank you,
On 10/18/18 3:28 AM, Alessandro Aste wrote: > Hi, > > > > Postresql version: 10.5 > > I need to convert an SQL field from real to numeric, but I’m getting a > strange behavior. > > See the following query in preprod: > > select amount, amount::numeric, amount::numeric(16,4), > amount::varchar::numeric from mytable where id = 32560545; > > Result: > > 17637.75, 17637.8, 17637.8000, 17637.75 > > As you can see, the conversion to ::numeric is truncating the number to > just 1 decimal digit. > > Also we tried to change the schema definition of this table, from real > to numeric, and the value was truncated. > > Is the ::varchar::numeric workaround a good option in your opinion? Any > other ideas to fix this issue? select 17637.75::real::numeric; numeric --------- 17637.8 select 17637.75::float::numeric; numeric ---------- 17637.75 > > Thank you, > -- Adrian Klaver adrian.klaver@aklaver.com
Alessandro Aste <alessandro.aste@gmail.com> writes: > I need to convert an SQL field from real to numeric, but I’m getting a > strange behavior. > select amount, amount::numeric, amount::numeric(16,4), > amount::varchar::numeric from mytable where id = 32560545; > Result: > 17637.75, 17637.8, 17637.8000, 17637.75 You realize of course that "real" (a/k/a float4) can only be trusted to six decimal digits in the first place. When I try this, I get regression=# select '17637.75'::real, '17637.75'::real::numeric; float4 | numeric ---------+--------- 17637.8 | 17637.8 (1 row) because the float4 output function rounds it off at the last trustworthy digit. Your results suggest that you must be running with extra_float_digits set to 1, which affects the behavior of the float4 output function ... but not that of float4->numeric conversion. You could ju-jitsu the system into duplicating that behavior by casting to text (which invokes float4out) and then to numeric: regression=# set extra_float_digits to 1; SET regression=# select '17637.75'::real, '17637.75'::real::numeric; float4 | numeric ----------+--------- 17637.75 | 17637.8 (1 row) regression=# select '17637.75'::real, '17637.75'::real::text::numeric; float4 | numeric ----------+---------- 17637.75 | 17637.75 (1 row) I think, however, that you ought to spend some time contemplating the fact that that extra digit is partially garbage. I'm not really convinced that doing it like this rather than doing the standard conversion is a good idea. You can't manufacture precision where there is none --- so it might be better to do the standard conversion and then go back and fix any values you can demonstrate are wrong. regards, tom lane
On 2018-10-18 10:15:40 -0400, Tom Lane wrote: > Alessandro Aste <alessandro.aste@gmail.com> writes: > > I need to convert an SQL field from real to numeric, but I’m getting a > > strange behavior. > > select amount, amount::numeric, amount::numeric(16,4), > > amount::varchar::numeric from mytable where id = 32560545; > > Result: > > 17637.75, 17637.8, 17637.8000, 17637.75 > > You realize of course that "real" (a/k/a float4) can only be trusted > to six decimal digits in the first place. When I try this, I get 24 bits, actually. Using decimal digits when talking about binary numbers is misleading. [...] > You could ju-jitsu the system into duplicating that behavior by casting > to text (which invokes float4out) and then to numeric: > > regression=# set extra_float_digits to 1; > SET > regression=# select '17637.75'::real, '17637.75'::real::numeric; > float4 | numeric > ----------+--------- > 17637.75 | 17637.8 > (1 row) > > regression=# select '17637.75'::real, '17637.75'::real::text::numeric; > float4 | numeric > ----------+---------- > 17637.75 | 17637.75 > (1 row) I suggest casting first to float8 and then to numeric. The conversion from float4 to float8 is exact, and any rounding error introduced by the float8->numeric conversion is certainly much smaller than the uncertainty of the original float4 value. > I think, however, that you ought to spend some time contemplating > the fact that that extra digit is partially garbage. If we assume that 17637.75 was the result of rounding a more precise value to a float4, then the real value was somewhere between 17637.7490234375 and 17637.7509765625. Rounding to 17637.8 introduces an error almost 50 times larger. > I'm not really convinced that doing it like this rather than doing the > standard conversion is a good idea. You can't manufacture precision > where there is none It may be that the real value of that number is only known to +/- 0.1. Or maybe only to +/- 100. But postgresql can't know that, and gratuitously adding additional rounding errors doesn't help. hp -- _ | Peter J. Holzer | we build much bigger, better disasters now |_|_) | | because we have much more sophisticated | | | hjp@hjp.at | management tools. __/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
Attachment
"Peter J. Holzer" <hjp-pgsql@hjp.at> writes: > On 2018-10-18 10:15:40 -0400, Tom Lane wrote: >> You could ju-jitsu the system into duplicating that behavior by casting >> to text (which invokes float4out) and then to numeric: > I suggest casting first to float8 and then to numeric. The conversion > from float4 to float8 is exact, and any rounding error introduced by the > float8->numeric conversion is certainly much smaller than the > uncertainty of the original float4 value. I do not think that the OP will care for the results of that. The problem is that now the output function will think that the result is worth printing to 16 digits, and the last ten or so of those will be garbage. As an example, even though the cited value happens to work nicely: regression=# select '17637.75'::float4::float8; float8 ---------- 17637.75 (1 row) nearby ones don't: regression=# select '17637.74'::float4::float8; float8 ----------------- 17637.740234375 (1 row) Yeah, in some sense that's a valid representation of the stored float4, but it likely has little to do with the originally presented value. regards, tom lane
On 2018-10-18 18:58:13 -0400, Tom Lane wrote: > "Peter J. Holzer" <hjp-pgsql@hjp.at> writes: > > On 2018-10-18 10:15:40 -0400, Tom Lane wrote: > >> You could ju-jitsu the system into duplicating that behavior by casting > >> to text (which invokes float4out) and then to numeric: > > > I suggest casting first to float8 and then to numeric. The conversion > > from float4 to float8 is exact, and any rounding error introduced by the > > float8->numeric conversion is certainly much smaller than the > > uncertainty of the original float4 value. > > I do not think that the OP will care for the results of that. The problem > is that now the output function will think that the result is worth > printing to 16 digits, and the last ten or so of those will be garbage. You are thinking about "printing" (i.e., presentation to a user), I am thinking about "converting" (i.e. storing the value as accurately as possible in a different presentation). These are different things, You should only think about "printing" when you actually print a value. Depending on the inherent accuracy of the value and the purpose of the display the best way to display the float4 value 17637.740234375 may be '17637.740234375' (the exact value), '17637.74' (the minimum number of decimal digits necessary to reconstruct the exact value), '17638' (we don't care about fractions), '17,700' (three digits ought to be enough for anybody), '17.7k' (the same, but more compact) or anything between. It is for the application programmer to decide how to display a value, because the programmer knows what it means, where it comes from and what the user is supposed to do with that information. The runtime environment doesn't know this. So it shouldn't throw away accuracy. Often even parts of the application don't know this. So they shouldn't either. > As an example, even though the cited value happens to work nicely: > > regression=# select '17637.75'::float4::float8; > float8 > ---------- > 17637.75 > (1 row) > > nearby ones don't: > > regression=# select '17637.74'::float4::float8; > float8 > ----------------- > 17637.740234375 > (1 row) > > Yeah, in some sense that's a valid representation of the stored float4, It is indeed the precise value which is stored (9030523 / 512). > but it likely has little to do with the originally presented value. It is much closer to the original value than 17637.7 (the relative errors are about 1.33e-8 and 2.27e-6, so the error introduced by PostgreSQLs default rounding is 170 times greater). I think using FLT_DIG and DBL_DIG for converting from binary to decimal is wrong. They represent the "number of decimal digits, q, such that any floating-point number with q decimal digits can be rounded into a floating-point number with p radix b digits and back again without change to the q decimal digits" (ISO/IEC 9899:2011), which is not the same as the number of decimal digits sufficient to convert a binary to decimal and back again. Unfortunately, the latter is not a constant and converting binary to decimal isn't trivial - nevertheless the problem has been solved (at least for IEEE-754 arithmetic), so I might take a stab at it (even with my limited numerical skills). hp -- _ | Peter J. Holzer | we build much bigger, better disasters now |_|_) | | because we have much more sophisticated | | | hjp@hjp.at | management tools. __/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>