Thread: my float function returns NaN
I have a function to calculate distance between two points. When asking for distance between two points that are actually the same point; my application (which rounds to only very few decimal places), might return zero or NaN depending on decimal accuracy of point. I am able to reproduce the problem with the following SQL statements. CREATE OR REPLACE FUNCTION geo_deg2rad(float) RETURNS float AS $$ SELECT ($1 * pi()::float / 180::float) $$ LANGUAGE SQL IMMUTABLE CREATE OR REPLACE FUNCTION geo_dist(point,point) RETURNS float AS $$ SELECT 6371.0 * acos( sin(geo_deg2rad($1[0]))*sin(geo_deg2rad($2[0])) + cos(geo_deg2rad($1[0]))*cos(geo_deg2rad($2[0])) * cos(geo_deg2rad($2[1])-geo_deg2rad($1[1])) ) $$ LANGUAGE SQL IMMUTABLE create table geo (p point); insert into geo values (point(44.85051666667,-79.5405)); insert into geo values (point(44.850516667,-79.5405)); select p,geo_dist(p,p) from geo; p | geo_dist ---------------------------+---------------------- (44.85051666667,-79.5405) | NaN (44.850516667,-79.5405) | 0.000134258785931453 Why does the point with more decimal accuracy result in NaN? If its not easy to "fix" my function; is the a function that can convert NaN to zero (like coalesce(p,0) would convert null to zero)? PostgreSQL 9.0.4 on x86_64-unknown-openbsd5.0, compiled by GCC cc (GCC) 4.2.1 20070719 , 64-bit
Dear Frank,
My guess would be that due to rounding errors, the argument passed to acos() is outwith the interval [-1,1] and acos() is only defined within this interval (for reals). I.e the expression:
sin(theta)^2 + cos(theta)^2*cos(0)
is coming out ever so slightly greater than 1, when it should, of course, be exactly 1.
You could simply insert a check prior to calling acos() to ensure it's argument is within the acceptable range. Eg, in pseudo code:
x = sin(lat1)*sin(lat2)+cos(lat1)*cos(lat2)*cos(dlon)
if x > 1 then x=1 else if x< -1 then x = -1
acos(x)
But I'd strongly suggest you investigate using the haversine formulation for spherical distances as it is much better for small distances. Your version is mathematically sound, but computationally prone to rounding errors.
Nathaniel
I have a function to calculate distance between two points. When asking for distance between two points that are actually the same point; my application (which rounds to only very few decimal places), might return zero or NaN depending on decimal accuracy of point.
I am able to reproduce the problem with the following SQL statements.
CREATE OR REPLACE FUNCTION geo_deg2rad(float) RETURNS float AS $$ SELECT ($1 * pi()::float / 180::float) $$ LANGUAGE SQL IMMUTABLE
CREATE OR REPLACE FUNCTION geo_dist(point,point) RETURNS float AS $$ SELECT 6371.0 * acos( sin(geo_deg2rad($1[0]))*sin(geo_deg2rad($2[0])) + cos(geo_deg2rad($1[0]))*cos(geo_deg2rad($2[0])) * cos(geo_deg2rad($2[1])-geo_deg2rad($1[1])) ) $$ LANGUAGE SQL IMMUTABLE
create table geo (p point);
insert into geo values (point(44.85051666667,-79.5405));
insert into geo values (point(44.850516667,-79.5405));
select p,geo_dist(p,p) from geo;
p | geo_dist
---------------------------+----------------------
(44.85051666667,-79.5405) | NaN
(44.850516667,-79.5405) | 0.000134258785931453
Why does the point with more decimal accuracy result in NaN?
If its not easy to "fix" my function; is the a function that can convert NaN to zero (like coalesce(p,0) would convert null to zero)?
PostgreSQL 9.0.4 on x86_64-unknown-openbsd5.0, compiled by GCC cc (GCC) 4.2.1 20070719 , 64-bit
--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice
Nathaniel Trellice <naptrel@yahoo.co.uk> writes: > My guess would be that due to rounding errors, the argument passed to > acos() is outwith the interval [-1,1] and acos() is only defined > within this interval (for reals). Yeah. It's not helping any that you've defined geo_deg2rad() to do its arithmetic in float (that is float4) precision. Making it work in float8 would probably improve matters. Still, I agree that switching to a more numerically stable distance calculation would be a good idea. regards, tom lane
On 04/07/12 10:54, Tom Lane wrote: > Nathaniel Trellice<naptrel@yahoo.co.uk> writes: >> My guess would be that due to rounding errors, the argument passed to >> acos() is outwith the interval [-1,1] and acos() is only defined >> within this interval (for reals). > > Yeah. It's not helping any that you've defined geo_deg2rad() to do its > arithmetic in float (that is float4) precision. Making it work in > float8 would probably improve matters. Still, I agree that switching to > a more numerically stable distance calculation would be a good idea. > Thanks Nathaniel & Tom: Using float8 rather than float4 for deg2rad() made no difference to results; I'll continue to use float8 anyway. I've read about haversine being more accurate and more expensive (cycles), but I haven't been able to find anything that says "how" much the accuracy is affected. My queries are typically between points less than 100km apart and rounding error of 1km is acceptable. When points are farther apart; larger error is acceptable. The extra decimal digits provided in my example (which seems to be causing the problems with NaN) just happen to be what's provided to me in database. The NaN result is causing problems when I ORDER BY distance since NaN is treated more like infinity rather than zero. Checking bounds of ACOS() does resolve the NaN issue; however, now I have a performance question. If I code the function using "LANGUAGE SQL" is the optimizer smart enough to know that distA (see below) should only be calculated once? It might be easier to read my function if I present the PHP code used to create it... $R = "6371.0"; $lat1="geo_deg2rad($1[0])"; $lat2="geo_deg2rad($2[0])"; $lng1="geo_deg2rad($1[1])"; $lng2="geo_deg2rad($2[1])"; $distA="sin($lat1)*sin($lat2) + cos($lat1)*cos($lat2) * cos($lng2-$lng1)"; $distB="CASE WHEN $distA > 1 THEN 1 WHEN $distA < -1 THEN -1 ELSE $distA END"; $dist_2="$R * acos( $distB )"; Otherwise I will need to rewrite in something like plpgsql (which I have not done before now).