Re: Distinct oddity - Mailing list pgsql-sql
From | Glenn Maynard |
---|---|
Subject | Re: Distinct oddity |
Date | |
Msg-id | bd36f99e0905131148k5255d63ai69c76072efa078c5@mail.gmail.com Whole thread Raw |
In response to | Re: Distinct oddity (Maximilian Tyrtania <maximilian.tyrtania@onlinehome.de>) |
Responses |
Re: Distinct oddity
|
List | pgsql-sql |
For purposes of DISTINCT, I'd expect any sort order should do; all it needs is for equal values to be grouped together. If strcoll() ever fails to do that, I'd call it a critical bug--even throwing total garbage at it should result in a consistent ordering, even if the ordering itself is totally meaningless. Many sort functions depend on this. On Wed, May 13, 2009 at 8:37 AM, Maximilian Tyrtania <maximilian.tyrtania@onlinehome.de> wrote: > Opened those files (with textwrangler, as I 've never used sed), stripped > off the '$', sorted and looked at the differences (using textwranglers > "compare documents"-feature). Can you narrow down what triggers this? Try copying off the table, and running this: ****** CREATE FUNCTION is_inconsistent() RETURNS BOOLEAN LANGUAGE SQL AS $$select (select count(distinct(f.bezeichnung)) fromfirmen_copy f) <> (select count(distinct(f.bezeichnung||'1')) from firmen_copy f) $$; -- If deleting [first,last] leaves the results inconsistent, return true; otherwise -- roll back the deletion and return false. CREATE FUNCTION test_delete_range(first BIGINT, last BIGINT) RETURNS BOOLEAN LANGUAGE plpgsql AS $$ BEGIN DELETE FROM firmen_copy WHERE id BETWEEN first AND last; IF is_inconsistent() THEN RETURN true; END IF; SELECT 1/0; EXCEPTION WHEN division_by_zero THEN RETURN false; END; $$; CREATE FUNCTION test_func() RETURNS INTEGER LANGUAGE plpgsql AS $$ DECLARE total bigint; BEGIN IF NOT is_inconsistent() THEN RETURN -1; END IF; LOOP total := (SELECT MAX(id) FROM firmen_copy); IF test_delete_range(0, total/2) THEN CONTINUE; END IF; IF test_delete_range(total*1/4, total*3/4)THEN CONTINUE; END IF; IF test_delete_range(total/2, total) THEN CONTINUE; END IF; RETURN 0; END LOOP; END; $$; SELECT test_func(); ****** This assumes you have a primary key named "id", and that your IDs start around 0 and are vaguely monotonic (renumber them in the copy if necessary). I can't easily test this code, of course, but it's a simple binary search. Depending on what's triggering this, it may or may not be able to narrow in on a test case. Tangentally, is there a better way of rolling back a function than a dumb hack like "SELECT 1/0"? -- Glenn Maynard