Re: Distinct oddity - Mailing list pgsql-sql
| From | Glenn Maynard |
|---|---|
| Subject | Re: Distinct oddity |
| Date | |
| Msg-id | bd36f99e0905131148k5255d63ai69c76072efa078c5@mail.gmail.com Whole thread Raw |
| In response to | Re: Distinct oddity (Maximilian Tyrtania <maximilian.tyrtania@onlinehome.de>) |
| Responses |
Re: Distinct oddity
|
| List | pgsql-sql |
For purposes of DISTINCT, I'd expect any sort order should do; all it
needs is for equal values to be grouped together. If strcoll() ever
fails to do that, I'd call it a critical bug--even throwing total
garbage at it should result in a consistent ordering, even if the
ordering itself is totally meaningless. Many sort functions depend on
this.
On Wed, May 13, 2009 at 8:37 AM, Maximilian Tyrtania
<maximilian.tyrtania@onlinehome.de> wrote:
> Opened those files (with textwrangler, as I 've never used sed), stripped
> off the '$', sorted and looked at the differences (using textwranglers
> "compare documents"-feature).
Can you narrow down what triggers this? Try copying off the table,
and running this:
******
CREATE FUNCTION is_inconsistent() RETURNS BOOLEAN LANGUAGE SQL AS $$select (select count(distinct(f.bezeichnung))
fromfirmen_copy f) <> (select count(distinct(f.bezeichnung||'1')) from firmen_copy f)
$$;
-- If deleting [first,last] leaves the results inconsistent, return
true; otherwise
-- roll back the deletion and return false.
CREATE FUNCTION test_delete_range(first BIGINT, last BIGINT)
RETURNS BOOLEAN LANGUAGE plpgsql AS $$
BEGIN DELETE FROM firmen_copy WHERE id BETWEEN first AND last; IF is_inconsistent() THEN RETURN true; END
IF; SELECT 1/0;
EXCEPTION WHEN division_by_zero THEN RETURN false;
END;
$$;
CREATE FUNCTION test_func()
RETURNS INTEGER LANGUAGE plpgsql AS $$
DECLARE total bigint;
BEGIN IF NOT is_inconsistent() THEN RETURN -1; END IF;
LOOP total := (SELECT MAX(id) FROM firmen_copy);
IF test_delete_range(0, total/2) THEN CONTINUE; END IF; IF test_delete_range(total*1/4,
total*3/4)THEN
CONTINUE; END IF; IF test_delete_range(total/2, total) THEN CONTINUE; END IF;
RETURN 0; END LOOP;
END;
$$;
SELECT test_func();
******
This assumes you have a primary key named "id", and that your IDs
start around 0 and are vaguely monotonic (renumber them in the copy if
necessary). I can't easily test this code, of course, but it's a
simple binary search. Depending on what's triggering this, it may or
may not be able to narrow in on a test case.
Tangentally, is there a better way of rolling back a function than a
dumb hack like "SELECT 1/0"?
--
Glenn Maynard