On Tue, Feb 7, 2017 at 4:13 PM, Joel Jacobson <joel@trustly.com> wrote:
Hi hackers,
Currently there is no simple way to check if two sets are equal.
Looks like no RDBMS in the world has a simple command for it.
You have to do something like:
WITH T1 AS (SELECT * FROM Foo WHERE FooID BETWEEN 1 AND 10000), T2 AS (SELECT * FROM Bar WHERE BarID BETWEEN 1 AND 10000) SELECT GREATEST( (SELECT COUNT(*) FROM T1), (SELECT COUNT(*) FROM T2) ) = (SELECT COUNT(*) FROM ( SELECT * FROM T1 INTERSECT ALL SELECT * FROM T2 ) AS X) INTO _Identical;
or,
SELECT 'Missmatch!' WHERE EXISTS ( SELECT * FROM Foo FULL JOIN Bar ON (Foo.FooID = Bar.BarID AND Foo IS NOT DISTINCT FROM Bar) WHERE TRUE AND ( Foo.FooID BETWEEN 1 AND 10000 AND Bar.BarID BETWEEN 1 AND 10000 ) AND ( Foo.FooID IS NULL OR Bar.BarID IS NULL);
Introducing new SQL keywords is of course not an option, since it would possibly break backwards compatibility.
So here is an idea that doesn't break backwards compatibility:
Let's give a meaning for the existing IS DISTINCT and IS NOT DISTINCT, that is currently a syntax error when used between two sets.
SELECT 1 IS DISTINCT FROM SELECT 1; ERROR: syntax error at or near "SELECT" LINE 1: SELECT 1 IS DISTINCT FROM SELECT 1;
The example above could be written as:
_Identical := ( SELECT * FROM Foo WHERE FooID BETWEEN 1 AND 10000 IS NOT DISTINCT FROM SELECT * FROM Bar WHERE BarID BETWEEN 1 AND 10000 );
Which would set _Identical to TRUE if the two sets are equal, and FALSE otherwise.
Since it's currently a syntax error, there is no risk for changed behaviour for any existing executable queries.