Thread: BUG #7509: x NOT IN (select x from z) extremely slow in compare to select x from y except select x from z;
BUG #7509: x NOT IN (select x from z) extremely slow in compare to select x from y except select x from z;
From
stefan@konink.de
Date:
The following bug has been logged on the website: Bug reference: 7509 Logged by: Stefan de Konink Email address: stefan@konink.de PostgreSQL version: 9.1.5 Operating system: Linux = Description: = The following is relatively fast: bag-2012-aug=3D# explain select count(*) from (select kvk from kvk_normal except select kvk from bag_kvk) as x; QUERY PLAN = = ---------------------------------------------------------------------------= ------------------------------- Aggregate (cost=3D1110465.88..1110465.89 rows=3D1 width=3D0) -> Subquery Scan on x (cost=3D1042163.45..1102413.23 rows=3D3221060 width=3D0) -> SetOp Except (cost=3D1042163.45..1070202.63 rows=3D3221060 width=3D8) -> Sort (cost=3D1042163.45..1056183.04 rows=3D5607836 widt= h=3D8) Sort Key: "*SELECT* 1".kvk -> Append (cost=3D0.00..183539.72 rows=3D5607836 width=3D8) -> Subquery Scan on "*SELECT* 1" = (cost=3D0.00..122902.20 rows=3D3221060 width=3D8) -> Seq Scan on kvk_normal = (cost=3D0.00..90691.60 rows=3D3221060 width=3D8) -> Subquery Scan on "*SELECT* 2" = (cost=3D0.00..60637.52 rows=3D2386776 width=3D8) -> Seq Scan on bag_kvk = (cost=3D0.00..36769.76 rows=3D2386776 width=3D8) The 'normal' case basically doesn't finish: bag-2012-aug=3D# explain select count(*) from (select kvk_normal.kvk from kvk_normal where kvk_normal.kvk not in (select bag_kvk.kvk from bag_kvk)) as x; QUERY PLAN = = ---------------------------------------------------------------------------= ---------- Aggregate (cost=3D103065293697.97..103065293697.98 rows=3D1 width=3D0) -> Seq Scan on kvk_normal (cost=3D0.00..103065289671.65 rows=3D1610530 width=3D0) Filter: (NOT (SubPlan 1)) SubPlan 1 -> Materialize (cost=3D0.00..58027.64 rows=3D2386776 width=3D8) -> Seq Scan on bag_kvk (cost=3D0.00..36769.76 rows=3D238= 6776 width=3D8) (6 rows) Table size is 3.2mil rows in adres, and 2.3mil rows in bag_kvk.
Re: BUG #7509: x NOT IN (select x from z) extremely slow in compare to select x from y except select x from z;
From
Tom Lane
Date:
stefan@konink.de writes: > The following is relatively fast: > bag-2012-aug=# explain select count(*) from (select kvk from kvk_normal > except select kvk from bag_kvk) as x; > The 'normal' case basically doesn't finish: > bag-2012-aug=# explain select count(*) from (select kvk_normal.kvk from > kvk_normal where kvk_normal.kvk not in (select bag_kvk.kvk from bag_kvk)) as > x; NOT IN is difficult to optimize, as well as hard to use, because of its rather bizarre behavior for nulls. You might consider using NOT EXISTS instead. regards, tom lane