Thread: BUG #17590: [OPTIMIZER] DELETE never ends on a small table, found a workaround which makes it instant
BUG #17590: [OPTIMIZER] DELETE never ends on a small table, found a workaround which makes it instant
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 17590 Logged by: Guillaume FOUET Email address: g.fouet@gmail.com PostgreSQL version: 14.5 Operating system: Windows 10 - 64x Description: Tested with Postgres 13.5, 13.8 and 14.5 (I updated to see if that was a bug). I have four tables: ADDRESS, COMPANY, CONTACT, TRIP ADDRESS has ~130000 rows COMPANY has a FK toward ADDRESS and ~16000 rows CONTACT has a FK toward ADDRESS and ~12000 rows TRIP has a FK toward ADDRESS and ~137500 rows We wanted to purge the address table from old, unused addresses: DELETE FROM address WHERE address_id NOT IN ( SELECT DISTINCT address_id FROM company UNION SELECT DISTINCT address_id FROM contact UNION SELECT DISTINCT address_id FROM trip ); This query above never ends (I waited 15 minutes and it was still going, HDD doing nothing, one core CPU used). The EXPLAIN says it materializes the address_ids aggregates then scans ADDRESS for deletion. After many tries, I made this query instead: CREATE TEMPORARY TABLE used_address_id AS ( SELECT DISTINCT address_id FROM company UNION SELECT DISTINCT address_id FROM contact UNION SELECT DISTINCT address_id FROM trip ); CREATE UNIQUE INDEX ON used_address_id (address_id); DELETE FROM used_address_id WHERE address_id IS NULL; DELETE FROM address WHERE address_id NOT IN (SELECT address_id FROM used_address_id); This was resolved in 500ms (basically instantly). I have the feeling there's an O(n²) somewhere in the first query. Thanks.
Re: BUG #17590: [OPTIMIZER] DELETE never ends on a small table, found a workaround which makes it instant
From
Tomas Vondra
Date:
On 8/19/22 17:13, PG Bug reporting form wrote: > The following bug has been logged on the website: > > Bug reference: 17590 > Logged by: Guillaume FOUET > Email address: g.fouet@gmail.com > PostgreSQL version: 14.5 > Operating system: Windows 10 - 64x > Description: > > Tested with Postgres 13.5, 13.8 and 14.5 (I updated to see if that was a > bug). > > I have four tables: ADDRESS, COMPANY, CONTACT, TRIP > ADDRESS has ~130000 rows > COMPANY has a FK toward ADDRESS and ~16000 rows > CONTACT has a FK toward ADDRESS and ~12000 rows > TRIP has a FK toward ADDRESS and ~137500 rows > > We wanted to purge the address table from old, unused addresses: > DELETE FROM address WHERE address_id NOT IN ( > SELECT DISTINCT address_id FROM company > UNION > SELECT DISTINCT address_id FROM contact > UNION > SELECT DISTINCT address_id FROM trip > ); > > This query above never ends (I waited 15 minutes and it was still going, HDD > doing nothing, one core CPU used). > The EXPLAIN says it materializes the address_ids aggregates then scans > ADDRESS for deletion. It's probably better to include the query plan. Anyway, the union essentially creates a new relation, making indexes (on the base relations unusable). > > After many tries, I made this query instead: > CREATE TEMPORARY TABLE used_address_id AS ( > SELECT DISTINCT address_id FROM company > UNION > SELECT DISTINCT address_id FROM contact > UNION > SELECT DISTINCT address_id FROM trip > ); > CREATE UNIQUE INDEX ON used_address_id (address_id); > DELETE FROM used_address_id WHERE address_id IS NULL; > DELETE FROM address WHERE address_id NOT IN (SELECT address_id FROM > used_address_id); > > This was resolved in 500ms (basically instantly). > I have the feeling there's an O(n²) somewhere in the first query. Yeah. The temporary table means we can use the indexes again. I'm not sure I'd call this a bug, it's simply how we deal with this sort of queries. Maybe try splitting the one "NOT IN" condition into a separate condition per table. I mean, something like DELETE FROM address WHERE address_id NOT IN (SELECT address_id FROM company) AND address_id NOT IN (SELECT address_id FROM contact) AND address_id NOT IN (SELECT address_id FROM trip) or something like that. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company