Materializing a sequential scan - Mailing list pgsql-performance
From | Steinar H. Gunderson |
---|---|
Subject | Materializing a sequential scan |
Date | |
Msg-id | 20051019174544.GA32303@samfundet.no Whole thread Raw |
Responses |
Re: Materializing a sequential scan
Re: Materializing a sequential scan Re: Materializing a sequential scan |
List | pgsql-performance |
Hi, I'm using PostgreSQL 8.1 beta 3 (packages from Debian experimental), and I have a (rather complex) query that seems to take forever -- when the database was just installed, it took about 1200ms (which is quite good, considering that the 7.4 system this runs on today uses about the same time, but has twice as much CPU power and runs sequential scans up to eight times as fast), but now I can never even get it to complete. I've tried running it for half an hour, but it still doesn't complete, so I'm a bit unsure what's going on. There's a _lot_ of tables and views in here, several hundres lines of SQL, but experience tells me that posting more is better than posting less, so here goes. (The data is unfortunately not public since it contains PIN codes and such, but if anybody asks I can probably send it off-list. It's ~30MB in plain pg_dump, though.) There might be a few tables that aren't referenced, but I don't really know a good way to figure out such dependencies automatically, and I'd guess most of them _are_ used :-) Apologies in advance for the Norwegian in the names. === cut here === CREATE TABLE gruppetype ( gruppetype_id integer NOT NULL PRIMARY KEY, gruppetype varchar ); CREATE TABLE gruppe ( gruppe_id serial NOT NULL PRIMARY KEY, gruppe varchar NOT NULL, beskrivelse varchar, gruppetype_id integer DEFAULT 1 NOT NULL REFERENCES gruppetype, adminacl varchar, aktiv boolean default 't' NOT NULL ); CREATE TABLE adgangsskjema ( adgangsskjema_id serial NOT NULL PRIMARY KEY, navn varchar NOT NULL, rita_navn varchar NOT NULL ); CREATE TABLE adgangsskjema_gruppe_kobling ( gruppe_id integer NOT NULL REFERENCES gruppe (gruppe_id), adgangsskjema_id integer NOT NULL REFERENCES adgangsskjema (adgangsskjema_id), PRIMARY KEY (adgangsskjema_id, gruppe_id) ); CREATE TABLE kortstatus ( kortstatus_id smallint NOT NULL PRIMARY KEY, kortstatus varchar ); CREATE TABLE korttype ( korttype_id serial NOT NULL PRIMARY KEY, korttype varchar NOT NULL, beskrivelse varchar ); CREATE TABLE medlemstatus ( medlemstatus_id serial NOT NULL PRIMARY KEY, medlemstatus varchar NOT NULL, beskrivelse varchar ); CREATE TABLE oblattype ( oblattype_id serial NOT NULL PRIMARY KEY, oblattype varchar NOT NULL, varighet interval NOT NULL ); CREATE TABLE skole ( skole_id serial NOT NULL PRIMARY KEY, skole varchar NOT NULL, beskrivelse varchar ); CREATE TABLE studie ( studie_id serial NOT NULL PRIMARY KEY, studie varchar NOT NULL, beskrivelse varchar ); CREATE TABLE poststed ( postnummer smallint NOT NULL PRIMARY KEY CHECK (postnummer >= 0 AND postnummer <= 9999), poststed varchar ); CREATE TABLE gruppekobling ( overgruppe_id integer NOT NULL REFERENCES gruppe (gruppe_id), undergruppe_id integer NOT NULL REFERENCES gruppe (gruppe_id), PRIMARY KEY (overgruppe_id, undergruppe_id) ); CREATE TABLE medlem ( medlem_id serial NOT NULL PRIMARY KEY CHECK (medlem_id > 0), fornavn varchar NOT NULL, etternavn varchar NOT NULL, hjemadresse varchar, hjem_postnummer smallint REFERENCES poststed (postnummer), studieadresse varchar, studie_postnummer smallint REFERENCES poststed (postnummer), fodselsdato date, telefon varchar, mail varchar UNIQUE, passord character(32) NOT NULL, registrert date DEFAULT now(), oppdatert date DEFAULT now(), skole_id integer REFERENCES skole, studie_id integer REFERENCES studie, medlemstatus_id integer DEFAULT 1 NOT NULL REFERENCES medlemstatus, pinkode smallint CHECK ((pinkode >= 0 AND pinkode <= 9999) OR pinkode IS NULL), UNIQUE ( LOWER(mail) ) ); CREATE TABLE kort ( kortnummer integer NOT NULL PRIMARY KEY CHECK (kortnummer > 0), medlem_id integer REFERENCES medlem DEFERRABLE, korttype_id integer DEFAULT 1 NOT NULL REFERENCES korttype, serie_registrert date DEFAULT now() NOT NULL, bruker_registrert date, kortstatus_id integer DEFAULT 1 NOT NULL REFERENCES kortstatus ); CREATE TABLE oblat ( oblatnummer integer NOT NULL PRIMARY KEY CHECK (oblatnummer > 0), oblattype_id integer NOT NULL REFERENCES oblattype, "start" date NOT NULL, kortnummer integer REFERENCES kort, bruker_registrert date, serie_registrert date DEFAULT NOW() NOT NULL ); CREATE TABLE verv ( medlem_id integer NOT NULL REFERENCES medlem, gruppe_id integer NOT NULL REFERENCES gruppe, "start" date DEFAULT now() NOT NULL, stopp date, CHECK ( stopp >= start ), PRIMARY KEY ( medlem_id, gruppe_id, "start" ) ); CREATE TABLE nytt_passord ( medlem_id integer NOT NULL REFERENCES medlem, hash varchar NOT NULL, tidspunkt date DEFAULT now() NOT NULL ); CREATE VIEW gyldige_medlemskap AS SELECT medlem_id,MAX("start"+varighet) AS stopp FROM kort JOIN oblat ON kort.kortnummer=oblat.kortnummer NATURAL JOIN oblattype WHERE kortstatus_id=1 AND medlem_id IS NOT NULL GROUP BY medlem_id HAVING MAX("start"+varighet) >= current_date; CREATE SCHEMA kortsys2; CREATE FUNCTION kortsys2.effektiv_dato(date) RETURNS date AS 'SELECT CASE WHEN $1 < CURRENT_DATE THEN CURRENT_DATE ELSE $1 END' LANGUAGE SQL STABLE; CREATE VIEW kortsys2.mdb_personer AS SELECT * FROM ( SELECT DISTINCT ON (medlem_id) medlem_id,fornavn,etternavn,mail,pinkode,kort.kortnummer AS kortnummer FROM medlem NATURAL JOIN kort -- the member must have an ID card WHERE kortstatus_id=1 -- the card must be active AND korttype_id IN (2,3) -- the card must be an ID card or UKA ID card AND pinkode IS NOT NULL -- the member must have a PIN AND medlem_id IN ( -- the member must be active in at least one group SELECT medlem_id FROM verv WHERE stopp IS NULL OR stopp >= current_date ) AND medlem_id IN ( -- the member must have a valid membership SELECT medlem_id FROM gyldige_medlemskap ) ORDER BY medlem_id, -- needed for the DISTINCT korttype_id -- prioritize ID cards over UKA ID cards ) AS t1 UNION ALL SELECT * FROM eksterne_kort.eksterne_personer; CREATE TABLE kortsys2.rita_personer ( medlem_id integer PRIMARY KEY NOT NULL, fornavn varchar NOT NULL, etternavn varchar NOT NULL, mail varchar NOT NULL, pinkode smallint NOT NULL CHECK (pinkode >= 0 AND pinkode <= 9999), kortnummer integer UNIQUE NOT NULL ); CREATE TABLE kortsys2.personer_tving_sletting ( medlem_id integer PRIMARY KEY NOT NULL ); CREATE VIEW kortsys2.personer_skal_slettes AS SELECT medlem_id FROM kortsys2.rita_personer WHERE (medlem_id,pinkode,kortnummer) NOT IN ( SELECT medlem_id,pinkode,kortnummer FROM kortsys2.mdb_personer ) UNION SELECT medlem_id FROM kortsys2.personer_tving_sletting; CREATE TABLE kortsys2.personer_nylig_slettet ( medlem_id integer PRIMARY KEY NOT NULL ); CREATE VIEW kortsys2.personer_skal_eksporteres AS SELECT * FROM kortsys2.mdb_personer WHERE medlem_id NOT IN ( SELECT medlem_id FROM kortsys2.rita_personer ) AND medlem_id NOT IN ( SELECT medlem_id FROM kortsys2.personer_nylig_slettet ); CREATE TABLE kortsys2.mdb_gruppekobling_temp ( overgruppe_id INTEGER NOT NULL, undergruppe_id INTEGER NOT NULL ); CREATE OR REPLACE FUNCTION kortsys2.mdb_gruppekobling_transitiv_tillukning() RETURNS SETOF gruppekobling AS ' DECLARE r RECORD; BEGIN INSERT INTO kortsys2.mdb_gruppekobling_temp SELECT overgruppe_id,undergruppe_id FROM gruppekobling gk JOIN gruppe g1 ON gk.overgruppe_id=g1.gruppe_id JOIN gruppe g2 ON gk.overgruppe_id=g2.gruppe_id WHERE g1.aktiv AND g2.aktiv; LOOP INSERT INTO kortsys2.mdb_gruppekobling_temp SELECT g1.overgruppe_id, g2.undergruppe_id FROM kortsys2.mdb_gruppekobling_temp g1 JOIN kortsys2.mdb_gruppekobling_temp g2 ON g1.undergruppe_id=g2.overgruppe_id WHERE (g1.overgruppe_id, g2.undergruppe_id) NOT IN ( SELECT * FROM kortsys2.mdb_gruppekobling_temp ); EXIT WHEN NOT FOUND; END LOOP; FOR r IN SELECT * from kortsys2.mdb_gruppekobling_temp LOOP RETURN NEXT r; END LOOP; DELETE FROM kortsys2.mdb_gruppekobling_temp; RETURN; END; ' LANGUAGE plpgsql; CREATE VIEW kortsys2.mdb_gruppetilgang AS SELECT DISTINCT gk.undergruppe_id AS gruppe_id, rita_navn FROM ( SELECT * FROM mdb_gruppekobling_transitiv_tillukning() UNION SELECT gruppe_id,gruppe_id FROM gruppe WHERE aktiv ) gk JOIN adgangsskjema_gruppe_kobling ak ON gk.overgruppe_id=ak.gruppe_id NATURAL JOIN adgangsskjema; CREATE VIEW kortsys2.mdb_tilgang AS SELECT t1.medlem_id AS medlem_id, rita_navn, "start", CASE WHEN m_stopp < stopp OR stopp IS NULL THEN m_stopp ELSE stopp END AS stopp FROM ( SELECT medlem_id, gruppe_id, ms.stopp AS m_stopp, MIN("start") AS start, MAX(v.stopp) AS stopp FROM ( SELECT * FROM verv UNION ALL SELECT * FROM eksterne_kort.vervekvivalens ) v JOIN ( SELECT * FROM gyldige_medlemskap ms UNION ALL SELECT medlem_id,stopp FROM eksterne_kort.vervekvivalens ) ms USING (medlem_id) WHERE ( v.stopp IS NULL OR v.stopp >= current_date ) GROUP BY medlem_id,gruppe_id,ms.stopp ) t1 JOIN mdb_gruppetilgang gt ON t1.gruppe_id=gt.gruppe_id WHERE medlem_id IN ( SELECT medlem_id FROM mdb_personer ) ; CREATE VIEW kortsys2.mdb_effektiv_tilgang AS SELECT medlem_id, rita_navn, MIN("start") AS "start", MAX(stopp) AS stopp FROM kortsys2.mdb_tilgang GROUP BY medlem_id,rita_navn HAVING MAX(stopp) >= current_date; CREATE TABLE kortsys2.rita_tilgang ( medlem_id integer NOT NULL REFERENCES kortsys2.rita_personer, rita_navn varchar NOT NULL, "start" date NOT NULL, stopp date NOT NULL, PRIMARY KEY ( medlem_id, rita_navn ) ); CREATE VIEW kortsys2.tilganger_skal_slettes AS SELECT * FROM kortsys2.rita_tilgang WHERE medlem_id NOT IN ( SELECT medlem_id FROM kortsys2.personer_nylig_slettet ) AND (medlem_id,rita_navn,kortsys2.effektiv_dato("start"),stopp) NOT IN ( SELECT medlem_id,rita_navn,kortsys2.effektiv_dato("start"),stopp FROM kortsys2.mdb_effektiv_tilgang ); CREATE VIEW kortsys2.tilganger_skal_gis AS SELECT medlem_id,rita_navn,"start",stopp FROM kortsys2.mdb_effektiv_tilgang WHERE medlem_id NOT IN ( SELECT medlem_id FROM kortsys2.personer_nylig_slettet ) AND (medlem_id,rita_navn,kortsys2.effektiv_dato("start"),stopp) NOT IN ( SELECT medlem_id,rita_navn,kortsys2.effektiv_dato("start"),stopp FROM kortsys2.rita_tilgang ); === cut here === Now for the simple query: mdb2_jodal=# explain select * from kortsys2.tilganger_skal_gis ; and the monster of a query plan (no EXPLAIN ANALYZE because, well, it never finishes): QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Subquery Scan mdb_effektiv_tilgang (cost=19821.69..4920621.69 rows=10000 width=48) Filter: ((NOT (hashed subplan)) AND (NOT (subplan))) -> HashAggregate (cost=19238.48..20838.48 rows=40000 width=52) Filter: (max(CASE WHEN ((m_stopp < (stopp)::timestamp without time zone) OR (stopp IS NULL)) THEN m_stopp ELSE (stopp)::timestampwithout time zone END) >= (('now'::text)::date)::timestamp without time zone) -> Merge Join (cost=12231.86..16091.27 rows=251777 width=52) Merge Cond: ("outer".gruppe_id = "inner".gruppe_id) -> Unique (cost=483.64..514.68 rows=4138 width=30) -> Sort (cost=483.64..493.99 rows=4138 width=30) Sort Key: gk.undergruppe_id, adgangsskjema.rita_navn -> Merge Join (cost=149.81..235.06 rows=4138 width=30) Merge Cond: ("outer".overgruppe_id = "inner".gruppe_id) -> Unique (cost=92.52..101.21 rows=1159 width=8) -> Sort (cost=92.52..95.41 rows=1159 width=8) Sort Key: overgruppe_id, undergruppe_id -> Append (cost=0.00..33.53 rows=1159 width=8) -> Function Scan on mdb_gruppekobling_transitiv_tillukning (cost=0.00..12.50rows=1000 width=8) -> Seq Scan on gruppe (cost=0.00..9.44 rows=159 width=4) Filter: aktiv -> Sort (cost=57.29..59.08 rows=714 width=30) Sort Key: ak.gruppe_id -> Hash Join (cost=1.60..23.45 rows=714 width=30) Hash Cond: ("outer".adgangsskjema_id = "inner".adgangsskjema_id) -> Seq Scan on adgangsskjema_gruppe_kobling ak (cost=0.00..11.14 rows=714width=8) -> Hash (cost=1.48..1.48 rows=48 width=30) -> Seq Scan on adgangsskjema (cost=0.00..1.48 rows=48 width=30) -> Sort (cost=11748.21..11778.64 rows=12169 width=24) Sort Key: t1.gruppe_id -> Hash Join (cost=8975.45..10922.49 rows=12169 width=24) Hash Cond: ("outer".medlem_id = "inner".medlem_id) -> HashAggregate (cost=5180.87..6093.55 rows=60845 width=24) -> Merge Join (cost=3496.19..4420.31 rows=60845 width=24) Merge Cond: ("outer".medlem_id = "inner".medlem_id) -> Sort (cost=2743.39..2749.11 rows=2290 width=12) Sort Key: ms.medlem_id -> Subquery Scan ms (cost=2483.70..2615.60 rows=2290 width=12) -> Append (cost=2483.70..2592.70 rows=2290 width=12) -> HashAggregate (cost=2483.70..2545.82 rows=2259 width=24) Filter: (max(("start" + varighet)) >= (('now'::text)::date)::timestampwithout time zone) -> Hash Join (cost=662.54..2427.49 rows=7494 width=24) Hash Cond: ("outer".oblattype_id = "inner".oblattype_id) -> Hash Join (cost=661.50..2314.03 rows=7494 width=12) Hash Cond: ("outer".kortnummer = "inner".kortnummer) -> Seq Scan on oblat (cost=0.00..632.17 rows=37817width=12) -> Hash (cost=614.81..614.81 rows=18673 width=8) -> Seq Scan on kort (cost=0.00..614.81rows=18673 width=8) Filter: ((kortstatus_id = 1) AND (medlem_idIS NOT NULL)) -> Hash (cost=1.04..1.04 rows=4 width=20) -> Seq Scan on oblattype (cost=0.00..1.04 rows=4width=20) -> Subquery Scan "*SELECT* 2" (cost=0.00..1.70 rows=31 width=4) -> Seq Scan on eksterne_kort (cost=0.00..1.39 rows=31 width=4) -> Sort (cost=752.80..766.08 rows=5314 width=16) Sort Key: v.medlem_id -> Append (cost=0.00..370.84 rows=5314 width=16) -> Seq Scan on verv (cost=0.00..316.31 rows=5283 width=16) Filter: ((stopp IS NULL) OR (stopp >= ('now'::text)::date)) -> Subquery Scan "*SELECT* 2" (cost=0.01..1.70 rows=31 width=8) -> Result (cost=0.01..1.39 rows=31 width=8) One-Time Filter: (('2030-01-01'::date IS NULL) OR ('2030-01-01'::date>= ('now'::text)::date)) -> Seq Scan on eksterne_kort (cost=0.00..1.31 rows=31 width=8) -> Hash (cost=3794.48..3794.48 rows=40 width=4) -> HashAggregate (cost=3794.08..3794.48 rows=40 width=4) -> Append (cost=3791.65..3793.58 rows=40 width=106) -> Subquery Scan t1 (cost=3791.65..3791.79 rows=9 width=106) -> Unique (cost=3791.65..3791.70 rows=9 width=60) -> Sort (cost=3791.65..3791.68 rows=9 width=60) Sort Key: medlem.medlem_id, public.kort.korttype_id -> Nested Loop (cost=2922.47..3791.51 rows=9 width=60) Join Filter: ("outer".medlem_id = "inner".medlem_id) -> Hash Join (cost=2918.46..3454.13 rows=42 width=60) Hash Cond: ("outer".medlem_id = "inner".medlem_id) -> Hash Join (cost=2574.06..3106.62 rows=538width=56) Hash Cond: ("outer".medlem_id = "inner".medlem_id) -> Seq Scan on medlem (cost=0.00..500.01rows=3623 width=52) Filter: (pinkode IS NOT NULL) -> Hash (cost=2568.41..2568.41 rows=2259width=4) -> HashAggregate (cost=2483.70..2545.82rows=2259 width=24) Filter: (max(("start" + varighet))>= (('now'::text)::date)::timestamp without time zone) -> Hash Join (cost=662.54..2427.49rows=7494 width=24) Hash Cond: ("outer".oblattype_id= "inner".oblattype_id) -> Hash Join (cost=661.50..2314.03rows=7494 width=12) Hash Cond: ("outer".kortnummer= "inner".kortnummer) -> Seq Scan onoblat (cost=0.00..632.17 rows=37817 width=12) -> Hash (cost=614.81..614.81rows=18673 width=8) -> Seq Scanon kort (cost=0.00..614.81 rows=18673 width=8) Filter:((kortstatus_id = 1) AND (medlem_id IS NOT NULL)) -> Hash (cost=1.04..1.04rows=4 width=20) -> Seq Scan onoblattype (cost=0.00..1.04 rows=4 width=20) -> Hash (cost=341.42..341.42 rows=1191 width=4) -> HashAggregate (cost=329.51..341.42rows=1191 width=4) -> Seq Scan on verv (cost=0.00..316.31rows=5283 width=4) Filter: ((stopp IS NULL) OR(stopp >= ('now'::text)::date)) -> Bitmap Heap Scan on kort (cost=4.01..8.02 rows=1width=12) Recheck Cond: ((("outer".medlem_id = kort.medlem_id)AND (kort.korttype_id = 2)) OR (("outer".medlem_id = kort.medlem_id) AND (kort.korttype_id = 3))) Filter: (kortstatus_id = 1) -> BitmapOr (cost=4.01..4.01 rows=1 width=0) -> Bitmap Index Scan on maksimalt_ett_aktivt_kort_per_medlem (cost=0.00..2.01 rows=1 width=0) Index Cond: (("outer".medlem_id =kort.medlem_id) AND (kort.korttype_id = 2)) -> Bitmap Index Scan on maksimalt_ett_aktivt_kort_per_medlem (cost=0.00..2.01 rows=1 width=0) Index Cond: (("outer".medlem_id =kort.medlem_id) AND (kort.korttype_id = 3)) -> Subquery Scan "*SELECT* 2" (cost=0.00..1.70 rows=31 width=25) -> Seq Scan on eksterne_kort (cost=0.00..1.39 rows=31 width=25) SubPlan -> Materialize (cost=546.45..742.37 rows=19592 width=38) -> Seq Scan on rita_tilgang (cost=0.00..526.86 rows=19592 width=38) -> Seq Scan on personer_nylig_slettet (cost=0.00..31.40 rows=2140 width=4) (105 rows) There's two oddities here at first sight: 1. Why does it materialize the sequential scan? What use would that have? 2. Why does it estimate four million disk page fetches in the top node? I can't find anything like that in the bottom nodes... All the obvious things are taken care of: The tables are freshly loaded, VACUUM ANALYZE just ran, sort_mem/shared_buffers/effective_cache_size is the same as on the 7.4 machine with the same amount of RAM (1GB). /* Steinar */ -- Homepage: http://www.sesse.net/
pgsql-performance by date: