Huge backends / self joins / long queries - Mailing list pgsql-general
From | Dirk Lutzebaeck |
---|---|
Subject | Huge backends / self joins / long queries |
Date | |
Msg-id | 14268.8672.901178.240810@blanc.aeccom.com Whole thread Raw |
List | pgsql-general |
Hi, I don't know if this fits into the current thread but I have also a problem with backends growing up to 100MB and more. This is on 6.5: CREATE TABLE docobj ( envelope OID NOT NULL, docspec OID, docindex INTEGER, attrid INTEGER, val_int INTEGER, val_str TEXT ); This table saves values with differnet base types for doing self joins. A typical select is the following: SELECT DISTINCT d0.envelope, d0.docindex FROM envelope e0, docobj d0, envelope e1, docobj d1, envelope e2, docobj d2, envelope e3, docobj d3 WHERE d0.docspec=734675 AND d0.envelope=e0.oid AND d0.attrid=0 AND d0.val_int='700050' AND d0.docindex=d0.docindex AND e0.oid=e0.oid AND d1.docspec=734675 AND d1.envelope=e1.oid AND d1.attrid=1 AND d1.val_str='01' AND d1.docindex=d0.docindex AND e1.oid=e0.oid AND d2.docspec=734675 AND d2.envelope=e2.oid AND d2.attrid=5 AND d2.val_str='00' AND d2.docindex=d0.docindex AND e2.oid=e0.oid AND d3.docspec=734675 AND d3.envelope=e3.oid AND d3.attrid=6 AND d3.val_str='UG' AND d3.docindex=d0.docindex AND e3.oid=e0.oid ORDER BY boxinfo.time DESC Indices are generated on all joined attributes. This particular select needs 38MB on the backend. There is not much data. Docobj has 1300 rows. This query is a bit of a fake. The real query takes some additional joins with other tables. So the backend reaches 100Mb. I have also set pg_geqo to Pool_Size 128 Effort low Generations 200 Random_Seed 830518260 Selection_Bias 1.750000 EXPLAIN on the query above gives: Unique (cost=59.40 rows=1000 width=52) -> Sort (cost=59.40 rows=1000 width=52) -> Nested Loop (cost=59.40 rows=1000 width=52) -> Nested Loop (cost=16.40 rows=1 width=48) -> Index Scan using docobj_spec_index on docobj d2 (cost=2.05 rows=1 width=8) -> Seq Scan (cost=14.35 rows=2 width=40) -> ??? (cost=14.35 rows=2 width=40) -> Nested Loop (cost=14.35 rows=2 width=40) -> Nested Loop (cost=12.30 rows=1 width=36) -> Index Scan using docobj_spec_index on docobj d3 (cost=2.05 rows=1 width=8) -> Seq Scan (cost=10.25 rows=2 width=28) -> ??? (cost=10.25 rows=2 width=28) -> Nested Loop (cost=10.25 rows=2 width=28) -> Nested Loop (cost=8.20 rows=1 width=24) -> Nested Loop (cost=6.15 rows=1 width=20) -> Index Scan using docobj_spec_index on docobj d0 (cost=2.05 rows=1 width=8) -> Seq Scan (cost=4.10 rows=2 width=12) -> ??? (cost=4.10 rows=2 width=12) -> Nested Loop (cost=4.10 rows=2 width=12) -> Index Scan using docobj_spec_index on docobj d1 (cost=2.05 rows=1 width=8) -> Index Scan using envelope_oid_index on envelope e1 (cost=2.05 rows=1000width=4) -> Index Scan using envelope_oid_index on envelope e0 (cost=2.05 rows=101 width=4) -> Index Scan using envelope_oid_index on envelope e3 (cost=2.05 rows=1000 width=4) -> Index Scan using envelope_oid_index on envelope e2 (cost=2.05 rows=1000 width=4) -> Seq Scan on boxinfo (cost=43.00 rows=1000 width=4) So, I guess it has something todo with the optimizer. What should I do? Thanks for help, Dirk
pgsql-general by date: