Re: how to improve perf of 131MM row table? - Mailing list pgsql-performance
From | AJ Weber |
---|---|
Subject | Re: how to improve perf of 131MM row table? |
Date | |
Msg-id | 53AC2CC3.9060403@comcast.net Whole thread Raw |
In response to | Re: how to improve perf of 131MM row table? (Shaun Thomas <sthomas@optionshouse.com>) |
Responses |
Re: how to improve perf of 131MM row table?
|
List | pgsql-performance |
I sent the details as identified by pgAdmin III. psql output shows this: \d alf_node Table "public.alf_node" Column | Type | Modifiers ----------------+------------------------+----------- id | bigint | not null version | bigint | not null store_id | bigint | not null uuid | character varying(36) | not null transaction_id | bigint | not null node_deleted | boolean | not null type_qname_id | bigint | not null locale_id | bigint | not null acl_id | bigint | audit_creator | character varying(255) | audit_created | character varying(30) | audit_modifier | character varying(255) | audit_modified | character varying(30) | audit_accessed | character varying(30) | Indexes: "alf_node_pkey" PRIMARY KEY, btree (id) CLUSTER "store_id" UNIQUE, btree (store_id, uuid) "fk_alf_node_acl" btree (acl_id) "fk_alf_node_loc" btree (locale_id) "fk_alf_node_store" btree (store_id) "fk_alf_node_tqn" btree (type_qname_id) "fk_alf_node_txn" btree (transaction_id) "idx_alf_node_del" btree (node_deleted) "idx_alf_node_txn_del" btree (transaction_id, node_deleted) Foreign-key constraints: "fk_alf_node_acl" FOREIGN KEY (acl_id) REFERENCES alf_access_control_list(id) "fk_alf_node_loc" FOREIGN KEY (locale_id) REFERENCES alf_locale(id) "fk_alf_node_store" FOREIGN KEY (store_id) REFERENCES alf_store(id) "fk_alf_node_tqn" FOREIGN KEY (type_qname_id) REFERENCES alf_qname(id) "fk_alf_node_txn" FOREIGN KEY (transaction_id) REFERENCES alf_transaction(id) Referenced by: TABLE "alf_child_assoc" CONSTRAINT "fk_alf_cass_cnode" FOREIGN KEY (child_node_id) REFERENCES alf_node(id) TABLE "alf_child_assoc" CONSTRAINT "fk_alf_cass_pnode" FOREIGN KEY (parent_node_id) REFERENCES alf_node(id) TABLE "alf_node_aspects" CONSTRAINT "fk_alf_nasp_n" FOREIGN KEY (node_id) REFERENCES alf_node(id) TABLE "alf_node_assoc" CONSTRAINT "fk_alf_nass_snode" FOREIGN KEY (source_node_id) REFERENCES alf_node(id) TABLE "alf_node_assoc" CONSTRAINT "fk_alf_nass_tnode" FOREIGN KEY (target_node_id) REFERENCES alf_node(id) TABLE "alf_node_properties" CONSTRAINT "fk_alf_nprop_n" FOREIGN KEY (node_id) REFERENCES alf_node(id) TABLE "alf_store" CONSTRAINT "fk_alf_store_root" FOREIGN KEY (root_node_id) REFERENCES alf_node(id) TABLE "alf_subscriptions" CONSTRAINT "fk_alf_sub_node" FOREIGN KEY (node_id) REFERENCES alf_node(id) ON DELETE CASCADE TABLE "alf_subscriptions" CONSTRAINT "fk_alf_sub_user" FOREIGN KEY (user_node_id) REFERENCES alf_node(id) ON DELETE CASCADE TABLE "alf_usage_delta" CONSTRAINT "fk_alf_usaged_n" FOREIGN KEY (node_id) REFERENCES alf_node(id) This line of the output: "alf_node_pkey" PRIMARY KEY, btree (id) CLUSTER would indicate to me that there is a PK on alf_node table, it is on column "id", it is of type btree, and the table is clustered around that index. Am I reading this totally wrong? The supporting table actually seems to have a multi-column PK defined, and a separate btree index on node_id as you mentioned. -AJ On 6/26/2014 10:05 AM, Shaun Thomas wrote: > On 06/26/2014 08:26 AM, AJ Weber wrote: > >> The "master table" definition is attached as "table1.sql". >> The "detail table" definition is attached as "table2.sql". > > I'm not sure what you think a primary key is, but neither of these > tables have one. Primary keys are declared one of two ways: > > CREATE TABLE foo > ( > id BIGINT PRIMARY KEY, > col1 VARCHAR, > col2 INT > ); > > Or this: > > CREATE TABLE foo > ( > id BIGINT, > col1 VARCHAR, > col2 INT > ); > > ALTER TABLE foo ADD constraint pk_foo PRIMARY KEY (id); > > On your alf_node_properties table, you only have an index on node_id > because you created one. If you look at your alf_node table, there is > no index on the id column at all. This is confirmed by the explain > output you attached: > > Seq Scan on alf_node node (cost=0.00..227265.29 rows=5733429 > width=16) (actual time=0.013..2029.649 rows=5733888 loops=1) > > Since it has no index, the database is reading the entire table to > find your matching values. Then it's using the index on node_id in the > other table to find the 'detail' matches, as seen here: > > Bitmap Index Scan on fk_alf_nprop_n (cost=0.00..1240.00 rows=52790 > width=0) (actual time=0.552..0.552 rows=1071 loops=1) > > Add an actual primary key to your alf_node table, and your query > performance should improve substantially. But I also strongly suggest > you spend some time learning how to read an EXPLAIN plan, as that > would have made your problem obvious immediately. > > Here's a link for your version: > > http://www.postgresql.org/docs/9.0/static/sql-explain.html > > You should still consider upgrading to the latest release of 9.0 too. >
pgsql-performance by date: