Optimizing the implementation of an optimized tree - Mailing list pgsql-sql
From | Christian Rishoej |
---|---|
Subject | Optimizing the implementation of an optimized tree |
Date | |
Msg-id | 1020733336.892.79.camel@penelope.dyndns.dk Whole thread Raw |
Responses |
Re: Optimizing the implementation of an optimized tree
|
List | pgsql-sql |
Hi, Recently I needed to store a hiearchy of pages on a website in Postgres in a way that would allow for fast extraction of parts of the tree for menu generation and "path" specification (which nodes in the tree make up the path from the root to my position?). This can be accomplished by letting each node in the tree have an l and r value with values determined by traversing the edge of the tree and assigning the value of integer that is incremented at each node visit to l and doing the same thing for r, this time traversing the edge of the tree backwards. The tree then becomes: CREATE TABLE tree ( id serial PRIMARY KEY, parent int REFERENCES tree, l int DEFAULT NULL, r int DEFAULT NULL, ); The parent id strictly not needed, but I chose to include it for convenience. I can easily extract a complete branch like this: SELECT treeNode.id FROM tree treeNode, tree treeParent WHERE treeNode.l BETWEEN treeParent.l AND treeParent.r AND treeParent.id = $1 ORDER BY treeNode.l And the "path" from the root to a particular node like this: SELECT treeNode.id FROM tree treeNode, tree currentNode WHERE currentNode.r BETWEEN treeNode.l AND treeNode.r AND currentNode.id = $1 ORDER BY treeNode.l; Now, in order to avoid having to maintain the values of l and r for each node when the tree is modified I created triggers in PL/PGSQL that handle INSERTs of new nodes into the tree, DELETEs of existing nodes and UPDATEs of a node's position in the tree (i.e. the parant field). The implementation is included as an attachment. I am very interested in comments on the implementation - especially hints on possible optimizations. It seems to me that the PL/PGSQL-definition of each function is parsed at each call. Is this true? Is it possible to avoid this? It is my understanding that Postgres does not yet support triggers that fire on the update of a paricular column. Therefore I have a check in the UPDATE trigger that checks if the parent-field was modified. Is there any was to do this in a nicer manner? Regards, Christian CREATE TABLE tree ( id serial PRIMARY KEY, parent int REFERENCES tree, l int DEFAULT NULL, r int DEFAULT NULL, ); -- Are these a good idea? CREATE INDEX tree_l_idx ON tree (l); CREATE INDEX tree_r_idx ON tree (r); CREATE OR REPLACE FUNCTION handle_tree_insert() RETURNS opaque AS ' DECLARE parentR INTEGER; BEGIN IF NEW.l ISNULL THEN parentR := ( SELECT r FROM tree WHERE id = NEW.parent ); UPDATE tree SET l=l+2 WHERE l >= parentR; UPDATE tree SET r=r+2 WHERE r >= parentR; NEW.l := parentR; NEW.r := parentR + 1; END IF; RETURN NEW; END; ' LANGUAGE PLPGSQL; CREATE TRIGGER on_tree_insert BEFORE INSERT ON tree FOR EACH ROW EXECUTE PROCEDURE handle_tree_insert(); CREATE OR REPLACE FUNCTION handle_tree_delete() RETURNS opaque AS ' DECLARE decr INTEGER; BEGIN -- RAISE NOTICE ''Gets here, ID %'', OLD.id; IF NOT OLD.l ISNULL THEN decr := (((OLD.r - OLD.l - 1) / 2 ) + 1) * 2; -- RAISE NOTICE ''...and executes stuff (decrementing with %)'', decr; UPDATE tree SET parent = NULL, l = NULL, r = NULL WHERE l > OLD.l AND r < OLD.r; DELETE FROM tree WHERE parent ISNULL AND l ISNULL AND r ISNULL; UPDATE tree SET l = l - decr WHERE l > OLD.l; UPDATE tree SET r = r - decr WHERE r > OLD.r; END IF; RETURN OLD; END; ' LANGUAGE PLPGSQL; CREATE TRIGGER on_tree_delete AFTER DELETE ON tree FOR EACH ROW EXECUTE PROCEDURE handle_tree_delete(); CREATE OR REPLACE FUNCTION handle_tree_update() RETURNS opaque AS ' DECLARE parentR INTEGER; span INTEGER; dist INTEGER; BEGIN IF (NOT OLD.parent ISNULL) AND (NOT NEW.parent ISNULL) AND (OLD.parent <> NEW.parent) THEN span := OLD.r - OLD.l + 1; -- Get the R value of the new parent parentR := ( SELECT r FROM tree WHERE id = NEW.parent ); -- Determine the distance dist = parentR - OLD.l; -- Make the gab UPDATE tree SET l = l + span WHERE l >= parentR; UPDATE tree SET r = r + span WHERE r >= parentR; -- Move our branch to the newly created gab -- Decrement nodes after the old position to close the old gab -- If dist is < 0 then our own branch was moved <span> steps - this must be taken into consideration IF (dist > 0) THEN UPDATE tree SET l = l + (dist), r = r + (dist) WHERE l >= OLD.l AND r <= OLD.r; UPDATE tree SET l = l - span WHERE l >= OLD.l; UPDATE tree SET r = r - span WHERE r >= OLD.r; ELSE UPDATE tree SET l = l + (dist - span), r = r + (dist - span) WHERE l >= (OLD.l + span) AND r <= (OLD.r +span); UPDATE tree SET l = l - span WHERE l >= (OLD.l + span); UPDATE tree SET r = r - span WHERE r >= (OLD.r + span); END IF; END IF; RETURN NULL; END; ' LANGUAGE PLPGSQL; CREATE TRIGGER on_tree_update AFTER UPDATE ON tree FOR EACH ROW EXECUTE PROCEDURE handle_tree_update();