Re: too complex query plan for not exists query and multicolumn indexes - Mailing list pgsql-performance

From Dave Crooke
Subject Re: too complex query plan for not exists query and multicolumn indexes
Date
Msg-id ca24673e1003191212j77cb9583i4aae6f83f9c9c16c@mail.gmail.com
Whole thread Raw
In response to too complex query plan for not exists query and multicolumn indexes  (Corin <wakathane@gmail.com>)
List pgsql-performance
K.I.S.S. here ..... the best way to do one of these in most DB's is typically an outer join and test for null:

select f1.* from friends f1
   left outer join friends f2 on (f1.user_id=f2.ref_id and f1.ref_id=f2.user_id)
   where f2.id is null;

On Fri, Mar 19, 2010 at 7:26 AM, Corin <wakathane@gmail.com> wrote:
Hi all!

While evaluting the pgsql query planer I found some weird behavior of the query planer. I think it's plan is way too complex and could much faster?

CREATE TABLE friends (
  id integer NOT NULL,
  user_id integer NOT NULL,
  ref_id integer NOT NULL,
);

ALTER TABLE ONLY friends ADD CONSTRAINT friends_pkey PRIMARY KEY (id);
CREATE INDEX user_ref ON friends USING btree (user_id, ref_id);

I fill this table with around 2.800.000 random rows (values between 1 and 500.000 for user_id, ref_id).

The intention of the query is to find rows with no "partner" row. The offset and limit are just to ignore the time needed to send the result to the client.

SELECT * FROM friends AS f1 WHERE NOT EXISTS (SELECT 1 FROM friends AS f2 WHERE f1.user_id=f2.ref_id AND f1.ref_id=f2.user_id) OFFSET 1000000 LIMIT 1

<snip>
 

pgsql-performance by date:

Previous
From: Stephen Frost
Date:
Subject: Re: too complex query plan for not exists query and multicolumn indexes
Next
From: Tom Lane
Date:
Subject: Re: PG using index+filter instead only use index