Re: Fast distinct not working as expected - Mailing list pgsql-performance

From Jeff Janes
Subject Re: Fast distinct not working as expected
Date
Msg-id CAMkU=1wTdkHmNkt9euasQux8LZ=JbWBcVb+5y1=XH3cYCpPTZw@mail.gmail.com
Whole thread Raw
In response to Fast distinct not working as expected  (Franck Routier <franck.routier@axege.com>)
Responses Re: Fast distinct not working as expected
List pgsql-performance
On Thu, Apr 17, 2014 at 8:11 AM, Franck Routier <franck.routier@axege.com> wrote:
Hi,

we are using a mono-column index on a huge table to try to make a quick
'select distinct ' on the column.

This used to work fine, but... it does not anymore. We don't know what
happened.

Here are the facts:

- request:
SELECT  dwhinv___rfovsnide::varchar FROM dwhinv WHERE dwhinv___rfovsnide
> '201212_cloture' ORDER BY dwhinv___rfovsnide LIMIT 1

That is not equivalent to a distinct.  There must be more to it than that.
 

- Plan :
Limit  (cost=0.00..1.13 rows=1 width=12) (actual time=5798.915..5798.916
rows=1 loops=1)
  ->  Index Scan using vsn_idx on dwhinv  (cost=0.00..302591122.05
rows=267473826 width=12) (actual time=5798.912..5798.912 rows=1 loops=1)
        Index Cond: ((dwhinv___rfovsnide)::text > '201212_cloture'::text)
Total runtime: 5799.141 ms


My best guess would be that the index got stuffed full of entries for rows that are not visible, either because they are not yet committed, or have been deleted but are not yet vacuumable.  Do you have any long-lived transactions?
 

- postgresql Version 8.4

Newer versions have better diagnostic tools.  An explain (analyze, buffers)  would be nice, especially with track_io_timing on.

Cheers,

Jeff

pgsql-performance by date:

Previous
From: Franck Routier
Date:
Subject: Fast distinct not working as expected
Next
From: Franck Routier
Date:
Subject: Re: Fast distinct not working as expected