Thread: [GENERAL] Why so long?

[GENERAL] Why so long?

From

Steve Clark

Date:

19 April 2017, 18:24:00

Hello,

I am confused. I have a table that has an incrementing primary key id.

When I select max(id) from table is returns almost instantly but
when I select min(id) from table it takes longer than I want to wait.

Shouldn't postgresql be able to quickly find the minimum id value in the index?

pmacct=# explain select max(id) from netflow;
                                                        QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Result (cost=1.13..1.14 rows=1 width=0)
   InitPlan 1 (returns $0)
     -> Limit (cost=0.71..1.13 rows=1 width=8)
           -> Index Only Scan Backward using netflow_pkey on netflow (cost=0.71..3799108784.10 rows=9123246080 width=8)
                 Index Cond: (id IS NOT NULL)
(5 rows)

pmacct=# explain select min(id) from netflow;
                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Result (cost=1.13..1.14 rows=1 width=0)
   InitPlan 1 (returns $0)
     -> Limit (cost=0.71..1.13 rows=1 width=8)
           -> Index Only Scan using netflow_pkey on netflow (cost=0.71..3799108784.10 rows=9123246080 width=8)
                 Index Cond: (id IS NOT NULL)
(5 rows)

pmacct=# \timing
Timing is on.
pmacct=# select max(id) from netflow;
     max
-------------
17547256873
(1 row)

Time: 0.626 ms

Re: [GENERAL] Why so long?

From

Steve Clark

Date:

19 April 2017, 18:32:38

Should add this is version 9.4.10 of postgresql

On 04/19/2017 11:24 AM, Steve Clark wrote:

Hello,

I am confused. I have a table that has an incrementing primary key id.

When I select max(id) from table is returns almost instantly but
when I select min(id) from table it takes longer than I want to wait.

Shouldn't postgresql be able to quickly find the minimum id value in the index?

pmacct=# explain select max(id) from netflow;
                                                        QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Result (cost=1.13..1.14 rows=1 width=0)
   InitPlan 1 (returns $0)
     -> Limit (cost=0.71..1.13 rows=1 width=8)
           -> Index Only Scan Backward using netflow_pkey on netflow (cost=0.71..3799108784.10 rows=9123246080 width=8)
                 Index Cond: (id IS NOT NULL)
(5 rows)

pmacct=# explain select min(id) from netflow;
                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Result (cost=1.13..1.14 rows=1 width=0)
   InitPlan 1 (returns $0)
     -> Limit (cost=0.71..1.13 rows=1 width=8)
           -> Index Only Scan using netflow_pkey on netflow (cost=0.71..3799108784.10 rows=9123246080 width=8)
                 Index Cond: (id IS NOT NULL)
(5 rows)

pmacct=# \timing
Timing is on.
pmacct=# select max(id) from netflow;
     max
-------------
17547256873
(1 row)

Time: 0.626 ms

pmacct=# select min(id) from netflow;
^CCancel request sent
ERROR: canceling statement due to user request
Time: 339114.334 ms

                                               Table "public.netflow"     Column     |            Type             |                              Modifiers                              ----------------+-----------------------------+--------------------------------------------------------------------- id             | bigint                      | not null default nextval('netflow_id_seq'::regclass) agent_id       | bigint                      | bytes          | bigint                      | stamp_inserted | timestamp without time zone | not null default '0001-01-01 00:00:00'::timestamp without time zone stamp_updated | timestamp without time zone | packets        | integer                     | default 0 port_src       | integer                     | default 0 port_dst       | integer                     | default 0 ip_proto       | smallint                    | default 0 tos            | smallint                    | default 0 ip_src         | inet                        | not null default '0.0.0.0'::inet ip_dst         | inet                        | not null default '0.0.0.0'::inetIndexes:    "netflow_pkey" PRIMARY KEY, btree (id)
    "netflow_ts_key" btree (stamp_inserted)    "netflow_tsu_idx" btree (stamp_updated)Triggers:    netflow_import AFTER INSERT OR UPDATE ON netflow FOR EACH STATEMENT EXECUTE PROCEDURE netflow_update()

--

--
Stephen Clark
NetWolves Managed Services, LLC.
Director of Technology
Phone: 813-579-3200
Fax: 813-882-0209
Email: steve.clark@netwolves.com
http://www.netwolves.com

Re: [GENERAL] Why so long?

From

Jeff Janes

Date:

19 April 2017, 18:57:54

On Wed, Apr 19, 2017 at 8:24 AM, Steve Clark <steve.clark@netwolves.com> wrote:

Hello,

I am confused. I have a table that has an incrementing primary key id.

When I select max(id) from table is returns almost instantly but
when I select min(id) from table it takes longer than I want to wait.

Shouldn't postgresql be able to quickly find the minimum id value in the index?

Not if the low end of the index is stuffed full of obsolete entries, which haven't been cleaned up because it is not being vacuumed often enough.

Do you have autovacuum on? Have you manually vacuumed the table recently?

Cheers,

Jeff

Re: [GENERAL] Why so long?

From

Steve Clark

Date:

19 April 2017, 19:20:16

On 04/19/2017 11:57 AM, Jeff Janes wrote:

On Wed, Apr 19, 2017 at 8:24 AM, Steve Clark <steve.clark@netwolves.com> wrote:
Hello,

I am confused. I have a table that has an incrementing primary key id.

When I select max(id) from table is returns almost instantly but
when I select min(id) from table it takes longer than I want to wait.

Shouldn't postgresql be able to quickly find the minimum id value in the index?

Not if the low end of the index is stuffed full of obsolete entries, which haven't been cleaned up because it is not being vacuumed often enough.

Do you have autovacuum on? Have you manually vacuumed the table recently?

Cheers,

Jeff

Hi Jeff,

Autovacuum is turned on.

schemaname | relname | last_vacuum | last_autovacuum | vacuum_count | autovacuum_count ------------+-----------------------+-------------+-------------------------------+--------------+------------------ public | netflow | | 2017-04-11 01:18:53.261221-04 | 0 | 1It is a large table.
select pg_size_pretty(pg_relation_size('netflow'));
pg_size_pretty
----------------
1267 GB

select pg_size_pretty(pg_total_relation_size('netflow_pkey'));
pg_size_pretty
----------------
287 GB

Regards,
Steve

Re: [GENERAL] Why so long?

From

David Rowley

Date:

19 April 2017, 20:07:20

On 20 April 2017 at 03:24, Steve Clark <steve.clark@netwolves.com> wrote:
> pmacct=# explain select min(id) from netflow;
>                                                    QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------
>  Result  (cost=1.13..1.14 rows=1 width=0)
>    InitPlan 1 (returns $0)
>      ->  Limit  (cost=0.71..1.13 rows=1 width=8)
>            ->  Index Only Scan using netflow_pkey on netflow
> (cost=0.71..3799108784.10 rows=9123246080 width=8)
>                  Index Cond: (id IS NOT NULL)
> (5 rows)
>
> pmacct=# \timing
> Timing is on.
> pmacct=# select max(id) from netflow;
>      max
> -------------
>  17547256873
> (1 row)
>
> Time: 0.626 ms
>
>
>
> pmacct=# select min(id) from netflow;
> ^CCancel request sent
> ERROR:  canceling statement due to user request
> Time: 339114.334 ms

It may help the diagnosis if you run an EXPLAIN (ANALYZE, BUFFERS)
version of this query to completion.

The index pages could simply be cold and coming from disk on a very
much I/O starved system.

More recently added pages are more likely to be cached.

You may also want to consider running the EXPLAIN (ANALYZE, BUFFERS)
after having SET track_io_timing = on;

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [GENERAL] Why so long?

From

Rafia Sabih

Date:

20 April 2017, 12:28:18

On Wed, Apr 19, 2017 at 8:54 PM, Steve Clark <steve.clark@netwolves.com> wrote:

Hello,

I am confused. I have a table that has an incrementing primary key id.

When I select max(id) from table is returns almost instantly but
when I select min(id) from table it takes longer than I want to wait.

Shouldn't postgresql be able to quickly find the minimum id value in the index?

pmacct=# explain select max(id) from netflow;
                                                        QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Result (cost=1.13..1.14 rows=1 width=0)
   InitPlan 1 (returns $0)
     -> Limit (cost=0.71..1.13 rows=1 width=8)
           -> Index Only Scan Backward using netflow_pkey on netflow (cost=0.71..3799108784.10 rows=9123246080 width=8)
                 Index Cond: (id IS NOT NULL)
(5 rows)

pmacct=# explain select min(id) from netflow;
                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Result (cost=1.13..1.14 rows=1 width=0)
   InitPlan 1 (returns $0)
     -> Limit (cost=0.71..1.13 rows=1 width=8)
           -> Index Only Scan using netflow_pkey on netflow (cost=0.71..3799108784.10 rows=9123246080 width=8)
                 Index Cond: (id IS NOT NULL)
(5 rows)

pmacct=# \timing
Timing is on.
pmacct=# select max(id) from netflow;
     max
-------------
17547256873
(1 row)

Time: 0.626 ms

pmacct=# select min(id) from netflow;
^CCancel request sent
ERROR: canceling statement due to user request
Time: 339114.334 ms

                                               Table "public.netflow"     Column     |            Type             |                              Modifiers                              ----------------+-----------------------------+--------------------------------------------------------------------- id             | bigint                      | not null default nextval('netflow_id_seq'::regclass) agent_id       | bigint                      | bytes          | bigint                      | stamp_inserted | timestamp without time zone | not null default '0001-01-01 00:00:00'::timestamp without time zone stamp_updated | timestamp without time zone | packets        | integer                     | default 0 port_src       | integer                     | default 0 port_dst       | integer                     | default 0 ip_proto       | smallint                    | default 0 tos            | smallint                    | default 0 ip_src         | inet                        | not null default '0.0.0.0'::inet ip_dst         | inet                        | not null default '0.0.0.0'::inetIndexes:    "netflow_pkey" PRIMARY KEY, btree (id)
    "netflow_ts_key" btree (stamp_inserted)    "netflow_tsu_idx" btree (stamp_updated)Triggers:    netflow_import AFTER INSERT OR UPDATE ON netflow FOR EACH STATEMENT EXECUTE PROCEDURE netflow_update()

Is this some one-time event or you could reproduce it consistently?

Regards,

Rafia Sabih

EnterpriseDB: http://www.enterprisedb.com/