Thread: [GENERAL] Why so long?
Hello,
I am confused. I have a table that has an incrementing primary key id.
When I select max(id) from table is returns almost instantly but
when I select min(id) from table it takes longer than I want to wait.
Shouldn't postgresql be able to quickly find the minimum id value in the index?
pmacct=# explain select max(id) from netflow;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Result (cost=1.13..1.14 rows=1 width=0)
InitPlan 1 (returns $0)
-> Limit (cost=0.71..1.13 rows=1 width=8)
-> Index Only Scan Backward using netflow_pkey on netflow (cost=0.71..3799108784.10 rows=9123246080 width=8)
Index Cond: (id IS NOT NULL)
(5 rows)
pmacct=# explain select min(id) from netflow;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Result (cost=1.13..1.14 rows=1 width=0)
InitPlan 1 (returns $0)
-> Limit (cost=0.71..1.13 rows=1 width=8)
-> Index Only Scan using netflow_pkey on netflow (cost=0.71..3799108784.10 rows=9123246080 width=8)
Index Cond: (id IS NOT NULL)
(5 rows)
pmacct=# \timing
Timing is on.
pmacct=# select max(id) from netflow;
max
-------------
17547256873
(1 row)
Time: 0.626 ms
pmacct=# select min(id) from netflow;
^CCancel request sent
ERROR: canceling statement due to user request
Time: 339114.334 ms
Table "public.netflow"
Column | Type | Modifiers
----------------+-----------------------------+---------------------------------------------------------------------
id | bigint | not null default nextval('netflow_id_seq'::regclass)
agent_id | bigint |
bytes | bigint |
stamp_inserted | timestamp without time zone | not null default '0001-01-01 00:00:00'::timestamp without time zone
stamp_updated | timestamp without time zone |
packets | integer | default 0
port_src | integer | default 0
port_dst | integer | default 0
ip_proto | smallint | default 0
tos | smallint | default 0
ip_src | inet | not null default '0.0.0.0'::inet
ip_dst | inet | not null default '0.0.0.0'::inet
Indexes:
"netflow_pkey" PRIMARY KEY, btree (id)
"netflow_ts_key" btree (stamp_inserted)
"netflow_tsu_idx" btree (stamp_updated)
Triggers:
netflow_import AFTER INSERT OR UPDATE ON netflow FOR EACH STATEMENT EXECUTE PROCEDURE netflow_update()
--
I am confused. I have a table that has an incrementing primary key id.
When I select max(id) from table is returns almost instantly but
when I select min(id) from table it takes longer than I want to wait.
Shouldn't postgresql be able to quickly find the minimum id value in the index?
pmacct=# explain select max(id) from netflow;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Result (cost=1.13..1.14 rows=1 width=0)
InitPlan 1 (returns $0)
-> Limit (cost=0.71..1.13 rows=1 width=8)
-> Index Only Scan Backward using netflow_pkey on netflow (cost=0.71..3799108784.10 rows=9123246080 width=8)
Index Cond: (id IS NOT NULL)
(5 rows)
pmacct=# explain select min(id) from netflow;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Result (cost=1.13..1.14 rows=1 width=0)
InitPlan 1 (returns $0)
-> Limit (cost=0.71..1.13 rows=1 width=8)
-> Index Only Scan using netflow_pkey on netflow (cost=0.71..3799108784.10 rows=9123246080 width=8)
Index Cond: (id IS NOT NULL)
(5 rows)
pmacct=# \timing
Timing is on.
pmacct=# select max(id) from netflow;
max
-------------
17547256873
(1 row)
Time: 0.626 ms
pmacct=# select min(id) from netflow;
^CCancel request sent
ERROR: canceling statement due to user request
Time: 339114.334 ms
Table "public.netflow"
Column | Type | Modifiers
----------------+-----------------------------+---------------------------------------------------------------------
id | bigint | not null default nextval('netflow_id_seq'::regclass)
agent_id | bigint |
bytes | bigint |
stamp_inserted | timestamp without time zone | not null default '0001-01-01 00:00:00'::timestamp without time zone
stamp_updated | timestamp without time zone |
packets | integer | default 0
port_src | integer | default 0
port_dst | integer | default 0
ip_proto | smallint | default 0
tos | smallint | default 0
ip_src | inet | not null default '0.0.0.0'::inet
ip_dst | inet | not null default '0.0.0.0'::inet
Indexes:
"netflow_pkey" PRIMARY KEY, btree (id)
"netflow_ts_key" btree (stamp_inserted)
"netflow_tsu_idx" btree (stamp_updated)
Triggers:
netflow_import AFTER INSERT OR UPDATE ON netflow FOR EACH STATEMENT EXECUTE PROCEDURE netflow_update()
--
Should add this is version 9.4.10 of postgresql
On 04/19/2017 11:24 AM, Steve Clark wrote:
On 04/19/2017 11:24 AM, Steve Clark wrote:
Hello,
I am confused. I have a table that has an incrementing primary key id.
When I select max(id) from table is returns almost instantly but
when I select min(id) from table it takes longer than I want to wait.
Shouldn't postgresql be able to quickly find the minimum id value in the index?
pmacct=# explain select max(id) from netflow;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Result (cost=1.13..1.14 rows=1 width=0)
InitPlan 1 (returns $0)
-> Limit (cost=0.71..1.13 rows=1 width=8)
-> Index Only Scan Backward using netflow_pkey on netflow (cost=0.71..3799108784.10 rows=9123246080 width=8)
Index Cond: (id IS NOT NULL)
(5 rows)
pmacct=# explain select min(id) from netflow;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Result (cost=1.13..1.14 rows=1 width=0)
InitPlan 1 (returns $0)
-> Limit (cost=0.71..1.13 rows=1 width=8)
-> Index Only Scan using netflow_pkey on netflow (cost=0.71..3799108784.10 rows=9123246080 width=8)
Index Cond: (id IS NOT NULL)
(5 rows)
pmacct=# \timing
Timing is on.
pmacct=# select max(id) from netflow;
max
-------------
17547256873
(1 row)
Time: 0.626 ms
pmacct=# select min(id) from netflow;
^CCancel request sent
ERROR: canceling statement due to user request
Time: 339114.334 ms
Table "public.netflow"
Column | Type | Modifiers
----------------+-----------------------------+---------------------------------------------------------------------
id | bigint | not null default nextval('netflow_id_seq'::regclass)
agent_id | bigint |
bytes | bigint |
stamp_inserted | timestamp without time zone | not null default '0001-01-01 00:00:00'::timestamp without time zone
stamp_updated | timestamp without time zone |
packets | integer | default 0
port_src | integer | default 0
port_dst | integer | default 0
ip_proto | smallint | default 0
tos | smallint | default 0
ip_src | inet | not null default '0.0.0.0'::inet
ip_dst | inet | not null default '0.0.0.0'::inet
Indexes:
"netflow_pkey" PRIMARY KEY, btree (id)
"netflow_ts_key" btree (stamp_inserted)
"netflow_tsu_idx" btree (stamp_updated)
Triggers:
netflow_import AFTER INSERT OR UPDATE ON netflow FOR EACH STATEMENT EXECUTE PROCEDURE netflow_update()
--
--
Stephen Clark
NetWolves Managed Services, LLC.
Director of Technology
Phone: 813-579-3200
Fax: 813-882-0209
Email: steve.clark@netwolves.com
http://www.netwolves.com
Stephen Clark
NetWolves Managed Services, LLC.
Director of Technology
Phone: 813-579-3200
Fax: 813-882-0209
Email: steve.clark@netwolves.com
http://www.netwolves.com
On Wed, Apr 19, 2017 at 8:24 AM, Steve Clark <steve.clark@netwolves.com> wrote:
Hello,
I am confused. I have a table that has an incrementing primary key id.
When I select max(id) from table is returns almost instantly but
when I select min(id) from table it takes longer than I want to wait.
Shouldn't postgresql be able to quickly find the minimum id value in the index?
Not if the low end of the index is stuffed full of obsolete entries, which haven't been cleaned up because it is not being vacuumed often enough.
Do you have autovacuum on? Have you manually vacuumed the table recently?
Cheers,
Jeff
On 04/19/2017 11:57 AM, Jeff Janes wrote:
Hi Jeff,On Wed, Apr 19, 2017 at 8:24 AM, Steve Clark <steve.clark@netwolves.com> wrote:Hello,
I am confused. I have a table that has an incrementing primary key id.
When I select max(id) from table is returns almost instantly but
when I select min(id) from table it takes longer than I want to wait.
Shouldn't postgresql be able to quickly find the minimum id value in the index?Not if the low end of the index is stuffed full of obsolete entries, which haven't been cleaned up because it is not being vacuumed often enough.Do you have autovacuum on? Have you manually vacuumed the table recently?Cheers,Jeff
Autovacuum is turned on.
schemaname | relname | last_vacuum | last_autovacuum | vacuum_count | autovacuum_count
------------+-----------------------+-------------+-------------------------------+--------------+------------------
public | netflow | | 2017-04-11 01:18:53.261221-04 | 0 | 1
It is a large table.
select pg_size_pretty(pg_relation_size('netflow'));
pg_size_pretty
----------------
1267 GB
select pg_size_pretty(pg_total_relation_size('netflow_pkey'));
pg_size_pretty
----------------
287 GB
Regards,
Steve
Steve
On 20 April 2017 at 03:24, Steve Clark <steve.clark@netwolves.com> wrote: > pmacct=# explain select min(id) from netflow; > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------- > Result (cost=1.13..1.14 rows=1 width=0) > InitPlan 1 (returns $0) > -> Limit (cost=0.71..1.13 rows=1 width=8) > -> Index Only Scan using netflow_pkey on netflow > (cost=0.71..3799108784.10 rows=9123246080 width=8) > Index Cond: (id IS NOT NULL) > (5 rows) > > pmacct=# \timing > Timing is on. > pmacct=# select max(id) from netflow; > max > ------------- > 17547256873 > (1 row) > > Time: 0.626 ms > > > > pmacct=# select min(id) from netflow; > ^CCancel request sent > ERROR: canceling statement due to user request > Time: 339114.334 ms It may help the diagnosis if you run an EXPLAIN (ANALYZE, BUFFERS) version of this query to completion. The index pages could simply be cold and coming from disk on a very much I/O starved system. More recently added pages are more likely to be cached. You may also want to consider running the EXPLAIN (ANALYZE, BUFFERS) after having SET track_io_timing = on; -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Apr 19, 2017 at 8:54 PM, Steve Clark <steve.clark@netwolves.com> wrote:
Hello,
I am confused. I have a table that has an incrementing primary key id.
When I select max(id) from table is returns almost instantly but
when I select min(id) from table it takes longer than I want to wait.
Shouldn't postgresql be able to quickly find the minimum id value in the index?
pmacct=# explain select max(id) from netflow;
QUERY PLAN
------------------------------------------------------------ ------------------------------ ------------------------------ --
Result (cost=1.13..1.14 rows=1 width=0)
InitPlan 1 (returns $0)
-> Limit (cost=0.71..1.13 rows=1 width=8)
-> Index Only Scan Backward using netflow_pkey on netflow (cost=0.71..3799108784.10 rows=9123246080 width=8)
Index Cond: (id IS NOT NULL)
(5 rows)
pmacct=# explain select min(id) from netflow;
QUERY PLAN
------------------------------------------------------------ ------------------------------ -----------------------
Result (cost=1.13..1.14 rows=1 width=0)
InitPlan 1 (returns $0)
-> Limit (cost=0.71..1.13 rows=1 width=8)
-> Index Only Scan using netflow_pkey on netflow (cost=0.71..3799108784.10 rows=9123246080 width=8)
Index Cond: (id IS NOT NULL)
(5 rows)
pmacct=# \timing
Timing is on.
pmacct=# select max(id) from netflow;
max
-------------
17547256873
(1 row)
Time: 0.626 ms
pmacct=# select min(id) from netflow;
^CCancel request sent
ERROR: canceling statement due to user request
Time: 339114.334 ms
Table "public.netflow"
Column | Type | Modifiers
----------------+-----------------------------+------------- ------------------------------ --------------------------
id | bigint | not null default nextval('netflow_id_seq'::regclass)
agent_id | bigint |
bytes | bigint |
stamp_inserted | timestamp without time zone | not null default '0001-01-01 00:00:00'::timestamp without time zone
stamp_updated | timestamp without time zone |
packets | integer | default 0
port_src | integer | default 0
port_dst | integer | default 0
ip_proto | smallint | default 0
tos | smallint | default 0
ip_src | inet | not null default '0.0.0.0'::inet
ip_dst | inet | not null default '0.0.0.0'::inet
Indexes:
"netflow_pkey" PRIMARY KEY, btree (id)
"netflow_ts_key" btree (stamp_inserted)
"netflow_tsu_idx" btree (stamp_updated)
Triggers:
netflow_import AFTER INSERT OR UPDATE ON netflow FOR EACH STATEMENT EXECUTE PROCEDURE netflow_update()
Is this some one-time event or you could reproduce it consistently?
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/