Home > mailing lists

[PATCH] Improve EXPLAIN ANALYZE overhead by sampling - Mailing list pgsql-patches

From	Martijn van Oosterhout
Subject	[PATCH] Improve EXPLAIN ANALYZE overhead by sampling
Date	May 9, 2006 17:37:25
Msg-id	20060509203704.GJ29652@svana.org Whole thread Raw
Responses	Re: [PATCH] Improve EXPLAIN ANALYZE overhead by sampling Re: [PATCH] Improve EXPLAIN ANALYZE overhead by sampling Re: [PATCH] Improve EXPLAIN ANALYZE overhead by sampling
List	pgsql-patches

Tree view

This was a suggestion made back in March that would dramatically reduce
the overhead of EXPLAIN ANALYZE on queries that loop continuously over
the same nodes.

http://archives.postgresql.org/pgsql-hackers/2006-03/msg01114.php

What it does behave normally for the first 50 tuples of any node, but
after that it starts sampling at ever increasing intervals, the
intervals controlled by an exponential function. So for a node
iterating over 1 million tuples it takes around 15,000 samples. The
result is that EXPLAIN ANALYZE has a much reduced effect on the total
execution time.

Without EXPLAIN ANALYZE:

postgres=# select count(*) from generate_series(1,1000000);
  count
---------
 1000000
(1 row)

Time: 2303.599 ms

EXPLAIN ANALYZE without patch:

postgres=# explain analyze select count(*) from generate_series(1,1000000);
                                                             QUERY PLAN
            

------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=15.00..15.01 rows=1 width=0) (actual time=8022.070..8022.073 rows=1 loops=1)
   ->  Function Scan on generate_series  (cost=0.00..12.50 rows=1000 width=0) (actual time=1381.762..4873.369
rows=1000000loops=1) 
 Total runtime: 8042.472 ms
(3 rows)

Time: 8043.401 ms

EXPLAIN ANALYZE with patch:

postgres=# explain analyze select count(*) from generate_series(1,1000000);
                                                             QUERY PLAN
            

------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=15.00..15.01 rows=1 width=0) (actual time=2469.491..2469.494 rows=1 loops=1)
   ->  Function Scan on generate_series  (cost=0.00..12.50 rows=1000 width=0) (actual time=1405.002..2187.417
rows=1000000loops=1) 
 Total runtime: 2496.529 ms
(3 rows)

Time: 2497.488 ms

As you can see, the overhead goes from 5.7 seconds to 0.2 seconds.
Obviously this is an extreme case, but it will probably help in a lot
of other cases people have been complaining about.

- To get this close it needs to get an estimate of the sampling overhead.
It does this by a little calibration loop that is run once per backend.
If you don't do this, you end up assuming all tuples take the same time
as tuples with the overhead, resulting in nodes apparently taking
longer than their parent nodes. Incidently, I measured the overhead to
be about 3.6us per tuple per node on my (admittedly slightly old)
machine.

Note that the resulting times still include the overhead actually
incurred, I didn't filter it out. I want the times to remain reflecting
reality as closely as possible.

- I also removed InstrStopNodeMulti and made InstrStopNode take a tuple
count parameter instead. This is much clearer all round.

- I also didn't make it optional. I'm unsure about whether it should be
optional or not, given the number of cases where it will make a
difference to be very few.

- The tuple counter for sampling restarts every loop. Thus a node that is
called repeatedly only returning one value each time will still measure
every tuple, though its parent node won't. We'll need some field
testing to see if that remains a significant effect.

- I don't let the user know anywhere how many samples it took. Is this
something users should care about?

Any comments?
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

pgsql-patches by date:

From: Fabien COELHO
Date: 09 May 2006, 15:23:41
Subject: Re: cast bytea to/from bit strings

From: Simon Riggs
Date: 09 May 2006, 17:53:04
Subject: Re: [PATCH] Improve EXPLAIN ANALYZE overhead by sampling

[PATCH] Improve EXPLAIN ANALYZE overhead by sampling - Mailing list pgsql-patches

Attachment

Previous

Next