Home > mailing lists

Re: Showing I/O timings spent reading/writing temp buffers in EXPLAIN - Mailing list pgsql-hackers

From	Masahiko Sawada
Subject	Re: Showing I/O timings spent reading/writing temp buffers in EXPLAIN
Date	April 7, 2022 12:48:02
Msg-id	CAD21AoDY5SwtCTjxuLOJYXpoJ_gtTUUKwjLZGzsMEsDGpTPb+Q@mail.gmail.com Whole thread Raw
In response to	Re: Showing I/O timings spent reading/writing temp buffers in EXPLAIN (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses	Re: Showing I/O timings spent reading/writing temp buffers in EXPLAIN
List	pgsql-hackers

Tree view

On Thu, Apr 7, 2022 at 5:52 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Apr 7, 2022 at 4:55 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > On Thu, Apr 07, 2022 at 04:24:54PM +0900, Michael Paquier wrote:
> > > On Thu, Apr 07, 2022 at 03:14:01PM +0800, Julien Rouhaud wrote:
> > > > Sure, but gettimeofday() has been implemented in vDSO for quite some time on
> > > > most platforms, so it shouldn't hurt that much on mainstream platforms
> > > > especially compared to the cost of whatever operation is actually using that
> > > > temporary file.
> > > >
> > > > I don't think that having an extra GUC for temp IO is sensible, if that's why
> > > > you're suggesting?  Or are you just asking to do some benchmarking on some
> > > > platform where getting the time is known to be slow (Windows?).
> > >
> > > I am asking about the latter, but the former could be one solution if
> > > the latter proves to be a problem, and this has not been discussed on
> > > the thread yet.  So, with some kind of worst-case scenario, how much
> > > worse the performance gets once you add those extra calls when
> > > compared to HEAD?  I think that we'd better be careful with any
> > > additions of INSTR_TIME_SET_CURRENT().
> >
> > I just did a quick test on my linux box, using this data:
> > CREATE TABLE tt AS select generate_series(1, 10000) id;
> > VACUUM ANALYZE tt;
> >
> > and this scenario:
> > SET work_mem TO '64kB';
> > SELECT count(*) FROM (SELECT id FROM tt ORDER BY id) s;
> >
> > which yields this plan:
> >
> >                                                     QUERY PLAN
> > ------------------------------------------------------------------------------------------------------------------
> >  Aggregate  (cost=1349.39..1349.40 rows=1 width=8) (actual time=5.417..5.417 rows=1 loops=1)
> >    ->  Sort  (cost=1199.39..1224.39 rows=10000 width=4) (actual time=2.910..4.422 rows=10000 loops=1)
> >          Sort Key: tt.id
> >          Sort Method: external merge  Disk: 144kB
> >          ->  Seq Scan on tt  (cost=0.00..145.00 rows=10000 width=4) (actual time=0.008..1.239 rows=10000 loops=1)
> >  Planning Time: 0.405 ms
> >  Execution Time: 5.524 ms
> >
> > So maybe not the worst that could be tested, but probably bad enough for this
> > patch.
> >
> > I ran that with pgbench, 4 clients (I have 4 cores) for 30 seconds, 3 times.
> >
> > Comparing master and this patch with track_io_timing activated, I see a 0.95%
> > overhead, with a 2.6% noise level.
>
> I've done the same test with a larger data set (10M tuples) on my
> machine (MacOS):
>
> HEAD: 5418.869 ms
> Patched: 5367.234 ms

Oops, the results are opposite:

HEAD: 5367.234 ms
Patched: 5418.869 ms

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

pgsql-hackers by date:

From: David Rowley
Date: 07 April 2022, 12:41:13
Subject: Re: generic plans and "initial" pruning

From: Justin Pryzby
Date: 07 April 2022, 12:55:55
Subject: Re: WIP: WAL prefetch (another approach)

Re: Showing I/O timings spent reading/writing temp buffers in EXPLAIN - Mailing list pgsql-hackers

Previous

Next