Re: [COMMITTERS] pgsql: Teach tuplesort.c about "top N" sorting, in which only the first - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: [COMMITTERS] pgsql: Teach tuplesort.c about "top N" sorting, in which only the first
Date
Msg-id 996401B2-0F18-4F59-B5CF-D100419C849F@decibel.org
Whole thread Raw
In response to Re: [COMMITTERS] pgsql: Teach tuplesort.c about "top N" sorting, in which only the first  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [COMMITTERS] pgsql: Teach tuplesort.c about "top N" sorting, in which only the first
Re: [COMMITTERS] pgsql: Teach tuplesort.c about "top N" sorting, in which only the first
List pgsql-hackers
On May 4, 2007, at 7:08 PM, Tom Lane wrote:
> What do you think the output should look like?  The first thought that
> comes to mind is to add "method=memory" (or disk or top-N) to the
> "actual" annotation:
>
> regression=# explain analyze select * from tenk1 order by fivethous  
> limit 100;
>                                                        QUERY PLAN
> ---------------------------------------------------------------------- 
> --------------------------------------------------
>  Limit  (cost=840.19..840.44 rows=100 width=244) (actual  
> time=140.511..141.604 rows=100 loops=1)
>    ->  Sort  (cost=840.19..865.19 rows=10000 width=244) (actual  
> time=140.492..140.880 rows=100 loops=1 method=top-N)
>                                                                        
>                                  ^^^^^^^^^^^^
>          Sort Key: fivethous
>          ->  Seq Scan on tenk1  (cost=0.00..458.00 rows=10000  
> width=244) (actual time=0.074..51.849 rows=10000 loops=1)
>  Total runtime: 143.089 ms
> (5 rows)
>
> Another possibility, which could be wedged into explain.c slightly  
> more
> easily, is to append "Method: top-N" or some such to the Sort Key  
> line,
> but I'm not sure that that would look nice.

If the method is disk it would be nice to know how much spilled to  
disk. That would tell you if it would be worth increasing work_mem,  
and by how much.

On a related note, it would also be *really* nice if we kept stats on  
how many sorts or hashes had spilled to disk, perhaps along with how  
much had spilled. Right now the only way to monitor that in a  
production system is to setup a cron job to watch pgsql_tmp, which is  
far from elegant.

I know there's concern about how much we add to the stats file, but I  
don't think this needs to be on a per-relation basis; per-database  
should be fine.
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)




pgsql-hackers by date:

Previous
From: "Magnus Hagander"
Date:
Subject: Re: [COMMITTERS] pgsql: Teach tuplesort.c about "top N" sorting, in which only the first
Next
From: Josh Berkus
Date:
Subject: Re: RETURN QUERY in PL/PgSQL?