Re: Why would this use 600Meg of VM? - Mailing list pgsql-hackers
| From | Larry Rosenman |
|---|---|
| Subject | Re: Why would this use 600Meg of VM? |
| Date | |
| Msg-id | 20010622122510.A10641@lerami.lerctr.org Whole thread Raw |
| In response to | Re: Why would this use 600Meg of VM? (Tom Lane <tgl@sss.pgh.pa.us>) |
| Responses |
Re: Why would this use 600Meg of VM?
|
| List | pgsql-hackers |
* Tom Lane <tgl@sss.pgh.pa.us> [010622 11:55]:
> Larry Rosenman <ler@lerctr.org> writes:
> > Can one of you knowledgeable people tell me why current CVS as of
> > a week ago would have the backend running this query grow to
> > 600 meg+?
>
> Sounds like there's still a memory leak in there somewhere, but the
> query looks fairly harmless. Could we see enough info to reproduce
> this? (Table declarations, explain output, etc) Another useful
> attack would be to let the query run awhile, then set a breakpoint
> at sbrk(). Stack traces from the first few hits of the breakpoint
> would give a pretty good indication of where the leak is, probably.
>
> regards, tom lane
neteng@tide.iadfw.net$ psql traffic_analysis
Welcome to psql, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms \h for help with SQL commands \? for help on internal slash commands
\g or terminate with semicolon to execute query \q to quit
traffic_analysis=# analyze traffic;
ANALYZE
traffic_analysis=# \i traffic_sum.sql
psql:traffic_sum.sql:15: NOTICE: QUERY PLAN:
Subquery Scan *SELECT* (cost=8471740.01..8994414.10 rows=1900633
width=72) -> Aggregate (cost=8471740.01..8994414.10 rows=1900633 width=72) -> Group
(cost=8471740.01..8614287.49rows=19006331
width=72) -> Sort (cost=8471740.01..8471740.01 rows=19006331
width=72) -> Seq Scan on traffic (cost=0.00..615601.86
rows=19006331 width=72)
EXPLAIN
traffic_analysis=#
neteng@tide.iadfw.net$ cat traffic_sum.sql
EXPLAIN
INSERT into traffic_summary
SELECT asn,protocol,
cast(sum(pkts_src) as float) as pkts_src,
cast(sum(pkts_dst) as float) as pkts_dst,
cast(sum(bytes_src) as float) as bytes_src,
cast(sum(bytes_dst) as float) as bytes_dst,
cast(sum(secs_src) as float) as secs_src,
cast(sum(secs_dst) as float) as secs_dst,
min(early) as early,
max(late) as late
FROM traffic
WHERE early between '2001-06-01 00:00:00'::timestamp and '2001-06-18 23:59:59'::timestamp
GROUP BY asn,protocol,date_part('epoch',early)/60/60;
neteng@tide.iadfw.net$
What else?
Failing a way to actually get this query to run, how would you suggest
aggregating the data down to 1 hour summaries?
neteng@tide.iadfw.net$ psql traffic_analysis
Welcome to psql, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms \h for help with SQL commands \? for help on internal slash commands
\g or terminate with semicolon to execute query \q to quit
traffic_analysis=# \d traffic Table "traffic"Attribute | Type | Modifier
-----------+--------------------------+----------asn | integer | protocol | integer
| pkts_src | bigint | pkts_dst | bigint | bytes_src | bigint
| bytes_dst | bigint | secs_src | bigint | secs_dst | bigint |
early | timestamp with time zone | late | timestamp with time zone |
Index: traffic_early
traffic_analysis=# \d traffic_summary Table "traffic_summary"Attribute | Type | Modifier
-----------+--------------------------+----------asn | integer | protocol | integer
| pkts_src | double precision | pkts_dst | double precision | bytes_src | double precision
| bytes_dst | double precision | secs_src | double precision | secs_dst | double precision |
early | timestamp with time zone | late | timestamp with time zone |
traffic_analysis=#
traffic_analysis=# \d traffic_early Index "traffic_early"Attribute | Type
-----------+--------------------------early | timestamp with time zone
btree
traffic_analysis=#
LER
--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 972-414-9812 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749
pgsql-hackers by date: