Re: Help with slow table update - Mailing list pgsql-general
From | Jim Nasby |
---|---|
Subject | Re: Help with slow table update |
Date | |
Msg-id | 552B106C.9030409@BlueTreble.com Whole thread Raw |
In response to | Help with slow table update (Pawel Veselov <pawel.veselov@gmail.com>) |
Responses |
Re: Help with slow table update
|
List | pgsql-general |
On 4/9/15 6:18 PM, Pawel Veselov wrote: > Hi. > > I have a plpgsql procedure that updates a few similar tables. > for some reason, updates on one of the tables take a lot longer the > updates on the other ones. The difference is, say, 7 seconds vs. 80 > milliseconds. > > the procedure uses cursors and record variables to do the updates. For > example: > > update r_agrio_total set > unserved = unserved + (agrow->>'unserved')::numeric(38), > r_brkconn = mush_brk_conn(r_brkconn, q_item.r_brkconn), > where > tagid = _tagid and > unitid = (akey->>'unitid')::numeric and > placement = (akey->>'placement')::numeric and > device_type = (akey->>'device_type')::numeric; > > There is another table (xq_agr) that is read record by record, and for > each of those records, such update is executed. > > I was trying to select analyze the updates to see where the time could > be spent. > There are only 24 row in the "bad" table, and 3,400 rows in "good" > table. So, for the "bad" table, most of the updates will be on the same > rows. The times were measured on processing 100 original records. > > When I'm analyzing pure update statements, I don't see anything strange. > > "bad" table: explain analyze update r_agrio_total set unconfirmed = > unconfirmed +0 where tagid = 1000 and unitid = 1000 and placement = 0 > and device_type = 100; > > RESULT: > Update on r_agrio_total (cost=0.42..4.46 rows=1 width=321) (actual > time=0.253..0.253 rows=0 loops=1) > -> Index Scan using tag_r_agrio_total on r_agrio_total > (cost=0.42..4.46 rows=1 width=321) (actual time=0.037..0.041 rows=1 > loops=1) > Index Cond: (tagid = 1000::numeric) > Filter: ((unitid = 1000::numeric) AND (placement = 0::numeric) > AND (device_type = 100::numeric)) > Rows Removed by Filter: 7 > Total runtime: 0.282 ms > > "good" table: explain analyze update r_agrio_hourly set unconfirmed = > unconfirmed +0 where tagid = 1000 and unitid = 1000 and placement = 0 > and device_type = 100 and rowdate = '2015-02-23T13'; > > RESULT: > Update on r_agrio_hourly (cost=0.42..17.36 rows=6 width=329) (actual > time=0.102..0.102 rows=0 loops=1) > -> Index Scan using u_r_agrio_hourly on r_agrio_hourly > (cost=0.42..17.36 rows=6 width=329) (actual time=0.047..0.048 rows=1 > loops=1) > Index Cond: ((tagid = 1000::numeric) AND (unitid = > 1000::numeric) AND ((rowdate)::text = '2015-02-23T13'::text) AND > (device_type = 100::numeric) AND (placement = 0::numeric)) > Total runtime: 0.135 ms > > When I try doing it with WITH statement (really, to apply the actual > data that the plpgsql function uses), there is something strange in the > "bad" table. > > explain analyze > with SRC as (select * from xq_agr where id = 914830) > update r_agrio_total set > unconfirmed = unconfirmed + > (SRC.r_agrio->>'unconfirmed')::numeric(38) > from SRC > where > tagid = (SRC.r_agrio->'key'->>'tagid')::numeric and > unitid = (SRC.r_agrio->'key'->>'unit')::numeric and > placement = (SRC.r_agrio->'key'->>'placement')::numeric and > device_type = (SRC.r_agrio->'key'->>'device_type')::numeric; > > RESULT: > Update on r_agrio_total (cost=8.91..32777.51 rows=19331 width=409) > (actual time=0.107..0.107 rows=0 loops=1) > CTE src > -> Index Scan using xq_agr_pkey on xq_agr (cost=0.42..8.44 > rows=1 width=379) (actual time=0.026..0.027 rows=1 loops=1) > Index Cond: (id = 914830) > -> Nested Loop (cost=0.46..32769.07 rows=19331 width=409) (actual > time=0.107..0.107 rows=0 loops=1) > -> CTE Scan on src (cost=0.00..0.02 rows=1 width=88) (actual > time=0.032..0.033 rows=1 loops=1) > -> Index Scan using u_r_agrio_total on r_agrio_total > (*cost=0.46..32285.78 rows=19331* width=321) (actual time=0.001..0.001 > rows=0 loops=1) > Index Cond: ((tagid = (((src.r_agrio -> 'key'::text) ->> > 'tagid'::text))::numeric) AND (unitid = (((src.r_agrio -> 'key'::text) > ->> 'unit'::text))::numeric) AND (device_type = (((src.r_agrio -> > 'key'::text) ->> 'device_type'::text))::numeric) AND (placement = > (((src.r_agrio -> 'key'::text) ->> 'placement'::text))::numeric)) > Total runtime: 0.155 ms > > explain analyze > with SRC as (select * from xq_agr where id = 914830) > update r_agrio_hourly set > unconfirmed = unconfirmed + > (SRC.r_agrio->>'unconfirmed')::numeric(38) > from SRC > where > tagid = (SRC.r_agrio->'key'->>'tagid')::numeric and > unitid = (SRC.r_agrio->'key'->>'unit')::numeric and > placement = (SRC.r_agrio->'key'->>'placement')::numeric and > device_type = (SRC.r_agrio->'key'->>'device_type')::numeric and > rowdate = (SRC.r_agrio->'key'->>'rowdate'); > > RESULT: > Update on r_agrio_hourly (cost=8.91..52.91 rows=20 width=417) (actual > time=0.123..0.123 rows=0 loops=1) > CTE src > -> Index Scan using xq_agr_pkey on xq_agr (cost=0.42..8.44 > rows=1 width=379) (actual time=0.023..0.024 rows=1 loops=1) > Index Cond: (id = 914830) > -> Nested Loop (cost=0.47..44.47 rows=20 width=417) (actual > time=0.121..0.121 rows=0 loops=1) > -> CTE Scan on src (cost=0.00..0.02 rows=1 width=88) (actual > time=0.030..0.031 rows=1 loops=1) > -> Index Scan using u_r_agrio_hourly on r_agrio_hourly > (*cost=0.47..43.95 rows=20* width=329) (actual time=0.000..0.000 > rows=0 loops=1) > Index Cond: ((tagid = (((src.r_agrio -> 'key'::text) ->> > 'tagid'::text))::numeric) AND (unitid = (((src.r_agrio -> 'key'::text) > ->> 'unit'::text))::numeric) AND ((rowdate)::text = ((src.r_agrio -> > 'key'::text) ->> 'rowdate'::text)) AND (device_type = (((src.r_agrio -> > 'key'::text) ->> 'device_type'::text))::numeric) AND (placement = > (((src.r_agrio -> 'key'::text) ->> 'placement'::text))::numeric)) > Total runtime: 0.176 ms > > I've tried doing vacuum full analyze, rebuilding indexes. Vacuum did > help somewhat, but the time differential is still huge (that 7sec vs. 80ms) > > The table structure (some fields thrown out just to save output length): > > => \d r_agrio_total > Column | Type | Modifiers > -------------------+---------------+------------------------------------------------------------ > id | bigint | not null default > nextval('r_agrio_total_id_seq'::regclass) > tagid | numeric(38,0) | not null > unitid | numeric(38,0) | not null > device_type | numeric(38,0) | not null > placement | numeric(38,0) | not null default 0 > unserved | numeric(38,0) | not null default 0 > r_brkconn | json | > Indexes: > "r_agrio_total_pkey" PRIMARY KEY, btree (id) > "u_r_agrio_total" UNIQUE, btree (tagid, unitid, device_type, placement) > "unit_r_agrio_total" btree (unitid) > "tag_r_agrio_total" btree (tagid) > > => \d r_agrio_hourly > Column | Type | > Modifiers > -------------------+-----------------------+------------------------------------------------------------- > id | bigint | not null default > nextval('r_agrio_hourly_id_seq'::regclass) > tagid | numeric(38,0) | not null > unitid | numeric(38,0) | not null > rowdate | character varying(15) | not null > device_type | numeric(38,0) | not null > placement | numeric(38,0) | not null default 0 > unserved | numeric(38,0) | not null default 0 > unconfirmed | numeric(38,0) | not null default 0 > r_brkconn | json | > Indexes: > "r_agrio_hourly_pkey" PRIMARY KEY, btree (id) > "u_r_agrio_hourly" UNIQUE, btree (tagid, unitid, rowdate, > device_type, placement) > "unit_r_agrio_hourly" btree (unitid) > "rowdate_r_agrio_hourly" btree (rowdate) > "tag_r_agrio_hourly" btree (tagid) > > Would appreciate any help on this, thank you! What is the ->> operator? I'm not familiar with it. numeric is FAR slower than int/bigint. Unless you *really* need 38 digits, use one of the other types. Cursors tend to make things slow. Avoid them if you can. As for your specific question, I suggest you modify the plpgsql function so that it's doing an EXPLAIN ANALYZE on the slow table. EXPLAIN ANALYZE actually returns a recordset the same way a SELECT would, with a single column of type text. So you just need to do something with that output. The easiest thing would be to replace this in your function: UPDATE slow_table SET ... to this (untested) RETURN QUERY EXPLAIN ANALYZE UPDATE slow_table SET ... and change the function so it returns SETOF text instead of whatever it returns now. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com
pgsql-general by date: