Re: Assorted improvements in pg_dump - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Assorted improvements in pg_dump |
Date | |
Msg-id | 20211025193024.ssesntsq25muk7py@alap3.anarazel.de Whole thread Raw |
In response to | Re: Assorted improvements in pg_dump (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Assorted improvements in pg_dump
|
List | pgsql-hackers |
Hi, On 2021-10-24 17:10:55 -0400, Tom Lane wrote: > 0004 is not committable as-is, because it assumes that the source > server has single-array unnest(), which is not true before 8.4. > We could fix that by using "oid = ANY(array-constant)" conditions > instead, but I'm unsure about the performance properties of that > for large OID arrays on those old server versions. It doesn't seem bad at all. 8.3 assert: CREATE TABLE foo(oid oid primary key); INSERT INTO foo SELECT generate_series(1, 1000000); postgres[1164129][1]=# explain ANALYZE SELECT count(*) FROM foo WHERE oid = ANY(ARRAY(SELECT generate_series(1100000, 1,-1))); ┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Aggregate (cost=81.54..81.55 rows=1 width=0) (actual time=2433.656..2433.656 rows=1 loops=1) │ │ InitPlan │ │ -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.004..149.425 rows=1100000 loops=1) │ │ -> Bitmap Heap Scan on foo (cost=42.70..81.50 rows=10 width=0) (actual time=2275.137..2369.478 rows=1000000 loops=1) │ │ Recheck Cond: (oid = ANY (($0)::oid[])) │ │ -> Bitmap Index Scan on foo_pkey (cost=0.00..42.69 rows=10 width=0) (actual time=2274.077..2274.077 rows=1000000loops=1) │ │ Index Cond: (oid = ANY (($0)::oid[])) │ │ Total runtime: 2436.201 ms │ └─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ (8 rows) Time: 2437.568 ms (00:02.438) > Lastly, 0006 implements the other idea we'd discussed in the other > thread: for queries that are issued repetitively but not within a > single pg_dump function invocation, use PREPARE/EXECUTE to cut down > the overhead. This gets only diminishing returns after 0004, but > it still brings "pg_dump -s regression" down from 0.39s to 0.33s, > so maybe it's worth doing. I think it's worth doing. There's things that the batch approach won't help with and even if it doesn't help a lot with the regression test database, I'd expect it to help plenty with other cases. A test database I had around with lots of functions got drastically faster to dump (7.4s to 2.5s), even though the number of queries didn't change significantly. According to pg_stat_statements plan_time for the dumpFunc query went from 2352ms to 0.4ms - interestingly execution time nearly halves as well. > I stopped after caching the plans for > functions/aggregates/operators/types, though. The remaining sorts > of objects aren't likely to appear in typical databases enough times > to be worth worrying over. (This patch will be a net loss if there > are more than zero but less than perhaps 10 instances of an object type, > so there's definitely reasons beyond laziness for not doing more.) Seems reasonable. > @@ -7340,25 +7340,37 @@ getDomainConstraints(Archive *fout, TypeInfo *tyinfo) > i_consrc; > int ntups; > > - query = createPQExpBuffer(); > + static bool query_prepared = false; > > - if (fout->remoteVersion >= 90100) > - appendPQExpBuffer(query, "SELECT tableoid, oid, conname, " > - "pg_catalog.pg_get_constraintdef(oid) AS consrc, " > - "convalidated " > - "FROM pg_catalog.pg_constraint " > - "WHERE contypid = '%u'::pg_catalog.oid " > - "ORDER BY conname", > - tyinfo->dobj.catId.oid); > + if (!query_prepared) > + { I wonder if it'd be better to store this in Archive or such. The approach with static variables might run into problems with parallel pg_dump at some point. These objects aren't dumped in parallel yet, but still... Greetings, Andres Freund
pgsql-hackers by date: