Thread: Re: pgbench with partitioned tables
On Fri, Jan 31, 2025 at 6:32 AM Sergey Tatarintsev <s.tatarintsev@postgrespro.ru> wrote: > > When using manually created partitioned tables for pgbench, an error > occurred: > ERROR: cannot perform COPY FREEZE on a partitioned table. > > Currently only pgbench_accounts can be partitioned, all others must be a > regular tables. > > I wrote a patch to disable WITH (FREEZE = ON) when generating data for > non-regular tables. Thanks for the patch! I recommend including a repro when proposing such things. Here's one I made # pgbench init so all the tables are there pgbench -i -s 10 # drop pgbench_tellers and recreate it partitioned psql -f- <<EOF DROP TABLE IF EXISTS pgbench_tellers; CREATE TABLE pgbench_tellers ( tid INT NOT NULL, bid INT, tbalance INT, filler CHAR(84) ) PARTITION BY RANGE (tid); CREATE TABLE pgbench_tellers_p1 PARTITION OF pgbench_tellers FOR VALUES FROM (1) TO (50); CREATE TABLE pgbench_tellers_p2 PARTITION OF pgbench_tellers FOR VALUES FROM (50) TO (2000); EOF # run pgbench init with only the data gen step pgbench -i -I g -s 10 Personally, I have needed to disable COPY FREEZE during pgbench -i when benchmarking features related to vacuum's freezing behavior. Maybe instead of just not using COPY FREEZE on a table if it is partitioned, we could add new data generation init_steps. Perhaps one that is client-side data generation (g) but with no freezing? I'm not really sure what the letter would be (f? making it f, g, and G?). It seems they did not consider making COPY FREEZE optional when adding the feature to pgbench [1]. This differs from your patch's behavior but it might still solve your problem? - Melanie [1] https://www.postgresql.org/message-id/flat/20210308.143907.2014279678657453983.t-ishii%40gmail.com
On 2025-Jan-31, Melanie Plageman wrote: > Maybe instead of just not using COPY FREEZE on a table if it is > partitioned, we could add new data generation init_steps. Perhaps one > that is client-side data generation (g) but with no freezing? I'm not > really sure what the letter would be (f? making it f, g, and G?). I think it makes sense to do what you suggest, but on the other hand, the original code that Sergey is patching looks like a hack in the sense that it hardcodes which tables to use FREEZE with. There's no point to doing things that way actually, so accepting Sergey's patch to replace that with a relkind check feels sensible to me. I think the query should be SELECT relkind FROM pg_catalog.pg_class WHERE oid='%s'::pg_catalog.regclass if only for consistency with pgbench's other query on catalogs. Your proposal to add different init_steps might be reasonable, at least if we allowed partitionedness of tables to vary in other ways (eg. if we made pgbench_history partitioned), but I don't think it conflicts with Sergey's patch in spirit. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "Here's a general engineering tip: if the non-fun part is too complex for you to figure out, that might indicate the fun part is too ambitious." (John Naylor) https://postgr.es/m/CAFBsxsG4OWHBbSDM%3DsSeXrQGOtkPiOEOuME4yD7Ce41NtaAD9g%40mail.gmail.com
I was looking at the comments [1] for why COPY FREEZE is not allowed on a parent table, and it was mainly due to having to open up partitions to check if they are able to take the optimization (table created or truncated in the current transaction ). Obviously as the number of partitions grow, it will take more to perform these checks. My suspicious is that in the cases in which partitions are used, the benefits of COPY FREEZE could outweigh the overhead of these additional checks. So while we could try to solve the COPY FREEZE issue specifically for pgbench, I wonder if we should try to do better and see if the limitation on a parent partition can be removed. I can provide a patch and some benchmark numbers unless there is something bigger I am missing about the reason this limitation exists? [1] https://github.com/postgres/postgres/blob/master/src/backend/commands/copyfrom.c#L727-L735 Regards, Sami