Thread: tables with 300+ partitions
Hi All! I had a big big big table. I tried to divide it in 300 partitions with 30M rows each one. The problem was when I used the table to insert information: the perfomance was LOW. I did some testing. I created a 300 partitioned empty table. Then, I inserted some rows on it and the perfomance was SLOW too. SLOW = 1% perfomance compared with a non partitioned table. That is too much. Then, I did a 10 partitioned table version with 30M rows each one and I inserted rows there. The performance was the same that the no partitioned table version. I suspect there is a lock problem there. I think every SQL command do a lock to ALL the partitions so the perfomance with concurrent inserts and updates are far worst than the no partitioned version. The perfomace degrade with the number of partitions. And it degrade fast: I have troubles with 40 partitions. Am I right? is there a workaround? Can I replace the partitioned version with another schema? any suggestion? I prefer to use something transparent for the program because it uses EJB3 = deep changes and testing on any change to the database layer. Regards Pablo Alcaraz
Pablo Alcaraz wrote: > I had a big big big table. I tried to divide it in 300 partitions with > 30M rows each one. The problem was when I used the table to insert > information: the perfomance was LOW. That's very vague. What exactly did you do? Just inserted a few rows, or perhaps a large bulk load of millions of rows? What was the bottleneck, disk I/O or CPU usage? How long did the operation take, and how long did you expect it to take? > I did some testing. I created a 300 partitioned empty table. Then, I > inserted some rows on it and the perfomance was SLOW too. > > SLOW = 1% perfomance compared with a non partitioned table. That is too > much. > > Then, I did a 10 partitioned table version with 30M rows each one and I > inserted rows there. The performance was the same that the no > partitioned table version. That suggests that the CPU time is spent in planning the query, possibly in constraint exclusion. But that's a very different scenario from having millions of rows in each partition. > I suspect there is a lock problem there. I think every SQL command do a > lock to ALL the partitions so the perfomance with concurrent inserts and > updates are far worst than the no partitioned version. Every query takes an AccessShareLock on each partition, but that doesn't prevent concurrent inserts or updates, and acquiring the locks isn't very expensive. In other words: no, that's not it. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On 10/30/07, Pablo Alcaraz <pabloa@laotraesquina.com.ar> wrote:
I did some testing. I created a 300 partitioned empty table. Then, I
inserted some rows on it and the perfomance was SLOW too.
Is the problem with inserting to the partitioned table or selecting from it? It sounds like inserting is the problem in which case I ask: how are you redirecting inserts to the appropriate partition? If you're using rules, then insert performance will quickly degrade with number of partitions as *every* rule needs to be evaluated for *every* row inserted to the base table. Using a trigger which you can modify according to some schedule is much faster, or better yet, use some application-level logic to insert directly to the desired partition.
Steve
Steven Flatt wrote:
The selects that need to evaluate all partitions were slow too, but I think I can wait for them. :D
I wonder if the update are slow too. I do not know that.
Do I need to do a trigger for insert only or I need a trigger to update and delete too?
To modify the appilication logic for this is not an options because they need to open the program, modify it and retest. All because an implementation problem. I prefer to try to solve it at the database level because the database need this table partitioned.
Thanks for your help
Regards.
Pablo
I was a program inserting into the base table. The program ran in 200+ threads and every thread insert data on it. Every thread inserts a row every 3 seconds aprox.(or they used to do it), but when I put more partitions the insert speed went to 1 insert every 2 minutes.On 10/30/07, Pablo Alcaraz <pabloa@laotraesquina.com.ar> wrote:I did some testing. I created a 300 partitioned empty table. Then, I
inserted some rows on it and the perfomance was SLOW too.Is the problem with inserting to the partitioned table or selecting from it? It sounds like inserting is the problem in which case I ask: how are you redirecting inserts to the appropriate partition? If you're using rules, then insert performance will quickly degrade with number of partitions as *every* rule needs to be evaluated for *every* row inserted to the base table. Using a trigger which you can modify according to some schedule is much faster, or better yet, use some application-level logic to insert directly to the desired partition.Steve
The selects that need to evaluate all partitions were slow too, but I think I can wait for them. :D
I wonder if the update are slow too. I do not know that.
Do I need to do a trigger for insert only or I need a trigger to update and delete too?
To modify the appilication logic for this is not an options because they need to open the program, modify it and retest. All because an implementation problem. I prefer to try to solve it at the database level because the database need this table partitioned.
Thanks for your help
Regards.
Pablo
On 10/31/07, Pablo Alcaraz <pabloa@laotraesquina.com.ar> wrote:
I was a program inserting into the base table. The program ran in 200+ threads and every thread insert data on it. Every thread inserts a row every 3 seconds aprox.(or they used to do it), but when I put more partitions the insert speed went to 1 insert every 2 minutes.
We still need to know how you're redirecting inserts to the appropriate partition. My guess is that you have rules on the base table which say "if an incoming row matches a certain criteria, then insert into partition x instead". There are several discussions on this newsgroup already about why using rules for partitioning hurts insert performance.
Next question is *how* are you partitioning the table? If you're partitioning based on some sort of log time field (i.e. today's data goes in this partition, tomorrow's data goes in that partition, etc.), then it is reasonably easy to use a single trigger (as oppose to many rules) on the base table which you can modify on some schedule (using cron, for example). If you're partitioning based on some other id field, then using a trigger won't work as nicely.
Do I need to do a trigger for insert only or I need a trigger to update and delete too?
For delete, probably not. For update, depends on how you're partitioning. Could the update cause the row to belong to a different partition? Do you care about moving it at that point?
Steve
On 10/31/07, Pablo Alcaraz <pabloa@laotraesquina.com.ar> wrote: > > Steven Flatt wrote: > > On 10/30/07, Pablo Alcaraz <pabloa@laotraesquina.com.ar> wrote: > > I did some testing. I created a 300 partitioned empty table. Then, I > > inserted some rows on it and the perfomance was SLOW too. > > > Is the problem with inserting to the partitioned table or selecting from it? > It sounds like inserting is the problem in which case I ask: how are you > redirecting inserts to the appropriate partition? If you're using rules, > then insert performance will quickly degrade with number of partitions as > *every* rule needs to be evaluated for *every* row inserted to the base > table. Using a trigger which you can modify according to some schedule is > much faster, or better yet, use some application-level logic to insert > directly to the desired partition. > > Steve I was a program inserting into the base table. The program ran in 200+ > threads and every thread insert data on it. Every thread inserts a row every > 3 seconds aprox.(or they used to do it), but when I put more partitions the > insert speed went to 1 insert every 2 minutes. > > The selects that need to evaluate all partitions were slow too, but I think > I can wait for them. :D > > I wonder if the update are slow too. I do not know that. > > Do I need to do a trigger for insert only or I need a trigger to update and > delete too? You need a trigger for any update / delete / insert you don't want to be really slow. Basically, if a rule is doing it now, you need a trigger to do it to speed it up. My experience has been that at 200 to 1000 partitions, the speed of the smaller tables still makes selects faster than with one big table for certain kinds of access. At some point, the cost of planning a lookup against thousands of tables will be more than the savings of looking in a really small table. The nice thing about triggers is that you can use maths to figure out the name of the table you'll be writing to instead of a static table like most rules use. So, all you have to do is make sure the new tables get added under the parent and poof, you're ready to go, no need for a new trigger.
> Steven Flatt wrote: >> On 10/30/07, *Pablo Alcaraz* <pabloa@laotraesquina.com.ar >> <mailto:pabloa@laotraesquina.com.ar>> wrote: >> >> I did some testing. I created a 300 partitioned empty table. Then, I >> inserted some rows on it and the perfomance was SLOW too. >> >> Is the problem with inserting to the partitioned table or selecting >> from it? It sounds like inserting is the problem in which case I >> ask: how are you redirecting inserts to the appropriate partition? >> If you're using rules, then insert performance will quickly degrade >> with number of partitions as *every* rule needs to be evaluated for >> *every* row inserted to the base table. Using a trigger which you >> can modify according to some schedule is much faster, or better yet, >> use some application-level logic to insert directly to the desired >> partition. >> >> Steve > I was a program inserting into the base table. The program ran in 200+ > threads and every thread insert data on it. Every thread inserts a row > every 3 seconds aprox.(or they used to do it), but when I put more > partitions the insert speed went to 1 insert every 2 minutes. OK, that gives about 70 inserts per second - depending on the amount of data inserted this may or may not be manageable. What is the size of the data the threads are writing with each insert, or what is the size of the whole table (not the number of rows, but size in MB / GB). What is the table structure - what indices are defined on it, etc.? What kind of SELECT queries do you execute on the table / partitions? Aggregations or simple queries? Have you executed ANALYZE on all the partitions after loading the data? What are the EXPLAIN plan for the slow SELECT queries? Anyway 300 partitions for 200 threads seems a little bit too much to me. I'd use something like 10 partitions or something like that. What strategy have you chosen to redirect the inserts into the partitions, i.e. how do you determine the partition the insert should be written to? Maybe I missed something, but what is the CPU and I/O load? In other words, is the system CPU bound or I/O bound? > The selects that need to evaluate all partitions were slow too, but I > think I can wait for them. :D > > I wonder if the update are slow too. I do not know that. > > Do I need to do a trigger for insert only or I need a trigger to > update and delete too? If you have created the queries using "INHERITS" then all you need to do is redirect inserts - either using a RULE, a BEFORE INSERT trigger, or a stored procedure. Each of these options has advandages / disadvantages: Rules are quite easy to maintain (once you create a new partition you just need to create a new rule), but may have serious overhead in case of many partitions as you have to evaluate all rules . Triggers are not as easy to maintain as all the tables have to be in a single procedure, and adding / removing a partition means modifying the procedure. On the other side the performance may be better in case of many partitions. Both the solutions mentioned above have the advantage of transparency, i.e. the clients don't need to know about them. Stored procedures have the advantages and disadvanteges of a trigger, plus they have to be invoked by the client. Tomas
Scott Marlowe wrote:
Currently I have a insert rule only and the updates are right solved. I think the UPDATEs use the constraint because the program use the base table everywhere.
This is the base table structure:
-- Table: t
-- DROP TABLE t;
CREATE TABLE t
(
idt bigint NOT NULL,
idtpadre bigint NOT NULL,
e integer NOT NULL,
dmodi timestamp without time zone NOT NULL DEFAULT now(),
p integer NOT NULL DEFAULT 0,
m text NOT NULL
)
WITHOUT OIDS;
ALTER TABLE t OWNER TO e;
-- Rule: "t_update_00003 ON t"
-- DROP RULE t_update_00003 ON t;
CREATE OR REPLACE RULE t_update_00003 AS
ON INSERT TO t
WHERE new.idt >= 1::bigint AND new.idt <= 30000000::bigint DO INSTEAD INSERT INTO t_00003 (idt, idtpadre, e, dmodi, p, m)
VALUES (new.idt, new.idtpadre, new.e, new.dmodi, new.p, new.m);
-- Rule: "t_update_00006 ON t"
-- DROP RULE t_update_00006 ON t;
CREATE OR REPLACE RULE t_update_00006 AS
ON INSERT TO t
WHERE new.idt >= 30000001::bigint AND new.idt <= 60000000::bigint DO INSTEAD INSERT INTO t_00006 (idt, idtpadre, e, dmodi, p, m)
VALUES (new.idt, new.idtpadre, new.e, new.dmodi, new.p, new.m);
-- Rule: "t_update_00009 ON t"
-- DROP RULE t_update_00009 ON t;
CREATE OR REPLACE RULE t_update_00009 AS
ON INSERT TO t
WHERE new.idt >= 60000001::bigint AND new.idt <= 90000000::bigint DO INSTEAD INSERT INTO t_00009 (idt, idtpadre, e, dmodi, p, m)
VALUES (new.idt, new.idtpadre, new.e, new.dmodi, new.p, new.m);
-- Rule: "t_update_00012 ON t"
-- DROP RULE t_update_00012 ON t;
CREATE OR REPLACE RULE t_update_00012 AS
ON INSERT TO t
WHERE new.idt >= 90000001::bigint AND new.idt <= 120000000::bigint DO INSTEAD INSERT INTO t_00012 (idt, idtpadre, e, dmodi, p, m)
VALUES (new.idt, new.idtpadre, new.e, new.dmodi, new.p, new.m);
etc ... 300 hundred partitions
The partitions are created like:
CREATE TABLE t_00003
(
CONSTRAINT t_00003_pkey PRIMARY KEY (idt),
CONSTRAINT t_00003_idt_check CHECK (idt >= 1::bigint AND idt <= 30000000::bigint)
) INHERITS (t)
WITHOUT OIDS;
ALTER TABLE t_00003 OWNER TO e;
CREATE INDEX t_00003_e
ON t_00003
USING btree
(e);
etc
On 10/31/07, Pablo Alcaraz <pabloa@laotraesquina.com.ar> wrote:Steven Flatt wrote: On 10/30/07, Pablo Alcaraz <pabloa@laotraesquina.com.ar> wrote:I did some testing. I created a 300 partitioned empty table. Then, I inserted some rows on it and the perfomance was SLOW too.Is the problem with inserting to the partitioned table or selecting from it?It sounds like inserting is the problem in which case I ask: how are you redirecting inserts to the appropriate partition? If you're using rules, then insert performance will quickly degrade with number of partitions as *every* rule needs to be evaluated for *every* row inserted to the base table. Using a trigger which you can modify according to some schedule is much faster, or better yet, use some application-level logic to insert directly to the desired partition. Steve I was a program inserting into the base table. The program ran in 200+ threads and every thread insert data on it. Every thread inserts a row every 3 seconds aprox.(or they used to do it), but when I put more partitions the insert speed went to 1 insert every 2 minutes. The selects that need to evaluate all partitions were slow too, but I think I can wait for them. :D I wonder if the update are slow too. I do not know that. Do I need to do a trigger for insert only or I need a trigger to update and delete too?You need a trigger for any update / delete / insert you don't want to be really slow. Basically, if a rule is doing it now, you need a trigger to do it to speed it up. My experience has been that at 200 to 1000 partitions, the speed of the smaller tables still makes selects faster than with one big table for certain kinds of access. At some point, the cost of planning a lookup against thousands of tables will be more than the savings of looking in a really small table. The nice thing about triggers is that you can use maths to figure out the name of the table you'll be writing to instead of a static table like most rules use. So, all you have to do is make sure the new tables get added under the parent and poof, you're ready to go, no need for a new trigger.
Currently I have a insert rule only and the updates are right solved. I think the UPDATEs use the constraint because the program use the base table everywhere.
This is the base table structure:
-- Table: t
-- DROP TABLE t;
CREATE TABLE t
(
idt bigint NOT NULL,
idtpadre bigint NOT NULL,
e integer NOT NULL,
dmodi timestamp without time zone NOT NULL DEFAULT now(),
p integer NOT NULL DEFAULT 0,
m text NOT NULL
)
WITHOUT OIDS;
ALTER TABLE t OWNER TO e;
-- Rule: "t_update_00003 ON t"
-- DROP RULE t_update_00003 ON t;
CREATE OR REPLACE RULE t_update_00003 AS
ON INSERT TO t
WHERE new.idt >= 1::bigint AND new.idt <= 30000000::bigint DO INSTEAD INSERT INTO t_00003 (idt, idtpadre, e, dmodi, p, m)
VALUES (new.idt, new.idtpadre, new.e, new.dmodi, new.p, new.m);
-- Rule: "t_update_00006 ON t"
-- DROP RULE t_update_00006 ON t;
CREATE OR REPLACE RULE t_update_00006 AS
ON INSERT TO t
WHERE new.idt >= 30000001::bigint AND new.idt <= 60000000::bigint DO INSTEAD INSERT INTO t_00006 (idt, idtpadre, e, dmodi, p, m)
VALUES (new.idt, new.idtpadre, new.e, new.dmodi, new.p, new.m);
-- Rule: "t_update_00009 ON t"
-- DROP RULE t_update_00009 ON t;
CREATE OR REPLACE RULE t_update_00009 AS
ON INSERT TO t
WHERE new.idt >= 60000001::bigint AND new.idt <= 90000000::bigint DO INSTEAD INSERT INTO t_00009 (idt, idtpadre, e, dmodi, p, m)
VALUES (new.idt, new.idtpadre, new.e, new.dmodi, new.p, new.m);
-- Rule: "t_update_00012 ON t"
-- DROP RULE t_update_00012 ON t;
CREATE OR REPLACE RULE t_update_00012 AS
ON INSERT TO t
WHERE new.idt >= 90000001::bigint AND new.idt <= 120000000::bigint DO INSTEAD INSERT INTO t_00012 (idt, idtpadre, e, dmodi, p, m)
VALUES (new.idt, new.idtpadre, new.e, new.dmodi, new.p, new.m);
etc ... 300 hundred partitions
The partitions are created like:
CREATE TABLE t_00003
(
CONSTRAINT t_00003_pkey PRIMARY KEY (idt),
CONSTRAINT t_00003_idt_check CHECK (idt >= 1::bigint AND idt <= 30000000::bigint)
) INHERITS (t)
WITHOUT OIDS;
ALTER TABLE t_00003 OWNER TO e;
CREATE INDEX t_00003_e
ON t_00003
USING btree
(e);
etc