Thread: partitioning / rules - strange behavior
Hi, we're running PostgreSQL 8.1.4 and I've encountered some strange problems with partitioning / rules, and I really don't know how to fix it. We have a table 'sessions' partitioned along the 'id' column which is a primary key. Each partition holds at most 500.000 rows, i.e. partition 'sessions_0' holds rows with ids 0 - 499.999, 'sessions_500000' holds rows with ids 500.000 - 999.999 and so on. We're currently using partition 'sessions_8500000'. I think the structure of the table is not important here, but there's nothing special about it - several columns, some foreign keys etc. There are no triggers on it or on the partitions. Inserts are redirected into the correct partition by RULES on the 'sessions' table, i.e. CREATE OR REPLACE RULE insert_8500000 AS ON INSERT TO sessions WHERE (id BETWEEN 8500000 AND 8999999) DO INSTEAD INSERT INTO sessions_8500000 (... columns here ...) VALUES (... NEW.columns ...); The problem is that once I create a rule for the next partition (it already exists), it simply stops working. It seems AS if the INSERT is succesfully redirected into the correct partition, but no data are inserted. So for example when I create a rule CREATE OR REPLACE RULE insert_9000000 AS ON INSERT TO sessions WHERE (id BETWEEN 9000000 AND 9499999) DO INSTEAD INSERT INTO sessions_9000000 (... columns here ...) VALUES (... NEW.columns ...); the rule insert_8500000 does not work anymore. When I insert a row with id between 8500000 and 8999999 it prints 'INSERT 0 0' as usual but when I try to fetch it no rows are found. That is db=> INSERT INTO sessions(... non-pk columns ...) VALUES (... data ...); INSERT 0 0 db=> SELECT * FROM sessions WHERE id = currval('sessions_id_seq'); (0 rows) which seems really strange to me. We're using sequence to generate the ids, but that shouldn't be a problem (at least it was not till today). All these problems started when one of our stored procedures used for a maintenance crashed. This procedure checks the currently used partition, checks if the 'next one' exists and creates the partition & rules / drop the old rules if needed. So there are some dynamic SQL, basically EXECUTE 'CREATE TABLE ...'; EXECUTE 'CREATE RULE ...'; EXECUTE 'CREATE INDEX ...'; EXECUTE 'DROP RULE ...'; I did some changes to this procedure recently and a stupid mistype in one of the SQL commands caused a runtime EXCEPTION yesterday. It created the next partition (sessions_9000000), and then failed because of the mistype, and this is the moment the problems described above began. This is really strange, as I thought all the procedures are running as a transaction, so a failure shouldn't cause such problems. I've fixed the mistype and rerun the procedure (succesfully) but it didn't help. I've tried to recreate the rules (drop / create) manually, but still no change. But when I drop the new rule (insert_9000000) it works fine, and once I create it again it stops working. What I suspect is the failure in the stored procedure did something wrong to the catalogs but maybe I'm completely wrong. Anyway I really don't know how to fix it. thanks for all advices Tomas
-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 > db=> INSERT INTO sessions(... non-pk columns ...) VALUES (... data ...); > INSERT 0 0 > db=> SELECT * FROM sessions WHERE id = currval('sessions_id_seq'); > (0 rows) > > which seems really strange to me. We're using sequence to generate the > ids, but that shouldn't be a problem (at least it was not till today). Looks to me as though you have forgottent the table inheritance part of the table partitioning trick. Are you sure that sessions_900000 inherit from the sessions table? Does a direct count from the table referenced by the rule return a non-zero count? In other words: SELECT count(*) FROM sessions_900000 where id = currval('sessions_id_seq'); (or some other similar table) may show your "missing" rows. The other possibility is that the you are not using sessions_id_seq in the way you think you are. - -- Greg Sabino Mullane greg@turnstep.com PGP Key: 0x14964AC8 200702041952 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iD8DBQFFxoDWvJuQZxSWSsgRAxqLAKDNedJ2pPg9otabsrinS1002SL0XQCeMwyF 8dqMxeB/x4XoZoj2WATZyPc= =mvMI -----END PGP SIGNATURE-----
> Looks to me as though you have forgottent the table inheritance part of the > table partitioning trick. Are you sure that sessions_900000 inherit from > the sessions table? Does a direct count from the table referenced by the > rule return a non-zero count? In other words: > > SELECT count(*) FROM sessions_900000 where id = currval('sessions_id_seq'); No, the tables are inherited, i.e. there's something like CREATE TABLE sessions_8000000 ( CHECK (id BETWEEN 8000000 AND 8499999 ) INHERITS (sessions); I've checked this for all the partitions (we're using three right now) as well as the direct count (returns nothing). As I already said - it works until I create a rule for the next partition (not used yet) - then is suddenly stops working. > (or some other similar table) may show your "missing" rows. The other > possibility is that the you are not using sessions_id_seq in the way > you think you are. What do you mean? I do a 'SELECT currval(...)' after the insert and it gives me the right value. It works until I create the next RULE, so this does not seem as a sequence problem. Tomas
Tomas Vondra <tv@fuzzy.cz> writes: > As I already said - it works until I create a rule for the next > partition (not used yet) - then is suddenly stops working. Suddenly stops working, or continues doing exactly what it did before? I'm wondering if you are relying on a cached plan that doesn't include the new rule. regards, tom lane
> Suddenly stops working, or continues doing exactly what it did before? > I'm wondering if you are relying on a cached plan that doesn't include > the new rule. > > regards, tom lane > If there´s only the insert_8500000 RULE then everything works as expected - the insert prints "INSERT 0 0", the row is inserted into the correct partition which is sessions_8500000 - I can fetch it using either SELECT * FROM sessions WHERE id = currval('sessions_id_seq'); or direcly by SELECT * FROM sessions_8500000 WHERE id = currval('sessions_id_seq'); When I create the next next rule (insert_9000000 for ids between 9000000 and 9499999) it stops working - it prints "INSERT 0 0" just as before, everything seems fine, but the row disappears - it's not available . I'm not sure about the query plans, but I think I've checked that and everything seemed ok - all the partitions were used as far as I remember. But this shouldn't be a problem as we have not reached the 9000000 limit yet (so the new partition is not used at all). And we've tried to restart the PostgreSQL as the last hope, yesterday, so there really should be no old plans. I don't have an access to the production database (I have not been able to simulate this on the development/testing system) - I'll play with that at night (european time). I'll try to drop / recreate the partition (I've tried to recreate only the RULEs, not the partitions). Tomas
> If there´s only the insert_8500000 RULE then everything works as expected - the > insert prints "INSERT 0 0", the row is inserted into the correct partition > which is sessions_8500000 - I can fetch it using either > > SELECT * FROM sessions WHERE id = currval('sessions_id_seq'); > > or direcly by > > SELECT * FROM sessions_8500000 WHERE id = currval('sessions_id_seq'); > > When I create the next next rule (insert_9000000 for ids between 9000000 and > 9499999) it stops working - it prints "INSERT 0 0" just as before, everything > seems fine, but the row disappears - it's not available . > > I'm not sure about the query plans, but I think I've checked that and everything > seemed ok - all the partitions were used as far as I remember. But this > shouldn't be a problem as we have not reached the 9000000 limit yet (so the new > partition is not used at all). And we've tried to restart the PostgreSQL as the > last hope, yesterday, so there really should be no old plans. > > I don't have an access to the production database (I have not been able to > simulate this on the development/testing system) - I'll play with that at night > (european time). I'll try to drop / recreate the partition (I've tried to > recreate only the RULEs, not the partitions). > > Tomas OK, I did some tests a while ago and the bad news is I still was not able to fix it. The table structure is this ======================================================================= db=> \d sessions Table "public.sessions" Column | Type | Modifiers ------------------+-----------------------------+--------------------- id | integer | not null default nextval('sessions_id_seq'::regclass) browser_id | integer | os_id | integer | arch_id | integer | language_id | character(2) | country_id | character(2) | visitor_id | integer | not null ip | inet | not null ip_forward | inet | session_date | timestamp without time zone | not null default now() user_agent | character varying(255) | screen_width | smallint | screen_height | smallint | screen_bit_depth | smallint | javascript | boolean | default false browser_minor | character varying(16) | browser_major | character varying(16) | referer | text | last_action | integer | not null default 0 Indexes: "sessions_pkey" PRIMARY KEY, btree (id) Check constraints: ... some foreign keys, not important here ... Rules: insert_8500000 AS ON INSERT TO sessions WHERE new.id >= 8500000 AND new.id <= 8999999 DO INSTEAD INSERT INTO sessions_8500000 (id, browser_id, os_id, arch_id, language_id, country_id, visitor_id, ip, ip_forward, session_date, user_agent, screen_width, screen_height, screen_bit_depth, javascript, browser_minor, browser_major, referer, last_action) VALUES (new.id, new.browser_id, new.os_id, new.arch_id, new.language_id, new.country_id, new.visitor_id, new.ip, new.ip_forward, new.session_date, new.user_agent, new.screen_width, new.screen_height, new.screen_bit_depth, new.javascript, new.browser_minor, new.browser_major, new.referer, new.last_action) ======================================================================= We're using sequence to generate the sessions(id) value, but that should not be a problem - with the structure / rules everything works fine (the current value in sessions_id_seq is about 8700000 so the values are inserted into the sessions_8500000 partition). The I create the 'next partition' for values between 9000000 and 9499999 using ======================================================================= CREATE TABLE sessions_9000000 ( CHECK (id BETWEEN 9000000 AND 9499999), PRIMARY KEY (id) ) INHERITS (sessions); ======================================================================= and everything still seems fine, even the execution plans reflect this new child table: ======================================================================= db=> explain select * from sessions; QUERY PLAN ---------------------------------------------------------------------- Result (cost=0.00..52262.48 rows=1052924 width=775) -> Append (cost=0.00..52262.48 rows=1052924 width=775) -> Seq Scan on sessions (cost=0.00..12.00 rows=100 width=775) -> Seq Scan on sessions_8000000 sessions (cost=0.00..23128.78 rows=500539 width=280) -> Seq Scan on sessions_8500000 sessions (cost=0.00..6147.60 rows=51230 width=775) -> Seq Scan on sessions_9000000 sessions (cost=0.00..12.00 rows=100 width=775) ======================================================================= but one I create a RULE for the new partition, thing go wrong. That is I execute this (I ommited the list of columns) ======================================================================= CREATE RULE insert_9000000 AS ON INSERT TO sessions WHERE (id BETWEEN 9000000 AND 9499999) DO INSTEAD INSERT INTO sessions_9000000 ( ... all the columns in sessions) VALUES ( ... all the columns in sessions prefixed with 'NEW' ...); ======================================================================= Now when I do for example ======================================================================= INSERT INTO sessions(id,visitor_id,ip) VALUES (8900000,0,'127.0.0.1'); ======================================================================= this new row should be inserted into the session_8500000 partition as the 8900000 is clearly between 8500000 AND 8999999. It even seems succesfully inserted (no exception, returns INSERT 0 0 as usual), but once I do SELECT * FROM sessions WHERE id = 8900000 it returns no rows. Even SELECT * FROM sessions_8500000 WHERE id = 8900000 returns no rows. Here is the execution plan for the INSERT (the execution plan for the SELECT can be found above). ======================================================================= db=> EXPLAIN ANALYZE INSERT INTO sessions(id,visitor_id,ip) VALUES (8900000,0,'127.0.0.1'); QUERY PLAN ---------------------------------------------------------------------- Result (cost=0.00..0.03 rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=1) One-Time Filter: ((true IS NOT TRUE) AND (false IS NOT TRUE)) Total runtime: 0.063 ms Result (cost=0.00..0.03 rows=1 width=0) (actual time=0.012..0.013 rows=1 loops=1) ... some triggers for foreign keys on 'sessions' ... Total runtime: 0.209 ms Result (cost=0.00..0.03 rows=1 width=0) (actual time=0.001..0.001 rows=0 loops=1) One-Time Filter: false Total runtime: 0.052 ms (15 rows) ==================================================================== The 'funny' thing is once I drop that new rule (insert_9000000) it starts working again. I really don't know how to solve this - today I've tried to drop / recreate the new _9000000 partitions (which are still empty) but no luck. Tomorrow I'll restore a fresh backup on a development system, and try if it 'works' in the same way. Maybe I'm missing something, but I see nothing wrong in the partitions and rules. I've taken the current backup (taken at night) and loaded that at the development system - everything works exactly as expected with exactly the same set-up. BTW we're using PostgreSQL 8.1.4 (on Linux) on both machines. Thanks in advance for all your advices how to fix this, optimally with as little downtime as possible. Tomas
On Tue, 06 Feb 2007 21:28:49 +0100, Tomas Vondra <tv@fuzzy.cz> wrote: > > We're using sequence to generate the sessions(id) value, but that should > not be a problem - with the structure / rules everything works fine (the > current value in sessions_id_seq is about 8700000 so the values are > inserted into the sessions_8500000 partition). It's likely to be a problem because of multiple evaluations of volatile expressions in the rule rewrite system....short example: CREATE TABLE sessions(id SERIAL PRIMARY KEY, value TEXT); ^ CREATE TABLE sessions_100(CHECK(id BETWEEN 1 AND 100), PRIMARY KEY(id)) INHERITS(sessions); CREATE TABLE sessions_200(CHECK(id BETWEEN 101 AND 200), PRIMARY KEY(id)) INHERITS(sessions); CREATE OR REPLACE RULE insert_100 AS ON INSERT TO sessions WHERE NEW.id BETWEEN 1 AND 100 DO INSTEAD INSERT INTO sessions_100(id, value) VALUES(NEW.id, NEW.value); CREATE OR REPLACE RULE insert_200 AS ON INSERT TO sessions WHERE NEW.id BETWEEN 101 AND 200 DO INSTEAD INSERT INTO sessions_200(id, value) VALUES(NEW.id, NEW.value); INSERT INTO sessions(value) VALUES('bernd'); SELECT * FROM sessions; id | value ----+------- 5 | bernd (1 row) but... SELECT * FROM sessions WHERE id = currval('sessions_id_seq'); id | value ----+------- (0 rows) SELECT currval('sessions_id_seq'); currval --------- 6 (1 row) [...] > > Now when I do for example > > ======================================================================= > > INSERT INTO sessions(id,visitor_id,ip) VALUES (8900000,0,'127.0.0.1'); > > ======================================================================= > > this new row should be inserted into the session_8500000 partition as > the 8900000 is clearly between 8500000 AND 8999999. It even seems > succesfully inserted (no exception, returns INSERT 0 0 as usual), but > once I do > > SELECT * FROM sessions WHERE id = 8900000 > > it returns no rows. Even > > SELECT * FROM sessions_8500000 WHERE id = 8900000 > > returns no rows. Here is the execution plan for the INSERT (the > execution plan for the SELECT can be found above). > Maybe i'm missing something, but with constant values i'm not able to reproduce this in my example above: INSERT INTO sessions VALUES(200, 'xyz'); SELECT * FROM sessions_200 WHERE id = 200; id | value -----+------- 200 | xyz (1 row) INSERT INTO sessions VALUES(87, 'xyz'); SELECT * FROM sessions_100 WHERE id = 87; id | value ----+------- 87 | xyz (1 row) Bernd
> It's likely to be a problem because of multiple evaluations of volatile expressions > in the rule rewrite system....short example: > > CREATE TABLE sessions(id SERIAL PRIMARY KEY, value TEXT); ^ > CREATE TABLE sessions_100(CHECK(id BETWEEN 1 AND 100), PRIMARY KEY(id)) > INHERITS(sessions); > CREATE TABLE sessions_200(CHECK(id BETWEEN 101 AND 200), PRIMARY KEY(id)) > INHERITS(sessions); > > CREATE OR REPLACE RULE insert_100 > AS ON INSERT TO sessions > WHERE NEW.id BETWEEN 1 AND 100 > DO INSTEAD > INSERT INTO sessions_100(id, value) VALUES(NEW.id, NEW.value); > > CREATE OR REPLACE RULE insert_200 > AS ON INSERT TO sessions > WHERE NEW.id BETWEEN 101 AND 200 > DO INSTEAD > INSERT INTO sessions_200(id, value) VALUES(NEW.id, NEW.value); > > INSERT INTO sessions(value) VALUES('bernd'); > > SELECT * FROM sessions; > id | value > ----+------- > 5 | bernd > (1 row) > > but... > > SELECT * FROM sessions WHERE id = currval('sessions_id_seq'); > id | value > ----+------- > (0 rows) > > SELECT currval('sessions_id_seq'); > currval > --------- > 6 > (1 row) Ouch! I've never noticed this behavior! When I do db=> SELECT currval('sessions_id_seq'); currval --------- 6 db=> INSERT INTO sessions(value) VALUES('bernd'); INSERT 0 0 db=> SELECT currval('sessions_id_seq'); currval --------- 12 I'll check if this is the reason why it works on the development system and not on the production. But I don't understand why the nextval('sessions_id_seq') is evaluated multiple times? Even when I do INSERT INTO sessions(id,value) VALUES(nextval('sessions_id_seq','x'); it calls sessions_id_seq several times. I'll fix it by first fetching the ID and then using it as a constant value in the INSERT, but I'd like to know the reason why it works this way. Tomas
On Wed, Feb 07, 2007 at 08:09:57PM +0100, Tomas Vondra wrote: > But I don't understand why the nextval('sessions_id_seq') is evaluated > multiple times? Even when I do RULE expand like macros, so every expression passed in is simply copied to each place, leading to multiple evaluations. If you've played with the C preprocessor: that has the same basic problem. It's rather unexpected for novices, which is why triggers are usually recommended. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
>> But I don't understand why the nextval('sessions_id_seq') is evaluated >> multiple times? Even when I do > > RULE expand like macros, so every expression passed in is simply copied > to each place, leading to multiple evaluations. > > If you've played with the C preprocessor: that has the same basic > problem. > > It's rather unexpected for novices, which is why triggers are usually > recommended. I wouldn't call myself a novice, but yes - this surprised me a little. Now it seems so natural ... I think there should be a clear explanation / example in the docs about this - I've read the documentation on RULE sytem several times and as far as I remember. Anyway thanks to those helped me to solve this problem, now it works as expected. Tomas
> I wouldn't call myself a novice, but yes - this surprised me a little. > Now it seems so natural ... I think there should be a clear explanation > / example in the docs about this - I've read the documentation on RULE > sytem several times and as far as I remember. ... and as far as I remember there's no clear statement about this. Tomas