Re: "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables |
Date | |
Msg-id | 47C7D824.5000707@enterprisedb.com Whole thread Raw |
In response to | "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables ("John Smith" <sodgodofall@gmail.com>) |
Responses |
Re: "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables
Re: "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables Re: "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables |
List | pgsql-hackers |
John Smith wrote: > Architecture: Intel Core 2 Duo > OS: linux-2.6.20-gentoo-r8 > Filesystem: ext3 > Postgres v8.2.3 compiled with gcc 4.1.1-r3 > RAM - 2GB > Shared buffers - 24MB > [All other Postgres configuration parameters are default values] > > Problem description: > COPY into temp table fails using a specific combination of > create/insert on temp tables, prepare/commit in subsequent > transactions. The "could not open relation" error occurs reliably. > > Steps to reproduce: > > Existing schema (scripts to create and populate these tables are > uploaded to http://upload2.net/page/download/rfsLfnuVlYjEcCJ/basetables.tgz.html > ): I can't get that link to work. Can you please email me the files offlist? Or upload somewhere else if they're too big for email. > Observations: > 1. The size of the data seems to matters. If the amount of data being > inserted is dropped to just one or two records per table, the error > doesn't happen. > 2. The order of columns for the select into temp2 matters. Changing > the order can cause the error to go away. > 3. If the prepare/commit is replaced with a "commit;" the error goes away. > 4. Removing "temp3" or "temp4" from the transaction causes one run of > the above statements to succeed, but if the sequence is issued in the > same PSQL session, the second one will fail. > 5. Given the current dataset, the error always occurs on line 926 of > the COPY (even if the values at line 926 are changed). > 6. <tablespace>/<database>/<oid> typically always corresponds to that > of temp2 on my system. I think I see what's happening here. We have restricted two-phase commit so that you're not supposed to be able to PREPARE TRANSACTION if the transaction has touched any temporary tables. That's because the 2nd phase commit can be performed from another backend, and another backend can't mess with another backend's temporary tables. However in this case, where you CREATE and DROP the temporary table in the same transaction, we don't detect that, and let the PREPARE TRANSACTION to finish. The detection relies on the lock manager, but we're not holding any locks on the dropped relation. I think we could in fact allow CREATE+DROP in same transaction, and remove the table immediately at PREPARE TRANSACTION, but what happens right now is that we store the relfilenode of the temp table to the two-phase state file in pg_twophase, for deletion at COMMIT/ROLLBACK PREPARED. But we don't store the fact that it's a temporary table, and therefore we try to unlink it like a normal table, and fail to purge the temp buffers of that table which causes problems later. Attached is a simple patch to fix that by disallowing CREATE+DROP+PREPARE TRANSACTION more reliably. It'd still be nice to debug the full test case of yours to verify that that's what's happening, though. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com Index: src/backend/access/transam/twophase.c =================================================================== RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/transam/twophase.c,v retrieving revision 1.39 diff -c -r1.39 twophase.c *** src/backend/access/transam/twophase.c 1 Jan 2008 19:45:48 -0000 1.39 --- src/backend/access/transam/twophase.c 29 Feb 2008 09:58:19 -0000 *************** *** 793,798 **** --- 793,799 ---- TransactionId *children; RelFileNode *commitrels; RelFileNode *abortrels; + bool haveTempCommit, haveTempAbort; /* Initialize linked list */ records.head = palloc0(sizeof(XLogRecData)); *************** *** 815,824 **** hdr.prepared_at = gxact->prepared_at; hdr.owner = gxact->owner; hdr.nsubxacts = xactGetCommittedChildren(&children); ! hdr.ncommitrels = smgrGetPendingDeletes(true, &commitrels, NULL); ! hdr.nabortrels = smgrGetPendingDeletes(false, &abortrels, NULL); StrNCpy(hdr.gid, gxact->gid, GIDSIZE); save_state_data(&hdr, sizeof(TwoPhaseFileHeader)); /* Add the additional info about subxacts and deletable files */ --- 816,830 ---- hdr.prepared_at = gxact->prepared_at; hdr.owner = gxact->owner; hdr.nsubxacts = xactGetCommittedChildren(&children); ! hdr.ncommitrels = smgrGetPendingDeletes(true, &commitrels, NULL, &haveTempCommit); ! hdr.nabortrels = smgrGetPendingDeletes(false, &abortrels, NULL, &haveTempAbort); StrNCpy(hdr.gid, gxact->gid, GIDSIZE); + if (haveTempCommit || haveTempAbort) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cannot PREPARE a transaction that has operated on temporary tables"))); + save_state_data(&hdr, sizeof(TwoPhaseFileHeader)); /* Add the additional info about subxacts and deletable files */ Index: src/backend/access/transam/xact.c =================================================================== RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/transam/xact.c,v retrieving revision 1.257 diff -c -r1.257 xact.c *** src/backend/access/transam/xact.c 15 Jan 2008 18:56:59 -0000 1.257 --- src/backend/access/transam/xact.c 29 Feb 2008 09:48:52 -0000 *************** *** 802,808 **** TransactionId *children; /* Get data needed for commit record */ ! nrels = smgrGetPendingDeletes(true, &rels, &haveNonTemp); nchildren = xactGetCommittedChildren(&children); /* --- 802,808 ---- TransactionId *children; /* Get data needed for commit record */ ! nrels = smgrGetPendingDeletes(true, &rels, &haveNonTemp, NULL); nchildren = xactGetCommittedChildren(&children); /* *************** *** 1174,1180 **** xid); /* Fetch the data we need for the abort record */ ! nrels = smgrGetPendingDeletes(false, &rels, NULL); nchildren = xactGetCommittedChildren(&children); /* XXX do we really need a critical section here? */ --- 1174,1180 ---- xid); /* Fetch the data we need for the abort record */ ! nrels = smgrGetPendingDeletes(false, &rels, NULL, NULL); nchildren = xactGetCommittedChildren(&children); /* XXX do we really need a critical section here? */ Index: src/backend/storage/smgr/smgr.c =================================================================== RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/storage/smgr/smgr.c,v retrieving revision 1.109 diff -c -r1.109 smgr.c *** src/backend/storage/smgr/smgr.c 1 Jan 2008 19:45:52 -0000 1.109 --- src/backend/storage/smgr/smgr.c 29 Feb 2008 09:56:37 -0000 *************** *** 678,689 **** * * If haveNonTemp isn't NULL, the bool it points to gets set to true if * there is any non-temp table pending to be deleted; false if not. * * Note that the list does not include anything scheduled for termination * by upper-level transactions. */ int ! smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr, bool *haveNonTemp) { int nestLevel = GetCurrentTransactionNestLevel(); int nrels; --- 678,692 ---- * * If haveNonTemp isn't NULL, the bool it points to gets set to true if * there is any non-temp table pending to be deleted; false if not. + * haveTemp is similar, but gets set if there is any temp table deletions + * pending. * * Note that the list does not include anything scheduled for termination * by upper-level transactions. */ int ! smgrGetPendingDeletes(bool forCommit, RelFileNode **ptr, ! bool *haveNonTemp, bool *haveTemp) { int nestLevel = GetCurrentTransactionNestLevel(); int nrels; *************** *** 693,698 **** --- 696,703 ---- nrels = 0; if (haveNonTemp) *haveNonTemp = false; + if (haveTemp) + *haveTemp = false; for (pending = pendingDeletes; pending != NULL; pending = pending->next) { if (pending->nestLevel >= nestLevel && pending->atCommit == forCommit) *************** *** 711,716 **** --- 716,723 ---- *rptr++ = pending->relnode; if (haveNonTemp && !pending->isTemp) *haveNonTemp = true; + if (haveTemp && pending->isTemp) + *haveTemp = true; } return nrels; }
pgsql-hackers by date: