Thread: ERROR: missing chunk number 0 for toast value
Hi All,
Test case:
drop table if exists t;
create table t(c text);
insert into t values ('x'), (repeat(md5('abcdefghijklmnop'), 10000));
select pg_column_size(c), pg_column_size(c || '') FROM t;
CREATE OR REPLACE FUNCTION copy_toast_out() RETURNS VOID AS $$
declare
v text;
BEGIN
SELECT c INTO v FROM t WHERE c <> 'x';
Select 1/0;
Exception
When Others Then
PERFORM pg_sleep(30); -- go run "TRUNCATE t" in a 2nd session
raise notice 'length :%', length(v || ''); -- force detoast
END;
$$ language plpgsql;
postgres=# select copy_toast_out();
ERROR: missing chunk number 0 for toast value 16390 in pg_toast_16384
CONTEXT: PL/pgSQL function copy_toast_out() line 10 at RAISE
Analysis:
The basic problem here is that if the lock is released on table before
extracting toasted value, and in meantime someone truncates the table,
this error can occur. Here error coming with PL block contains an Exception
block (as incase there is an exception block, it calls RollbackAndReleaseCurrentSubTransaction).
Do you think we should detoast the local variable before RollbackAndReleaseCurrentSubTransaction ? Or any other options ?
Regards,
Rushabh Lathia On 01/02/2014 02:24 PM, Rushabh Lathia wrote: > Hi All, > > Test case: > > drop table if exists t; > create table t(c text); > insert into t values ('x'), (repeat(md5('abcdefghijklmnop'), 10000)); > select pg_column_size(c), pg_column_size(c || '') FROM t; > > CREATE OR REPLACE FUNCTION copy_toast_out() RETURNS VOID AS $$ > declare > v text; > BEGIN > SELECT c INTO v FROM t WHERE c <> 'x'; > Select 1/0; > Exception > When Others Then > PERFORM pg_sleep(30); -- go run "TRUNCATE t" in a 2nd session > > > raise notice 'length :%', length(v || ''); -- force detoast > > > END; > $$ language plpgsql; > > postgres=# select copy_toast_out(); > ERROR: missing chunk number 0 for toast value 16390 in pg_toast_16384 > CONTEXT: PL/pgSQL function copy_toast_out() line 10 at RAISE > > Analysis: > > The basic problem here is that if the lock is released on table before > extracting toasted value, and in meantime someone truncates the table, > this error can occur. Here error coming with PL block contains an Exception > block (as incase there is an exception block, it calls > RollbackAndReleaseCurrentSubTransaction). This is another variant of the bug discussed here: http://www.postgresql.org/message-id/0C41674C-FA02-4768-9E1B-548E568875BD@quarantainenet.nl. > Do you think we should detoast the local variable before > RollbackAndReleaseCurrentSubTransaction ? Or any other options ? Hmm, that would fix this particular test case, but not the other case where you DROP or TRUNCATE the table in the same transaction. The simplest fix would be to just detoast everything on assignment but that was rejected on performance grounds in that previous thread. I don't see any other realistic way to fix this, however, so maybe we should just bite the bullet and do it anyway. For simple variables like, in your test case, it's a good bet to detoast the value immediately; it'll be detoasted as soon as you try to do anything with it anyway. But it's not a good bet for record or row variables, because you often fetch the whole row into a variable but only access a field or two. Then again, if you run into that, at least you can work around it by changing your plpgsql code to only fetch the fields you need. - Heikki
Heikki Linnakangas <hlinnakangas@vmware.com> writes: > The simplest fix would be to just detoast everything on assignment but > that was rejected on performance grounds in that previous thread. I > don't see any other realistic way to fix this, however, so maybe we > should just bite the bullet and do it anyway. Or just say "don't do that". TRUNCATE on a table that's in use by open transactions has all sorts of issues besides this one. The given example is a pretty narrow corner case anyway --- with a less contorted coding pattern, we'd still have AccessShareLock on the table, blocking the TRUNCATE from removing data. I'd still not want to blow up performance in order to make this example work. regards, tom lane
On 2014-01-02 21:21:15 +0200, Heikki Linnakangas wrote: > I don't see any other realistic way to fix this, however, so maybe we > should just bite the bullet and do it anyway. We could remember the subtransaction a variable was created in and error out if it the creating subtransaction aborted and it's not a pass-by-value datum or similar. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-01-02 21:21:15 +0200, Heikki Linnakangas wrote: >> I don't see any other realistic way to fix this, however, so maybe we >> should just bite the bullet and do it anyway. > We could remember the subtransaction a variable was created in and error > out if it the creating subtransaction aborted and it's not a > pass-by-value datum or similar. That would still result in throwing an error, though, so it isn't likely to make the OP happy. I was wondering if we could somehow arrange to not release the subtransaction's AccessShareLock on the table, as long as it was protecting toasted references someplace. regards, tom lane
On 2014-01-02 15:00:58 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-01-02 21:21:15 +0200, Heikki Linnakangas wrote: > >> I don't see any other realistic way to fix this, however, so maybe we > >> should just bite the bullet and do it anyway. > > > We could remember the subtransaction a variable was created in and error > > out if it the creating subtransaction aborted and it's not a > > pass-by-value datum or similar. > > That would still result in throwing an error, though, so it isn't likely > to make the OP happy. Yea, it would give a better error message which might help diagnose the issue, but not more. We could disallow accessing such variables generally unless they explicitly had been detoasted, that would make people notice the problem more easily. I shortly wondered if we couldn't "just" iterate over plpgsql variables and detoast them on subabort if created in the aborted xact, but that doesn't really work because we're in an aborted transaction where it might not be safe to access relations... Theoretically the subabort could be split into two phases allowing it by only releasing the lock after safely switching to the upper transaction but that sounds like a hammer too big for the problem. > I was wondering if we could somehow arrange to not > release the subtransaction's AccessShareLock on the table, as long as it > was protecting toasted references someplace. Sounds fairly ugly... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Jan 2, 2014 at 3:19 PM, Andres Freund <andres@2ndquadrant.com> wrote: >> I was wondering if we could somehow arrange to not >> release the subtransaction's AccessShareLock on the table, as long as it >> was protecting toasted references someplace. > > Sounds fairly ugly... I think the only principled fixes are to either retain the lock or forcibly detoast before releasing it. The main problem I see with retaining the lock is that you'd need a way of finding out the relation OIDs of all toast pointers you might later decide to expand. I don't have an amazingly good idea about how to figure that out. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2014-01-02 16:05:09 -0500, Robert Haas wrote: > On Thu, Jan 2, 2014 at 3:19 PM, Andres Freund <andres@2ndquadrant.com> wrote: > >> I was wondering if we could somehow arrange to not > >> release the subtransaction's AccessShareLock on the table, as long as it > >> was protecting toasted references someplace. > > > > Sounds fairly ugly... > > I think the only principled fixes are to either retain the lock or > forcibly detoast before releasing it. I don't think that's sufficient. Unless I miss something the problem isn't restricted to TRUNCATE and such at all. I think a plain VACUUM should be sufficient? I haven't tested it, but INSERT RETURNING toasted_col a row, storing the result in a record, and then aborting the subtransaction will allow the inserted row to be VACUUMed by a concurrent transaction. So I don't think anything along those lines will be sufficient. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Jan 3, 2014 at 12:51 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > On 01/02/2014 02:24 PM, Rushabh Lathia wrote: >> >> Hi All, >> >> Test case: >> >> drop table if exists t; >> create table t(c text); >> insert into t values ('x'), (repeat(md5('abcdefghijklmnop'), 10000)); >> select pg_column_size(c), pg_column_size(c || '') FROM t; >> >> CREATE OR REPLACE FUNCTION copy_toast_out() RETURNS VOID AS $$ >> declare >> v text; >> BEGIN >> SELECT c INTO v FROM t WHERE c <> 'x'; >> Select 1/0; >> Exception >> When Others Then >> PERFORM pg_sleep(30); -- go run "TRUNCATE t" in a 2nd session >> >> >> raise notice 'length :%', length(v || ''); -- force detoast >> >> >> END; >> $$ language plpgsql; >> >> postgres=# select copy_toast_out(); >> ERROR: missing chunk number 0 for toast value 16390 in pg_toast_16384 >> CONTEXT: PL/pgSQL function copy_toast_out() line 10 at RAISE >> >> Analysis: >> >> The basic problem here is that if the lock is released on table before >> extracting toasted value, and in meantime someone truncates the table, >> this error can occur. Here error coming with PL block contains an >> Exception >> block (as incase there is an exception block, it calls >> RollbackAndReleaseCurrentSubTransaction). > > > This is another variant of the bug discussed here: > http://www.postgresql.org/message-id/0C41674C-FA02-4768-9E1B-548E568875BD@quarantainenet.nl. > > >> Do you think we should detoast the local variable before >> RollbackAndReleaseCurrentSubTransaction ? Or any other options ? > > > Hmm, that would fix this particular test case, but not the other case where > you DROP or TRUNCATE the table in the same transaction. > > The simplest fix would be to just detoast everything on assignment but that > was rejected on performance grounds in that previous thread. I don't see any > other realistic way to fix this, however, so maybe we should just bite the > bullet and do it anyway. For simple variables like, in your test case, it's > a good bet to detoast the value immediately; it'll be detoasted as soon as > you try to do anything with it anyway. But it's not a good bet for record or > row variables, because you often fetch the whole row into a variable but > only access a field or two. Yeah, this is exactly what came to my mind as well the first time I saw this problem that for row and record variables it can be penalty which user might not expect as he might not be using toasted values. However is it possible that we do detoasting on assignment when the variable of function is declared with some specific construct. For example, we do detoasting at commit time for holdable portals (referred below code) /* * Change the destination to output to the tuplestore. Note we tell * the tuplestore receiver to detoast all data passed through it. */ queryDesc->dest = CreateDestReceiver(DestTuplestore); SetTuplestoreDestReceiverParams(..); When the Hold option is specified with cursor, then we perform detoast on commit, so on similar lines if the specific variable or function is declared with some particular construct, then we detoast on assignment. Another option is that we give more meaningful error with Hint suggesting the possible reason of error. This option can be used along with above option in case variable/function is not declared with particular construct. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Fri, Jan 3, 2014 at 9:05 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Fri, Jan 3, 2014 at 12:51 AM, Heikki Linnakangas > <hlinnakangas@vmware.com> wrote: >> On 01/02/2014 02:24 PM, Rushabh Lathia wrote: >>> Do you think we should detoast the local variable before >>> RollbackAndReleaseCurrentSubTransaction ? Or any other options ? >> >> >> Hmm, that would fix this particular test case, but not the other case where >> you DROP or TRUNCATE the table in the same transaction. >> >> The simplest fix would be to just detoast everything on assignment but that >> was rejected on performance grounds in that previous thread. I don't see any >> other realistic way to fix this, however, so maybe we should just bite the >> bullet and do it anyway. For simple variables like, in your test case, it's >> a good bet to detoast the value immediately; it'll be detoasted as soon as >> you try to do anything with it anyway. But it's not a good bet for record or >> row variables, because you often fetch the whole row into a variable but >> only access a field or two. > > Yeah, this is exactly what came to my mind as well the first time I saw this > problem that for row and record variables it can be penalty which user might > not expect as he might not be using toasted values. > > However is it possible that we do detoasting on assignment when the > variable of function is declared with some specific construct. After reading about handling for similar problem in other databases and thinking more on it, I wonder if we can make a rule such that values lesser than some threshold (8K or 16K or 32K) can be allowed to be retrieved in plpgsql variables. So with this, we can always detoast on assignment if the value is less than threshold and return error otherwise. I think this will help in reducing the performance impact and allow users to retrieve values (which are of less than threshold) in plpgsql variables without worrying about the behaviour reported in this and similar thread. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Thu, Jan 2, 2014 at 4:15 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-01-02 16:05:09 -0500, Robert Haas wrote: >> On Thu, Jan 2, 2014 at 3:19 PM, Andres Freund <andres@2ndquadrant.com> wrote: >> >> I was wondering if we could somehow arrange to not >> >> release the subtransaction's AccessShareLock on the table, as long as it >> >> was protecting toasted references someplace. >> > >> > Sounds fairly ugly... >> >> I think the only principled fixes are to either retain the lock or >> forcibly detoast before releasing it. > > I don't think that's sufficient. Unless I miss something the problem > isn't restricted to TRUNCATE and such at all. I think a plain VACUUM > should be sufficient? I haven't tested it, but INSERT RETURNING > toasted_col a row, storing the result in a record, and then aborting the > subtransaction will allow the inserted row to be VACUUMed by a > concurrent transaction. Hmm, that's actually nastier than the case that the case Rushabh originally reported. A somewhat plausible response to "my holdable cursor didn't work after I truncated the table it read from" is "well don't do that then". But this case could actually happen to someone who wasn't trying to do anything screwy. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2014-01-06 09:10:48 -0500, Robert Haas wrote: > On Thu, Jan 2, 2014 at 4:15 PM, Andres Freund <andres@2ndquadrant.com> wrote: > >> I think the only principled fixes are to either retain the lock or > >> forcibly detoast before releasing it. > > > > I don't think that's sufficient. Unless I miss something the problem > > isn't restricted to TRUNCATE and such at all. I think a plain VACUUM > > should be sufficient? I haven't tested it, but INSERT RETURNING > > toasted_col a row, storing the result in a record, and then aborting the > > subtransaction will allow the inserted row to be VACUUMed by a > > concurrent transaction. > > Hmm, that's actually nastier than the case that the case Rushabh > originally reported. A bit, yes. Somebody should probably verify that it can actually happen :P > A somewhat plausible response to "my holdable > cursor didn't work after I truncated the table it read from" is "well > don't do that then". But this case could actually happen to someone > who wasn't trying to do anything screwy. Personally I think everything that involves using data computed in an aborted subtransaction but the error code is screwy. I think plpgsql has been far too lenient in allowing that in an unconstrained fashion. I actually vote for not allowing doing so at all by erroring out when accessing a plpgsql variable created in an aborted subxact, unless you explicitly signal that you want to do do so by calling some function deleting the information about which subxact a variable was created in. I have seen several bugs caused by people assuming that EXCEPTION BLOCK/subtransaction rollback had some kind of effects on variables created in them. And we just don't have much support for doing anything in that direction safely. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Jan 6, 2014 at 9:19 AM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-01-06 09:10:48 -0500, Robert Haas wrote: >> On Thu, Jan 2, 2014 at 4:15 PM, Andres Freund <andres@2ndquadrant.com> wrote: >> >> I think the only principled fixes are to either retain the lock or >> >> forcibly detoast before releasing it. >> > >> > I don't think that's sufficient. Unless I miss something the problem >> > isn't restricted to TRUNCATE and such at all. I think a plain VACUUM >> > should be sufficient? I haven't tested it, but INSERT RETURNING >> > toasted_col a row, storing the result in a record, and then aborting the >> > subtransaction will allow the inserted row to be VACUUMed by a >> > concurrent transaction. >> >> Hmm, that's actually nastier than the case that the case Rushabh >> originally reported. > > A bit, yes. Somebody should probably verify that it can actually happen :P > >> A somewhat plausible response to "my holdable >> cursor didn't work after I truncated the table it read from" is "well >> don't do that then". But this case could actually happen to someone >> who wasn't trying to do anything screwy. > > Personally I think everything that involves using data computed in an > aborted subtransaction but the error code is screwy. I think plpgsql has > been far too lenient in allowing that in an unconstrained fashion. > > I actually vote for not allowing doing so at all by erroring out when > accessing a plpgsql variable created in an aborted subxact, unless you > explicitly signal that you want to do do so by calling some function > deleting the information about which subxact a variable was created > in. I have seen several bugs caused by people assuming that EXCEPTION > BLOCK/subtransaction rollback had some kind of effects on variables > created in them. And we just don't have much support for doing anything > in that direction safely. So, you want to let users do things that are unsafe, but only if they ask nicely? That hardly seems right. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2014-01-06 09:43:45 -0500, Robert Haas wrote: > > I actually vote for not allowing doing so at all by erroring out when > > accessing a plpgsql variable created in an aborted subxact, unless you > > explicitly signal that you want to do do so by calling some function > > deleting the information about which subxact a variable was created > > in. I have seen several bugs caused by people assuming that EXCEPTION > > BLOCK/subtransaction rollback had some kind of effects on variables > > created in them. And we just don't have much support for doing anything > > in that direction safely. > > So, you want to let users do things that are unsafe, but only if they > ask nicely? That hardly seems right. Well, no. If they have to use that function explicitly *before* the subxact aborted, we can copy & detoast the value out of that context safely. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Jan 6, 2014 at 9:47 AM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-01-06 09:43:45 -0500, Robert Haas wrote: >> > I actually vote for not allowing doing so at all by erroring out when >> > accessing a plpgsql variable created in an aborted subxact, unless you >> > explicitly signal that you want to do do so by calling some function >> > deleting the information about which subxact a variable was created >> > in. I have seen several bugs caused by people assuming that EXCEPTION >> > BLOCK/subtransaction rollback had some kind of effects on variables >> > created in them. And we just don't have much support for doing anything >> > in that direction safely. >> >> So, you want to let users do things that are unsafe, but only if they >> ask nicely? That hardly seems right. > > Well, no. If they have to use that function explicitly *before* the > subxact aborted, we can copy & detoast the value out of that context > safely. Oh, I see. I think that's pretty icky. Users won't expect (and will complain about) such restrictions. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2014-01-06 11:08:41 -0500, Robert Haas wrote: > On Mon, Jan 6, 2014 at 9:47 AM, Andres Freund <andres@2ndquadrant.com> wrote: > > On 2014-01-06 09:43:45 -0500, Robert Haas wrote: > >> > I actually vote for not allowing doing so at all by erroring out when > >> > accessing a plpgsql variable created in an aborted subxact, unless you > >> > explicitly signal that you want to do do so by calling some function > >> > deleting the information about which subxact a variable was created > >> > in. I have seen several bugs caused by people assuming that EXCEPTION > >> > BLOCK/subtransaction rollback had some kind of effects on variables > >> > created in them. And we just don't have much support for doing anything > >> > in that direction safely. > >> > >> So, you want to let users do things that are unsafe, but only if they > >> ask nicely? That hardly seems right. > > > > Well, no. If they have to use that function explicitly *before* the > > subxact aborted, we can copy & detoast the value out of that context > > safely. > > Oh, I see. I think that's pretty icky. Users won't expect (and will > complain about) such restrictions. Yea. But at least it would fail reliably instead of just under concurrency and other strange circumstances - and there'd be a safe way out. Currently there seem to be all sorts of odd behaviour possible. I simply don't have a better idea :( Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Jan 6, 2014 at 11:47 AM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-01-06 11:08:41 -0500, Robert Haas wrote: >> On Mon, Jan 6, 2014 at 9:47 AM, Andres Freund <andres@2ndquadrant.com> wrote: >> > On 2014-01-06 09:43:45 -0500, Robert Haas wrote: >> >> > I actually vote for not allowing doing so at all by erroring out when >> >> > accessing a plpgsql variable created in an aborted subxact, unless you >> >> > explicitly signal that you want to do do so by calling some function >> >> > deleting the information about which subxact a variable was created >> >> > in. I have seen several bugs caused by people assuming that EXCEPTION >> >> > BLOCK/subtransaction rollback had some kind of effects on variables >> >> > created in them. And we just don't have much support for doing anything >> >> > in that direction safely. >> >> >> >> So, you want to let users do things that are unsafe, but only if they >> >> ask nicely? That hardly seems right. >> > >> > Well, no. If they have to use that function explicitly *before* the >> > subxact aborted, we can copy & detoast the value out of that context >> > safely. >> >> Oh, I see. I think that's pretty icky. Users won't expect (and will >> complain about) such restrictions. > > Yea. But at least it would fail reliably instead of just under > concurrency and other strange circumstances - and there'd be a safe way > out. Currently there seem to be all sorts of odd behaviour possible. > > I simply don't have a better idea :( Is "forcibly detoast everything" a complete no-go? I realize there are performance concerns with that approach, but I'm not sure how realistic a worry it actually is. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > Is "forcibly detoast everything" a complete no-go? I realize there > are performance concerns with that approach, but I'm not sure how > realistic a worry it actually is. It's certainly possible to think of scenarios under which it'd be painful, eg, you fetch all columns into a record but you never actually use the toasted one(s). OTOH, I can think of cases where forced detoasting might save cycles too, if it prevents multiple detoastings on later accesses. Probably what we ought to do is put together a trial patch and try to do some benchmarking. I agree that this is the simplest route to a fix if we can stand the overhead. regards, tom lane
On 2014-01-06 12:40:25 -0500, Robert Haas wrote: > On Mon, Jan 6, 2014 at 11:47 AM, Andres Freund <andres@2ndquadrant.com> wrote: > > On 2014-01-06 11:08:41 -0500, Robert Haas wrote: > > Yea. But at least it would fail reliably instead of just under > > concurrency and other strange circumstances - and there'd be a safe way > > out. Currently there seem to be all sorts of odd behaviour possible. > > > > I simply don't have a better idea :( > > Is "forcibly detoast everything" a complete no-go? I realize there > are performance concerns with that approach, but I'm not sure how > realistic a worry it actually is. The scenario I am primarily worried about is turning a record assignment which previously took up to BLOCK_SIZE + slop amount of memory into something taking up to a gigabyte. That's a pretty damn hefty change. And there's no good way of preventing it short of using a variable for each actually desired column which imnsho isn't really a solution. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-01-06 12:40:25 -0500, Robert Haas wrote: >> Is "forcibly detoast everything" a complete no-go? I realize there >> are performance concerns with that approach, but I'm not sure how >> realistic a worry it actually is. > The scenario I am primarily worried about is turning a record assignment > which previously took up to BLOCK_SIZE + slop amount of memory into > something taking up to a gigabyte. That's a pretty damn hefty > change. > And there's no good way of preventing it short of using a variable for > each actually desired column which imnsho isn't really a solution. Dunno ... if you have a table that contains a gigabyte-width column, should you be all that surprised if "SELECT * INTO r FROM table" results in "r" occupying about a gigabyte? And I can't count the number of times I've heard people deprecate using "SELECT *" at all in production code, so I don't agree with the claim that listing the columns you want is an unacceptable solution. I don't doubt that there are some folks for whom this would be a noticeable space-consumption hit compared to current behavior, but I have a hard time working up a lot of sympathy for them. I'm more concerned about the possible performance hit from detoasting more-reasonably-sized columns (say in the tens-of-KB range) when they might not get used. But we really need to benchmark that rather than just guess about whether it's a problem. regards, tom lane
On 1/2/14, 1:32 PM, Tom Lane wrote: > Heikki Linnakangas <hlinnakangas@vmware.com> writes: >> The simplest fix would be to just detoast everything on assignment but >> that was rejected on performance grounds in that previous thread. I >> don't see any other realistic way to fix this, however, so maybe we >> should just bite the bullet and do it anyway. > > Or just say "don't do that". TRUNCATE on a table that's in use by open > transactions has all sorts of issues besides this one. The given example > is a pretty narrow corner case anyway --- with a less contorted coding > pattern, we'd still have AccessShareLock on the table, blocking the > TRUNCATE from removing data. I'd still not want to blow up performance > in order to make this example work. If concurrent TRUNCATE isn't safe outside of this case then why do we allow it? IE: why doesn't TRUNCATE exclusive lock therelation? I'd much rather have working concurrent truncation than having to lock the relation, but if it's not safe we shouldn't handpeople that footgun... -- Jim C. Nasby, Data Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
On 1/6/14, 2:21 PM, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: >> On 2014-01-06 12:40:25 -0500, Robert Haas wrote: >>> Is "forcibly detoast everything" a complete no-go? I realize there >>> are performance concerns with that approach, but I'm not sure how >>> realistic a worry it actually is. > >> The scenario I am primarily worried about is turning a record assignment >> which previously took up to BLOCK_SIZE + slop amount of memory into >> something taking up to a gigabyte. That's a pretty damn hefty >> change. >> And there's no good way of preventing it short of using a variable for >> each actually desired column which imnsho isn't really a solution. > > Dunno ... if you have a table that contains a gigabyte-width column, > should you be all that surprised if "SELECT * INTO r FROM table" > results in "r" occupying about a gigabyte? And I can't count the > number of times I've heard people deprecate using "SELECT *" at all > in production code, so I don't agree with the claim that listing the > columns you want is an unacceptable solution. I see your logic, but the problem is a good developer would have actually tested that case and said "Oh look, plpgsql isn'tblindly copying the entire record." Now we're changing that case underneath them. That's a pretty significant changethat could affect a LOT of code on the user's side. And if they've got conditional code down-stream that sometimeshits the TOASTed value and sometimes doesn't then they're in for even more fun... The deferred access pattern of detoasting is a very powerful performance improvement and I'd hate to see us limiting it inplpgsql. -- Jim C. Nasby, Data Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
On Mon, Jan 6, 2014 at 8:02 PM, Jim Nasby <jim@nasby.net> wrote: > If concurrent TRUNCATE isn't safe outside of this case then why do we allow > it? IE: why doesn't TRUNCATE exclusive lock the relation? It *does*. The problem is that the *other* transaction that's reading the relation can still retain a TOAST pointer after it no longer holds the lock. That's uncool. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 01/06/2014 08:29 PM, Andres Freund wrote: > On 2014-01-06 12:40:25 -0500, Robert Haas wrote: >> On Mon, Jan 6, 2014 at 11:47 AM, Andres Freund <andres@2ndquadrant.com> wrote: >>> On 2014-01-06 11:08:41 -0500, Robert Haas wrote: >>> Yea. But at least it would fail reliably instead of just under >>> concurrency and other strange circumstances - and there'd be a safe way >>> out. Currently there seem to be all sorts of odd behaviour possible. >>> >>> I simply don't have a better idea :( >> >> Is "forcibly detoast everything" a complete no-go? I realize there >> are performance concerns with that approach, but I'm not sure how >> realistic a worry it actually is. > > The scenario I am primarily worried about is turning a record assignment > which previously took up to BLOCK_SIZE + slop amount of memory into > something taking up to a gigabyte. That's a pretty damn hefty > change. > And there's no good way of preventing it short of using a variable for > each actually desired column which imnsho isn't really a solution. We could mitigate that somewhat by doing an optimization pass of the PL/pgSQL code after compilation, and check which fields of a row variable are never referenced, and skip the detoasting for those fields. It would only work for named row variables, not anonymous record variables, and you would still unnecessarily detoast fields that are sometimes accessed but usually not. But it would avoid the detoasting in the most egregious cases, e.g where you fetch a whole row into a variable just to access one field. Overall, I'm leaning towards biting the bullet and always detoasting everything in master. Probably best to just leave the stable branches alone. - Heikki
On 2014-01-07 10:45:24 +0200, Heikki Linnakangas wrote: > Overall, I'm leaning towards biting the bullet and always detoasting > everything in master. Probably best to just leave the stable branches alone. If we're doing something coarse grained as this, I agree, it should be master only. I personally vote to rather just leave things as is, seems better than this pessimization, and it's not like loads of people have hit the issue. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Jan7, 2014, at 09:45 , Heikki Linnakangas <hlinnakangas@vmware.com> wrote: > Overall, I'm leaning towards biting the bullet and always detoasting everything in master. Probably best to just leavethe stable branches alone. +1 The fact that de-TOAST-ing can happen lazily is, at least to me, an implementation detail that shouldn't be observable. If we want to allow people to use lazy de-TOAST-ing as an optimization tool, we should provide an explicit way to do so, e.g. by flagging variables in pl/pgsql as REFERENCE or something like that. best regards, Florian Pflug