Re: A long-running transaction - Mailing list pgsql-sql
From | Andrew Sullivan |
---|---|
Subject | Re: A long-running transaction |
Date | |
Msg-id | 20070413114919.GA31083@phlogiston.dyndns.org Whole thread Raw |
In response to | Re: A long-running transaction (John Summerfield <postgres@herakles.homelinux.org>) |
Responses |
Re: A long-running transaction
Re: A long-running transaction |
List | pgsql-sql |
On Fri, Apr 13, 2007 at 09:02:58AM +0800, John Summerfield wrote: > > Why is this implementation preferable to not doing that? Because this way, readers never wait for writers. On most database systems, that pending UPDATE would block anyone reading the row too, even if they weren't going to write on it. It's a trade-off. It happens to mean that the trade is bad for the thing you happen to be doing, but since most database operations aren't bulk ones in the way you're trying to achieve this, that's quite possibly a reasonable trade off. > Where is enterprisedb? enterprisedb.com seems to be held by a squatter > in Kingston. About where they're playing an important cricket > competition about now. Google took me here in one: http://www.enterprisedb.com But anyway, their stuff is built on top of Pg, so going to their site won't help you. > I can see we need before and after, but why all those intermediate rows? Because in the general case, you need the intermediate rows. The problem is that you'd have to write a special piece of code to catch the case where nobody else can see the row that you're about to expire, and that test isn't free. Therefore, you do it the same way any other row gets expired. > Also, I don't see why (apparently) a sequential search is used; surely > if all of these rows might be required, still a pointer to the last > would be right? Or last two, if you need the ability to check the order. No, it's not a seqscan. It's following the chain of versions of the row. You don't know until you look at the actual row whether the version of it you are looking at is valid for you. There isn't any other place to store that information. (If you want someone who really understands how all this works to explain it to you in more accurate terms, you should ask the question on -hackers. They'll point you to the developers' docs that I can't seem to find right now.) > Is there a development version I can try, with this improvement in it? Well, you could try using the current CVS HEAD, which is close to feature freeze. It'd sure be a service to the community, at least, because we'd learn whether the proposed change fixes this sort of case. (But I wouldn't trust my data to the HEAD for real. You said you're not actually in production yet, though.) > 1. For the first day or so, my observation was that the disk was not > particularly busy. That's completely consistent with the theory I have. As the number of dead tuples goes up, your disk activity will slowly get worse. > At present I'm trying to clarify in my mind the nature of the problem. > What I'm trying to do seems to me reasonable. I have some data, and I > want it all in or none of it, so it fits the idea of a single transaction. > > It might be that my demands exceed Postgresql's current capabilities, > but by itself it doesn't make what I'm trying to do unreasonable. No, it's not unreasonable, but it happens to be a pessimal case under Postgres. A -- Andrew Sullivan | ajs@crankycanuck.ca "The year's penultimate month" is not in truth a good way of saying November. --H.W. Fowler