Thread: pg_dump, MVCC and consistency

pg_dump, MVCC and consistency

From
Florian Ledoux
Date:
Hello everybody !

I am coming from the (expensive) "Oracle World" and I am a newbie in
PG administration. I am currently working on backup concerns... I am
using pg_dump and I have not encountered any problems but I have some
questions about the internal management of data consistency in PG
server.
I have read some articles about the MVCC mechanism but I can't see how
it handles a consistent "snapshot" of the database during all the
export process.

If I have well understood, the defaut transaction isolation level in
PG is the "read commited" isolation level. If it is the isolation
scheme used by pg_dump how can I be sure that tables accessed at the
end of my export are consistent with those accessed at the begining ?
Does pg_dump use a serializable isolation scheme ?

We have this kind of concerns with Oracle and a "CONSISTENT" flag can
be set in the exp utility to use a consistent snapshot of the database
from the begining to the end of the export process. Unfortunately,
this mode use intensively rollback segments and can drive to obsolete
data (also knows as "Snapshot too old"). Is there the equivalent of
rollback segments in PG ? Is there some issues like "snapshot too old"
with intensive multi-users and transactional databases ?

I have not a good knowledge of PG internal mechanism, I hope that my
questions are clear enough...

Florian

Re: pg_dump, MVCC and consistency

From
Michael Fuhr
Date:
On Mon, Oct 24, 2005 at 02:29:24PM +0200, Florian Ledoux wrote:
> If I have well understood, the defaut transaction isolation level in
> PG is the "read commited" isolation level. If it is the isolation
> scheme used by pg_dump how can I be sure that tables accessed at the
> end of my export are consistent with those accessed at the begining ?
> Does pg_dump use a serializable isolation scheme ?

There are at least two ways to find out: examine the source code
or enable query logging on the server.  You'll discover that a
pg_dump session starts with:

BEGIN
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE

> We have this kind of concerns with Oracle and a "CONSISTENT" flag can
> be set in the exp utility to use a consistent snapshot of the database
> from the begining to the end of the export process. Unfortunately,
> this mode use intensively rollback segments and can drive to obsolete
> data (also knows as "Snapshot too old"). Is there the equivalent of
> rollback segments in PG ? Is there some issues like "snapshot too old"
> with intensive multi-users and transactional databases ?

PostgreSQL uses MVCC to get snapshots.  See "Concurrency Control"
and "Preventing transaction ID wraparound failures" in the documentation
for discussion of how this works, what problems you might encounter,
and how to avoid them.

http://www.postgresql.org/docs/8.0/interactive/mvcc.html
http://www.postgresql.org/docs/8.0/interactive/maintenance.html#VACUUM-FOR-WRAPAROUND

PostgreSQL 8.1 makes checks to avoid data loss due to transaction
ID wraparound, but there's one situation I'm not sure how it handles:
when a transaction is so long-lived that it would appear to be in
the future of newly-created transactions due to wraparound.  I'd
have to dig into the source code to find out if that's possible,
and if so, what happens.  Maybe one of the developers will comment.

--
Michael Fuhr

Re: pg_dump, MVCC and consistency

From
Martijn van Oosterhout
Date:
On Mon, Oct 24, 2005 at 11:25:00AM -0600, Michael Fuhr wrote:
> PostgreSQL 8.1 makes checks to avoid data loss due to transaction
> ID wraparound, but there's one situation I'm not sure how it handles:
> when a transaction is so long-lived that it would appear to be in
> the future of newly-created transactions due to wraparound.  I'd
> have to dig into the source code to find out if that's possible,
> and if so, what happens.  Maybe one of the developers will comment.

To avoid this you need to do a VACUUM FULL over the database at least
once every two billion transactions (not statements or tuples,
transactions). To that end, the server begins complaining after one
billion. I've never seen this in practice. Perhaps you could calculate
how long it would take to do that many transactions. Most systems will
never see it...

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: pg_dump, MVCC and consistency

From
Alex Turner
Date:
Beleive me, when you get data feeds with bad data and you have to do each insert as an xact because copy will just dump out, you can hit 1bil really fast.

Alex

On 10/24/05, Martijn van Oosterhout <kleptog@svana.org> wrote:
On Mon, Oct 24, 2005 at 11:25:00AM -0600, Michael Fuhr wrote:
> PostgreSQL 8.1 makes checks to avoid data loss due to transaction
> ID wraparound, but there's one situation I'm not sure how it handles:
> when a transaction is so long-lived that it would appear to be in
> the future of newly-created transactions due to wraparound.  I'd
> have to dig into the source code to find out if that's possible,
> and if so, what happens.  Maybe one of the developers will comment.

To avoid this you need to do a VACUUM FULL over the database at least
once every two billion transactions (not statements or tuples,
transactions). To that end, the server begins complaining after one
billion. I've never seen this in practice. Perhaps you could calculate
how long it would take to do that many transactions. Most systems will
never see it...

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.



Re: pg_dump, MVCC and consistency

From
Jeff Davis
Date:
Martijn van Oosterhout wrote:
> On Mon, Oct 24, 2005 at 11:25:00AM -0600, Michael Fuhr wrote:
>
>>PostgreSQL 8.1 makes checks to avoid data loss due to transaction
>>ID wraparound, but there's one situation I'm not sure how it handles:
>>when a transaction is so long-lived that it would appear to be in
>>the future of newly-created transactions due to wraparound.  I'd
>>have to dig into the source code to find out if that's possible,
>>and if so, what happens.  Maybe one of the developers will comment.
>
>
> To avoid this you need to do a VACUUM FULL over the database at least
> once every two billion transactions (not statements or tuples,
> transactions). To that end, the server begins complaining after one
> billion. I've never seen this in practice. Perhaps you could calculate
> how long it would take to do that many transactions. Most systems will
> never see it...
>
> Hope this helps,

Docs section 21.1.3 (in 8.0 docs anyway, it seems to have moved to
22.1.3 for 8.1 docs) makes no mention of VACUUM FULL, only VACUUM is
required.

VACUUM FULL is much more expensive, because it actually moves tuples
on-disk rather than just marking them. So it's fortunate that a VACUUM
FULL is not required. As far as I can tell, VACUUM FULL is never
required for normal database activity, even a 24/7/365.25 system.

I think Michael Fuhr was asking something a little different. The same
section of the docs appears to clear it up a little:

"...the normal XID space is circular with no endpoint. Therefore, once a
row version has been created with a particular normal XID, the row
version will appear to be "in the past" for the next two billion
transactions,..."

So it seems that, in order for the wraparound to be a problem, the
transaction would have to last longer than 2 billion other transactions.

And if a transaction did last that long, according to the 8.1 docs (22.1.3):

"...the system will shut down and refuse to execute any new transactions
once there are fewer than 1 million transactions left until wraparound..."

The word that stands out in that sentence to me is "new". So,
presumably, it would continue that one long transaction indefinitely,
refusing new transactions. That isn't great, but it's hard to imagine
leaving a transaction open for 2 billion other transactions. And at
least it doesn't cause data loss.

Is there any easy way to see what transactions are currently open, how
old the XID is, and what PID is executing it?

And what about a transaction left open for 2PC? Does a transaction get a
new XID if it's PREPAREd now and COMMIT PREPAREd in a year?

Regards,
    Jeff Davis

Re: pg_dump, MVCC and consistency

From
Martijn van Oosterhout
Date:
On Mon, Oct 24, 2005 at 02:42:09PM -0700, Jeff Davis wrote:
> So it seems that, in order for the wraparound to be a problem, the
> transaction would have to last longer than 2 billion other transactions.
>
> And if a transaction did last that long, according to the 8.1 docs (22.1.3):
>
> "...the system will shut down and refuse to execute any new transactions
> once there are fewer than 1 million transactions left until wraparound..."

I've thought about this and it seems to me that if you have a
transaction open for the duration of 2 billion other transactions, you
likely have a real problem. If even a substantial fraction of those
transactions altered any data, your tables and indexes are going to be
incredibly bloated because any VACUUM can't remove anything newer than
the oldest transaction. A billion uncleaned tuples will translate to
tens of gigabytes of useless space that can't be cleaned.

At 100 transactions per second, 2 billion transactions is eight months.
that's a very long running query (running on very out of date data).

Actually, there's some discussion in the archives about a "buffer death
spiral" [1]. If you have more dead tuples in your update chains than
number of buffers, you'll completely thrash the system just to access
them. Nasty.

> Is there any easy way to see what transactions are currently open, how
> old the XID is, and what PID is executing it?

Well, the XID info can be found in pg_database. The PID info from
pg_stat_activity which gives you the time the query started...

> And what about a transaction left open for 2PC? Does a transaction get a
> new XID if it's PREPAREd now and COMMIT PREPAREd in a year?

That I don't know. Perhaps that's why it's stops a million before
overflow. So important transactions like that or VACUUM or even just
logging in can complete.

[1] http://archives.postgresql.org/pgsql-performance/2005-10/msg00227.php

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: pg_dump, MVCC and consistency

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> On Mon, Oct 24, 2005 at 02:42:09PM -0700, Jeff Davis wrote:
>> And what about a transaction left open for 2PC? Does a transaction get a
>> new XID if it's PREPAREd now and COMMIT PREPAREd in a year?

> That I don't know.

A prepared transaction is still open for this purpose (and for purposes
such as holding locks, preventing VACUUM reclamation of tuples, etc).
In practice, leaving prepared transactions hanging around will be just
as socially unfriendly as opening a regular transaction and then sitting
on it.

> Perhaps that's why it's stops a million before overflow.

No, that's to give the DBA some breathing room for recovery.  He needs
enough remaining transaction IDs to do database-wide vacuums (which,
remember, eat an XID per table ... so it's conceivable that even a
million might not be enough ...)

            regards, tom lane