Thread: Bug in point releases 9.3.6 and 9.2.10?

Bug in point releases 9.3.6 and 9.2.10?

From

Peter Geoghegan

Date:

12 March 2015, 23:42:29

Heroku Postgres runs provisioning code that performs certain actions
on roles when creating a new "fork" of an existing database. This
often causes the new fork to be on the latest point release, where the
database being forked was not.

We want to create a new role when this happens, for various reasons.
This occurs after recovery ends, but before the database has been
"unfenced". The template code that generates various ALTER ROLE
statements in our internal provisioning system - which has apparently
worked just fine for a long time - is:

db.execute("ALTER ROLE #{old_database_user} RENAME TO #{database_user}")
db.execute("ALTER ROLE #{database_user} PASSWORD '#{database_password}' LOGIN")
db.execute("CREATE ROLE \"#{old_database_user}\" PASSWORD
'#{old_database_password}' IN ROLE \"#{database_user}\" LOGIN")

I've seen multiple reports of apparent corruption, appearing as the
resulting ALTER ROLE statements are executed:

PG::DataCorrupted: ERROR: could not read block 0 in file
"global/12811": read only 0 of 8192 bytes
or:
PG::DataCorrupted: ERROR: could not read block 0 in file
"global/12785": read only 0 of 8192 bytes
or:
PG::DataCorrupted: ERROR: could not read block 0 in file
"global/12811": read only 0 of 8192 bytes


The only common factor is that this occurs on the latest point
releases (either 9.3.6 and 9.2.10, at least so far). In all cases I've
seen so far, the relation in question is the pg_auth_members heap
relation. For example:

redacteddb=#  select pg_relation_filenode(oid), oid, relname, relkind
from pg_class where pg_relation_filenode(oid) = 12785;pg_relation_filenode | oid  |     relname     | relkind
----------------------+------+-----------------+---------               12785 | 1261 | pg_auth_members | r
(1 row)


Running "VACUUM FULL pg_auth_members;" has made the problem go away
(to the extent that the above code doesn't trip up and everything is
at least superficially okay) on at least one occasion. I'm currently
investigating how consistently that works as a short term remediation.

Theories?
-- 
Peter Geoghegan

Re: Bug in point releases 9.3.6 and 9.2.10?

From

Andres Freund

Date:

13 March 2015, 00:22:06

Hi,

On 2015-03-12 16:42:24 -0700, Peter Geoghegan wrote:
> We want to create a new role when this happens, for various reasons.
> This occurs after recovery ends, but before the database has been
> "unfenced". The template code that generates various ALTER ROLE
> statements in our internal provisioning system - which has apparently
> worked just fine for a long time - is:

Is this all the code that's exececuted after recovery? How are these
forks brought up? Promoted how? Is it a common 'source' database?

> db.execute("ALTER ROLE #{old_database_user} RENAME TO #{database_user}")
> db.execute("ALTER ROLE #{database_user} PASSWORD '#{database_password}' LOGIN")
> db.execute("CREATE ROLE \"#{old_database_user}\" PASSWORD
> '#{old_database_password}' IN ROLE \"#{database_user}\" LOGIN")
> 
> I've seen multiple reports of apparent corruption, appearing as the
> resulting ALTER ROLE statements are executed:
> 
> PG::DataCorrupted: ERROR: could not read block 0 in file
> "global/12811": read only 0 of 8192 bytes
> or:
> PG::DataCorrupted: ERROR: could not read block 0 in file
> "global/12785": read only 0 of 8192 bytes
> or:
> PG::DataCorrupted: ERROR: could not read block 0 in file
> "global/12811": read only 0 of 8192 bytes

Have you looked at these files? Are they indeed zero bytes when this
error occurs?

Do you still have a base backup from the relevant time, so you could
repeat the whole thing?

> The only common factor is that this occurs on the latest point
> releases (either 9.3.6 and 9.2.10, at least so far). In all cases I've
> seen so far, the relation in question is the pg_auth_members heap
> relation. For example:

Any chance that the new nodes also use a different kernel version or
such?

> redacteddb=#  select pg_relation_filenode(oid), oid, relname, relkind
> from pg_class where pg_relation_filenode(oid) = 12785;
>  pg_relation_filenode | oid  |     relname     | relkind
> ----------------------+------+-----------------+---------
>                 12785 | 1261 | pg_auth_members | r
> (1 row)

This filenode got to be pg_auth_member's original one, given it's below
FirstNormalObjectId. I get a lower value, but that's probably caused by
having fewer collations and other data generated during initdb. That
implies that the table hasn't ever been rewritten.

What's 12811?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Bug in point releases 9.3.6 and 9.2.10?

From

Peter Geoghegan

Date:

13 March 2015, 00:42:39

In a hurry right now, so unfortunately I'll need to be brief for now.

On Thu, Mar 12, 2015 at 5:21 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2015-03-12 16:42:24 -0700, Peter Geoghegan wrote:
>> We want to create a new role when this happens, for various reasons.
>> This occurs after recovery ends, but before the database has been
>> "unfenced". The template code that generates various ALTER ROLE
>> statements in our internal provisioning system - which has apparently
>> worked just fine for a long time - is:
>
> Is this all the code that's exececuted after recovery? How are these
> forks brought up? Promoted how? Is it a common 'source' database?

We do PITR up to a recovery target. We're talking about the same issue
occurring on entirely distinct customer databases, with entirely
distinct major PG versions. I'm not sure what other code might have
already been run at this point, but it won't have been much. As I
said, the only common factor that I know of is all affected databases
being on the latest point release.

> Have you looked at these files? Are they indeed zero bytes when this
> error occurs?

I think that they are indeed zero. I looked at that last week, when I
didn't consider that this might be a more widespread issue. I'll check
again later.

> Do you still have a base backup from the relevant time, so you could
> repeat the whole thing?

Yes.

>> The only common factor is that this occurs on the latest point
>> releases (either 9.3.6 and 9.2.10, at least so far). In all cases I've
>> seen so far, the relation in question is the pg_auth_members heap
>> relation. For example:
>
> Any chance that the new nodes also use a different kernel version or
> such?

They may differ, but that doesn't seem likely to be relevant, at least
to me. This has happened something like 6 or 7 times already, starting
late last week. I am unfamiliar with this provisioning code, so, as I
mentioned, offhand I cannot be entirely sure that there isn't some
other code run when the problem originally arises (that I should have
included in my report). What I can tell you is that I saw the same
error messages when I manually ran the statements generated by the
above code within a transaction...until I ran "VACUUM FULL
pg_auth_members;".

> This filenode got to be pg_auth_member's original one, given it's below
> FirstNormalObjectId. I get a lower value, but that's probably caused by
> having fewer collations and other data generated during initdb. That
> implies that the table hasn't ever been rewritten.
>
> What's 12811?

It's the same catalog, pg_auth_member. As I said, the messages you saw
are on entirely different customer databases, servers and (sometimes)
PG version.

-- 
Peter Geoghegan

Re: Bug in point releases 9.3.6 and 9.2.10?

From

Andres Freund

Date:

13 March 2015, 00:56:49

On 2015-03-12 17:42:33 -0700, Peter Geoghegan wrote:
> On Thu, Mar 12, 2015 at 5:21 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2015-03-12 16:42:24 -0700, Peter Geoghegan wrote:
> >> We want to create a new role when this happens, for various reasons.
> >> This occurs after recovery ends, but before the database has been
> >> "unfenced". The template code that generates various ALTER ROLE
> >> statements in our internal provisioning system - which has apparently
> >> worked just fine for a long time - is:
> >
> > Is this all the code that's exececuted after recovery? How are these
> > forks brought up? Promoted how? Is it a common 'source' database?
> 
> We do PITR up to a recovery target.

So, no hot standby enabled?

> I'm not sure what other code might have already been run at this
> point, but it won't have been much.

> > Have you looked at these files? Are they indeed zero bytes when this
> > error occurs?

> I think that they are indeed zero. I looked at that last week, when I
> didn't consider that this might be a more widespread issue. I'll check
> again later.

> > Any chance that the new nodes also use a different kernel version or
> > such?
> 
> They may differ, but that doesn't seem likely to be relevant, at least
> to me.

There've been some issues with seek(END) sometimes returning the wrong
length in the past. And I've seen a report that might indicate a similar
bug has been reintroduced. That could certainly cause such anerror.

> I am unfamiliar with this provisioning code, so, as I
> mentioned, offhand I cannot be entirely sure that there isn't some
> other code run when the problem originally arises (that I should have
> included in my report).

It's probably worthwhile to dig out what's happening.

> What I can tell you is that I saw the same error messages when I
> manually ran the statements generated by the above code within a
> transaction...until I ran "VACUUM FULL pg_auth_members;".

You can reproduce that problem? How easily? If you can, please produce a
backtrace. It'll certainly be interesting to see whether that access is
through an index or whatnot.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Bug in point releases 9.3.6 and 9.2.10?

From

Peter Geoghegan

Date:

13 March 2015, 01:01:37

On Thu, Mar 12, 2015 at 5:56 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> So, no hot standby enabled?

Right.

> There've been some issues with seek(END) sometimes returning the wrong
> length in the past. And I've seen a report that might indicate a similar
> bug has been reintroduced. That could certainly cause such anerror.

Perhaps.

>> I am unfamiliar with this provisioning code, so, as I
>> mentioned, offhand I cannot be entirely sure that there isn't some
>> other code run when the problem originally arises (that I should have
>> included in my report).
>
> It's probably worthwhile to dig out what's happening.

I'll get to that after the backtrace.

>> What I can tell you is that I saw the same error messages when I
>> manually ran the statements generated by the above code within a
>> transaction...until I ran "VACUUM FULL pg_auth_members;".
>
> You can reproduce that problem? How easily? If you can, please produce a
> backtrace. It'll certainly be interesting to see whether that access is
> through an index or whatnot.

I suspect I can reproduce it quite easily. The next step should be to
do a backtrace. I'll look at that tomorrow.

-- 
Peter Geoghegan

Re: Bug in point releases 9.3.6 and 9.2.10?

From

Peter Geoghegan

Date:

17 March 2015, 17:35:59

On Thu, Mar 12, 2015 at 5:56 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>> > Any chance that the new nodes also use a different kernel version or
>> > such?
>>
>> They may differ, but that doesn't seem likely to be relevant, at least
>> to me.
>
> There've been some issues with seek(END) sometimes returning the wrong
> length in the past. And I've seen a report that might indicate a similar
> bug has been reintroduced. That could certainly cause such anerror.

It seems that I failed to take account of the fact that these cases
all involved Ubuntu 14.04 LTS (that wasn't clear to me before, but
I've verified that that's the case now), which is a small minority of
the kernels our instances use. I've verified that the issue arises
independently of point release, including on earlier point releases
from about a year ago, so I was hasty in suggesting that there was a
regression in the latest point releases.

I'm still going to get a back trace here, because it seems reasonable
to suppose that there is a Postgres bug even still - it may be that
whatever differences are in the 14.04 kernel are enough to make a
previously latent bug trip this code up. Plenty of people are using
Ubuntu 14.04 LTS with Postgres without seeing this issue, or any other
issue, and there is no other issue with the kernel that I know of. The
issue continues to occur as new instances with this configuration are
deployed.

I should also add that my temporary remediation ("VACUUM FULL
pg_auth_members;") works consistently, as new cases start to crop up.

-- 
Peter Geoghegan

Re: Bug in point releases 9.3.6 and 9.2.10?

From

Josh Berkus

Date:

17 March 2015, 17:51:14

On 03/17/2015 10:35 AM, Peter Geoghegan wrote:
> I'm still going to get a back trace here, because it seems reasonable
> to suppose that there is a Postgres bug even still - it may be that
> whatever differences are in the 14.04 kernel are enough to make a
> previously latent bug trip this code up. Plenty of people are using
> Ubuntu 14.04 LTS with Postgres without seeing this issue, or any other
> issue, and there is no other issue with the kernel that I know of. The
> issue continues to occur as new instances with this configuration are
> deployed.

Yeah ... I think I have logins on something like 50 servers running
9.3.6 on 14.04, in replication, and haven't seen this issue.

Which specific kernel version?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: Bug in point releases 9.3.6 and 9.2.10?

From

Greg Stark

Date:

17 March 2015, 19:12:15

Well, when a database is first initdb'd that relation is in fact 0 bytes:

::***# select pg_relation_filenode(oid) from pg_class where relname = 'pg_auth_members';

┌──────────────────────┐

│ pg_relation_filenode │

├──────────────────────┤

│ 12610 │

└──────────────────────┘

(1 row)

$ find . -name 12610 -ls

131772 0 -rw------- 1 stark stark 0 Mar 3 18:52 ./global/12610

So it's not surprising to find a 0-byte file there sometime. The question is how the ALTER ROLE commands failed to extend it or why they're expecting it to be non-zero when it is.

Is it possible the permissions on the file are wrong for some reason? The symptoms would make sense if the directory had permissions that allowed unlinking (for vacuum full) but the file didn't have "r" access for the postgres user or SE rules blocked it from being read or something like that.

greg