Thread: Bug in point releases 9.3.6 and 9.2.10?
Heroku Postgres runs provisioning code that performs certain actions on roles when creating a new "fork" of an existing database. This often causes the new fork to be on the latest point release, where the database being forked was not. We want to create a new role when this happens, for various reasons. This occurs after recovery ends, but before the database has been "unfenced". The template code that generates various ALTER ROLE statements in our internal provisioning system - which has apparently worked just fine for a long time - is: db.execute("ALTER ROLE #{old_database_user} RENAME TO #{database_user}") db.execute("ALTER ROLE #{database_user} PASSWORD '#{database_password}' LOGIN") db.execute("CREATE ROLE \"#{old_database_user}\" PASSWORD '#{old_database_password}' IN ROLE \"#{database_user}\" LOGIN") I've seen multiple reports of apparent corruption, appearing as the resulting ALTER ROLE statements are executed: PG::DataCorrupted: ERROR: could not read block 0 in file "global/12811": read only 0 of 8192 bytes or: PG::DataCorrupted: ERROR: could not read block 0 in file "global/12785": read only 0 of 8192 bytes or: PG::DataCorrupted: ERROR: could not read block 0 in file "global/12811": read only 0 of 8192 bytes The only common factor is that this occurs on the latest point releases (either 9.3.6 and 9.2.10, at least so far). In all cases I've seen so far, the relation in question is the pg_auth_members heap relation. For example: redacteddb=# select pg_relation_filenode(oid), oid, relname, relkind from pg_class where pg_relation_filenode(oid) = 12785;pg_relation_filenode | oid | relname | relkind ----------------------+------+-----------------+--------- 12785 | 1261 | pg_auth_members | r (1 row) Running "VACUUM FULL pg_auth_members;" has made the problem go away (to the extent that the above code doesn't trip up and everything is at least superficially okay) on at least one occasion. I'm currently investigating how consistently that works as a short term remediation. Theories? -- Peter Geoghegan
Hi, On 2015-03-12 16:42:24 -0700, Peter Geoghegan wrote: > We want to create a new role when this happens, for various reasons. > This occurs after recovery ends, but before the database has been > "unfenced". The template code that generates various ALTER ROLE > statements in our internal provisioning system - which has apparently > worked just fine for a long time - is: Is this all the code that's exececuted after recovery? How are these forks brought up? Promoted how? Is it a common 'source' database? > db.execute("ALTER ROLE #{old_database_user} RENAME TO #{database_user}") > db.execute("ALTER ROLE #{database_user} PASSWORD '#{database_password}' LOGIN") > db.execute("CREATE ROLE \"#{old_database_user}\" PASSWORD > '#{old_database_password}' IN ROLE \"#{database_user}\" LOGIN") > > I've seen multiple reports of apparent corruption, appearing as the > resulting ALTER ROLE statements are executed: > > PG::DataCorrupted: ERROR: could not read block 0 in file > "global/12811": read only 0 of 8192 bytes > or: > PG::DataCorrupted: ERROR: could not read block 0 in file > "global/12785": read only 0 of 8192 bytes > or: > PG::DataCorrupted: ERROR: could not read block 0 in file > "global/12811": read only 0 of 8192 bytes Have you looked at these files? Are they indeed zero bytes when this error occurs? Do you still have a base backup from the relevant time, so you could repeat the whole thing? > The only common factor is that this occurs on the latest point > releases (either 9.3.6 and 9.2.10, at least so far). In all cases I've > seen so far, the relation in question is the pg_auth_members heap > relation. For example: Any chance that the new nodes also use a different kernel version or such? > redacteddb=# select pg_relation_filenode(oid), oid, relname, relkind > from pg_class where pg_relation_filenode(oid) = 12785; > pg_relation_filenode | oid | relname | relkind > ----------------------+------+-----------------+--------- > 12785 | 1261 | pg_auth_members | r > (1 row) This filenode got to be pg_auth_member's original one, given it's below FirstNormalObjectId. I get a lower value, but that's probably caused by having fewer collations and other data generated during initdb. That implies that the table hasn't ever been rewritten. What's 12811? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
In a hurry right now, so unfortunately I'll need to be brief for now. On Thu, Mar 12, 2015 at 5:21 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2015-03-12 16:42:24 -0700, Peter Geoghegan wrote: >> We want to create a new role when this happens, for various reasons. >> This occurs after recovery ends, but before the database has been >> "unfenced". The template code that generates various ALTER ROLE >> statements in our internal provisioning system - which has apparently >> worked just fine for a long time - is: > > Is this all the code that's exececuted after recovery? How are these > forks brought up? Promoted how? Is it a common 'source' database? We do PITR up to a recovery target. We're talking about the same issue occurring on entirely distinct customer databases, with entirely distinct major PG versions. I'm not sure what other code might have already been run at this point, but it won't have been much. As I said, the only common factor that I know of is all affected databases being on the latest point release. > Have you looked at these files? Are they indeed zero bytes when this > error occurs? I think that they are indeed zero. I looked at that last week, when I didn't consider that this might be a more widespread issue. I'll check again later. > Do you still have a base backup from the relevant time, so you could > repeat the whole thing? Yes. >> The only common factor is that this occurs on the latest point >> releases (either 9.3.6 and 9.2.10, at least so far). In all cases I've >> seen so far, the relation in question is the pg_auth_members heap >> relation. For example: > > Any chance that the new nodes also use a different kernel version or > such? They may differ, but that doesn't seem likely to be relevant, at least to me. This has happened something like 6 or 7 times already, starting late last week. I am unfamiliar with this provisioning code, so, as I mentioned, offhand I cannot be entirely sure that there isn't some other code run when the problem originally arises (that I should have included in my report). What I can tell you is that I saw the same error messages when I manually ran the statements generated by the above code within a transaction...until I ran "VACUUM FULL pg_auth_members;". > This filenode got to be pg_auth_member's original one, given it's below > FirstNormalObjectId. I get a lower value, but that's probably caused by > having fewer collations and other data generated during initdb. That > implies that the table hasn't ever been rewritten. > > What's 12811? It's the same catalog, pg_auth_member. As I said, the messages you saw are on entirely different customer databases, servers and (sometimes) PG version. -- Peter Geoghegan
On 2015-03-12 17:42:33 -0700, Peter Geoghegan wrote: > On Thu, Mar 12, 2015 at 5:21 PM, Andres Freund <andres@2ndquadrant.com> wrote: > > On 2015-03-12 16:42:24 -0700, Peter Geoghegan wrote: > >> We want to create a new role when this happens, for various reasons. > >> This occurs after recovery ends, but before the database has been > >> "unfenced". The template code that generates various ALTER ROLE > >> statements in our internal provisioning system - which has apparently > >> worked just fine for a long time - is: > > > > Is this all the code that's exececuted after recovery? How are these > > forks brought up? Promoted how? Is it a common 'source' database? > > We do PITR up to a recovery target. So, no hot standby enabled? > I'm not sure what other code might have already been run at this > point, but it won't have been much. > > Have you looked at these files? Are they indeed zero bytes when this > > error occurs? > I think that they are indeed zero. I looked at that last week, when I > didn't consider that this might be a more widespread issue. I'll check > again later. > > Any chance that the new nodes also use a different kernel version or > > such? > > They may differ, but that doesn't seem likely to be relevant, at least > to me. There've been some issues with seek(END) sometimes returning the wrong length in the past. And I've seen a report that might indicate a similar bug has been reintroduced. That could certainly cause such anerror. > I am unfamiliar with this provisioning code, so, as I > mentioned, offhand I cannot be entirely sure that there isn't some > other code run when the problem originally arises (that I should have > included in my report). It's probably worthwhile to dig out what's happening. > What I can tell you is that I saw the same error messages when I > manually ran the statements generated by the above code within a > transaction...until I ran "VACUUM FULL pg_auth_members;". You can reproduce that problem? How easily? If you can, please produce a backtrace. It'll certainly be interesting to see whether that access is through an index or whatnot. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Mar 12, 2015 at 5:56 PM, Andres Freund <andres@2ndquadrant.com> wrote: > So, no hot standby enabled? Right. > There've been some issues with seek(END) sometimes returning the wrong > length in the past. And I've seen a report that might indicate a similar > bug has been reintroduced. That could certainly cause such anerror. Perhaps. >> I am unfamiliar with this provisioning code, so, as I >> mentioned, offhand I cannot be entirely sure that there isn't some >> other code run when the problem originally arises (that I should have >> included in my report). > > It's probably worthwhile to dig out what's happening. I'll get to that after the backtrace. >> What I can tell you is that I saw the same error messages when I >> manually ran the statements generated by the above code within a >> transaction...until I ran "VACUUM FULL pg_auth_members;". > > You can reproduce that problem? How easily? If you can, please produce a > backtrace. It'll certainly be interesting to see whether that access is > through an index or whatnot. I suspect I can reproduce it quite easily. The next step should be to do a backtrace. I'll look at that tomorrow. -- Peter Geoghegan
On Thu, Mar 12, 2015 at 5:56 PM, Andres Freund <andres@2ndquadrant.com> wrote: >> > Any chance that the new nodes also use a different kernel version or >> > such? >> >> They may differ, but that doesn't seem likely to be relevant, at least >> to me. > > There've been some issues with seek(END) sometimes returning the wrong > length in the past. And I've seen a report that might indicate a similar > bug has been reintroduced. That could certainly cause such anerror. It seems that I failed to take account of the fact that these cases all involved Ubuntu 14.04 LTS (that wasn't clear to me before, but I've verified that that's the case now), which is a small minority of the kernels our instances use. I've verified that the issue arises independently of point release, including on earlier point releases from about a year ago, so I was hasty in suggesting that there was a regression in the latest point releases. I'm still going to get a back trace here, because it seems reasonable to suppose that there is a Postgres bug even still - it may be that whatever differences are in the 14.04 kernel are enough to make a previously latent bug trip this code up. Plenty of people are using Ubuntu 14.04 LTS with Postgres without seeing this issue, or any other issue, and there is no other issue with the kernel that I know of. The issue continues to occur as new instances with this configuration are deployed. I should also add that my temporary remediation ("VACUUM FULL pg_auth_members;") works consistently, as new cases start to crop up. -- Peter Geoghegan
On 03/17/2015 10:35 AM, Peter Geoghegan wrote: > I'm still going to get a back trace here, because it seems reasonable > to suppose that there is a Postgres bug even still - it may be that > whatever differences are in the 14.04 kernel are enough to make a > previously latent bug trip this code up. Plenty of people are using > Ubuntu 14.04 LTS with Postgres without seeing this issue, or any other > issue, and there is no other issue with the kernel that I know of. The > issue continues to occur as new instances with this configuration are > deployed. Yeah ... I think I have logins on something like 50 servers running 9.3.6 on 14.04, in replication, and haven't seen this issue. Which specific kernel version? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Well, when a database is first initdb'd that relation is in fact 0 bytes:
::***# select pg_relation_filenode(oid) from pg_class where relname = 'pg_auth_members';
┌──────────────────────┐
│ pg_relation_filenode │
├──────────────────────┤
│ 12610 │
└──────────────────────┘
(1 row)
$ find . -name 12610 -ls
131772 0 -rw------- 1 stark stark 0 Mar 3 18:52 ./global/12610
So it's not surprising to find a 0-byte file there sometime. The question is how the ALTER ROLE commands failed to extend it or why they're expecting it to be non-zero when it is.
Is it possible the permissions on the file are wrong for some reason? The symptoms would make sense if the directory had permissions that allowed unlinking (for vacuum full) but the file didn't have "r" access for the postgres user or SE rules blocked it from being read or something like that.
--
greg