Home > mailing lists

pgstattuple "unexpected zero page" for gist and hash indexes - Mailing list pgsql-hackers

From	Nitin Motiani
Subject	pgstattuple "unexpected zero page" for gist and hash indexes
Date	September 16 10:36:00
Msg-id	CAH5HC95gT1J3dRYK4qEnaywG8RqjbwDdt04wuj8p39R=HukayA@mail.gmail.com Whole thread Raw
List	pgsql-hackers

Tree view

Hi Hackers,

$SUBJECT happens if we crash just after extending the index. Noah
Misch looked into it and came up with the steps to simulate this by
patching pgstattuple to extend each rel it visits:

====
--- a/contrib/pgstattuple/pgstattuple.c
+++ b/contrib/pgstattuple/pgstattuple.c
@@ -254,6 +254,8 @@ pgstat_relation(Relation rel, FunctionCallInfo fcinfo)
                                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                                 errmsg("cannot access temporary
tables of other sessions")));

+       ReleaseBuffer(ExtendBufferedRel(BMR_REL(rel), MAIN_FORKNUM,
NULL, EB_CLEAR_SIZE_CACHE));
+
        if (RELKIND_HAS_TABLE_AM(rel->rd_rel->relkind) ||
====

Then, in the regression database, this reaches the error for each hash index and
each gist index:

====
CREATE EXTENSION pgstattuple;
CREATE FUNCTION pgstattuple_safe(IN reloid regclass) RETURNS text
LANGUAGE plpgsql AS $$
BEGIN
        RETURN pgstattuple($1);
EXCEPTION WHEN OTHERS THEN
        RETURN sqlerrm;
END
$$;
SELECT * from (
 SELECT pgstattuple_safe(oid), relkind, oid::regclass
 FROM pg_class
 WHERE pg_relation_filenode(oid) <> 0
)
WHERE pgstattuple_safe NOT LIKE '%supported%'
AND pgstattuple_safe NOT LIKE '(%'
ORDER BY 1;
====

There have been prior reports:
2016-09 https://www.postgresql.org/message-id/CAA4eK1%2BzT16bxVpF3J9UsiZxExCqFa6QxzUfazmJthW6dO78ww%40mail.gmail.com
2016-09 https://www.postgresql.org/message-id/CAE9k0PnCaBkMgsDGuuPnPPTrQUc%3Dy9NiQvvsFFQkDNGcjYSajg%40mail.gmail.com
2021-02 docs https://www.postgresql.org/message-id/161280049644.664.3414087059452030397@wrigleys.postgresql.org


For hash indexes, this error comes from pgstat_hash_page() ->
_hash_getbuf_with_strategy() -> _hash_checkpage() for a zero page.
Similarly gistcheckpage() is the cause for gist indexes.

Since after a crash recovery a zero page is normal (as long as not
referenced from the rest of the index), emitting
ERRCODE_INDEX_CORRUPTED is a false alarm.

To fix this, we propose that we remove the checks from
pgstat_hash_page() and pgstat_gist_page(). This is something which
pgstat_btree_page() already does. It uses a lower-level function
ReadBufferExtended() and avoids using _bt_getbuf() which would check
for the "unexpected zero page".

I'm attaching a patch for the above which does the following :

1. Replaces _hash_getbuf_with_strategy() with ReadBufferExtended() in
pgstat_hash_page(). Then for non-PageIsNew() pages, we explicitly
check the PageGetSpecialSize().

2. Similarly gistcheckpage() is removed in pgstat_gist_page(). And for
non-PageIsNew(), we explicitly check the PageGetSpecialSize() before
checking if it's a leaf page.

I confirmed that this was working by running the above mentioned
simulation steps along with the patch.

Note that _hash_getbuf_with_strategy() also checks if blkno is P_NEW.
But that check seems unnecessary here as P_NEW is the same as
InvalidBlockNo and pgstat_hash_page() is called by pgstat_index() for
the block nos starting from HASH_METAPAGE + 1 and remains <
RelationGetNumberOfBlocks().

An alternative might be that we still call the check functions for the
non-PageIsNew() cases. Following were Noah Misch's comments on these
options :

I mildly lean toward not calling these *check*page functions, for two
reasons.  First, pgstattuple() is not a general integrity checker.  It
should try its best to count things, and it
shouldn't go out of its way to report corruption.  Second, pgstattuple doesn't
depend on access-method-specific index integrity.  pgstat_gist_page() is a thin
wrapper around the generic pgstat_index_page().  pgstat_hash_page() does
slightly more, but a page failing the _hash_checkpage() checks still wouldn't
elicit intolerable behavior from pgstat_hash_page().

Let me know what you folks think of this.

Thanks & Regards,
Nitin Motiani
Google

Attachment

v1-0001-Fix-unexpected-zero-page-in-pgstattuple-for-hash-.patch

pgsql-hackers by date:

From: Dmitry Mityugov
Date: 16 September, 10:21:30
Subject: Re: --with-llvm on 32-bit platforms?

From: Fujii Masao
Date: 16 September, 11:04:18
Subject: Re: pg_restore --no-policies should not restore policies' comment

pgstattuple "unexpected zero page" for gist and hash indexes - Mailing list pgsql-hackers

Attachment

Previous

Next