Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
Date
Msg-id 20171003164820.lmohsuyf7hygafdq@alvherre.pgsql
Whole thread Raw
Responses Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
List pgsql-hackers
So here's my attempt at an explanation for what is going on.  At one
point, we have this:

select lp, lp_flags, t_xmin, t_xmax, t_ctid, to_hex(t_infomask) as infomask,
to_hex(t_infomask2) as infomask2
from heap_page_items(get_raw_page('t', 0));lp | lp_flags | t_xmin | t_xmax | t_ctid | infomask | infomask2 
----+----------+--------+--------+--------+----------+----------- 1 |        1 |      2 |      0 | (0,1)  | 902      |
32 |        0 |        |        |        |          |  3 |        1 |      2 |  19928 | (0,4)  | 3142     | c003 4 |
   1 |  14662 |  19929 | (0,5)  | 3142     | c003 5 |        1 |  14663 |  19931 | (0,6)  | 3142     | c003 6 |
1|  14664 |  19933 | (0,7)  | 3142     | c003 7 |        1 |  14665 |      0 | (0,7)  | 2902     | 8003
 
(7 filas)

which shows a HOT-update chain, where the t_xmax are multixacts.  Then a
vacuum freeze comes, and because the multixacts are below the freeze
horizon for multixacts, we get this:

select lp, lp_flags, t_xmin, t_xmax, t_ctid, to_hex(t_infomask) as infomask,
to_hex(t_infomask2) as infomask2
from heap_page_items(get_raw_page('t', 0));lp | lp_flags | t_xmin | t_xmax | t_ctid | infomask | infomask2 
----+----------+--------+--------+--------+----------+----------- 1 |        1 |      2 |      0 | (0,1)  | 902      |
32 |        0 |        |        |        |          |  3 |        1 |      2 |  14662 | (0,4)  | 2502     | c003 4 |
   1 |      2 |  14663 | (0,5)  | 2502     | c003 5 |        1 |      2 |  14664 | (0,6)  | 2502     | c003 6 |
1|      2 |  14665 | (0,7)  | 2502     | c003 7 |        1 |      2 |      0 | (0,7)  | 2902     | 8003
 
(7 filas)

where the xmin values have all been frozen, and the xmax values are now
regular Xids.  I think the HOT code that walks the chain fails to detect
these as chains, because the xmin values no longer match the xmax
values.  I modified the multixact freeze code, so that whenever the
update Xid is below the cutoff Xid, it's set to FrozenTransactionId,
since keeping the other value is invalid anyway (even though we have set
the HEAP_XMAX_COMMITTED flag).  But that still doesn't fix the problem;
as far as I can see, vacuum removes the root of the chain, not yet sure
why, and then things are just as corrupted as before.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Dang Minh Huong
Date:
Subject: Re: [HACKERS] list of credits for release notes
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] [sqlsmith] stuck spinlock in pg_stat_get_wal_receiver after OOM