Re: cache lookup failed when \d t concurrent with DML change column data type - Mailing list pgsql-hackers

From Andrei Lepikhov
Subject Re: cache lookup failed when \d t concurrent with DML change column data type
Date
Msg-id 02ae680f-d769-4b4c-b8a5-07b7cf6e3638@gmail.com
Whole thread Raw
In response to Re: cache lookup failed when \d t concurrent with DML change column data type  (Andrei Lepikhov <lepihov@gmail.com>)
Responses Re: cache lookup failed when \d t concurrent with DML change column data type
List pgsql-hackers
On 10/25/24 10:05, Andrei Lepikhov wrote:
> On 10/24/24 22:30, jian he wrote:
>> hi. I think I found a bug.
>> PostgreSQL 18devel_debug_build_45188c2ea2 on x86_64-linux, compiled by
>> gcc-14.1.0, 64-bit
>> commit at 45188c2ea2.
>> Ubuntu 22.04.4 LTS
>>
>>
>> setup:
>> drop table t cascade;
>> create table t(a int PRIMARY key);
>>
>> IN session1:
>> step "change data type" {begin; alter table t alter column a set data
>> type int4;}
>> step "s1" {commit;}
>>
>> IN session2:
>> step "psql_another_session" {\d t}
>>
>> permutation "change data type" "psql_another_session" "s1"
> 
>> ERROR:  cache lookup failed for attribute 1 of relation 34418
> Yes, it looks like a bug existing for a long time, at least since PG11 
> (I didn't trace further down).
> It seems that the backend didn't apply invalidation messages before 
> touching system caches. Backtrace:
After a short discovery, I found the origins:
The pg_get_indexdef has an incoming index oid and gets all the stuff 
needed just by looking up sys-caches. But it wants to build a list of 
relation column names at a specific moment and opens the heap relation. 
After that operation, we already have syscaches updated and the old 
index oid replaced with the new one.
It may be have made sense to lock the row of replaced index in  pg_class 
and pg_index until the transaction, altered it will be  commmitted. But, 
because ALTER TABLE is not fully MVCC-safe, it may be  expected (or 
acceptable) behaviour.

-- 
regards, Andrei Lepikhov




pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: Can rs_cindex be < 0 for bitmap heap scans?
Next
From: Alexander Lakhin
Date:
Subject: Re: pgbench: Improve result outputs related to failed transactinos