Home > mailing lists

Re: Fix missing EvalPlanQual recheck for TID scans - Mailing list pgsql-hackers

From	David Rowley
Subject	Re: Fix missing EvalPlanQual recheck for TID scans
Date	September 8 07:51:08
Msg-id	CAApHDvoHYhDn2UbwvCbuu_BXUkA2wf3RS+ZNxgzCHXkZ5_07FQ@mail.gmail.com Whole thread Raw
In response to	Fix missing EvalPlanQual recheck for TID scans ("Sophie Alpert" <pg@sophiebits.com>)
Responses	Re: Fix missing EvalPlanQual recheck for TID scans
List	pgsql-hackers

Tree view

On Mon, 8 Sept 2025 at 10:31, Sophie Alpert <pg@sophiebits.com> wrote:
> I learned that the TID Scan node does not properly implement EvalPlanQual recheck, which has already been noted in
commentadded in 2002 (commit 6799a6ca21). This does seem to have the potential to manifest in observable and surprising
wayswhen queries are filtering on `ctid`, like for optimistic concurrency control as in
https://stackoverflow.com/q/78487441.

Thanks for the report and the patch.

I'll summarise this issue, as you provided an external link which
might one day disappear:

drop table if exists test_optimistic_lock;
create table public.test_optimistic_lock (
    id bigserial primary key,
    name varchar,
    rest_count integer,
    update_user varchar,
    update_time timestamp without time zone,
    create_time timestamp without time zone
);

insert into public.test_optimistic_lock (name, rest_count,
update_time, create_time) values
('Product1', 1, current_timestamp, current_timestamp);

S1: begin;
S1: select ctid, rest_count from public.test_optimistic_lock where
name = 'Product1';

S2: begin;
S2: select ctid, rest_count from public.test_optimistic_lock where
name = 'Product1';

S1: update public.test_optimistic_lock set rest_count = rest_count -
1, update_user = 'A', update_time = current_timestamp where name =
'Product1' and ctid = '(0,1)';
S2: update public.test_optimistic_lock set rest_count = rest_count -
1, update_user = 'A', update_time = current_timestamp where name =
'Product1' and ctid = '(0,1)'; -- waits for S1

S1: commit:
S2: Gets UPDATE 1 when it should get UPDATE 0!!

I agree that this is a bug and we should fix it.  The behaviour should
match what you'd get if you ran it with "SET enable_tidscan = 0;".
When that's done, the Seq Scan will have the ctid related expressions
within the qual and it will correctly filter out the non-matching
tuple.

> Attached is an attempt from me at fixing this issue for TID Scan as well as TID Range Scan which appears to have the
sameissue. 

I think this is generally pretty good. Here's a quick review:

1. This part is quite annoying:

+ if (node->tss_TidList == NULL)
+ TidListEval(node);

Of course, it's required since ExecReScanTidScan() wiped out that list.

Maybe we can add a boolean variable named tts_inScan to TidScanState
after tss_isCurrentOf (so as it consumes the existing padding bytes
and does not enlarge that struct), and have ExecReScanTidScan() set
that to false rather than wipe out the tss_TidList. Then just reuse
the existing list in TidRecheck(). That should save on the extra
overhead of having to rebuild the list each time there's an EPQ
recheck.

2. For TidRangeRecheck, I don't see why this part is needed:

+ if (!TidRangeEval(node))
+ return false;

The TID range is preserved already and shouldn't need to be recalculated.

3. In TidRecheck(), can you make "match" an "ItemPointer" and do:
match = (ItemPointer) bsearch(...);

4. Can you put this comment above the "if"?

+ if (node->tss_isCurrentOf)
+ /* WHERE CURRENT OF always intends to resolve to the latest tuple */
+ return true;

What you've got there doesn't meet the style we use. Technically there
should be braces because there are multiple lines (per our style (not
C)), but it's a bit needless to do that as the comment makes the same
amount of sense if it goes before the "if".

5. Can you make TidRangeRecheck() look like this?

static bool
TidRangeRecheck(TidRangeScanState *node, TupleTableSlot *slot)
{
    /* Recheck the ctid is still within range */
    if (ItemPointerCompare(&slot->tts_tid, &node->trss_mintid) < 0 ||
        ItemPointerCompare(&slot->tts_tid, &node->trss_maxtid) > 0)
        return false;

    return true;
}

The reason being that it's closer to how heap_getnextslot_tidrange()
does it, and I'd rather the conditions were kept as similar as
possible to that function.

6. For the tests. It should be ok to make the Tid range scan test do
ctid BETWEEN '(0,1)' AND '(0,1)'. I feel this is more aligned to the
TID Range Scan version of what you're doing in the TID Scan test.

I'll need to give the tests a closer inspection later.

David

pgsql-hackers by date:

From: Emmanuel Sibi
Date: 08 September, 07:45:49
Subject: [BUG] PostgreSQL crashes with ThreadSanitizer during early initialization

From: Dilip Kumar
Date: 08 September, 08:05:32
Subject: Re: plan shape work

Re: Fix missing EvalPlanQual recheck for TID scans - Mailing list pgsql-hackers

Previous

Next