Hi all,
I learned that the TID Scan node does not properly implement EvalPlanQual recheck, which has already been noted in
commentadded in 2002 (commit 6799a6ca21). This does seem to have the potential to manifest in observable and surprising
wayswhen queries are filtering on `ctid`, like for optimistic concurrency control as in
https://stackoverflow.com/q/78487441.
Attached is an attempt from me at fixing this issue for TID Scan as well as TID Range Scan which appears to have the
sameissue.
This is my first time looking at the Postgres source (nor do I work with C on a regular basis), so I will not be
surprisedto hear that I've done things wrong, but I'm hopeful that this is usable as written. Running `meson test`
passesfor me, including the added isolation tests, and I confirmed that both of the tests that now result in a negative
recheckwere previously failing.
In my implementation I am calling TidRangeEval every time TidRangeRecheck is called; let me know if this is a
performanceconcern. It didn't seem that any of the existing TidRangeScanState flags are quite right to know if the
boundshave been initialized, but I am happy to add one.
The original issue appears to have been present since the introduction of TID scans in 1999 (commit 6f9ff92cc0) so I
imagineit may make sense to backport the fix, although evidently not many people are running into this.
Thanks,
Sophie