Hello!
On Wed, Nov 26, 2025 at 7:34 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
> We ran into one more problem with the new test, evidenced by timeouts by
> buildfarm member prion. For CATCACHE_FORCE_RELEASE builds on two of the
> tests, we get a few invalidations of the catalog snapshot ahead of what
> we expect, and because we have an injection point to sleep there, those
> tests get stuck.
Oh, I missed that. Non-yet pushed tests are probably affected too.
> Here's one possible fix. I had to take the attach operation on
> invalidate-catalog-snapshot-end to a new step of s1, instead of
> occurring in the setup block. I understand that this is because no step
> can run until the setup of all steps completes, so if one setup gets
> stuck, we're out of luck. And then, session s4 can do a conditional
> wakeup of session s1.
I have tried to move the setup of invalidate-catalog-snapshot-end to
s1_start_upsert as the first command - but for some reason it wasn't
working the way I expected. But maybe I missed something.
> Patch attached. Thoughts?
Solution seems reasonable to me, another related ideas:
* replace "select case when" with function like
injection_points_wakeup_if_waiting to avoid the possible race between
select and wake up (but AFAIK it is not possible in the current case)
* introduce some injection_points function to enter "ignore all runs,
but still allowed to attach/detach" mode and "normal" mode.. As first
command of setup - enter such "setup mode", as last - back to normal.
> Maybe there's some other way to go about this -- for instance I
> considered the idea of moving the injection point somewhere else from
> InvalidateCatalogSnapshot(). I don't have any ideas about that though,
> but I'm willing to listen if anybody has any.
AFAIU it is the only place.
Best regard,
Mikhail.