I tried to analyze the issue, and I found that it might be caused by this commit:
commit dad50f677c42de207168a3f08982ba23c9fc6720
bufmgr: Acquire and clean victim buffer separately
Thanks for looking into it!
...
With debug logging added in this code within ExtendBufferedRelLocal(): if (found) { BufferDesc *existing_hdr = GetLocalBufferDescriptor(hresult->id); uint32 buf_state;
buf_state = pg_atomic_read_u32(&existing_hdr->state); Assert(buf_state & BM_TAG_VALID); Assert(!(buf_state & BM_DIRTY)); buf_state &= BM_VALID; pg_atomic_unlocked_write_u32(&existing_hdr->state, buf_state); ... I see that it reached for the second INSERT (and NOSPC error) with existing_hdr->state == 0x2040000, but for the third INSERT I observe state == 0x0.
I wonder, if "buf_state &= BM_VALID" is a typo here, maybe it supposed to be "buf_state &= ~BM_VALID" as in ExtendBufferedRelShared()...
Yeah, that's true. I analyze this issue again, and I think the root cause is the " buf_state &= BM_VALID" .
In my report issue, buf_state & BM_VALID is true, but buf_state & BM_TAG_VALID is false. This situation is impossible.
It can't happen that the data in the local buffer pool is valid, but LocalBufHash has no entry.
I modified v1 patch, and attached v2 patch should fix the above issues.