I have been reading the source code of the BgWriter, and there is some
code in BgBufferSync() that I don't fully understand.
In BgBufferSync(), we have the following code:
while (num_to_scan > 0 && reusable_buffers < upcoming_alloc_est)
{
int sync_state = SyncOneBuffer(next_to_clean, true,
wb_context);
if (++next_to_clean >= NBuffers)
{
next_to_clean = 0;
next_passes++;
}
num_to_scan--;
if (sync_state & BUF_WRITTEN)
{
reusable_buffers++;
if (++num_written >= bgwriter_lru_maxpages)
{
PendingBgWriterStats.maxwritten_clean++;
break;
}
}
else if (sync_state & BUF_REUSABLE)
reusable_buffers++;
}
In SyncOneBuffer(), we lock the bufHdr and then check if both the
refcount and usage_count are zero. If so, we mark the return value as
BUF_REUSABLE.
My understanding is that this means that the buffer could be reused
when am empty shared buffer is needed by a backend. However, in the
code above, we seem to track these in the reusable_buffers variable.
But that variable is always incremented when the buffer was written in
SyncOneBuffer() even though that buffer might have a non-zero refcount
or non-zero usage_count.
Initially, I thought this might be a bug, but then realized that the
reusable_buffers variable is used to make the BGWriter more or less
aggressive in reading pages, and now I'm unsure.
I discussed this with some of the PG committers at Microsoft, and we
all agree that the current code is a little confusing and it might
help if some more comments were added.
Does anybody know what the reusable_buffers variable really accounts
for, and whether the code as currently written is correct?
Thanks,
Marcel
--
Marcel van der Holst
mvdholst@gmail.com