Thread: Use read streams in CREATE DATABASE command when the strategy is wal_log
Use read streams in CREATE DATABASE command when the strategy is wal_log
From
Nazir Bilal Yavuz
Date:
Hi,
I am working on using read streams in the CREATE DATABASE command when the strategy is wal_log. RelationCopyStorageUsingBuffer() function is used in this context. This function reads source buffers then copies them to the destination buffers. I used read streams only when reading source buffers because the destination buffers are read by 'RBM_ZERO_AND_LOCK' option, so it is not important.
I created a ~6 GB table [1] and created a new database with the wal_log strategy using the database that table was created in as a template [2]. My benchmarking results are:
a. Timings:
patched:
12955.027 ms
12917.475 ms
13177.846 ms
12971.308 ms
13059.985 ms
master:
13156.375 ms
13054.071 ms
13151.607 ms
13152.633 ms
13160.538 ms
There is no difference in timings, the patched version is a tiny bit better but it is negligible. I actually expected the patched version to be better because there was no prefetching before, but the read stream API detects sequential access and disables prefetching.
b. strace:
patched:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
68.02 3.749359 2 1285782 pwrite64
18.54 1.021734 21 46730 preadv
9.49 0.522889 826 633 fdatasync
2.55 0.140339 59 2368 pwritev
1.14 0.062583 409 153 fsync
master:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
59.71 3.831542 2 1288365 pwrite64
29.84 1.914540 2 747936 pread64
7.90 0.506843 837 605 fdatasync
1.58 0.101575 54 1856 pwritev
0.75 0.048431 400 121 fsync
There are fewer (~1/16) read system calls in the patched version.
c. perf:
patched:
- 97.83% 1.13% postgres postgres [.] RelationCopyStorageUsingBuffer
- 97.83% RelationCopyStorageUsingBuffer
- 44.28% ReadBufferWithoutRelcache
+ 42.20% GetVictimBuffer
0.81% ZeroBuffer
+ 31.86% log_newpage_buffer
- 19.51% read_stream_next_buffer
- 17.92% WaitReadBuffers
+ 17.61% mdreadv
- 1.47% read_stream_start_pending_read
+ 1.46% StartReadBuffers
master:
- 97.68% 0.57% postgres postgres [.] RelationCopyStorageUsingBuffer
- RelationCopyStorageUsingBuffer
- 65.48% ReadBufferWithoutRelcache
+ 41.16% GetVictimBuffer
- 20.42% WaitReadBuffers
+ 19.90% mdreadv
+ 1.85% StartReadBuffer
0.75% ZeroBuffer
+ 30.82% log_newpage_buffer
Patched version spends less CPU time in read calls and more CPU time in other calls such as write.
There are three patch files attached. First two are optimization and adding a way to create a read stream object by using SMgrRelation, these are already proposed in the streaming I/O thread [3]. The third one is the actual patch file.
Any kind of feedback would be appreciated.
[1] CREATE TABLE t as select repeat('a', 100) || i || repeat('b', 500) as filler from generate_series(1, 9000000) as i;
[2] CREATE DATABASE test_1 STRATEGY 'wal_log' TEMPLATE test;
[3] https://www.postgresql.org/message-id/CAN55FZ1yGvCzCW_aufu83VimdEYHbG_zuOY3J9JL-nBptyJyKA%40mail.gmail.com
--
Regards,
Nazir Bilal Yavuz
Microsoft
I am working on using read streams in the CREATE DATABASE command when the strategy is wal_log. RelationCopyStorageUsingBuffer() function is used in this context. This function reads source buffers then copies them to the destination buffers. I used read streams only when reading source buffers because the destination buffers are read by 'RBM_ZERO_AND_LOCK' option, so it is not important.
I created a ~6 GB table [1] and created a new database with the wal_log strategy using the database that table was created in as a template [2]. My benchmarking results are:
a. Timings:
patched:
12955.027 ms
12917.475 ms
13177.846 ms
12971.308 ms
13059.985 ms
master:
13156.375 ms
13054.071 ms
13151.607 ms
13152.633 ms
13160.538 ms
There is no difference in timings, the patched version is a tiny bit better but it is negligible. I actually expected the patched version to be better because there was no prefetching before, but the read stream API detects sequential access and disables prefetching.
b. strace:
patched:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
68.02 3.749359 2 1285782 pwrite64
18.54 1.021734 21 46730 preadv
9.49 0.522889 826 633 fdatasync
2.55 0.140339 59 2368 pwritev
1.14 0.062583 409 153 fsync
master:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
59.71 3.831542 2 1288365 pwrite64
29.84 1.914540 2 747936 pread64
7.90 0.506843 837 605 fdatasync
1.58 0.101575 54 1856 pwritev
0.75 0.048431 400 121 fsync
There are fewer (~1/16) read system calls in the patched version.
c. perf:
patched:
- 97.83% 1.13% postgres postgres [.] RelationCopyStorageUsingBuffer
- 97.83% RelationCopyStorageUsingBuffer
- 44.28% ReadBufferWithoutRelcache
+ 42.20% GetVictimBuffer
0.81% ZeroBuffer
+ 31.86% log_newpage_buffer
- 19.51% read_stream_next_buffer
- 17.92% WaitReadBuffers
+ 17.61% mdreadv
- 1.47% read_stream_start_pending_read
+ 1.46% StartReadBuffers
master:
- 97.68% 0.57% postgres postgres [.] RelationCopyStorageUsingBuffer
- RelationCopyStorageUsingBuffer
- 65.48% ReadBufferWithoutRelcache
+ 41.16% GetVictimBuffer
- 20.42% WaitReadBuffers
+ 19.90% mdreadv
+ 1.85% StartReadBuffer
0.75% ZeroBuffer
+ 30.82% log_newpage_buffer
Patched version spends less CPU time in read calls and more CPU time in other calls such as write.
There are three patch files attached. First two are optimization and adding a way to create a read stream object by using SMgrRelation, these are already proposed in the streaming I/O thread [3]. The third one is the actual patch file.
Any kind of feedback would be appreciated.
[1] CREATE TABLE t as select repeat('a', 100) || i || repeat('b', 500) as filler from generate_series(1, 9000000) as i;
[2] CREATE DATABASE test_1 STRATEGY 'wal_log' TEMPLATE test;
[3] https://www.postgresql.org/message-id/CAN55FZ1yGvCzCW_aufu83VimdEYHbG_zuOY3J9JL-nBptyJyKA%40mail.gmail.com
--
Regards,
Nazir Bilal Yavuz
Microsoft
Attachment
On Tue, Apr 16, 2024 at 02:12:19PM +0300, Nazir Bilal Yavuz wrote: > I am working on using read streams in the CREATE DATABASE command when the > strategy is wal_log. RelationCopyStorageUsingBuffer() function is used in > this context. This function reads source buffers then copies them to the Please rebase. I applied this to 40126ac for review purposes. > a. Timings: > b. strace: > c. perf: Thanks for including those details. That's valuable confirmation. > Subject: [PATCH v1 1/3] Refactor PinBufferForBlock() to remove if checks about > persistence > > There are if checks in PinBufferForBlock() function to set persistence > of the relation and this function is called for the each block in the > relation. Instead of that, set persistence of the relation before > PinBufferForBlock() function. I tried with the following additional patch to see if PinBufferForBlock() ever gets invalid smgr_relpersistence: ==== --- a/src/backend/storage/buffer/bufmgr.c +++ b/src/backend/storage/buffer/bufmgr.c @@ -1098,6 +1098,11 @@ PinBufferForBlock(Relation rel, Assert(blockNum != P_NEW); + if (!(smgr_persistence == RELPERSISTENCE_TEMP || + smgr_persistence == RELPERSISTENCE_PERMANENT || + smgr_persistence == RELPERSISTENCE_UNLOGGED)) + elog(WARNING, "unexpected relpersistence %d", smgr_persistence); + if (smgr_persistence == RELPERSISTENCE_TEMP) { io_context = IOCONTEXT_NORMAL; ==== That still gets relpersistence==0 in various src/test/regress cases. I think the intent was to prevent that. If not, please add a comment about when relpersistence==0 is still allowed. > --- a/src/backend/storage/aio/read_stream.c > +++ b/src/backend/storage/aio/read_stream.c > @@ -549,7 +549,7 @@ read_stream_begin_relation(int flags, > { > stream->ios[i].op.rel = rel; > stream->ios[i].op.smgr = RelationGetSmgr(rel); > - stream->ios[i].op.smgr_persistence = 0; > + stream->ios[i].op.smgr_persistence = rel->rd_rel->relpersistence; Does the following comment in ReadBuffersOperation need an update? /* * The following members should be set by the caller. If only smgr is * provided without rel, then smgr_persistence can be set to override the * default assumption of RELPERSISTENCE_PERMANENT. */ > --- a/src/backend/storage/buffer/bufmgr.c > +++ b/src/backend/storage/buffer/bufmgr.c > +/* > + * Helper struct for read stream object used in > + * RelationCopyStorageUsingBuffer() function. > + */ > +struct copy_storage_using_buffer_read_stream_private > +{ > + BlockNumber blocknum; > + int64 last_block; > +}; Why is last_block an int64, not a BlockNumber? > @@ -4667,19 +4698,31 @@ RelationCopyStorageUsingBuffer(RelFileLocator srclocator, > /* Iterate over each block of the source relation file. */ > for (blkno = 0; blkno < nblocks; blkno++) > { > CHECK_FOR_INTERRUPTS(); > > /* Read block from source relation. */ > - srcBuf = ReadBufferWithoutRelcache(srclocator, forkNum, blkno, > - RBM_NORMAL, bstrategy_src, > - permanent); > + srcBuf = read_stream_next_buffer(src_stream, NULL); > LockBuffer(srcBuf, BUFFER_LOCK_SHARE); I think this should check for read_stream_next_buffer() returning InvalidBuffer. pg_prewarm doesn't, but the other callers do, and I think the other callers are a better model. LockBuffer() doesn't check the InvalidBuffer case, so let's avoid the style of using a read_stream_next_buffer() return value without checking. Thanks, nm
Re: Use read streams in CREATE DATABASE command when the strategy is wal_log
From
Nazir Bilal Yavuz
Date:
Hi, Thank you for the review! On Fri, 12 Jul 2024 at 02:52, Noah Misch <noah@leadboat.com> wrote: > > On Tue, Apr 16, 2024 at 02:12:19PM +0300, Nazir Bilal Yavuz wrote: > > I am working on using read streams in the CREATE DATABASE command when the > > strategy is wal_log. RelationCopyStorageUsingBuffer() function is used in > > this context. This function reads source buffers then copies them to the > > Please rebase. I applied this to 40126ac for review purposes. Rebased. > > Subject: [PATCH v1 1/3] Refactor PinBufferForBlock() to remove if checks about > > persistence > > > > There are if checks in PinBufferForBlock() function to set persistence > > of the relation and this function is called for the each block in the > > relation. Instead of that, set persistence of the relation before > > PinBufferForBlock() function. > > I tried with the following additional patch to see if PinBufferForBlock() ever > gets invalid smgr_relpersistence: > > ==== > --- a/src/backend/storage/buffer/bufmgr.c > +++ b/src/backend/storage/buffer/bufmgr.c > @@ -1098,6 +1098,11 @@ PinBufferForBlock(Relation rel, > > Assert(blockNum != P_NEW); > > + if (!(smgr_persistence == RELPERSISTENCE_TEMP || > + smgr_persistence == RELPERSISTENCE_PERMANENT || > + smgr_persistence == RELPERSISTENCE_UNLOGGED)) > + elog(WARNING, "unexpected relpersistence %d", smgr_persistence); > + > if (smgr_persistence == RELPERSISTENCE_TEMP) > { > io_context = IOCONTEXT_NORMAL; > ==== > > That still gets relpersistence==0 in various src/test/regress cases. I think > the intent was to prevent that. If not, please add a comment about when > relpersistence==0 is still allowed. I fixed it, it is caused by (mode == RBM_ZERO_AND_CLEANUP_LOCK | mode == RBM_ZERO_AND_LOCK) case in the ReadBuffer_common(). The persistence was not updated for this path before. I also added an assert check for this problem to PinBufferForBlock(). > > --- a/src/backend/storage/aio/read_stream.c > > +++ b/src/backend/storage/aio/read_stream.c > > @@ -549,7 +549,7 @@ read_stream_begin_relation(int flags, > > { > > stream->ios[i].op.rel = rel; > > stream->ios[i].op.smgr = RelationGetSmgr(rel); > > - stream->ios[i].op.smgr_persistence = 0; > > + stream->ios[i].op.smgr_persistence = rel->rd_rel->relpersistence; > > Does the following comment in ReadBuffersOperation need an update? > > /* > * The following members should be set by the caller. If only smgr is > * provided without rel, then smgr_persistence can be set to override the > * default assumption of RELPERSISTENCE_PERMANENT. > */ > I believe it does not need to be updated but I renamed 'ReadBuffersOperation.smgr_persistence' as 'ReadBuffersOperation.persistence'. So, this comment is updated as well. I think that rename suits better because persistence does not need to come from smgr, it could come from relation, too. Do you think it is a good idea? If it is, does it need a separate commit? > > --- a/src/backend/storage/buffer/bufmgr.c > > +++ b/src/backend/storage/buffer/bufmgr.c > > > +/* > > + * Helper struct for read stream object used in > > + * RelationCopyStorageUsingBuffer() function. > > + */ > > +struct copy_storage_using_buffer_read_stream_private > > +{ > > + BlockNumber blocknum; > > + int64 last_block; > > +}; > > Why is last_block an int64, not a BlockNumber? You are right, the type of last_block should be BlockNumber; done. I copied it from pg_prewarm_read_stream_private struct and I guess the same should be applied to it as well but it is not the topic of this thread, so I did not update it yet. > > @@ -4667,19 +4698,31 @@ RelationCopyStorageUsingBuffer(RelFileLocator srclocator, > > > /* Iterate over each block of the source relation file. */ > > for (blkno = 0; blkno < nblocks; blkno++) > > { > > CHECK_FOR_INTERRUPTS(); > > > > /* Read block from source relation. */ > > - srcBuf = ReadBufferWithoutRelcache(srclocator, forkNum, blkno, > > - RBM_NORMAL, bstrategy_src, > > - permanent); > > + srcBuf = read_stream_next_buffer(src_stream, NULL); > > LockBuffer(srcBuf, BUFFER_LOCK_SHARE); > > I think this should check for read_stream_next_buffer() returning > InvalidBuffer. pg_prewarm doesn't, but the other callers do, and I think the > other callers are a better model. LockBuffer() doesn't check the > InvalidBuffer case, so let's avoid the style of using a > read_stream_next_buffer() return value without checking. There is an assert in the LockBuffer which checks for the InvalidBuffer. If that is not enough, we may add an if check for InvalidBuffer but what should we do in this case? It should not happen, so erroring out may be a good idea. Updated patches are attached (without InvalidBuffer check for now). -- Regards, Nazir Bilal Yavuz Microsoft
Attachment
On Tue, Jul 16, 2024 at 02:11:20PM +0300, Nazir Bilal Yavuz wrote: > On Fri, 12 Jul 2024 at 02:52, Noah Misch <noah@leadboat.com> wrote: > > On Tue, Apr 16, 2024 at 02:12:19PM +0300, Nazir Bilal Yavuz wrote: > > > --- a/src/backend/storage/aio/read_stream.c > > > +++ b/src/backend/storage/aio/read_stream.c > > > @@ -549,7 +549,7 @@ read_stream_begin_relation(int flags, > > > { > > > stream->ios[i].op.rel = rel; > > > stream->ios[i].op.smgr = RelationGetSmgr(rel); > > > - stream->ios[i].op.smgr_persistence = 0; > > > + stream->ios[i].op.smgr_persistence = rel->rd_rel->relpersistence; > > > > Does the following comment in ReadBuffersOperation need an update? > > > > /* > > * The following members should be set by the caller. If only smgr is > > * provided without rel, then smgr_persistence can be set to override the > > * default assumption of RELPERSISTENCE_PERMANENT. > > */ > > I believe it does not need to be updated but I renamed > 'ReadBuffersOperation.smgr_persistence' as > 'ReadBuffersOperation.persistence'. So, this comment is updated as > well. I think that rename suits better because persistence does not > need to come from smgr, it could come from relation, too. Do you think > it is a good idea? If it is, does it need a separate commit? The rename is good. I think the comment implies "persistence" is unused when rel!=NULL. That implication is true before the patch but false after the patch. > > > @@ -4667,19 +4698,31 @@ RelationCopyStorageUsingBuffer(RelFileLocator srclocator, > > > > > /* Iterate over each block of the source relation file. */ > > > for (blkno = 0; blkno < nblocks; blkno++) > > > { > > > CHECK_FOR_INTERRUPTS(); > > > > > > /* Read block from source relation. */ > > > - srcBuf = ReadBufferWithoutRelcache(srclocator, forkNum, blkno, > > > - RBM_NORMAL, bstrategy_src, > > > - permanent); > > > + srcBuf = read_stream_next_buffer(src_stream, NULL); > > > LockBuffer(srcBuf, BUFFER_LOCK_SHARE); > > > > I think this should check for read_stream_next_buffer() returning > > InvalidBuffer. pg_prewarm doesn't, but the other callers do, and I think the > > other callers are a better model. LockBuffer() doesn't check the > > InvalidBuffer case, so let's avoid the style of using a > > read_stream_next_buffer() return value without checking. > > There is an assert in the LockBuffer which checks for the > InvalidBuffer. If that is not enough, we may add an if check for > InvalidBuffer but what should we do in this case? It should not > happen, so erroring out may be a good idea. I like this style from read_stream_reset(): while ((buffer = read_stream_next_buffer(stream, NULL)) != InvalidBuffer) { ... } That is, don't iterate over block numbers. Drain the stream until empty. If the stream returns a number of blocks higher or lower than we expected, we won't detect that, and that's okay. It's not a strong preference, so I'm open to arguments against that from you or others. A counterargument could be that read_stream_reset() doesn't know the buffer count, so it has no choice. The counterargument could say that callers knowing the block count should use the pg_prewarm() style, and others should use the read_stream_reset() style.
Re: Use read streams in CREATE DATABASE command when the strategy is wal_log
From
Nazir Bilal Yavuz
Date:
Hi, On Tue, 16 Jul 2024 at 15:19, Noah Misch <noah@leadboat.com> wrote: > > On Tue, Jul 16, 2024 at 02:11:20PM +0300, Nazir Bilal Yavuz wrote: > > On Fri, 12 Jul 2024 at 02:52, Noah Misch <noah@leadboat.com> wrote: > > > On Tue, Apr 16, 2024 at 02:12:19PM +0300, Nazir Bilal Yavuz wrote: > > > > --- a/src/backend/storage/aio/read_stream.c > > > > +++ b/src/backend/storage/aio/read_stream.c > > > > @@ -549,7 +549,7 @@ read_stream_begin_relation(int flags, > > > > { > > > > stream->ios[i].op.rel = rel; > > > > stream->ios[i].op.smgr = RelationGetSmgr(rel); > > > > - stream->ios[i].op.smgr_persistence = 0; > > > > + stream->ios[i].op.smgr_persistence = rel->rd_rel->relpersistence; > > > > > > Does the following comment in ReadBuffersOperation need an update? > > > > > > /* > > > * The following members should be set by the caller. If only smgr is > > > * provided without rel, then smgr_persistence can be set to override the > > > * default assumption of RELPERSISTENCE_PERMANENT. > > > */ > > > > I believe it does not need to be updated but I renamed > > 'ReadBuffersOperation.smgr_persistence' as > > 'ReadBuffersOperation.persistence'. So, this comment is updated as > > well. I think that rename suits better because persistence does not > > need to come from smgr, it could come from relation, too. Do you think > > it is a good idea? If it is, does it need a separate commit? > > The rename is good. I think the comment implies "persistence" is unused when > rel!=NULL. That implication is true before the patch but false after the > patch. What makes it false after the patch? I think the logic did not change. If there is rel, the value of persistence is obtained from 'rel->rd_rel->relpersistence'. If there is no rel, then smgr is used to obtain its value. > > > > @@ -4667,19 +4698,31 @@ RelationCopyStorageUsingBuffer(RelFileLocator srclocator, > > > > > > > /* Iterate over each block of the source relation file. */ > > > > for (blkno = 0; blkno < nblocks; blkno++) > > > > { > > > > CHECK_FOR_INTERRUPTS(); > > > > > > > > /* Read block from source relation. */ > > > > - srcBuf = ReadBufferWithoutRelcache(srclocator, forkNum, blkno, > > > > - RBM_NORMAL, bstrategy_src, > > > > - permanent); > > > > + srcBuf = read_stream_next_buffer(src_stream, NULL); > > > > LockBuffer(srcBuf, BUFFER_LOCK_SHARE); > > > > > > I think this should check for read_stream_next_buffer() returning > > > InvalidBuffer. pg_prewarm doesn't, but the other callers do, and I think the > > > other callers are a better model. LockBuffer() doesn't check the > > > InvalidBuffer case, so let's avoid the style of using a > > > read_stream_next_buffer() return value without checking. > > > > There is an assert in the LockBuffer which checks for the > > InvalidBuffer. If that is not enough, we may add an if check for > > InvalidBuffer but what should we do in this case? It should not > > happen, so erroring out may be a good idea. > > I like this style from read_stream_reset(): > > while ((buffer = read_stream_next_buffer(stream, NULL)) != InvalidBuffer) > { > ... > } > > That is, don't iterate over block numbers. Drain the stream until empty. If > the stream returns a number of blocks higher or lower than we expected, we > won't detect that, and that's okay. It's not a strong preference, so I'm open > to arguments against that from you or others. A counterargument could be that > read_stream_reset() doesn't know the buffer count, so it has no choice. The > counterargument could say that callers knowing the block count should use the > pg_prewarm() style, and others should use the read_stream_reset() style. I think what you said in the counter argument makes sense. Also, there is an 'Assert(read_stream_next_buffer(src_stream, NULL) == InvalidBuffer);' after the loop. Which means all the blocks in the stream are read and there is no block left. v3 is attached. The only change is 'read_stream.c' changes in the 0003 are moved to 0002. -- Regards, Nazir Bilal Yavuz Microsoft
Attachment
On Wed, Jul 17, 2024 at 12:22:49PM +0300, Nazir Bilal Yavuz wrote: > On Tue, 16 Jul 2024 at 15:19, Noah Misch <noah@leadboat.com> wrote: > > On Tue, Jul 16, 2024 at 02:11:20PM +0300, Nazir Bilal Yavuz wrote: > > > On Fri, 12 Jul 2024 at 02:52, Noah Misch <noah@leadboat.com> wrote: > > > > On Tue, Apr 16, 2024 at 02:12:19PM +0300, Nazir Bilal Yavuz wrote: > > > > > --- a/src/backend/storage/aio/read_stream.c > > > > > +++ b/src/backend/storage/aio/read_stream.c > > > > > @@ -549,7 +549,7 @@ read_stream_begin_relation(int flags, > > > > > { > > > > > stream->ios[i].op.rel = rel; > > > > > stream->ios[i].op.smgr = RelationGetSmgr(rel); > > > > > - stream->ios[i].op.smgr_persistence = 0; > > > > > + stream->ios[i].op.smgr_persistence = rel->rd_rel->relpersistence; > > > > > > > > Does the following comment in ReadBuffersOperation need an update? > > > > > > > > /* > > > > * The following members should be set by the caller. If only smgr is > > > > * provided without rel, then smgr_persistence can be set to override the > > > > * default assumption of RELPERSISTENCE_PERMANENT. > > > > */ > > > > > > I believe it does not need to be updated but I renamed > > > 'ReadBuffersOperation.smgr_persistence' as > > > 'ReadBuffersOperation.persistence'. So, this comment is updated as > > > well. I think that rename suits better because persistence does not > > > need to come from smgr, it could come from relation, too. Do you think > > > it is a good idea? If it is, does it need a separate commit? > > > > The rename is good. I think the comment implies "persistence" is unused when > > rel!=NULL. That implication is true before the patch but false after the > > patch. > > What makes it false after the patch? I think the logic did not change. > If there is rel, the value of persistence is obtained from > 'rel->rd_rel->relpersistence'. If there is no rel, then smgr is used > to obtain its value. First, the patch removes the "default assumption of RELPERSISTENCE_PERMANENT". It's now an assertion failure. The second point is about "If only smgr is provided without rel". Before the patch, the extern functions that take a ReadBuffersOperation argument examine smgr_persistence if and only if rel==NULL. That's consistent with the comment. After the patch, StartReadBuffersImpl() calling PinBufferForBlock() uses the field unconditionally. On that note, does WaitReadBuffers() still have a reason to calculate its persistence as follows, or should this patch make it "persistence = operation->persistence"? persistence = operation->rel ? operation->rel->rd_rel->relpersistence : RELPERSISTENCE_PERMANENT; > I think what you said in the counter argument makes sense. Okay.
Re: Use read streams in CREATE DATABASE command when the strategy is wal_log
From
Nazir Bilal Yavuz
Date:
Hi, On Wed, 17 Jul 2024 at 23:41, Noah Misch <noah@leadboat.com> wrote: > > On Wed, Jul 17, 2024 at 12:22:49PM +0300, Nazir Bilal Yavuz wrote: > > On Tue, 16 Jul 2024 at 15:19, Noah Misch <noah@leadboat.com> wrote: > > > On Tue, Jul 16, 2024 at 02:11:20PM +0300, Nazir Bilal Yavuz wrote: > > > > On Fri, 12 Jul 2024 at 02:52, Noah Misch <noah@leadboat.com> wrote: > > > > > On Tue, Apr 16, 2024 at 02:12:19PM +0300, Nazir Bilal Yavuz wrote: > > > > > > --- a/src/backend/storage/aio/read_stream.c > > > > > > +++ b/src/backend/storage/aio/read_stream.c > > > > > > @@ -549,7 +549,7 @@ read_stream_begin_relation(int flags, > > > > > > { > > > > > > stream->ios[i].op.rel = rel; > > > > > > stream->ios[i].op.smgr = RelationGetSmgr(rel); > > > > > > - stream->ios[i].op.smgr_persistence = 0; > > > > > > + stream->ios[i].op.smgr_persistence = rel->rd_rel->relpersistence; > > > > > > > > > > Does the following comment in ReadBuffersOperation need an update? > > > > > > > > > > /* > > > > > * The following members should be set by the caller. If only smgr is > > > > > * provided without rel, then smgr_persistence can be set to override the > > > > > * default assumption of RELPERSISTENCE_PERMANENT. > > > > > */ > > > > > > > > I believe it does not need to be updated but I renamed > > > > 'ReadBuffersOperation.smgr_persistence' as > > > > 'ReadBuffersOperation.persistence'. So, this comment is updated as > > > > well. I think that rename suits better because persistence does not > > > > need to come from smgr, it could come from relation, too. Do you think > > > > it is a good idea? If it is, does it need a separate commit? > > > > > > The rename is good. I think the comment implies "persistence" is unused when > > > rel!=NULL. That implication is true before the patch but false after the > > > patch. > > > > What makes it false after the patch? I think the logic did not change. > > If there is rel, the value of persistence is obtained from > > 'rel->rd_rel->relpersistence'. If there is no rel, then smgr is used > > to obtain its value. > > First, the patch removes the "default assumption of RELPERSISTENCE_PERMANENT". > It's now an assertion failure. > > The second point is about "If only smgr is provided without rel". Before the > patch, the extern functions that take a ReadBuffersOperation argument examine > smgr_persistence if and only if rel==NULL. That's consistent with the > comment. After the patch, StartReadBuffersImpl() calling PinBufferForBlock() > uses the field unconditionally. I see, thanks for the explanation. I removed that part of the comment. > > On that note, does WaitReadBuffers() still have a reason to calculate its > persistence as follows, or should this patch make it "persistence = > operation->persistence"? > > persistence = operation->rel > ? operation->rel->rd_rel->relpersistence > : RELPERSISTENCE_PERMANENT; Nice catch, I do not think it is needed now. WaitReadBuffers() is called only from ReadBuffer_common() and read_stream_next_buffer(). For the ReadBuffer_common(), persistence is calculated before calling WaitReadBuffers(). And for the read_stream_next_buffer(), it is calculated while creating a read stream object in the read_stream_begin_impl(). v4 is attached. -- Regards, Nazir Bilal Yavuz Microsoft
Attachment
On Thu, Jul 18, 2024 at 02:11:13PM +0300, Nazir Bilal Yavuz wrote: > v4 is attached. Removal of the PinBufferForBlock() comment about the "persistence = RELPERSISTENCE_PERMANENT" fallback started to feel like a loss. I looked for a way to re-add a comment about the fallback. https://coverage.postgresql.org/src/backend/storage/buffer/bufmgr.c.gcov.html shows no test coverage of that fallback, and I think the fallback is unreachable. Hence, I've removed the fallback in a separate commit. I've pushed that and your three patches. Thanks.
Re: Use read streams in CREATE DATABASE command when the strategy is wal_log
From
Nazir Bilal Yavuz
Date:
Hi, On Sat, 20 Jul 2024 at 14:27, Noah Misch <noah@leadboat.com> wrote: > > On Thu, Jul 18, 2024 at 02:11:13PM +0300, Nazir Bilal Yavuz wrote: > > v4 is attached. > > Removal of the PinBufferForBlock() comment about the "persistence = > RELPERSISTENCE_PERMANENT" fallback started to feel like a loss. I looked for > a way to re-add a comment about the fallback. > https://coverage.postgresql.org/src/backend/storage/buffer/bufmgr.c.gcov.html > shows no test coverage of that fallback, and I think the fallback is > unreachable. Hence, I've removed the fallback in a separate commit. I've > pushed that and your three patches. Thanks. Thanks for the separate commit and push! With the separate commit (e00c45f685), does it make sense to rename the smgr_persistence parameter of the ReadBuffer_common() to persistence? Because, ExtendBufferedRelTo() calls ReadBuffer_common() with rel's persistence now, not with smgr's persistence. -- Regards, Nazir Bilal Yavuz Microsoft
On Sat, Jul 20, 2024 at 03:01:31PM +0300, Nazir Bilal Yavuz wrote: > On Sat, 20 Jul 2024 at 14:27, Noah Misch <noah@leadboat.com> wrote: > > > > On Thu, Jul 18, 2024 at 02:11:13PM +0300, Nazir Bilal Yavuz wrote: > > > v4 is attached. > > > > Removal of the PinBufferForBlock() comment about the "persistence = > > RELPERSISTENCE_PERMANENT" fallback started to feel like a loss. I looked for > > a way to re-add a comment about the fallback. > > https://coverage.postgresql.org/src/backend/storage/buffer/bufmgr.c.gcov.html > > shows no test coverage of that fallback, and I think the fallback is > > unreachable. Hence, I've removed the fallback in a separate commit. I've > > pushed that and your three patches. Thanks. > > Thanks for the separate commit and push! > > With the separate commit (e00c45f685), does it make sense to rename > the smgr_persistence parameter of the ReadBuffer_common() to > persistence? Because, ExtendBufferedRelTo() calls ReadBuffer_common() > with rel's persistence now, not with smgr's persistence. BMR_REL() doesn't set relpersistence, so bmr.relpersistence is associated with bmr.smgr and is unset if bmr.rel is set. That is to say, bmr.relpersistence is an smgr_persistence. It could make sense to change ReadBuffer_common() to take a BufferManagerRelation instead of the three distinct arguments. On a different naming topic, my review missed that field name copy_storage_using_buffer_read_stream_private.last_block doesn't fit how the field is used. Code uses it like an nblocks. So let's either rename the field or change the code to use it as a last_block (e.g. initialize it to nblocks-1, not nblocks).
Re: Use read streams in CREATE DATABASE command when the strategy is wal_log
From
Nazir Bilal Yavuz
Date:
Hi, On Sat, 20 Jul 2024 at 21:14, Noah Misch <noah@leadboat.com> wrote: > > On Sat, Jul 20, 2024 at 03:01:31PM +0300, Nazir Bilal Yavuz wrote: > > > > With the separate commit (e00c45f685), does it make sense to rename > > the smgr_persistence parameter of the ReadBuffer_common() to > > persistence? Because, ExtendBufferedRelTo() calls ReadBuffer_common() > > with rel's persistence now, not with smgr's persistence. > > BMR_REL() doesn't set relpersistence, so bmr.relpersistence is associated with > bmr.smgr and is unset if bmr.rel is set. That is to say, bmr.relpersistence > is an smgr_persistence. It could make sense to change ReadBuffer_common() to > take a BufferManagerRelation instead of the three distinct arguments. Got it. > > On a different naming topic, my review missed that field name > copy_storage_using_buffer_read_stream_private.last_block doesn't fit how the > field is used. Code uses it like an nblocks. So let's either rename the > field or change the code to use it as a last_block (e.g. initialize it to > nblocks-1, not nblocks). I prefered renaming it as nblocks, since that is how it was used in RelationCopyStorageUsingBuffer() before. Also, I realized that instead of setting p.blocknum = 0; initializing blkno as 0 and using p.blocknum = blkno makes sense. Because, p.blocknum and blkno should always start with the same block number. The relevant patch is attached. -- Regards, Nazir Bilal Yavuz Microsoft
Attachment
On Mon, Jul 22, 2024 at 12:00:45PM +0300, Nazir Bilal Yavuz wrote: > On Sat, 20 Jul 2024 at 21:14, Noah Misch <noah@leadboat.com> wrote: > > On a different naming topic, my review missed that field name > > copy_storage_using_buffer_read_stream_private.last_block doesn't fit how the > > field is used. Code uses it like an nblocks. So let's either rename the > > field or change the code to use it as a last_block (e.g. initialize it to > > nblocks-1, not nblocks). > > I prefered renaming it as nblocks, since that is how it was used in > RelationCopyStorageUsingBuffer() before. Also, I realized that instead > of setting p.blocknum = 0; initializing blkno as 0 and using > p.blocknum = blkno makes sense. Because, p.blocknum and blkno should > always start with the same block number. The relevant patch is > attached. I felt the local variable change was not a clear improvement. It would have been fine for the original patch to do it in that style, but the style of the original patch was also fine. So I've pushed just the struct field rename.