Thread: parallelizing the archiver
Hi hackers, I'd like to gauge interest in parallelizing the archiver process. From a quick scan, I was only able to find one recent thread [0] that brought up this topic, and ISTM the conventional wisdom is to use a backup utility like pgBackRest that does things in parallel behind- the-scenes. My experience is that the generating-more-WAL-than-we- can-archive problem is pretty common, and parallelization seems to help quite a bit, so perhaps it's a good time to consider directly supporting parallel archiving in PostgreSQL. Based on previous threads I've seen, I believe many in the community would like to replace archive_command entirely, but what I'm proposing here would build on the existing tools. I'm currently thinking of something a bit like autovacuum_max_workers, but the archive workers would be created once and would follow a competing consumers model. Another approach I'm looking at is to use background worker processes, although I'm not sure if linking such a critical piece of functionality to max_worker_processes is a good idea. However, I do see that logical replication uses background workers. Anyway, I'm curious what folks think about this. I think it'd help simplify server administration for many users. Nathan [0] https://www.postgresql.org/message-id/flat/20180828060221.x33gokifqi3csjj4%40depesz.com
On Wed, Sep 8, 2021 at 6:36 AM Bossart, Nathan <bossartn@amazon.com> wrote: > > I'd like to gauge interest in parallelizing the archiver process. > [...] > Based on previous threads I've seen, I believe many in the community > would like to replace archive_command entirely, but what I'm proposing > here would build on the existing tools. Having a new implementation that would remove the archive_command is probably a better long term solution, but I don't know of anyone working on that and it's probably gonna take some time. Right now we have a lot of users that face archiving bottleneck so I think it would be a good thing to implement parallel archiving, fully compatible with current archive_command, as a short term solution.
On 9/7/21, 11:38 PM, "Julien Rouhaud" <rjuju123@gmail.com> wrote: > On Wed, Sep 8, 2021 at 6:36 AM Bossart, Nathan <bossartn@amazon.com> wrote: >> >> I'd like to gauge interest in parallelizing the archiver process. >> [...] >> Based on previous threads I've seen, I believe many in the community >> would like to replace archive_command entirely, but what I'm proposing >> here would build on the existing tools. > > Having a new implementation that would remove the archive_command is > probably a better long term solution, but I don't know of anyone > working on that and it's probably gonna take some time. Right now we > have a lot of users that face archiving bottleneck so I think it would > be a good thing to implement parallel archiving, fully compatible with > current archive_command, as a short term solution. Thanks for chiming in. I'm planning to work on a patch next week. Nathan
On Fri, Sep 10, 2021 at 6:30 AM Bossart, Nathan <bossartn@amazon.com> wrote: > > Thanks for chiming in. I'm planning to work on a patch next week. Great news! About the technical concerns: > I'm currently thinking of > something a bit like autovacuum_max_workers, but the archive workers > would be created once and would follow a competing consumers model. In this approach, the launched archiver workers would be kept as long as the instance is up, or should they be stopped if they're not required anymore, e.g. if there was a temporary write activity spike? I think we should make sure that at least one worker is always up. > Another approach I'm looking at is to use background worker processes, > although I'm not sure if linking such a critical piece of > functionality to max_worker_processes is a good idea. However, I do > see that logical replication uses background workers. I think that using background workers is a good approach, and the various guc in that area should allow users to properly configure archiving too. If that's not the case, it might be an opportunity to add some new infrastructure that could benefit all bgworkers users.
> 8 сент. 2021 г., в 03:36, Bossart, Nathan <bossartn@amazon.com> написал(а): > > Anyway, I'm curious what folks think about this. I think it'd help > simplify server administration for many users. BTW this thread is also related [0]. My 2 cents. It's OK if external tool is responsible for concurrency. Do we want this complexity in core? Many users do not enable archivingat all. Maybe just add parallelism API for external tool? It's much easier to control concurrency in external tool that in PostgreSQL core. Maintaining parallel worker is a tremendouslyharder than spawning goroutine, thread, task or whatever. External tool needs to know when xlog segment is ready and needs to report when it's done. Postgres should just ensure thatexternal archiever\restorer is running. For example external tool could read xlog names from stdin and report finished files from stdout. I can prototype such toolswiftly :) E.g. postgres runs ```wal-g wal-archiver``` and pushes ready segment filenames on stdin. And no more listing of archive_statusand hacky algorithms to predict next WAL name and completition time! Thoughts? Best regards, Andrey Borodin. [0] https://www.postgresql.org/message-id/flat/CA%2BTgmobhAbs2yabTuTRkJTq_kkC80-%2Bjw%3DpfpypdOJ7%2BgAbQbw%40mail.gmail.com
On Fri, Sep 10, 2021 at 1:28 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote: > > It's OK if external tool is responsible for concurrency. Do we want this complexity in core? Many users do not enable archivingat all. > Maybe just add parallelism API for external tool? > It's much easier to control concurrency in external tool that in PostgreSQL core. Maintaining parallel worker is a tremendouslyharder than spawning goroutine, thread, task or whatever. Yes, but it also means that it's up to every single archiving tool to implement a somewhat hackish parallel version of an archive_command, hoping that core won't break it. If this problem is solved in postgres core whithout API change, then all existing tool will automatically benefit from it (maybe not the one who used to have hacks to make it parallel though, but it seems easier to disable it rather than implement it). > External tool needs to know when xlog segment is ready and needs to report when it's done. Postgres should just ensurethat external archiever\restorer is running. > For example external tool could read xlog names from stdin and report finished files from stdout. I can prototype suchtool swiftly :) > E.g. postgres runs ```wal-g wal-archiver``` and pushes ready segment filenames on stdin. And no more listing of archive_statusand hacky algorithms to predict next WAL name and completition time! Yes, but that requires fundamental design changes for the archive commands right? So while I agree it could be a better approach overall, it seems like a longer term option. As far as I understand, what Nathan suggested seems more likely to be achieved in pg15 and could benefit from a larger set of backup solutions. This can give us enough time to properly design a better approach for designing a new archiving approach.
> 10 сент. 2021 г., в 10:52, Julien Rouhaud <rjuju123@gmail.com> написал(а): > > On Fri, Sep 10, 2021 at 1:28 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote: >> >> It's OK if external tool is responsible for concurrency. Do we want this complexity in core? Many users do not enablearchiving at all. >> Maybe just add parallelism API for external tool? >> It's much easier to control concurrency in external tool that in PostgreSQL core. Maintaining parallel worker is a tremendouslyharder than spawning goroutine, thread, task or whatever. > > Yes, but it also means that it's up to every single archiving tool to > implement a somewhat hackish parallel version of an archive_command, > hoping that core won't break it. I'm not proposing to remove existing archive_command. Just deprecate it one-WAL-per-call form. > If this problem is solved in > postgres core whithout API change, then all existing tool will > automatically benefit from it (maybe not the one who used to have > hacks to make it parallel though, but it seems easier to disable it > rather than implement it). True hacky tools already can coordinate swarm of their processes and are prepared that they are called multiple times concurrently:) >> External tool needs to know when xlog segment is ready and needs to report when it's done. Postgres should just ensurethat external archiever\restorer is running. >> For example external tool could read xlog names from stdin and report finished files from stdout. I can prototype suchtool swiftly :) >> E.g. postgres runs ```wal-g wal-archiver``` and pushes ready segment filenames on stdin. And no more listing of archive_statusand hacky algorithms to predict next WAL name and completition time! > > Yes, but that requires fundamental design changes for the archive > commands right? So while I agree it could be a better approach > overall, it seems like a longer term option. As far as I understand, > what Nathan suggested seems more likely to be achieved in pg15 and > could benefit from a larger set of backup solutions. This can give us > enough time to properly design a better approach for designing a new > archiving approach. It's a very simplistic approach. If some GUC is set - archiver will just feed ready files to stdin of archive command. Whatfundamental design changes we need? Best regards, Andrey Borodin.
On Fri, Sep 10, 2021 at 2:03 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote: > > > 10 сент. 2021 г., в 10:52, Julien Rouhaud <rjuju123@gmail.com> написал(а): > > > > Yes, but it also means that it's up to every single archiving tool to > > implement a somewhat hackish parallel version of an archive_command, > > hoping that core won't break it. > I'm not proposing to remove existing archive_command. Just deprecate it one-WAL-per-call form. Which is a big API beak. > It's a very simplistic approach. If some GUC is set - archiver will just feed ready files to stdin of archive command.What fundamental design changes we need? I'm talking about the commands themselves. Your suggestion is to change archive_command to be able to spawn a daemon, and it looks like a totally different approach. I'm not saying that having a daemon based approach to take care of archiving is a bad idea, I'm saying that trying to fit that with the current archive_command + some new GUC looks like a bad idea.
> 10 сент. 2021 г., в 11:11, Julien Rouhaud <rjuju123@gmail.com> написал(а): > > On Fri, Sep 10, 2021 at 2:03 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote: >> >>> 10 сент. 2021 г., в 10:52, Julien Rouhaud <rjuju123@gmail.com> написал(а): >>> >>> Yes, but it also means that it's up to every single archiving tool to >>> implement a somewhat hackish parallel version of an archive_command, >>> hoping that core won't break it. >> I'm not proposing to remove existing archive_command. Just deprecate it one-WAL-per-call form. > > Which is a big API beak. Huge extension, not a break. >> It's a very simplistic approach. If some GUC is set - archiver will just feed ready files to stdin of archive command.What fundamental design changes we need? > > I'm talking about the commands themselves. Your suggestion is to > change archive_command to be able to spawn a daemon, and it looks like > a totally different approach. I'm not saying that having a daemon > based approach to take care of archiving is a bad idea, I'm saying > that trying to fit that with the current archive_command + some new > GUC looks like a bad idea. It fits nicely, even in corner cases. E.g. restore_command run from pg_rewind seems compatible with this approach. One more example: after failover DBA can just ```ls|wal-g wal-push``` to archive all WALs unarchived before network partition. This is simple yet powerful approach, without any contradiction to existing archive_command API. Why it's a bad idea? Best regards, Andrey Borodin.
On Tue, Sep 7, 2021 at 6:36 PM Bossart, Nathan <bossartn@amazon.com> wrote: > Based on previous threads I've seen, I believe many in the community > would like to replace archive_command entirely, but what I'm proposing > here would build on the existing tools. I'm currently thinking of > something a bit like autovacuum_max_workers, but the archive workers > would be created once and would follow a competing consumers model. To me, it seems way more beneficial to think about being able to invoke archive_command with many files at a time instead of just one. I think for most plausible archive commands that would be way more efficient than what you propose here. It's *possible* that if we had that, we'd still want this, but I'm not even convinced. -- Robert Haas EDB: http://www.enterprisedb.com
On Fri, Sep 10, 2021 at 9:13 PM Robert Haas <robertmhaas@gmail.com> wrote: > > To me, it seems way more beneficial to think about being able to > invoke archive_command with many files at a time instead of just one. > I think for most plausible archive commands that would be way more > efficient than what you propose here. It's *possible* that if we had > that, we'd still want this, but I'm not even convinced. Those approaches don't really seems mutually exclusive? In both case you will need to internally track the status of each WAL file and handle non contiguous file sequences. In case of parallel commands you only need additional knowledge that some commands is already working on a file. Wouldn't it be even better to eventually be able launch multiple batches of multiple files rather than a single batch? If we start with parallelism first, the whole ecosystem could immediately benefit from it as is. To be able to handle multiple files in a single command, we would need some way to let the server know which files were successfully archived and which files weren't, so it requires a different communication approach than the command return code. But as I said, I'm not convinced that using the archive_command approach for that is the best approach If I understand correctly, most of the backup solutions would prefer to have a daemon being launched and use it at a queuing system. Wouldn't it be better to have a new archive_mode, e.g. "daemon", and have postgres responsible to (re)start it, and pass information through the daemon's stdin/stdout or something like that?
On Fri, Sep 10, 2021 at 10:19 AM Julien Rouhaud <rjuju123@gmail.com> wrote: > Those approaches don't really seems mutually exclusive? In both case > you will need to internally track the status of each WAL file and > handle non contiguous file sequences. In case of parallel commands > you only need additional knowledge that some commands is already > working on a file. Wouldn't it be even better to eventually be able > launch multiple batches of multiple files rather than a single batch? Well, I guess I'm not convinced. Perhaps people with more knowledge of this than I may already know why it's beneficial, but in my experience commands like 'cp' and 'scp' are usually limited by the speed of I/O, not the fact that you only have one of them running at once. Running several at once, again in my experience, is typically not much faster. On the other hand, scp has a LOT of startup overhead, so it's easy to see the benefits of batching. [rhaas pgsql]$ touch x y z [rhaas pgsql]$ time sh -c 'scp x cthulhu: && scp y cthulhu: && scp z cthulhu:' x 100% 207KB 78.8KB/s 00:02 y 100% 0 0.0KB/s 00:00 z 100% 0 0.0KB/s 00:00 real 0m9.418s user 0m0.045s sys 0m0.071s [rhaas pgsql]$ time sh -c 'scp x y z cthulhu:' x 100% 207KB 273.1KB/s 00:00 y 100% 0 0.0KB/s 00:00 z 100% 0 0.0KB/s 00:00 real 0m3.216s user 0m0.017s sys 0m0.020s > If we start with parallelism first, the whole ecosystem could > immediately benefit from it as is. To be able to handle multiple > files in a single command, we would need some way to let the server > know which files were successfully archived and which files weren't, > so it requires a different communication approach than the command > return code. That is possibly true. I think it might work to just assume that you have to retry everything if it exits non-zero, but that requires the archive command to be smart enough to do something sensible if an identical file is already present in the archive. > But as I said, I'm not convinced that using the archive_command > approach for that is the best approach If I understand correctly, > most of the backup solutions would prefer to have a daemon being > launched and use it at a queuing system. Wouldn't it be better to > have a new archive_mode, e.g. "daemon", and have postgres responsible > to (re)start it, and pass information through the daemon's > stdin/stdout or something like that? Sure. Actually, I think a background worker would be better than a separate daemon. Then it could just talk to shared memory directly. -- Robert Haas EDB: http://www.enterprisedb.com
On Fri, Sep 10, 2021 at 11:22 PM Robert Haas <robertmhaas@gmail.com> wrote: > > Well, I guess I'm not convinced. Perhaps people with more knowledge of > this than I may already know why it's beneficial, but in my experience > commands like 'cp' and 'scp' are usually limited by the speed of I/O, > not the fact that you only have one of them running at once. Running > several at once, again in my experience, is typically not much faster. > On the other hand, scp has a LOT of startup overhead, so it's easy to > see the benefits of batching. I totally agree that batching as many file as possible in a single command is probably what's gonna achieve the best performance. But if the archiver only gets an answer from the archive_command once it tried to process all of the file, it also means that postgres won't be able to remove any WAL file until all of them could be processed. It means that users will likely have to limit the batch size and therefore pay more startup overhead than they would like. In case of archiving on server with high latency / connection overhead it may be better to be able to run multiple commands in parallel. I may be overthinking here and definitely having feedback from people with more experience around that would be welcome. > That is possibly true. I think it might work to just assume that you > have to retry everything if it exits non-zero, but that requires the > archive command to be smart enough to do something sensible if an > identical file is already present in the archive. Yes, it could be. I think that we need more feedback for that too. > Sure. Actually, I think a background worker would be better than a > separate daemon. Then it could just talk to shared memory directly. I thought about it too, but I was under the impression that most people would want to implement a custom daemon (or already have) with some more parallel/thread friendly language.
> 10 сент. 2021 г., в 19:19, Julien Rouhaud <rjuju123@gmail.com> написал(а): > Wouldn't it be better to > have a new archive_mode, e.g. "daemon", and have postgres responsible > to (re)start it, and pass information through the daemon's > stdin/stdout or something like that? We don't even need to introduce new archive_mode. Currently archive_command has no expectations regarding stdin\stdout. Let's just say that we will push new WAL names to stdin until archive_command exits. And if archive_command prints something to stdout we will interpret it as archived WAL names. That's it. Existing archive_commands will continue as is. Currently information about what is archived is stored on filesystem in archive_status dir. We do not need to change anything. If archive_command exits (with any exit code) we will restart it if there are WAL files that still were not archived. Best regards, Andrey Borodin.
On 9/10/21, 8:22 AM, "Robert Haas" <robertmhaas@gmail.com> wrote: > On Fri, Sep 10, 2021 at 10:19 AM Julien Rouhaud <rjuju123@gmail.com> wrote: >> Those approaches don't really seems mutually exclusive? In both case >> you will need to internally track the status of each WAL file and >> handle non contiguous file sequences. In case of parallel commands >> you only need additional knowledge that some commands is already >> working on a file. Wouldn't it be even better to eventually be able >> launch multiple batches of multiple files rather than a single batch? > > Well, I guess I'm not convinced. Perhaps people with more knowledge of > this than I may already know why it's beneficial, but in my experience > commands like 'cp' and 'scp' are usually limited by the speed of I/O, > not the fact that you only have one of them running at once. Running > several at once, again in my experience, is typically not much faster. > On the other hand, scp has a LOT of startup overhead, so it's easy to > see the benefits of batching. > > [...] > >> If we start with parallelism first, the whole ecosystem could >> immediately benefit from it as is. To be able to handle multiple >> files in a single command, we would need some way to let the server >> know which files were successfully archived and which files weren't, >> so it requires a different communication approach than the command >> return code. > > That is possibly true. I think it might work to just assume that you > have to retry everything if it exits non-zero, but that requires the > archive command to be smart enough to do something sensible if an > identical file is already present in the archive. My initial thinking was similar to Julien's. Assuming I have an archive_command that handles one file, I can just set archive_max_workers to 3 and reap the benefits. If I'm using an existing utility that implements its own parallelism, I can keep archive_max_workers at 1 and continue using it. This would be a simple incremental improvement. That being said, I think the discussion about batching is a good one to have. If the overhead described in your SCP example is representative of a typical archive_command, then parallelism does seem a bit silly. We'd essentially be using a ton more resources when there's obvious room for improvement via reducing amount of overhead per archive. I think we could easily make the batch size configurable so that existing archive commands would work (e.g., archive_batch_size=1). However, unlike the simple parallel approach, you'd likely have to adjust your archive_command if you wanted to make use of batching. That doesn't seem terrible to me, though. As discussed above, there are some implementation details to work out for archive failures, but nothing about that seems intractable to me. Plus, if you still wanted to parallelize things, feeding your archive_command several files at a time could still be helpful. I'm currently leaning toward exploring the batching approach first. I suppose we could always make a prototype of both solutions for comparison with some "typical" archive commands if that would help with the discussion. Nathan
On Fri, 2021-09-10 at 23:48 +0800, Julien Rouhaud wrote: > I totally agree that batching as many file as possible in a single > command is probably what's gonna achieve the best performance. But if > the archiver only gets an answer from the archive_command once it > tried to process all of the file, it also means that postgres won't be > able to remove any WAL file until all of them could be processed. It > means that users will likely have to limit the batch size and > therefore pay more startup overhead than they would like. In case of > archiving on server with high latency / connection overhead it may be > better to be able to run multiple commands in parallel. Well, users would also have to limit the parallelism, right? If connections are high-overhead, I wouldn't imagine that running hundreds of them simultaneously would work very well in practice. (The proof would be in an actual benchmark, obviously, but usually I would rather have one process handling a hundred items than a hundred processes handling one item each.) For a batching scheme, would it be that big a deal to wait for all of them to be archived before removal? > > That is possibly true. I think it might work to just assume that you > > have to retry everything if it exits non-zero, but that requires the > > archive command to be smart enough to do something sensible if an > > identical file is already present in the archive. > > Yes, it could be. I think that we need more feedback for that too. Seems like this is the sticking point. What would be the smartest thing for the command to do? If there's a destination file already, checksum it and make sure it matches the source before continuing? --Jacob
On Fri, Sep 10, 2021 at 11:49 AM Julien Rouhaud <rjuju123@gmail.com> wrote: > I totally agree that batching as many file as possible in a single > command is probably what's gonna achieve the best performance. But if > the archiver only gets an answer from the archive_command once it > tried to process all of the file, it also means that postgres won't be > able to remove any WAL file until all of them could be processed. It > means that users will likely have to limit the batch size and > therefore pay more startup overhead than they would like. In case of > archiving on server with high latency / connection overhead it may be > better to be able to run multiple commands in parallel. I may be > overthinking here and definitely having feedback from people with more > experience around that would be welcome. That's a fair point. I'm not sure how much it matters, though. I think you want to imagine a system where there are let's say 10 WAL flies being archived per second. Using fork() + exec() to spawn a shell command 10 times per second is a bit expensive, whether you do it serially or in parallel, and even if the command is something with a less-insane startup overhead than scp. If we start a shell command say every 3 seconds and give it 30 files each time, we can reduce the startup costs we're paying by ~97% at the price of having to wait up to 3 additional seconds to know that archiving succeeded for any particular file. That sounds like a pretty good trade-off, because the main benefit of removing old files is that it keeps us from running out of disk space, and you should not be running a busy system in such a way that it is ever within 3 seconds of running out of disk space, so whatever. If on the other hand you imagine a system that's not very busy, say 1 WAL file being archived every 10 seconds, then using a batch size of 30 would very significantly delay removal of old files. However, on this system, batching probably isn't really needed. The rate of WAL file generation is low enough that if you pay the startup cost of your archive_command for every file, you're probably still doing just fine. Probably, any kind of parallelism or batching needs to take this kind of time-based thinking into account. For batching, the rate at which files are generated should affect the batch size. For parallelism, it should affect the number of processes used. -- Robert Haas EDB: http://www.enterprisedb.com
On 9/10/21, 10:12 AM, "Robert Haas" <robertmhaas@gmail.com> wrote: > If on the other hand you imagine a system that's not very busy, say 1 > WAL file being archived every 10 seconds, then using a batch size of > 30 would very significantly delay removal of old files. However, on > this system, batching probably isn't really needed. The rate of WAL > file generation is low enough that if you pay the startup cost of your > archive_command for every file, you're probably still doing just fine. > > Probably, any kind of parallelism or batching needs to take this kind > of time-based thinking into account. For batching, the rate at which > files are generated should affect the batch size. For parallelism, it > should affect the number of processes used. I was thinking that archive_batch_size would be the maximum batch size. If the archiver only finds a single file to archive, that's all it'd send to the archive command. If it finds more, it'd send up to archive_batch_size to the command. Nathan
On Fri, Sep 10, 2021 at 1:07 PM Bossart, Nathan <bossartn@amazon.com> wrote: > That being said, I think the discussion about batching is a good one > to have. If the overhead described in your SCP example is > representative of a typical archive_command, then parallelism does > seem a bit silly. I think that's pretty realistic, because a lot of people's archive commands are going to actually be, or need to use, scp specifically. However, there are also cases where people are using commands that just put the file in some local directory (maybe on a remote mount point) and I would expect the startup overhead to be much less in those cases. Maybe people are archiving via HTTPS or similar as well, and then you again have some connection overhead though, I suspect, not as much as scp, since web pages do not take 3 seconds to get an https connection going. I don't know why scp is so crazy slow. Even in the relatively low-overhead cases, though, I think we would want to do some real testing to see if the benefits are as we expect. See http://postgr.es/m/20200420211018.w2qphw4yybcbxksl@alap3.anarazel.de and downthread for context. I was *convinced* that parallel backup was a win. Benchmarking was a tad underwhelming, but there was a clear if modest benefit by running a synthetic test of copying a lot of files serially or in parallel, with the files spread across multiple filesystems on the same physical box. However, when Andres modified my test program to use posix_fadvise(), posix_fallocate(), and sync_file_range() while doing the copies, the benefits of parallelism largely evaporated, and in fact in some cases enabling parallelism caused major regressions. In other words, the apparent benefits of parallelism were really due to suboptimal behaviors in the Linux page cache and some NUMA effects that were in fact avoidable. So I'm suspicious that the same things might end up being true here. It's not exactly the same, because the goal of WAL archiving is to keep up with the rate of WAL generation, and the goal of a backup is (unless max-rate is used) to finish as fast as possible, and that difference in goals might end up being significant. Also, you can make an argument that some people will benefit from a parallelism feature even if a perfectly-implemented archive_command doesn't, because many people use really terrible archive_commnads. But all that said, I think the parallel backup discussion is still a cautionary tale to which some attention ought to be paid. > We'd essentially be using a ton more resources when > there's obvious room for improvement via reducing amount of overhead > per archive. I think we could easily make the batch size configurable > so that existing archive commands would work (e.g., > archive_batch_size=1). However, unlike the simple parallel approach, > you'd likely have to adjust your archive_command if you wanted to make > use of batching. That doesn't seem terrible to me, though. As > discussed above, there are some implementation details to work out for > archive failures, but nothing about that seems intractable to me. > Plus, if you still wanted to parallelize things, feeding your > archive_command several files at a time could still be helpful. Yep. > I'm currently leaning toward exploring the batching approach first. I > suppose we could always make a prototype of both solutions for > comparison with some "typical" archive commands if that would help > with the discussion. Yeah, I think the concerns here are more pragmatic than philosophical, at least for me. I had kind of been thinking that the way to attack this problem is to go straight to allowing for a background worker, because the other problem with archive_command is that running a shell command like cp, scp, or rsync is not really safe. It won't fsync your data, it might not fail if the file is in the archive already, and it definitely won't succeed without doing anything if there's a byte for byte identical file in the archive and fail if there's a file with different contents already in the archive. Fixing that stuff by running different shell commands is hard, but it wouldn't be that hard to do it in C code, and you could then also extend whatever code you wrote to do batching and parallelism; starting more workers isn't hard. However, I can't see the idea of running a shell command going away any time soon, in spite of its numerous and severe drawbacks. Such an interface provides a huge degree of flexibility and allows system admins to whack around behavior easily, which you don't get if you have to code every change in C. So I think command-based enhancements are fine to pursue also, even though I don't think it's the ideal place for most users to end up. -- Robert Haas EDB: http://www.enterprisedb.com
> 10 сент. 2021 г., в 22:18, Bossart, Nathan <bossartn@amazon.com> написал(а): > > I was thinking that archive_batch_size would be the maximum batch > size. If the archiver only finds a single file to archive, that's all > it'd send to the archive command. If it finds more, it'd send up to > archive_batch_size to the command. I think that a concept of a "batch" is misleading. If you pass filenames via stdin you don't need to know all names upfront. Just send more names to the pipe if achiver_command is still running one more segments just became available. This way level of parallelism will adapt to the workload. Best regards, Andrey Borodin.
Greetings, * Julien Rouhaud (rjuju123@gmail.com) wrote: > On Fri, Sep 10, 2021 at 2:03 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote: > > > 10 сент. 2021 г., в 10:52, Julien Rouhaud <rjuju123@gmail.com> написал(а): > > > Yes, but it also means that it's up to every single archiving tool to > > > implement a somewhat hackish parallel version of an archive_command, > > > hoping that core won't break it. We've got too many archiving tools as it is, if you want my 2c on that. > > I'm not proposing to remove existing archive_command. Just deprecate it one-WAL-per-call form. > > Which is a big API beak. We definitely need to stop being afraid of this. We completely changed around how restores work and made pretty much all of the backup/restore tools have to make serious changes when we released v12. I definitely don't think that we should be making assumptions that changing archive command to start running things in parallel isn't *also* an API break too, in any case. It is also a change and there's definitely a good chance that it'd break some of the archivers out there. If we're going to make a change here, let's make a sensible one. > > It's a very simplistic approach. If some GUC is set - archiver will just feed ready files to stdin of archive command.What fundamental design changes we need? Haven't really thought about this proposal but it does sound interesting. Thanks, Stephen
Attachment
On Wed, Sep 15, 2021 at 4:14 AM Stephen Frost <sfrost@snowman.net> wrote: > > > > I'm not proposing to remove existing archive_command. Just deprecate it one-WAL-per-call form. > > > > Which is a big API beak. > > We definitely need to stop being afraid of this. We completely changed > around how restores work and made pretty much all of the backup/restore > tools have to make serious changes when we released v12. I never said that we should avoid API break at all cost, I said that if we break the API we should introduce something better. The proposal to pass multiple file names to the archive command said nothing about how to tell which ones were successfully archived and which ones weren't, which is a big problem in my opinion. But I think we should also consider different approach, such as maintaining some kind of daemon and asynchronously passing all the WAL file names, waiting for answers. Or maybe something else. It's just that simply "passing multiple file names" doesn't seem like a big enough win to justify an API break to me. > I definitely don't think that we should be making assumptions that > changing archive command to start running things in parallel isn't > *also* an API break too, in any case. It is also a change and there's > definitely a good chance that it'd break some of the archivers out > there. If we're going to make a change here, let's make a sensible one. But doing parallel archiving can and should be controlled with a GUC, so if your archive_command isn't compatible you can simply just not use it (on top of having a default of not using parallel archiving, at least for some times). It doesn't seem like a big problem.
On 9/10/21, 10:42 AM, "Robert Haas" <robertmhaas@gmail.com> wrote: > I had kind of been thinking that the way to attack this problem is to > go straight to allowing for a background worker, because the other > problem with archive_command is that running a shell command like cp, > scp, or rsync is not really safe. It won't fsync your data, it might > not fail if the file is in the archive already, and it definitely > won't succeed without doing anything if there's a byte for byte > identical file in the archive and fail if there's a file with > different contents already in the archive. Fixing that stuff by > running different shell commands is hard, but it wouldn't be that hard > to do it in C code, and you could then also extend whatever code you > wrote to do batching and parallelism; starting more workers isn't > hard. > > However, I can't see the idea of running a shell command going away > any time soon, in spite of its numerous and severe drawbacks. Such an > interface provides a huge degree of flexibility and allows system > admins to whack around behavior easily, which you don't get if you > have to code every change in C. So I think command-based enhancements > are fine to pursue also, even though I don't think it's the ideal > place for most users to end up. I've given this quite a bit of thought. I hacked together a batching approach for benchmarking, and it seemed to be a decent improvement, but you're still shelling out every N files, and all the stuff about shell commands not being ideal that you mentioned still applies. Perhaps it's still a good improvement, and maybe we should still do it, but I get the idea that many believe we can still do better. So, I looked into adding support for setting up archiving via an extension. The attached patch is a first try at adding alternatives for archive_command, restore_command, archive_cleanup_command, and recovery_end_command. It adds the GUCs archive_library, restore_library, archive_cleanup_library, and recovery_end_library. Each of these accepts a library name that is loaded at startup, similar to shared_preload_libraries. _PG_init() is still used for initialization, and you can use the same library for multiple purposes by checking the new exported variables (e.g., process_archive_library_in_progress). The library is then responsible for implementing the relevant function, such as _PG_archive() or _PG_restore(). The attached patch also demonstrates a simple implementation for an archive_library that is similar to the sample archive_command in the documentation. I tested the sample archive_command in the docs against the sample archive_library implementation in the patch, and I saw about a 50% speedup. (The archive_library actually syncs the files to disk, too.) This is similar to the improvement from batching. Of course, there are drawbacks to using an extension. Besides the obvious added complexity of building an extension in C versus writing a shell command, the patch disallows changing the libraries without restarting the server. Also, the patch makes no effort to simplify error handling, memory management, etc. This is left as an exercise for the extension author. I'm sure there are other ways to approach this, but I thought I'd give it a try to see what was possible and to get the conversation started. Nathan
Attachment
On 9/29/21, 9:49 PM, "Bossart, Nathan" <bossartn@amazon.com> wrote: > I'm sure there are other ways to approach this, but I thought I'd give > it a try to see what was possible and to get the conversation started. BTW I am also considering the background worker approach that was mentioned upthread. My current thinking is that the backup extension would define a special background worker that communicates with the archiver via shared memory. As noted upthread, this would enable extension authors to do whatever batching, parallelism, etc. that they want, and it should also prevent failures from taking down the archiver process. However, this approach might not make sense for things like recovery_end_command that are only executed once. Maybe it's okay to leave that one alone for now. Nathan
> 30 сент. 2021 г., в 09:47, Bossart, Nathan <bossartn@amazon.com> написал(а): > > The attached patch is a first try at adding alternatives for > archive_command Looks like an interesting alternative design. > I tested the sample archive_command in the docs against the sample > archive_library implementation in the patch, and I saw about a 50% > speedup. (The archive_library actually syncs the files to disk, too.) > This is similar to the improvement from batching. Why test sample agains sample? I think if one tests this agains real archive tool doing archive_status lookup and ready->donerenaming results will be much different. > Of course, there are drawbacks to using an extension. Besides the > obvious added complexity of building an extension in C versus writing > a shell command, the patch disallows changing the libraries without > restarting the server. Also, the patch makes no effort to simplify > error handling, memory management, etc. This is left as an exercise > for the extension author. I think the real problem with extension is quite different than mentioned above. There are many archive tools that already feature parallel archiving. PgBackrest, wal-e, wal-g, pg_probackup, pghoard, pgbarmanand others. These tools by far outweight tools that don't look into archive_status to parallelize archival. And we are going to ask them: add also a C extension without any feasible benefit to the user. You only get some restrictionslike system restart to enable shared library. I think we need a design that legalises already existing de-facto standard features in archive tools. Or event better - enablesthese tools to be more efficient, reliable etc. Either way we will create legacy code from the scratch. Thanks! Best regards, Andrey Borodin.
On 10/1/21, 12:08 PM, "Andrey Borodin" <x4mmm@yandex-team.ru> wrote: > 30 сент. 2021 г., в 09:47, Bossart, Nathan <bossartn@amazon.com> написал(а): >> I tested the sample archive_command in the docs against the sample >> archive_library implementation in the patch, and I saw about a 50% >> speedup. (The archive_library actually syncs the files to disk, too.) >> This is similar to the improvement from batching. > Why test sample agains sample? I think if one tests this agains real archive tool doing archive_status lookup and ready->donerenaming results will be much different. My intent was to demonstrate the impact of reducing the amount of overhead when archiving. I don't doubt that third party archive tools can show improvements by doing batching/parallelism behind the scenes. >> Of course, there are drawbacks to using an extension. Besides the >> obvious added complexity of building an extension in C versus writing >> a shell command, the patch disallows changing the libraries without >> restarting the server. Also, the patch makes no effort to simplify >> error handling, memory management, etc. This is left as an exercise >> for the extension author. > I think the real problem with extension is quite different than mentioned above. > There are many archive tools that already feature parallel archiving. PgBackrest, wal-e, wal-g, pg_probackup, pghoard,pgbarman and others. These tools by far outweight tools that don't look into archive_status to parallelize archival. > And we are going to ask them: add also a C extension without any feasible benefit to the user. You only get some restrictionslike system restart to enable shared library. > > I think we need a design that legalises already existing de-facto standard features in archive tools. Or event better -enables these tools to be more efficient, reliable etc. Either way we will create legacy code from the scratch. My proposal wouldn't require any changes to any of these utilities. This design just adds a new mechanism that would allow end users to set up archiving a different way with less overhead in hopes that it will help them keep up. I suspect a lot of work has been put into the archive tools you mentioned to make sure they can keep up with high rates of WAL generation, so I'm skeptical that anything we do here will really benefit them all that much. Ideally, we'd do something that improves matters for everyone, though. I'm open to suggestions. Nathan
Greetings, * Bossart, Nathan (bossartn@amazon.com) wrote: > On 10/1/21, 12:08 PM, "Andrey Borodin" <x4mmm@yandex-team.ru> wrote: > > 30 сент. 2021 г., в 09:47, Bossart, Nathan <bossartn@amazon.com> написал(а): > >> Of course, there are drawbacks to using an extension. Besides the > >> obvious added complexity of building an extension in C versus writing > >> a shell command, the patch disallows changing the libraries without > >> restarting the server. Also, the patch makes no effort to simplify > >> error handling, memory management, etc. This is left as an exercise > >> for the extension author. > > I think the real problem with extension is quite different than mentioned above. > > There are many archive tools that already feature parallel archiving. PgBackrest, wal-e, wal-g, pg_probackup, pghoard,pgbarman and others. These tools by far outweight tools that don't look into archive_status to parallelize archival. > > And we are going to ask them: add also a C extension without any feasible benefit to the user. You only get some restrictionslike system restart to enable shared library. > > > > I think we need a design that legalises already existing de-facto standard features in archive tools. Or event better- enables these tools to be more efficient, reliable etc. Either way we will create legacy code from the scratch. > > My proposal wouldn't require any changes to any of these utilities. > This design just adds a new mechanism that would allow end users to > set up archiving a different way with less overhead in hopes that it > will help them keep up. I suspect a lot of work has been put into the > archive tools you mentioned to make sure they can keep up with high > rates of WAL generation, so I'm skeptical that anything we do here > will really benefit them all that much. Ideally, we'd do something > that improves matters for everyone, though. I'm open to suggestions. This has something we've contemplated quite a bit and the last thing that I'd want to have is a requirement to configure a whole bunch of additional parameters to enable this. Why do we need to have some many new GUCs? I would have thought we'd probably be able to get away with just having the appropriate hooks and then telling folks to load the extension in shared_preload_libraries.. As for the hooks themselves, I'd certainly hope that they'd be designed to handle batches of WAL rather than individual ones as that's long been one of the main issues with the existing archive command approach. I appreciate that maybe that's less of an issue with a shared library but it's still something to consider. Admittedly, I haven't looked in depth with this patch set and am just going off of the description of them provided in the thread, so perhaps I missed something. Thanks, Stephen
Attachment
On 10/4/21, 7:21 PM, "Stephen Frost" <sfrost@snowman.net> wrote: > This has something we've contemplated quite a bit and the last thing > that I'd want to have is a requirement to configure a whole bunch of > additional parameters to enable this. Why do we need to have some many > new GUCs? I would have thought we'd probably be able to get away with > just having the appropriate hooks and then telling folks to load the > extension in shared_preload_libraries.. That would certainly simplify my patch quite a bit. I'll do it this way in the next revision. > As for the hooks themselves, I'd certainly hope that they'd be designed > to handle batches of WAL rather than individual ones as that's long been > one of the main issues with the existing archive command approach. I > appreciate that maybe that's less of an issue with a shared library but > it's still something to consider. Will do. This seems like it should be easier with the hook because we can provide a way to return which files were successfully archived. Nathan
Greetings, * Bossart, Nathan (bossartn@amazon.com) wrote: > On 10/4/21, 7:21 PM, "Stephen Frost" <sfrost@snowman.net> wrote: > > This has something we've contemplated quite a bit and the last thing > > that I'd want to have is a requirement to configure a whole bunch of > > additional parameters to enable this. Why do we need to have some many > > new GUCs? I would have thought we'd probably be able to get away with > > just having the appropriate hooks and then telling folks to load the > > extension in shared_preload_libraries.. > > That would certainly simplify my patch quite a bit. I'll do it this > way in the next revision. > > > As for the hooks themselves, I'd certainly hope that they'd be designed > > to handle batches of WAL rather than individual ones as that's long been > > one of the main issues with the existing archive command approach. I > > appreciate that maybe that's less of an issue with a shared library but > > it's still something to consider. > > Will do. This seems like it should be easier with the hook because we > can provide a way to return which files were successfully archived. It's also been discussed, at least around the water cooler (as it were in pandemic times- aka our internal slack channels..) that the existing archive command might be reimplemented as an extension using these. Not sure if that's really necessary but it was a thought. In any case, thanks for working on this! Stephen
Attachment
On 10/4/21, 8:19 PM, "Stephen Frost" <sfrost@snowman.net> wrote: > It's also been discussed, at least around the water cooler (as it were > in pandemic times- aka our internal slack channels..) that the existing > archive command might be reimplemented as an extension using these. Not > sure if that's really necessary but it was a thought. In any case, > thanks for working on this! Interesting. I like the idea of having one code path for everything instead of branching for the hook and non-hook paths. Thanks for sharing your thoughts. Nathan
On Tue, Oct 5, 2021 at 5:32 AM Bossart, Nathan <bossartn@amazon.com> wrote:
On 10/4/21, 8:19 PM, "Stephen Frost" <sfrost@snowman.net> wrote:
> It's also been discussed, at least around the water cooler (as it were
> in pandemic times- aka our internal slack channels..) that the existing
> archive command might be reimplemented as an extension using these. Not
> sure if that's really necessary but it was a thought. In any case,
> thanks for working on this!
Interesting. I like the idea of having one code path for everything
instead of branching for the hook and non-hook paths. Thanks for
sharing your thoughts.
I remember having had this discussion a few times, I think mainly with Stephen and David as well (but not on their internal slack channels :P).
I definitely think that's the way to go. It gives a single path for everything which makes it simpler in the most critical parts. And once you have picked an implementation other than it, you're now completely rid of the old implementation. And of course the good old idea that having an extension already using the API is a good way to show that the API is in a good place.
As much as I dislike our current interface in archive_command, and would like to see it go away completely, I do believe we need to ship something that has it - if nothing else then for backwards compatibility. But an extension like this would also make it easier to eventually, down the road, deprecate this solution.
Oh, and please put said implementation in a better place than contrib :)
On 10/6/21, 1:34 PM, "Magnus Hagander" <magnus@hagander.net> wrote: > I definitely think that's the way to go. It gives a single path for everything which makes it simpler in the most criticalparts. And once you have picked an implementation other than it, you're now completely rid of the old implementation. And of course the good old idea that having an extension already using the API is a good way to show thatthe API is in a good place. > > As much as I dislike our current interface in archive_command, and would like to see it go away completely, I do believewe need to ship something that has it - if nothing else then for backwards compatibility. But an extension like thiswould also make it easier to eventually, down the road, deprecate this solution. > > Oh, and please put said implementation in a better place than contrib :) I've attached an attempt at moving the archive_command logic to its own module and replacing it with a hook. This was actually pretty straightforward. I think the biggest question is where to put the archive_command module, which I've called shell_archive. The only existing directory that looked to me like it might work is src/test/modules. It might be rather bold to relegate this functionality to a test module so quickly, but on the other hand, perhaps it's the right thing to do given we intend to deprecate it in the future. I'm curious what others think about this. I'm still working on the documentation updates, which are quite extensive. I haven't included any of those in the patch yet. Nathan
Attachment
On Mon, Oct 18, 2021 at 7:25 PM Bossart, Nathan <bossartn@amazon.com> wrote: > I think the biggest question is where to put the archive_command > module, which I've called shell_archive. The only existing directory > that looked to me like it might work is src/test/modules. It might be > rather bold to relegate this functionality to a test module so > quickly, but on the other hand, perhaps it's the right thing to do > given we intend to deprecate it in the future. I'm curious what > others think about this. I don't see that as being a viable path forward based on my customer interactions working here at EDB. I am not quite sure why we wouldn't just compile the functions into the server. Functions pointers can point to core functions as surely as loadable modules. The present design isn't too congenial to that because it's relying on the shared library loading mechanism to wire the thing in place - but there's no reason it has to be that way. Logical decoding plugins don't work that way, for example. We could still have a GUC, say call it archive_method, that selects the module -- with 'shell' being a builtin method, and others being loadable as modules. If you set archive_method='shell' then you enable this module, and it has its own GUC, say call it archive_command, to configure the behavior. An advantage of this approach is that it's perfectly backward-compatible. I understand that archive_command is a hateful thing to many people here, but software has to serve the user base, not just the developers. Lots of people use archive_command and rely on it -- and are not interested in installing yet another piece of out-of-core software to do what $OTHERDB has built in. -- Robert Haas EDB: http://www.enterprisedb.com
On 10/19/21 8:50 AM, Robert Haas wrote: > On Mon, Oct 18, 2021 at 7:25 PM Bossart, Nathan <bossartn@amazon.com> wrote: >> I think the biggest question is where to put the archive_command >> module, which I've called shell_archive. The only existing directory >> that looked to me like it might work is src/test/modules. It might be >> rather bold to relegate this functionality to a test module so >> quickly, but on the other hand, perhaps it's the right thing to do >> given we intend to deprecate it in the future. I'm curious what >> others think about this. > > I don't see that as being a viable path forward based on my customer > interactions working here at EDB. > > I am not quite sure why we wouldn't just compile the functions into > the server. Functions pointers can point to core functions as surely > as loadable modules. The present design isn't too congenial to that > because it's relying on the shared library loading mechanism to wire > the thing in place - but there's no reason it has to be that way. > Logical decoding plugins don't work that way, for example. We could > still have a GUC, say call it archive_method, that selects the module > -- with 'shell' being a builtin method, and others being loadable as > modules. If you set archive_method='shell' then you enable this > module, and it has its own GUC, say call it archive_command, to > configure the behavior. > > An advantage of this approach is that it's perfectly > backward-compatible. I understand that archive_command is a hateful > thing to many people here, but software has to serve the user base, > not just the developers. Lots of people use archive_command and rely > on it -- and are not interested in installing yet another piece of > out-of-core software to do what $OTHERDB has built in. +1 to all of this, certainly for the time being. The archive_command mechanism is not great, but it is simple, and this part is not really what makes writing a good archive command hard. I had also originally envisioned this a default extension in core, but having the default 'shell' method built-in is certainly simpler. Regards, -- -David david@pgmasters.net
On Tue, Oct 19, 2021 at 2:50 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Oct 18, 2021 at 7:25 PM Bossart, Nathan <bossartn@amazon.com> wrote:
> I think the biggest question is where to put the archive_command
> module, which I've called shell_archive. The only existing directory
> that looked to me like it might work is src/test/modules. It might be
> rather bold to relegate this functionality to a test module so
> quickly, but on the other hand, perhaps it's the right thing to do
> given we intend to deprecate it in the future. I'm curious what
> others think about this.
I don't see that as being a viable path forward based on my customer
interactions working here at EDB.
I am not quite sure why we wouldn't just compile the functions into
the server. Functions pointers can point to core functions as surely
as loadable modules. The present design isn't too congenial to that
because it's relying on the shared library loading mechanism to wire
the thing in place - but there's no reason it has to be that way.
Logical decoding plugins don't work that way, for example. We could
still have a GUC, say call it archive_method, that selects the module
-- with 'shell' being a builtin method, and others being loadable as
modules. If you set archive_method='shell' then you enable this
module, and it has its own GUC, say call it archive_command, to
configure the behavior.
Yeah, seems reasonable. It wouldn't serve as well as an example to developers, but then it's probably not the "loadable module" part of building it that people need examples of. So as long as it's using the same internal APIs and just happens to be compiled in by default, I see no problem with that.
But, is logical decoding really that great an example? I mean, we build pgoutput.so as a library, we don't provide it compiled-in. So we could build the "shell archiver" based on that pattern, in which case we should create a postmaster/shell_archiver directory or something like that?
It should definitely not go under "test".
An advantage of this approach is that it's perfectly
backward-compatible. I understand that archive_command is a hateful
thing to many people here, but software has to serve the user base,
not just the developers. Lots of people use archive_command and rely
on it -- and are not interested in installing yet another piece of
out-of-core software to do what $OTHERDB has built in.
Backwards compatibility is definitely a must, I'd say. Regardless of exactly how the backwards-compatible pugin is shipped, it should be what's turned on by default.
On 10/19/21, 6:39 AM, "David Steele" <david@pgmasters.net> wrote: > On 10/19/21 8:50 AM, Robert Haas wrote: >> I am not quite sure why we wouldn't just compile the functions into >> the server. Functions pointers can point to core functions as surely >> as loadable modules. The present design isn't too congenial to that >> because it's relying on the shared library loading mechanism to wire >> the thing in place - but there's no reason it has to be that way. >> Logical decoding plugins don't work that way, for example. We could >> still have a GUC, say call it archive_method, that selects the module >> -- with 'shell' being a builtin method, and others being loadable as >> modules. If you set archive_method='shell' then you enable this >> module, and it has its own GUC, say call it archive_command, to >> configure the behavior. >> >> An advantage of this approach is that it's perfectly >> backward-compatible. I understand that archive_command is a hateful >> thing to many people here, but software has to serve the user base, >> not just the developers. Lots of people use archive_command and rely >> on it -- and are not interested in installing yet another piece of >> out-of-core software to do what $OTHERDB has built in. > > +1 to all of this, certainly for the time being. The archive_command > mechanism is not great, but it is simple, and this part is not really > what makes writing a good archive command hard. > > I had also originally envisioned this a default extension in core, but > having the default 'shell' method built-in is certainly simpler. I have no problem building it this way. It's certainly better for backward compatibility, which I think everyone here feels is important. Robert's proposed design is a bit more like my original proof-of- concept [0]. There, I added an archive_library GUC which was basically an extension of shared_preload_libraries (which creates some interesting problems in the library loading logic). You could only set one of archive_command or archive_library at any given time. When the archive_library was set, we ran that library's _PG_init() just like we do for any other library, and then we set the archiver function pointer to the library's _PG_archive() function. IIUC the main difference between this design and what Robert proposes is that we'd also move the existing archive_command stuff somewhere else and then access it via the archiver function pointer. I think that is clearly better than branching based on whether archive_command or archive_library is set. (BTW I'm not wedded to these GUCs. If folks would rather create something like the archive_method GUC, I think that would work just as well.) My original proof-of-concept also attempted to handle a bunch of other shell command GUCs, but perhaps I'd better keep this focused on archive_command for now. What we do here could serve as an example of how to adjust the other shell command GUCs later on. I'll go ahead and rework my patch to look more like what is being discussed here, although I expect the exact design for the interface will continue to evolve based on the feedback in this thread. Nathan [0] https://postgr.es/m/E9035E94-EC76-436E-B6C9-1C03FBD8EF54%40amazon.com
On Tue, Oct 19, 2021 at 10:19 AM Magnus Hagander <magnus@hagander.net> wrote: > But, is logical decoding really that great an example? I mean, we build pgoutput.so as a library, we don't provide it compiled-in.So we could build the "shell archiver" based on that pattern, in which case we should create a postmaster/shell_archiverdirectory or something like that? Well, I guess you could also use parallel contexts as an example. There, the core facilities that most people will use are baked into the server, but you can provide your own in an extension and the parallel context stuff will happily call it for you if you so request. I don't think the details here are too important. I'm just saying that not everything needs to depend on _PG_init() as a way of bootstrapping itself. TBH, if I ran the zoo and also had infinite time to tinker with stuff like this, I'd probably make a pass through the hooks we already have and try to refactor as many of them as possible to use some mechanism other than _PG_init() to bootstrap themselves. That mechanism actually sucks. When we use other mechanisms -- like a language "C" function that knows the shared object name and function name -- then load is triggered when it's needed, and the user gets the behavior they want. Similarly with logical decoding and FDWs -- you, as the user, say that you want this or that kind of logical decoding or FDW or C function or whatever -- and then the system either notices that it's already loaded and does what you want, or notices that it's not loaded and loads it, and then does what you want. But when the bootstrapping mechanism is _PG_init(), then the user has got to make sure the library is loaded at the correct time. They have to know whether it should go into shared_preload_libraries or whether it should be put into one of the other various GUCs or if it can be loaded on the fly with LOAD. If they don't load it in the right way, or if it doesn't get loaded at all, well then probably it just silently doesn't work. Plus there can be weird cases if it gets loaded into some backends but not others and things like that. And here we seem to have an opportunity to improve the interface by not depending on it. -- Robert Haas EDB: http://www.enterprisedb.com
Greetings, * Magnus Hagander (magnus@hagander.net) wrote: > Backwards compatibility is definitely a must, I'd say. Regardless of > exactly how the backwards-compatible pugin is shipped, it should be what's > turned on by default. I keep seeing this thrown around and I don't quite get why we feel this is the case. I'm not completely against trying to maintain backwards compatibility, but at the same time, we just went through changing quite a bit around in v12 with the restore command and that's the other half of this. Why are we so concerned about backwards compatibility here when there was hardly any complaint raised about breaking it in the restore case? If maintaining compatibility makes this a lot more difficult or ugly, then I'm against doing so. I don't know that to be the case, none of the proposed approaches really sound all that bad to me, but I certainly don't think we should be entirely avoiding the idea of breaking backwards compatibility here. We literally just did that and while there's been some noise about it, it's hardly risen to the level of being "something we should never, ever, even consider doing again" as seems to be implied on this thread. For those who might argue that maintaining compatibility for archive command is somehow more important than for restore command- allow me to save you the trouble and just let you know that I don't buy off on such an argument. If anything, it should be the opposite. You back up your database all the time and you're likely to see much more quickly if that stops working. Database restores, on the other hand, are nearly always done in times of great stress and when you want things to be very clear and easy to follow and for everything to 'just work'. Thanks, Stephen
Attachment
On 10/19/21, 9:14 AM, "Bossart, Nathan" <bossartn@amazon.com> wrote: > My original proof-of-concept also attempted to handle a bunch of other > shell command GUCs, but perhaps I'd better keep this focused on > archive_command for now. What we do here could serve as an example of > how to adjust the other shell command GUCs later on. I'll go ahead > and rework my patch to look more like what is being discussed here, > although I expect the exact design for the interface will continue to > evolve based on the feedback in this thread. Alright, I reworked the patch a bit to maintain backward compatibility. My initial intent for 0001 was to just do a clean refactor to move the shell archiving stuff to its own file. However, after I did that, I realized that adding the hook wouldn't be too much more work, so I did that as well. This seems to be enough to support custom archiving modules. I included a basic example of such a module in 0002. 0002 is included primarily for demonstration purposes. I do wonder if there are some further enhancements we should make to the archiving module interface. With 0001 applied, archive_command is silently ignored if you've preloaded a library that uses the hook. There's no way to indicate that you actually want to use archive_command or that you want to use a specific library as the archive library. On the other hand, just adding the hook keeps things simple, and it doesn't preclude future improvements in this area. Nathan
Attachment
On 10/20/21, 3:23 PM, "Bossart, Nathan" <bossartn@amazon.com> wrote: > Alright, I reworked the patch a bit to maintain backward > compatibility. My initial intent for 0001 was to just do a clean > refactor to move the shell archiving stuff to its own file. However, > after I did that, I realized that adding the hook wouldn't be too much > more work, so I did that as well. This seems to be enough to support > custom archiving modules. I included a basic example of such a module > in 0002. 0002 is included primarily for demonstration purposes. It looks like the FreeBSD build is failing because sys/wait.h is missing. Here is an attempt at fixing that. Nathan
Attachment
On Tue, Oct 19, 2021 at 2:50 PM Stephen Frost <sfrost@snowman.net> wrote: > I keep seeing this thrown around and I don't quite get why we feel this > is the case. I'm not completely against trying to maintain backwards > compatibility, but at the same time, we just went through changing quite > a bit around in v12 with the restore command and that's the other half > of this. Why are we so concerned about backwards compatibility here > when there was hardly any complaint raised about breaking it in the > restore case? There are 0 references to restore_command in the v12 release notes. Just in case you had the version number wrong in this email, I compared the documentation for restore_command in v10 to the documentation in v14. The differences seem to be only cosmetic. So I'm not sure what functional change you think we made. It was probably less significant than what was being discussed here in regards to archive_command. Also, more to the point, when there's a need to break backward compatibility in order to get some improvement, it's worth considering, but here there just isn't. -- Robert Haas EDB: http://www.enterprisedb.com
Greetings, * Robert Haas (robertmhaas@gmail.com) wrote: > On Tue, Oct 19, 2021 at 2:50 PM Stephen Frost <sfrost@snowman.net> wrote: > > I keep seeing this thrown around and I don't quite get why we feel this > > is the case. I'm not completely against trying to maintain backwards > > compatibility, but at the same time, we just went through changing quite > > a bit around in v12 with the restore command and that's the other half > > of this. Why are we so concerned about backwards compatibility here > > when there was hardly any complaint raised about breaking it in the > > restore case? > > There are 0 references to restore_command in the v12 release notes. > Just in case you had the version number wrong in this email, I > compared the documentation for restore_command in v10 to the > documentation in v14. The differences seem to be only cosmetic. So I'm > not sure what functional change you think we made. It was probably > less significant than what was being discussed here in regards to > archive_command. restore_command used to be in recovery.conf, which disappeared with v12 and it now has to go into postgresql.auto.conf or postgresql.conf. That's a huge breaking change. > Also, more to the point, when there's a need to break backward > compatibility in order to get some improvement, it's worth > considering, but here there just isn't. There won't be any thought towards a backwards-incompatible capability if everyone is saying that we can't possibly break it. That's why I was commenting on it. Thanks, Stephen
Attachment
On Thu, Oct 21, 2021 at 4:29 PM Stephen Frost <sfrost@snowman.net> wrote: > restore_command used to be in recovery.conf, which disappeared with v12 > and it now has to go into postgresql.auto.conf or postgresql.conf. > > That's a huge breaking change. Not in the same sense. Moving the functionality to a different configuration file can and probably did cause a lot of problems for people, but the same basic functionality was still available. (Also, I'm pretty sure that the recovery.conf changes would have happened years earlier if there hadn't been backward compatibility concerns, from Simon in particular. So saying that there was "hardly any complaint raised" in that case doesn't seem to me to be entirely accurate.) > > Also, more to the point, when there's a need to break backward > > compatibility in order to get some improvement, it's worth > > considering, but here there just isn't. > > There won't be any thought towards a backwards-incompatible capability > if everyone is saying that we can't possibly break it. That's why I was > commenting on it. I can't speak for anyone else, but that is not what I am saying. I am open to the idea of breaking it if we thereby get some valuable benefit which cannot be obtained otherwise. But Nathan has now implemented something which, from the sound of it, will allow us to obtain all of the available benefits with no incompatibilities. If we think of additional benefits that we cannot obtain without incompatibilities, then we can consider that situation when it arises. In the meantime, there's no need to go looking for reasons to break stuff that works in existing releases. -- Robert Haas EDB: http://www.enterprisedb.com
On Thu, Oct 21, 2021 at 11:05 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Oct 21, 2021 at 4:29 PM Stephen Frost <sfrost@snowman.net> wrote:
> restore_command used to be in recovery.conf, which disappeared with v12
> and it now has to go into postgresql.auto.conf or postgresql.conf.
>
> That's a huge breaking change.
Not in the same sense. Moving the functionality to a different
configuration file can and probably did cause a lot of problems for
people, but the same basic functionality was still available.
Yeah.
And as a bonus it got a bunch of people to upgrade their backup software that suddenly stopped working. Or in some case, to install backup software instead of using the hand-rolled scripts. So there were some good side-effects specifically to breaking it as well.
(Also, I'm pretty sure that the recovery.conf changes would have
happened years earlier if there hadn't been backward compatibility
concerns, from Simon in particular. So saying that there was "hardly
any complaint raised" in that case doesn't seem to me to be entirely
accurate.)
> > Also, more to the point, when there's a need to break backward
> > compatibility in order to get some improvement, it's worth
> > considering, but here there just isn't.
>
> There won't be any thought towards a backwards-incompatible capability
> if everyone is saying that we can't possibly break it. That's why I was
> commenting on it.
I can't speak for anyone else, but that is not what I am saying. I am
open to the idea of breaking it if we thereby get some valuable
benefit which cannot be obtained otherwise. But Nathan has now
implemented something which, from the sound of it, will allow us to
obtain all of the available benefits with no incompatibilities. If we
think of additional benefits that we cannot obtain without
incompatibilities, then we can consider that situation when it arises.
In the meantime, there's no need to go looking for reasons to break
stuff that works in existing releases.
Agreed.
On Thu, Oct 21, 2021 at 9:51 PM Bossart, Nathan <bossartn@amazon.com> wrote:
On 10/20/21, 3:23 PM, "Bossart, Nathan" <bossartn@amazon.com> wrote:
> Alright, I reworked the patch a bit to maintain backward
> compatibility. My initial intent for 0001 was to just do a clean
> refactor to move the shell archiving stuff to its own file. However,
> after I did that, I realized that adding the hook wouldn't be too much
> more work, so I did that as well. This seems to be enough to support
> custom archiving modules. I included a basic example of such a module
> in 0002. 0002 is included primarily for demonstration purposes.
It looks like the FreeBSD build is failing because sys/wait.h is
missing. Here is an attempt at fixing that.
I still like the idea of loading the library via a special parameter, archive_library or such.
One reason for that is that adding/removing modules in shared_preload_libraries has a terrible UX in that you have to replace the whole thing. This makes it much more complex to deal with when different modules just want to add to it.
E.g. my awsome backup program could set archive_library='my_awesome_backups', and know it didn't break anything else. but it couldn't set shared_preload_libraries='my_awesome_bacukps', because then it might break a bunch of other modules that used to be there. So it has to go try to parse the whole config and figure out where to make such modifications.
Now, this could *also* be solved by allowing shared_preload_library to be a "list" instead of a string, and allow postgresql.conf to accept syntax like shared_preload_libraries+='my_awesome_backups'.
But without that level fo functionality available, I think a separate parameter for the archive library would be a good thing.
Other than that:
+
+/*
+ * Is WAL archiving configured? For consistency with previous releases, this
+ * checks that archive_command is set when archiving via shell is enabled.
+ * Otherwise, we just check that an archive function is set, and it is the
+ * responsibility of that archive function to ensure it is properly configured.
+ */
+#define XLogArchivingConfigured() \
+ (PG_archive && (PG_archive != shell_archive || XLogArchiveCommand[0] != '\0'))
+/*
+ * Is WAL archiving configured? For consistency with previous releases, this
+ * checks that archive_command is set when archiving via shell is enabled.
+ * Otherwise, we just check that an archive function is set, and it is the
+ * responsibility of that archive function to ensure it is properly configured.
+ */
+#define XLogArchivingConfigured() \
+ (PG_archive && (PG_archive != shell_archive || XLogArchiveCommand[0] != '\0'))
Wouldn't that be better as a callback into the module? So that shell_archive would implement the check for XLogArchiveCommand. Then another third party module can make it's own decision on what to check. And PGarchive would then be a struct that holds a function pointer to the archive command and another function pointer to the isenabled command? (I think having a struct for it would be useful regardless -- for possible future extensions with more API points).
--
On 10/22/21, 7:43 AM, "Magnus Hagander" <magnus@hagander.net> wrote: > I still like the idea of loading the library via a special > parameter, archive_library or such. My first attempt [0] added a GUC like this, so I can speak to some of the interesting design decisions that follow. The simplest thing we could do would be to add the archive_library GUC and to load that just like the library is at the end of shared_preload_libraries. This would mean that the archive library could be specified in either GUC, and there would effectively be no difference between the two. The next thing we could consider doing is adding a new boolean called process_archive_library_in_progress, which would be analogous to process_shared_preload_libraries_in_progress. If a library is loaded from the archive_library GUC, its _PG_init() will be called with process_archive_library_in_progress set. This also means that if a library is specified in both shared_preload_libraries and archive_library, we'd call its _PG_init() twice. The library could then branch based on whether process_shared_preload_libraries_in_progress or process_archive_library_in_progress was set. Another approach would be to add a new initialization function (e.g., PG_archive_init()) that would be used if the library is being loaded from archive_library. Like before, you can use the library for both shared_preload_libraries and archive_library, but your initialization logic would be expected to go in separate functions. However, there still wouldn't be anything forcing that. A library could still break the rules and do everything in _PG_init() and be loaded via shared_preload_libraries. One more thing we could do is to discover the relevant symbols for archiving in library loading function. Rather than expecting the initialization function to set the hook correctly, we'd just look up the _PG_archive() function during loading. Again, a library could probably still break the rules and do everything in _PG_init()/shared_preload_libraries, but there would at least be a nicer interface available. I believe the main drawbacks of going down this path are the additional complexity in the backend and the slippery slope of adding all kinds of new GUCs in the future. My original patch also tried to do something similar for some other shell command GUCs (archive_cleanup_command, restore_command, and recovery_end_command). While I'm going to try to keep this focused on archive_command for now, presumably we'd eventually want the ability to use hooks for all of these things. I don't know if we really want to incur a new GUC for every single one of these. To be clear, I'm not against adding a GUC if it seems like the right thing to do. I just want to make sure we are aware of the tradeoffs compared to a simple shared_preload_libraries approach with its terrible UX. > Wouldn't that be better as a callback into the module? So that > shell_archive would implement the check for XLogArchiveCommand. Then > another third party module can make it's own decision on what to > check. And PGarchive would then be a struct that holds a function > pointer to the archive command and another function pointer to the > isenabled command? (I think having a struct for it would be useful > regardless -- for possible future extensions with more API points). +1. This crossed my mind, too. I'll add this in the next revision. Nathan [0] https://postgr.es/m/E9035E94-EC76-436E-B6C9-1C03FBD8EF54%40amazon.com
On Fri, Oct 22, 2021 at 1:42 PM Bossart, Nathan <bossartn@amazon.com> wrote: > Another approach would be to add a new initialization function (e.g., > PG_archive_init()) that would be used if the library is being loaded > from archive_library. Like before, you can use the library for both > shared_preload_libraries and archive_library, but your initialization > logic would be expected to go in separate functions. However, there > still wouldn't be anything forcing that. A library could still break > the rules and do everything in _PG_init() and be loaded via > shared_preload_libraries. I was imagining something like what logical decoding does. In that case, you make a _PG_output_plugin_init function and it returns a table of callbacks. Then the core code invokes those callbacks at the appropriate times. -- Robert Haas EDB: http://www.enterprisedb.com
On 10/22/21, 4:35 PM, "Robert Haas" <robertmhaas@gmail.com> wrote: > I was imagining something like what logical decoding does. In that > case, you make a _PG_output_plugin_init function and it returns a > table of callbacks. Then the core code invokes those callbacks at the > appropriate times. Here is an attempt at doing this. Archive modules are expected to declare _PG_archive_module_init(), which can define GUCs, register background workers, etc. This function must at least define the archive callbacks. For now, I've introduced two callbacks. The first is for checking that the archive module is configured, and the second contains the actual archiving logic. I've written this so that the same library can be used for multiple purposes (e.g., it could be in shared_preload_libraries and archive_library). I don't know if that's really necessary, but it seemed to me like a reasonable way to handle the changes to the library loading logic that we need anyway. 0002 is still a sample backup module, but I also added some handling for preexisting archives. If the preexisting archive file has the same contents as the current file to archive, archiving is allowed to continue. If the contents don't match, archiving fails. This sample module could still produce unexpected results if two servers were sending archives to the same directory. I stopped short of adding handling for that case, but that might be a good thing to tackle next. Nathan
Attachment
Greetings, * Magnus Hagander (magnus@hagander.net) wrote: > On Thu, Oct 21, 2021 at 11:05 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Thu, Oct 21, 2021 at 4:29 PM Stephen Frost <sfrost@snowman.net> wrote: > > > restore_command used to be in recovery.conf, which disappeared with v12 > > > and it now has to go into postgresql.auto.conf or postgresql.conf. > > > > > > That's a huge breaking change. > > > > Not in the same sense. Moving the functionality to a different > > configuration file can and probably did cause a lot of problems for > > people, but the same basic functionality was still available. > > Yeah. > > And as a bonus it got a bunch of people to upgrade their backup software > that suddenly stopped working. Or in some case, to install backup software > instead of using the hand-rolled scripts. So there were some good > side-effects specifically to breaking it as well. I feel like there's some confusion here- just to clear things up, I wasn't suggesting that we wouldn't include the capability, just that we should be open to changing the interface/configuration based on what makes sense and not, necessarily, insist on perfect backwards compatibility. Seems everyone else has come out in support of that as well at this point and so I don't think there's much more to say here. The original complaint I had made was that it felt like folks were pushing hard on backwards compatibility for the sake of it and I was just trying to make sure it's clear that we can, and do, break backwards compatibility sometimes and the bar to clear isn't necessarily all that high, though of course we should be gaining something if we do decide to make such a change. Thanks, Stephen
Attachment
On Sun, Oct 24, 2021 at 2:15 AM Bossart, Nathan <bossartn@amazon.com> wrote: > Here is an attempt at doing this. Archive modules are expected to > declare _PG_archive_module_init(), which can define GUCs, register > background workers, etc. This function must at least define the > archive callbacks. For now, I've introduced two callbacks. The first > is for checking that the archive module is configured, and the second > contains the actual archiving logic. I don't see why this patch should need to make any changes to internal_load_library(), PostmasterMain(), SubPostmasterMain(), or any other central point of control, and I don't think it should. pgarch_archiveXlog() can just load the library at the time it's needed. That way it only gets loaded in the archiver process, and the required changes are much more localized. Like instead of asserting that the functions are initialized, just load_external_function(libname, "_PG_archive_module_init") and call it if they aren't. I think the attempt in check_archive_command()/check_archive_library() to force exactly one of those two to be set is not going to work well and should be removed. In general, GUCs whose legal values depend on the values of other GUCs don't end up working out well. I think what should happen instead is that if archive_library=shell then archive_command does whatever it does; otherwise archive_command is without effect. -- Robert Haas EDB: http://www.enterprisedb.com
On 10/25/21, 10:02 AM, "Robert Haas" <robertmhaas@gmail.com> wrote: > I don't see why this patch should need to make any changes to > internal_load_library(), PostmasterMain(), SubPostmasterMain(), or any > other central point of control, and I don't think it should. > pgarch_archiveXlog() can just load the library at the time it's > needed. That way it only gets loaded in the archiver process, and the > required changes are much more localized. Like instead of asserting > that the functions are initialized, just > load_external_function(libname, "_PG_archive_module_init") and call it > if they aren't. IIUC this would mean that archive modules that need to define GUCs or register background workers would have to separately define a _PG_init() and be loaded via shared_preload_libraries in addition to archive_library. That doesn't seem too terrible to me, but it was something I was trying to avoid. > I think the attempt in check_archive_command()/check_archive_library() > to force exactly one of those two to be set is not going to work well > and should be removed. In general, GUCs whose legal values depend on > the values of other GUCs don't end up working out well. I think what > should happen instead is that if archive_library=shell then > archive_command does whatever it does; otherwise archive_command is > without effect. I'm fine with this approach. I'll go this route in the next revision. Nathan
On Mon, Oct 25, 2021 at 1:14 PM Bossart, Nathan <bossartn@amazon.com> wrote: > IIUC this would mean that archive modules that need to define GUCs or > register background workers would have to separately define a > _PG_init() and be loaded via shared_preload_libraries in addition to > archive_library. That doesn't seem too terrible to me, but it was > something I was trying to avoid. Hmm. That doesn't seem like a terrible goal, but I think we should try to find some way of achieving it that looks tidier than this does. -- Robert Haas EDB: http://www.enterprisedb.com
On 10/25/21, 10:18 AM, "Robert Haas" <robertmhaas@gmail.com> wrote: > On Mon, Oct 25, 2021 at 1:14 PM Bossart, Nathan <bossartn@amazon.com> wrote: >> IIUC this would mean that archive modules that need to define GUCs or >> register background workers would have to separately define a >> _PG_init() and be loaded via shared_preload_libraries in addition to >> archive_library. That doesn't seem too terrible to me, but it was >> something I was trying to avoid. > > Hmm. That doesn't seem like a terrible goal, but I think we should try > to find some way of achieving it that looks tidier than this does. We could just treat archive_library as if it is tacked onto the shared_preload_libraries list. I think I can make that look relatively tidy. Nathan
On 10/25/21, 10:50 AM, "Bossart, Nathan" <bossartn@amazon.com> wrote: > On 10/25/21, 10:18 AM, "Robert Haas" <robertmhaas@gmail.com> wrote: >> On Mon, Oct 25, 2021 at 1:14 PM Bossart, Nathan <bossartn@amazon.com> wrote: >>> IIUC this would mean that archive modules that need to define GUCs or >>> register background workers would have to separately define a >>> _PG_init() and be loaded via shared_preload_libraries in addition to >>> archive_library. That doesn't seem too terrible to me, but it was >>> something I was trying to avoid. >> >> Hmm. That doesn't seem like a terrible goal, but I think we should try >> to find some way of achieving it that looks tidier than this does. > > We could just treat archive_library as if it is tacked onto the > shared_preload_libraries list. I think I can make that look > relatively tidy. Alright, here is an attempt at that. With this revision, archive libraries are preloaded (and _PG_init() is called), and the archiver is responsible for calling _PG_archive_module_init() to get the callbacks. I've also removed the GUC check hooks as previously discussed. Nathan
Attachment
On Mon, Oct 25, 2021 at 3:45 PM Bossart, Nathan <bossartn@amazon.com> wrote: > Alright, here is an attempt at that. With this revision, archive > libraries are preloaded (and _PG_init() is called), and the archiver > is responsible for calling _PG_archive_module_init() to get the > callbacks. I've also removed the GUC check hooks as previously > discussed. I would need to spend more time on this to have a detailed opinion on all of it, but I agree that part looks better this way. -- Robert Haas EDB: http://www.enterprisedb.com
On 10/25/21, 1:29 PM, "Robert Haas" <robertmhaas@gmail.com> wrote: > On Mon, Oct 25, 2021 at 3:45 PM Bossart, Nathan <bossartn@amazon.com> wrote: >> Alright, here is an attempt at that. With this revision, archive >> libraries are preloaded (and _PG_init() is called), and the archiver >> is responsible for calling _PG_archive_module_init() to get the >> callbacks. I've also removed the GUC check hooks as previously >> discussed. > > I would need to spend more time on this to have a detailed opinion on > all of it, but I agree that part looks better this way. Great. Unless I see additional feedback on the basic design shortly, I'll give the documentation updates a try. Nathan
On 10/25/21, 1:41 PM, "Bossart, Nathan" <bossartn@amazon.com> wrote: > Great. Unless I see additional feedback on the basic design shortly, > I'll give the documentation updates a try. Okay, here is a more complete patch with a first attempt at the documentation changes. I tried to keep the changes to the existing docs as minimal as possible, and then I added a new chapter that describes what goes into creating an archive module. Separately, I simplified the basic_archive module, moved it to src/test/modules, and added a simple test. My goal is for this to serve as a basic example and to provide some test coverage on the new infrastructure. Nathan
Attachment
Greetings, * Bossart, Nathan (bossartn@amazon.com) wrote: > On 10/25/21, 1:41 PM, "Bossart, Nathan" <bossartn@amazon.com> wrote: > > Great. Unless I see additional feedback on the basic design shortly, > > I'll give the documentation updates a try. > > Okay, here is a more complete patch with a first attempt at the > documentation changes. I tried to keep the changes to the existing > docs as minimal as possible, and then I added a new chapter that > describes what goes into creating an archive module. Separately, I > simplified the basic_archive module, moved it to src/test/modules, > and added a simple test. My goal is for this to serve as a basic > example and to provide some test coverage on the new infrastructure. Definitely interested and plan to look at this more shortly, and generally this all sounds good, but maybe we should have it be posted under a new thread as it's moved pretty far from the subject and folks might not appreciate what this is about at this point..? Thanks, Stephen
Attachment
On 11/1/21, 10:57 AM, "Stephen Frost" <sfrost@snowman.net> wrote: > Definitely interested and plan to look at this more shortly, and > generally this all sounds good, but maybe we should have it be posted > under a new thread as it's moved pretty far from the subject and folks > might not appreciate what this is about at this point..? Done: https://postgr.es/m/668D2428-F73B-475E-87AE-F89D67942270%40amazon.com Looking forward to your feedback. Nathan