Re: [HACKERS] WIP: [[Parallel] Shared] Hash - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: [HACKERS] WIP: [[Parallel] Shared] Hash |
Date | |
Msg-id | CAM3SWZRx87vdJ1fW=rbYsPdFE6-dSX2qEYBFCP8DYRqrJA2R1Q@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] WIP: [[Parallel] Shared] Hash (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: [HACKERS] WIP: [[Parallel] Shared] Hash
Re: [HACKERS] WIP: [[Parallel] Shared] Hash |
List | pgsql-hackers |
On Wed, Jan 11, 2017 at 10:57 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Jan 10, 2017 at 8:56 PM, Peter Geoghegan <pg@heroku.com> wrote: >> Instead of all this, I suggest copying some of my changes to fd.c, so >> that resource ownership within fd.c differentiates between a vfd that >> is owned by the backend in the conventional sense, including having a >> need to delete at eoxact, as well as a lesser form of ownership where >> deletion should not happen. > > If multiple processes are using the same file via the BufFile > interface, I think that it is absolutely necessary that there should > be a provision to track the "attach count" of the BufFile. Each > process that reaches EOXact decrements the attach count and when it > reaches 0, the process that reduced it to 0 removes the BufFile. I > think anything that's based on the notion that leaders will remove > files and workers won't is going to be fragile and limiting, and I am > going to push hard against any such proposal. Okay. My BufFile unification approach happens to assume that backends clean up after themselves, but that isn't a ridged assumption (of course, these are always temp files, so we reason about them as temp files). It could be based on a refcount fairly easily, such that, as you say here, deletion of files occurs within workers (that "own" the files) only as a consequence of their being the last backend with a reference, that must therefore "turn out the lights" (delete the file). That seems consistent with what I've done within fd.c, and what I suggested to Thomas (that he more or less follow that approach). You'd probably still want to throw an error when workers ended up not deleting BufFile segments they owned, though, at least for parallel tuplesort. This idea is something that's much more limited than the SharedTemporaryFile() API that you sketched on the parallel sort thread, because it only concerns resource management, and not how to make access to the shared file concurrency safe in any special, standard way. I think that this resource management is something that should be managed by buffile.c (and the temp file routines within fd.c that are morally owned by buffile.c, their only caller). It shouldn't be necessary for a client of this new infrastructure, such as parallel tuplesort or parallel hash join, to know anything about file paths. Instead, they should be passing around some kind of minimal private-to-buffile state in shared memory that coordinates backends participating in BufFile unification. Private state created by buffile.c, and passed back to buffile.c. Everything should be encapsulated within buffile.c, IMV, making parallel implementations as close as possible to their serial implementations. -- Peter Geoghegan
pgsql-hackers by date: