Re: refactoring relation extension and BufferAlloc(), faster COPY - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: refactoring relation extension and BufferAlloc(), faster COPY |
Date | |
Msg-id | 1e2df302-d0c4-8baa-fbd6-50a810b4e1ad@iki.fi Whole thread Raw |
In response to | Re: refactoring relation extension and BufferAlloc(), faster COPY (Andres Freund <andres@anarazel.de>) |
Responses |
Re: refactoring relation extension and BufferAlloc(), faster COPY
|
List | pgsql-hackers |
On 01/03/2023 09:33, Andres Freund wrote: > On 2023-02-21 17:33:31 +0100, Alvaro Herrera wrote: >> On 2023-Feb-21, Heikki Linnakangas wrote: >> >>>> +static BlockNumber >>>> +BulkExtendSharedRelationBuffered(Relation rel, >>>> + SMgrRelation smgr, >>>> + bool skip_extension_lock, >>>> + char relpersistence, >>>> + ForkNumber fork, ReadBufferMode mode, >>>> + BufferAccessStrategy strategy, >>>> + uint32 *num_pages, >>>> + uint32 num_locked_pages, >>>> + Buffer *buffers) >>> >>> Ugh, that's a lot of arguments, some are inputs and some are outputs. I >>> don't have any concrete suggestions, but could we simplify this somehow? >>> Needs a comment at least. >> >> Yeah, I noticed this too. I think it would be easy enough to add a new >> struct that can be passed as a pointer, which can be stack-allocated >> by the caller, and which holds the input arguments that are common to >> both functions, as is sensible. > > I played a fair bit with various options. I ended up not using a struct to > pass most options, but instead go for a flags argument. However, I did use a > struct for passing either relation or smgr. > > > typedef enum ExtendBufferedFlags { > EB_SKIP_EXTENSION_LOCK = (1 << 0), > EB_IN_RECOVERY = (1 << 1), > EB_CREATE_FORK_IF_NEEDED = (1 << 2), > EB_LOCK_FIRST = (1 << 3), > > /* internal flags follow */ > EB_RELEASE_PINS = (1 << 4), > } ExtendBufferedFlags; Is EB_IN_RECOVERY always set when RecoveryInProgress()? Is it really needed? What does EB_LOCK_FIRST do? > /* > * To identify the relation - either relation or smgr + relpersistence has to > * be specified. Used via the EB_REL()/EB_SMGR() macros below. This allows us > * to use the same function for both crash recovery and normal operation. > */ > typedef struct ExtendBufferedWhat > { > Relation rel; > struct SMgrRelationData *smgr; > char relpersistence; > } ExtendBufferedWhat; > > #define EB_REL(p_rel) ((ExtendBufferedWhat){.rel = p_rel}) > /* requires use of EB_SKIP_EXTENSION_LOCK */ > #define EB_SMGR(p_smgr, p_relpersistence) ((ExtendBufferedWhat){.smgr = p_smgr, .relpersistence = p_relpersistence}) Clever. I'm still not 100% convinced we need the EB_SMGR variant, but with this we'll have the flexibility in any case. > extern Buffer ExtendBufferedRel(ExtendBufferedWhat eb, > ForkNumber forkNum, > BufferAccessStrategy strategy, > uint32 flags); > extern BlockNumber ExtendBufferedRelBy(ExtendBufferedWhat eb, > ForkNumber fork, > BufferAccessStrategy strategy, > uint32 flags, > uint32 extend_by, > Buffer *buffers, > uint32 *extended_by); > extern Buffer ExtendBufferedRelTo(ExtendBufferedWhat eb, > ForkNumber fork, > BufferAccessStrategy strategy, > uint32 flags, > BlockNumber extend_to, > ReadBufferMode mode); > > As you can see I removed ReadBufferMode from most of the functions (as > suggested by Heikki earlier). When extending by 1/multiple pages, we only need > to know whether to lock or not. Ok, that's better. Still complex and a lot of arguments, but I don't have any great suggestions on how to improve it. > The reason ExtendBufferedRelTo() has a 'mode' argument is that that allows to > fall back to reading page normally if there was a concurrent relation > extension. Hmm, I think you'll need another return value, to let the caller know if the relation was extended or not. Or a flag to ereport(ERROR) if the page already exists, for ginbuild() and friends. > The reason EB_CREATE_FORK_IF_NEEDED exists is to remove the duplicated, > gnarly, code to do so from vm_extend(), fsm_extend(). Makes sense. > I'm not sure about the function naming pattern. I do like 'By' a lot more than > the Bulk prefix I used before. +1 - Heikki
pgsql-hackers by date: