Thread: add function for creating/attaching hash table in DSM registry

add function for creating/attaching hash table in DSM registry

From
Nathan Bossart
Date:
Libraries commonly use shared memory to store hash tables.  While it's
possible to set up a dshash table using the DSM registry today, doing so is
complicated; you need to set up two LWLock tranches, a DSA, and finally the
dshash table.  The attached patch adds a new function called
GetNamedDSMHash() that makes creating/attaching a hash table in the DSM
registry much easier.

-- 
nathan

Attachment

Re: add function for creating/attaching hash table in DSM registry

From
Dagfinn Ilmari Mannsåker
Date:
Nathan Bossart <nathandbossart@gmail.com> writes:

> +typedef struct NamedDSMHashState
> +{
> +    dsa_handle    dsah;
> +    dshash_table_handle dshh;
> +    int            dsa_tranche;
> +    char        dsa_tranche_name[68];    /* name + "_dsa" */
> +    int            dsh_tranche;
> +    char        dsh_tranche_name[68];    /* name + "_dsh" */
> +} NamedDSMHashState;

I don't have enough knowledge to review the rest of the patch, but
shouldn't this use NAMEDATALEN, rather than hard-coding the default
length?

- ilmari



Thanks for this patch! I have implemented this pattern several times,
so this is really helpful.

I have a few early comments, but I plan on trying this out next.

1/
> > +typedef struct NamedDSMHashState
> > +{
> > +     dsa_handle      dsah;
> > +     dshash_table_handle dshh;
> > +     int                     dsa_tranche;
> > +     char            dsa_tranche_name[68];   /* name + "_dsa" */
> > +     int                     dsh_tranche;
> > +     char            dsh_tranche_name[68];   /* name + "_dsh" */
> > +} NamedDSMHashState;
>
> I don't have enough knowledge to review the rest of the patch, but
> shouldn't this use NAMEDATALEN, rather than hard-coding the default
> length?

NamedLWLockTrancheRequest uses NAMEDATALEN = 64 bytes for the
tranche_name

typedef struct NamedLWLockTrancheRequest
{
char tranche_name[NAMEDATALEN];
int num_lwlocks;
} NamedLWLockTrancheRequest;

but in the case here, "_dsa" or "_dsh" will occupy another 4 bytes.
I think instead of hardcoding, we should #define a length,

i.e. #define NAMEDDSMTRANCHELEN (NAMEDATALEN + 4)

2/ Can you group the dsa and dsh separately. I felt this was a bit
difficult to read?

+               /* Initialize LWLock tranches for the DSA and dshash table. */
+               state->dsa_tranche = LWLockNewTrancheId();
+               state->dsh_tranche = LWLockNewTrancheId();
+               sprintf(state->dsa_tranche_name, "%s_dsa", name);
+               sprintf(state->dsh_tranche_name, "%s_dsh", name);
+               LWLockRegisterTranche(state->dsa_tranche,
state->dsa_tranche_name);
+               LWLockRegisterTranche(state->dsh_tranche,
state->dsh_tranche_name);

3/ It will be good to "Assert(dsh)" before "return dsh;" for safety?

    MemoryContextSwitchTo(oldcontext);
    LWLockRelease(DSMRegistryLock);

    return dsh;
}


--
Sami Imseih
Amazon Web Services (AWS)



Re: add function for creating/attaching hash table in DSM registry

From
Nathan Bossart
Date:
On Thu, Jun 05, 2025 at 01:38:25PM -0500, Sami Imseih wrote:
> I have a few early comments, but I plan on trying this out next.

Thanks for reviewing.

>> > +typedef struct NamedDSMHashState
>> > +{
>> > +     dsa_handle      dsah;
>> > +     dshash_table_handle dshh;
>> > +     int                     dsa_tranche;
>> > +     char            dsa_tranche_name[68];   /* name + "_dsa" */
>> > +     int                     dsh_tranche;
>> > +     char            dsh_tranche_name[68];   /* name + "_dsh" */
>> > +} NamedDSMHashState;
>>
>> I don't have enough knowledge to review the rest of the patch, but
>> shouldn't this use NAMEDATALEN, rather than hard-coding the default
>> length?

I straightened this out in v2.  I've resisted using NAMEDATALEN because
this stuff is unrelated to the name type.  But I have moved all the lengths
and suffixes to macros.

> NamedLWLockTrancheRequest uses NAMEDATALEN = 64 bytes for the
> tranche_name
> 
> typedef struct NamedLWLockTrancheRequest
> {
> char tranche_name[NAMEDATALEN];
> int num_lwlocks;
> } NamedLWLockTrancheRequest;

I think the NAMEDATALEN limit only applies to tranches requested at startup
time.  LWLockRegisterTranche() just saves whatever pointer you give it, so
AFAICT there's no real limit there.

> 2/ Can you group the dsa and dsh separately. I felt this was a bit
> difficult to read?
> 
> +               /* Initialize LWLock tranches for the DSA and dshash table. */
> +               state->dsa_tranche = LWLockNewTrancheId();
> +               state->dsh_tranche = LWLockNewTrancheId();
> +               sprintf(state->dsa_tranche_name, "%s_dsa", name);
> +               sprintf(state->dsh_tranche_name, "%s_dsh", name);
> +               LWLockRegisterTranche(state->dsa_tranche,
> state->dsa_tranche_name);
> +               LWLockRegisterTranche(state->dsh_tranche,
> state->dsh_tranche_name);

Done.

> 3/ It will be good to "Assert(dsh)" before "return dsh;" for safety?
> 
>     MemoryContextSwitchTo(oldcontext);
>     LWLockRelease(DSMRegistryLock);
> 
>     return dsh;

Eh, I would expect the tests to start failing horribly if I managed to mess
that up.

-- 
nathan

Attachment