Home > mailing lists

Re: Single transaction in the tablesync worker? - Mailing list pgsql-hackers

From	Petr Jelinek
Subject	Re: Single transaction in the tablesync worker?
Date	February 5, 2021 15:10:36
Msg-id	e2d6c854-61ac-2720-37bd-d463a8fc1210@enterprisedb.com Whole thread Raw
In response to	Re: Single transaction in the tablesync worker? (Amit Kapila <amit.kapila16@gmail.com>)
Responses	Re: Single transaction in the tablesync worker? Re: Single transaction in the tablesync worker? Re: Single transaction in the tablesync worker?
List	pgsql-hackers

Tree view

Hi,

We had a bit high-level discussion about this patches with Amit 
off-list, so I decided to also take a look at the actual code.

My main concern originally was the potential for left-over slots on 
publisher, but I think the state now is relatively okay, with couple of 
corner cases that are documented and don't seem much worse than the main 
slot.

I wonder if we should mention the max_slot_wal_keep_size GUC in the 
table sync docs though.

Another thing that might need documentation is that the the visibility 
of changes done by table sync is not anymore isolated in that table 
contents will show intermediate progress to other backends, rather than 
switching from nothing to state consistent with rest of replication.


Some minor comments about code:

> +        else if (res->status == WALRCV_ERROR && missing_ok)
> +        {
> +            /* WARNING. Error, but missing_ok = true. */
> +            ereport(WARNING,

I wonder if we need to add error code to the WalRcvExecResult and check 
for the appropriate ones here. Because this can for example return error 
because of timeout, not because slot is missing. Not sure if it matters 
for current callers though (but then maybe don't call the param 
missign_ok?).


> +ReplicationSlotNameForTablesync(Oid suboid, Oid relid, char syncslotname[NAMEDATALEN])
> +{
> +    if (syncslotname)
> +        sprintf(syncslotname, "pg_%u_sync_%u", suboid, relid);
> +    else
> +        syncslotname = psprintf("pg_%u_sync_%u", suboid, relid);
> +
> +    return syncslotname;
> +}

Given that we are now explicitly dropping slots, what happens here if we 
have 2 different downstreams that happen to get same suboid and reloid, 
will one of the drop the slot of the other one? Previously with the 
cleanup being left to temp slot we'd at maximum got error when creating 
it but with the new logic in LogicalRepSyncTableStart it feels like we 
could get into situation where 2 downstreams are fighting over slot no?


-- 
Petr

pgsql-hackers by date:

From: Stephen Frost
Date: 05 February 2021, 15:05:21
Subject: Re: [HACKERS] GSoC 2017: Foreign Key Arrays

From: Mark Rofail
Date: 05 February 2021, 15:27:21
Subject: Re: [HACKERS] GSoC 2017: Foreign Key Arrays

Re: Single transaction in the tablesync worker? - Mailing list pgsql-hackers

Previous

Next