Home > mailing lists
Re: Turning recovery.conf into GUCs - Mailing list pgsql-hackers

From	Alex Shulgin
Subject	Re: Turning recovery.conf into GUCs
Date	November 21, 2014 17:36:39
Msg-id	87389ctq24.fsf@commandprompt.com Whole thread Raw
In response to	Re: Turning recovery.conf into GUCs (Andres Freund <andres@2ndquadrant.com>)
Responses	Re: Turning recovery.conf into GUCs Re: Turning recovery.conf into GUCs
List	pgsql-hackers
Tree view
Hello,

Here's an attempt to revive this patch.

It is rebased onto the latest master and also includes handling and
documentation of newly added recovery.conf parameters such as
primary_slot_name, recovery_min_apply_delay and
recovery_target='immediate'.

The following feedback had been addressed:

Andres Freund <andres@2ndquadrant.com> writes:
>> >
>> > * --write-standby-enable seems to loose quite some functionality in
>> >   comparison to --write-recovery-conf since it doesn't seem to set
>> >   primary_conninfo, standby anymore.
>>
>> we can add code that looks for postgresql.conf in $PGDATA but if
>> postgresql.conf is not there (like the case in debian, there is not
>> much we can do about it) or we can write a file ready to be included
>> in postgresql.conf, any sugestion?
>
> People might not like me for the suggestion, but I think we should
> simply always include a 'recovery.conf' in $PGDATA
> unconditionally. That'd make this easier.
> Alternatively we could pass a filename to --write-recovery-conf.

Well, the latest version of this patch fails to start when it sees
'recovery.conf' in PGDATA:

  FATAL:  "recovery.conf" is not supported anymore as a recovery method
  DETAIL:  Refer to appropriate documentation about migration methods

I've missed all the discussion behind this decision and after reading
the ALTER SYSTEM/conf.d thread I'm even more confused so I'd like
someone more knowledgeable to speak up on the status of this.

Do we want to keep this behavior of the patch?

>> > * CheckRecoveryReadyFile() doesn't seem to be a very descriptive
>> >   function name.
>>
>> I left it as CheckStartingAsStandby() but i still have a problem of
>> this not being completely clear. this function is useful for standby
>> or pitr.

There's not much left for this function in the current patch version, so
maybe we should just move it to StartupXLOG (it's not called from
anywhere else either way).

>> which leads me to the other problem i have: the recovery trigger file,
>> i have left it as standby.enabled but i still don't like it.
>
>> recovery.trigger (Andres objected on this name)
>> forced_recovery.trigger
>> user_forced_recovery.trigger
>
> stay_in_recovery.trigger? That'd be pretty clear for anybody involved in
> pg, but otherwise...

The docs use the term "continuous recovery".

Either way, from the code it is clear that we only stay in recovery if
standby_mode is directly turned on.  This makes the whole check for a
specially named file unnecessary, IMO: we should just check the value of
standby_mode (which is off by default).

By the way, is there any use in setting standby_mode=on and any of the
recovery_target* GUCs at the same time?

I think it can only play together if you set the target farther than the
latest point you've got in the archive locally.  So that's sort of
"Point-in-Future-Recovery".  Does that make any sense at all?

>> > * the description of archive_cleanup_command seems wrong to me.
>>
>> why? it seems to be the same that was in recovery.conf. where did you
>> see the description you're complaining at?
>
> I dislike the description in guc.c
>
>> +        {"archive_cleanup_command", PGC_POSTMASTER, WAL_ARCHIVE_RECOVERY,
>> +            gettext_noop("Sets the shell command that will be executed at every restartpoint."),
>> +            NULL
>> +        },
>> +        &archive_cleanup_command,
>
> previously it was:
>
>> -# specifies an optional shell command to execute at every restartpoint.
>> -# This can be useful for cleaning up the archive of a standby server.

Expanded the GUC desc.

>> > * Why the PG_TRY/PG_CATCH in check_recovery_target_time? Besides being
>> >   really strangely formatted (multiline :? inside a function?) it
>> >   doesn't a) seem to be correct to ignore potential memory allocation
>> >   errors by just switching back into the context that just errored out,
>> >   and continue to work there b) forgo cleanup by just continuing as if
>> >   nothing happened. That's unlikely to be acceptable.
>>
>> the code that read recovery.conf didn't has that, so i just removed it
>
> Well, that's not necessarily correct. recovery.conf was only read during
> startup, while this is read on SIGHUP.
>
[copied from the bottom, related]
>
> I don't think that's correct. Afaics it will cause the postmaster to
> crash on a SIGHUP with invalid data. I think you're unfortunately going
> to have to copy a fair bit of timestamptz_in() and even
> DateTimeParseError(), replacing the ereport()s by the guc error
> reporting.

The use of PG_TRY/CATCH does protect from FATALs in SIGHUP indeed.
Using CopyErrorData() we can also fetch the actual error message from
timestamptz_in, though I wonder we really have to make a full copy.

>> > * Why do you include xlog_internal.h in guc.c and not xlog.h?
>
>> we actually need both but including xlog_internal.h also includes xlog.h
>> i added xlog.h and if someone things is enough only putting
>> xlog_internal.h let me know
>
> What's required from xlog_internal.h?

Looks like this was addressed before me.

>> diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
>> index b53ae87..54f6a0d 100644
>> --- a/src/backend/access/transam/xlog.c
>> +++ b/src/backend/access/transam/xlog.c
>> @@ -64,11 +64,12 @@
>>  extern uint32 bootstrap_data_checksum_version;
>>
>>  /* File path names (all relative to $PGDATA) */
>> -#define RECOVERY_COMMAND_FILE    "recovery.conf"
>> -#define RECOVERY_COMMAND_DONE    "recovery.done"
>> +#define RECOVERY_ENABLE_FILE    "standby.enabled"
>
> Imo the file and variable names should stay coherent.

Yes, once we settle on the name (and if we really need that extra
trigger file.)

>> +/* recovery.conf is not supported anymore */
>> +#define RECOVERY_COMMAND_FILE    "recovery.conf"
>
>> +bool        StandbyModeRequested = false;
>> +static TimestampTz recoveryDelayUntilTime;
>
> This imo should be lowercase now, the majority of GUC variables are.

This is not a GUC variable, though it's calculated based on a GUC
recovery_min_apply_delay.

>> +/* are we currently in standby mode? */
>> +bool StandbyMode = false;
>
> Why did you move this?

It was easy to move it back though.

>> -    if (rtliGiven)
>> +    if (strcmp(recovery_target_timeline_string, "") != 0)
>>      {
>
> Why not have the convention that NULL indicates a unset target_timeline
> like you use for other GUCs? Mixing things like this is confusing.
>
> Why is recovery_target_timeline stored as a string? Because it's a
> unsigned int? If so, you should have an assign hook setting up a)
> rtliGiven, b) properly typed variable.

Yes, I believe setting these to NULL by default makes a lot more sense.
Then one can check if there was a non-default setting by checking the
*_string variable, which is not possible with int or TimestampTz.

>> -        if (rtli)
>> +        if (recovery_target_timeline)
>>          {
>>              /* Timeline 1 does not have a history file, all else should */
>> -            if (rtli != 1 && !existsTimeLineHistory(rtli))
>> +            if (recovery_target_timeline != 1 &&
>> +                !existsTimeLineHistory(recovery_target_timeline))
>>                  ereport(FATAL,
>>                          (errmsg("recovery target timeline %u does not exist",
>> -                                rtli)));
>> -            recoveryTargetTLI = rtli;
>> +                                recovery_target_timeline)));
>> +            recoveryTargetTLI = recovery_target_timeline;
>>              recoveryTargetIsLatest = false;
>
> So, now we have a recoveryTargetTLI and recovery_target_timeline
> variable? Really? Why do we need recoveryTargetTLI at all now?

Looks like we still do need both of them.  The initial value of
recoveryTargetTLI is derived from pg_control, but later it can be
overriden by this setting.

However, if the recovery_target_timeline was "latest" we need to find
the latest TLI, based on the initial value from pg_control.  But we
can't set final recoveryTargetTLI value in the GUC assign hook, because
we might need to fetch some history files first.

>> +static void
>> +assign_recovery_target_time(const char *newval, void *extra)
>> +{
>> +    recovery_target_time = *((TimestampTz *) extra);
>> +
>> +    if (recovery_target_xid != InvalidTransactionId)
>> +        recovery_target = RECOVERY_TARGET_XID;
>> +    else if (recovery_target_name[0])
>> +        recovery_target = RECOVERY_TARGET_NAME;
>> +    else if (recovery_target_time != 0)
>> +        recovery_target = RECOVERY_TARGET_TIME;
>> +    else
>> +        recovery_target = RECOVERY_TARGET_UNSET;
>> +}
>> +
>
> I don't think it's correct to do such hangups in the assign hook - you
> have no ideas in which order they will be called and such. Imo that
> should happen at startup, like we also do it for other interdependent
> variables like wal_buffers.

Yeah, that looked weird.  Moved to StartupXLOG().

I disliked the strtoul handling in the earlier version of the patch,
especially given that with the base=0 it can parse 0x-prefixed hex
strings.  I would rather error out on non-hex digit instead of stopping
and calling it OK.  This change is included in the new version.

Should we really allow specifying negative values for XID/timeline?
Right now it will happily consume "-1" for recovery_target_xid and
complain if it's out of range, like this:

LOG:  starting point-in-time recovery to XID 4294967295
LOG:  invalid primary checkpoint record
LOG:  invalid secondary checkpoint record

Allowing negative values makes even less sense for timelines, IMO.

--
Alex
Attachment

recovery_guc_v5.3.patch
pgsql-hackers by date:
From: Tom Lane
Date: 21 November 2014, 17:35:10
Subject: Re: psql \sf doesn't show it's SQL when ECHO_HIDDEN is on
From: Josh Berkus
Date: 21 November 2014, 17:55:13
Subject: Re: Turning recovery.conf into GUCs
Re: Turning recovery.conf into GUCs - Mailing list pgsql-hackers

Attachment

Previous

Next