Re: Duplicate history file? - Mailing list pgsql-hackers

From Tatsuro Yamada
Subject Re: Duplicate history file?
Date
Msg-id 4d9aa52e-00dd-a68d-da45-50ab863af6b6@nttcom.co.jp_1
Whole thread Raw
In response to Duplicate history file?  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: Duplicate history file?
List pgsql-hackers
Hi Horiguchi-san,

> This thread should have been started here:
> 
> https://www.postgresql.org/message-id/20210531.165825.921389284096975508.horikyota.ntt%40gmail.com
>>
>> (To recap: In a replication set using archive, startup tries to
>> restore WAL files from archive before checking pg_wal directory for
>> the desired file.  The behavior itself is intentionally designed and
>> reasonable. However, the restore code notifies of a restored file
>> regardless of whether it has been already archived or not.  If
>> archive_command is written so as to return error for overwriting as we
>> suggest in the documentation, that behavior causes archive failure.)
>>
>> After playing with this, I see the problem just by restarting a
>> standby even in a simple archive-replication set after making
>> not-special prerequisites.  So I think this is worth fixing.
>>
>> With this patch, KeepFileRestoredFromArchive compares the contents of
>> just-restored file and the existing file for the same segment only
>> when:
>>
>>       - archive_mode = always
>>   and - the file to restore already exists in pgwal
>>   and - it has a .done and/or .ready status file.
>>
>> which doesn't happen usually.  Then the function skips archive
>> notification if the contents are identical.  The included TAP test is
>> working both on Linux and Windows.
> 
> 
> Thank you for the analysis and the patch.
> I'll try the patch tomorrow.
> 
> I just noticed that this thread is still tied to another thread
> (it's not an independent thread). To fix that, it may be better to
> create a new thread again. 


I've tried your patch. Unfortunately, it didn't seem to have any good
effect on the script I sent to reproduce the problem.

I understand that, as Stefan says, the test and cp commands have
problems and should not be used for archive commands. Maybe this is not
a big problem for the community.
Nevertheless, even if we do not improve the feature, I think it is a
good idea to explicitly state in the documentation that archiving may
fail under certain conditions for new users.

I'd like to hear the opinions of experts on the archive command.

P.S.
My customer's problem has already been solved, so it's ok. I've
emailed -hackers with the aim of preventing users from encountering
the same problem.


Regards,
Tatsuro Yamada





pgsql-hackers by date:

Previous
From: "houzj.fnst@fujitsu.com"
Date:
Subject: RE: Parallel INSERT SELECT take 2
Next
From: Quan Zongliang
Date:
Subject: Remove unused code from the KnownAssignedTransactionIdes submodule