Re: BUG #17731: Server doesn't start after abnormal shutdown while creating unlogged tables - Mailing list pgsql-bugs

From Kyotaro Horiguchi
Subject Re: BUG #17731: Server doesn't start after abnormal shutdown while creating unlogged tables
Date
Msg-id 20230425.123332.1657429858787993644.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: BUG #17731: Server doesn't start after abnormal shutdown while creating unlogged tables  (Karina Litskevich <litskevichkarina@gmail.com>)
Responses Re: BUG #17731: Server doesn't start after abnormal shutdown while creating unlogged tables
List pgsql-bugs
At Mon, 24 Apr 2023 15:59:38 +0300, Karina Litskevich <litskevichkarina@gmail.com> wrote in 
> So in case before WAL recovery main fork exists and init fork isn't, and during
> recovery init fork is created, we get this problem. The second
> ResetUnloggedRelations() call sees just created init fork and tries to create a
> main fork from it expecting that the old main fork was already deleted by the
> first ResetUnloggedRelations() call, but it wasn't because the main fork hasn't
> corresponding init fork at that moment yet.

Seems right.

> Theoretically, this applies to all versions, but the script somehow doesn't lead
> to an error on REL_11_STABLE. I haven't investigated it yet.
> 
> I see two solutions: 1) keep init fork files until the next checkpoint as well
> as main fork files, 2) ignore (rewrite if exists) presence of an empty main
> fork file when copying from init fork. I found the latter less elegant so I
> implemented the first one. The patch is attached.

The init-fork related code has some other issues with crash-restart. A
minor one is that the crash of the creating transaction for a unlogged
relation leaves orphan init fork files.  I haven't fully chased the
specific issue rased here, but I think the common cause in the cases
is that the file operations around unlogged files are not fully
transactional.  There is a proposed patchset [1], the first patch of
which makes storage file creation and deletion transactional and
crash-safe.  As far as I see it seems to fix this case, too.

The latest version of it that posed to this ML [2] needs a rebase and
some fix for now, though.  (I'll post a rebased version, soon.)

As for the proposed patch, I haven't looked closely, but I don't think
delaying init-file removal is the right approach. The reason of the
delay, as mentioned, is someone might be accessing the file (causing
deletion failure on some platforms). Init-fork files don't fall into
that category.

regards.

[1] https://commitfest.postgresql.org/43/3461/

[2] https://www.postgresql.org/message-id/20230317.151634.1038632016265639446.horikyota.ntt%40gmail.com

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-bugs by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: BUG #17903: There is a bug in the KeepLogSeg()
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: BUG #17903: There is a bug in the KeepLogSeg()