Re: BUG #16161: pg_ctl stop fails sometimes (on Windows) - Mailing list pgsql-bugs
From | Alexander Lakhin |
---|---|
Subject | Re: BUG #16161: pg_ctl stop fails sometimes (on Windows) |
Date | |
Msg-id | b2cc98b1-0ba0-1358-572f-8a1a01a3880e@gmail.com Whole thread Raw |
In response to | Re: BUG #16161: pg_ctl stop fails sometimes (on Windows) (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: BUG #16161: pg_ctl stop fails sometimes (on Windows)
|
List | pgsql-bugs |
18.12.2019 19:03, Tom Lane wrote:
Regarding fsync_fname_ext(), I thought that it's OK to avoid performing a call that is known to fail. And even if someone will try to open() a directory, he will need to add the same check for EACCES, otherwise the code will fail on Windows. In fact, I see the same pattern in pre_sync_fname, fsync_fname in file_utils.c.Alexander Lakhin <exclusion@gmail.com> writes:18.12.2019 6:25, Kyotaro Horiguchi wrote:At Tue, 17 Dec 2019 18:50:26 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote inIt does look suspiciously like this wait is triggering on these machines and somehow getting masked in most other test cases.https://web.archive.org/web/20150317121919/http://support.microsoft.com/en-us/kb/113996 tells that STATUS_FILE_IS_A_DIRECTORY is mapped to ERROR_ACCESS_DENIED too.Ugh. I wondered if we could really get away with slowing down all ERROR_ACCESS_DENIED cases.It seems that the cause of the issue is in fsync_fname_ext:I think you're blaming the messenger. We cannot suddenly impose a rule that people mustn't try to open() files that might be directories. Even if fsync_fname_ext() happens to be the only place that does that today (which I doubt) it's going to bite us on the rear regularly in the future, because there's no way to enforce it, and most developers aren't going to notice the problem in testing.
I see your point about future uncertainty, but I don't think that receiving e.g. STATUS_FILE_DELETED or STATUS_NO_EFS error every few seconds and ignoring it is a normal practice.It's possible that we could deal with this specific case by having pgwin32_open() try to stat the file and see if it's a directory. But I think that's messy. Also, the list of kernel-level conditions that that webpage says are mapped to ERROR_ACCESS_DENIED is scarily long: STATUS_THREAD_IS_TERMINATING ERROR_ACCESS_DENIED STATUS_PROCESS_IS_TERMINATING ERROR_ACCESS_DENIED STATUS_INVALID_LOCK_SEQUENCE ERROR_ACCESS_DENIED STATUS_INVALID_VIEW_SIZE ERROR_ACCESS_DENIED STATUS_ALREADY_COMMITTED ERROR_ACCESS_DENIED STATUS_ACCESS_DENIED ERROR_ACCESS_DENIED STATUS_FILE_IS_A_DIRECTORY ERROR_ACCESS_DENIED STATUS_CANNOT_DELETE ERROR_ACCESS_DENIED STATUS_FILE_DELETED ERROR_ACCESS_DENIED STATUS_FILE_RENAMED ERROR_ACCESS_DENIED STATUS_DELETE_PENDING ERROR_ACCESS_DENIED STATUS_PORT_CONNECTION_REFUSED ERROR_ACCESS_DENIED ... STATUS_ENCRYPTION_FAILED ERROR_ACCESS_DENIED STATUS_DECRYPTION_FAILED ERROR_ACCESS_DENIED STATUS_NO_RECOVERY_POLICY ERROR_ACCESS_DENIED STATUS_NO_EFS ERROR_ACCESS_DENIED STATUS_WRONG_EFS ERROR_ACCESS_DENIED STATUS_NO_USER_KEYS ERROR_ACCESS_DENIED ... SEC_E_MESSAGE_ALTERED ERROR_ACCESS_DENIED SEC_E_OUT_OF_SEQUENCE ERROR_ACCESS_DENIED Maybe most of these can't occur during fopen, but I'd rather not assume that
Unfortunately no, there is no known way to get the kernel error code. As stated in https://stackoverflow.com/questions/3764072, there is the GetFileInformationByHandleEx function, that can return DeletePending state, but to use it we should open the file first (to get a file handle).Is there a way to get the original kernel error code? It'd be a lot better if we could be sure that the condition is STATUS_DELETE_PENDING before looping.
So I see three ways now:
1) Revert the pgwin32_open change and choose the previously rejected approach: Unlink postmaster.pid using rename operation (adopt the solution from https://stackoverflow.com/questions/3764072/).
2) Add a check for directory in pgwin32_open, but I think then we're getting close to checking there just for postmaster.pid.
3) Find out how often we get ERROR_ACCESS_DENIED during all regression tests and if such error count is near zero, then add the check for isdir in fsync_fname_ext and friends, and get done.
I'm still thinking that we shouldn't perform doomed calls, but I'm ready to change the course.
Best regards,
Alexander
pgsql-bugs by date: