Re: BUG #16161: pg_ctl stop fails sometimes (on Windows) - Mailing list pgsql-bugs
| From | Tom Lane |
|---|---|
| Subject | Re: BUG #16161: pg_ctl stop fails sometimes (on Windows) |
| Date | |
| Msg-id | 32374.1576770248@sss.pgh.pa.us Whole thread Raw |
| In response to | Re: BUG #16161: pg_ctl stop fails sometimes (on Windows) (Alexander Lakhin <exclusion@gmail.com>) |
| Responses |
Re: BUG #16161: pg_ctl stop fails sometimes (on Windows)
|
| List | pgsql-bugs |
Alexander Lakhin <exclusion@gmail.com> writes:
> Please look at the patch that implements (2). It makes `vcregress
> recoverycheck` pass (and my demo restart test still passes too).
I think we could be a little smarter here. If the problem is
STATUS_DELETE_PENDING, doesn't that affect stat() as well? That is,
if stat() succeeds, we needn't wait, independently of whether it
says S_ISDIR or not? This seems like a noticeable improvement if
true, because it would mean that ordinary file-permission failures
also need not wait. We'd still get confused if we get a permission
failure on a containing directory rather than the file itself, but
that I'm prepared to dismiss as an uncommon case.
Entirely-untested patch along this line attached.
BTW ... it's likely that stat() here is actually going to invoke
pgwin32_safestat(), which has its own opinions about this, and
indeed seems to think we'll get ERROR_DELETE_PENDING. I think
that's harmless here, but it makes me wonder if we should use
a native Windows API instead of going through stat().
> As open(dir) is getting a little more expensive than before, maybe it's
> still worthwhile to patch fsync*(..., isdir,...). I can prepare such a
> patch if so.
I think we should leave that for later, so that the buildfarm can
actually test whatever logic we put in here.
regards, tom lane
diff --git a/src/port/open.c b/src/port/open.c
index 5165276..24af6f4 100644
--- a/src/port/open.c
+++ b/src/port/open.c
@@ -21,6 +21,7 @@
#include <fcntl.h>
#include <assert.h>
+#include <sys/stat.h>
static int
@@ -137,18 +138,33 @@ pgwin32_open(const char *fileName, int fileFlags,...)
}
/*
- * ERROR_ACCESS_DENIED can be returned if the file is deleted but not
- * yet gone (Windows NT status code is STATUS_DELETE_PENDING). Wait a
- * bit and try again, giving up after 1 second (since this condition
- * should never persist very long).
+ * ERROR_ACCESS_DENIED is returned if the file is deleted but not yet
+ * gone (Windows NT status code is STATUS_DELETE_PENDING). In that
+ * case we want to wait a bit and try again, giving up after 1 second
+ * (since this condition should never persist very long). However,
+ * there are other commonly-hit cases that return ERROR_ACCESS_DENIED,
+ * so care is needed. In particular that happens if we try to open a
+ * directory, or of course if there's an actual file-permissions
+ * problem. To distinguish these cases, try a stat(). In the
+ * delete-pending case, it will either also get STATUS_DELETE_PENDING,
+ * or it will see the file as gone and fail with ENOENT. In other
+ * cases it will usually succeed. The only somewhat-likely case where
+ * this coding will uselessly wait is if there's a permissions problem
+ * with a containing directory, which we hope will never happen in any
+ * performance-critical code paths.
*/
if (err == ERROR_ACCESS_DENIED)
{
if (loops < 10)
{
- pg_usleep(100000);
- loops++;
- continue;
+ struct stat st;
+
+ if (stat(fileName, &st) != 0)
+ {
+ pg_usleep(100000);
+ loops++;
+ continue;
+ }
}
}
pgsql-bugs by date: