Thread: [PATCH] initdb: Treat empty -U argument as unset username
While working with `initdb`, I noticed that passing an empty string to the `-U` option (e.g., `initdb -U ''`) causes it to fail with a misleading error:
performing post-bootstrap initialization ... 2025-07-01 19:48:42.006 PDT [14888] FATAL: role """ does not exist at character 72
2025-07-01 19:48:42.006 PDT [14888] STATEMENT:
UPDATE pg_class SET relacl = (SELECT array_agg(a.acl) FROM (SELECT E'=r/""' as acl UNION SELECT unnest(pg_catalog.acldefault( CASE WHEN relkind = 'S' THEN 's' ELSE 'r' END::"char",10::oid)) ) as a) WHERE relkind IN ('r', 'v', 'm', 'S') AND relacl IS NULL;
This happens because `initdb` accepts the empty string as a valid role name and attempts to use it as the database superuser, which is not intended and fails during bootstrap SQL.
I propose a small patch that treats an empty string passed to `-U` as if the option was not provided at all — falling back to the current system user, which is the documented and expected behavior when `-U` is omitted.
This change improves robustness and avoids confusing failure messages due to user input that is technically invalid but easy to produce (e.g., via scripting or argument quoting issues).
### Patch summary:
- Checks if the passed `username` is non-null but empty (`'\0'`)
- Replaces it with the effective system user in that case
- Keeps the logic consistent with the existing behavior when `-U` is omitted
Let me know if this approach seems reasonable or if you’d prefer we explicitly reject empty usernames with an error instead.
Patch attached.
Best regards,
Jianghua Yang
Attachment
git show 8e673801262c66af4a54837f63ff596407835c20
effective_user = get_id();
- if (strlen(username) == 0)
+ if (!username)
username = effective_user;
strlen(username) == 0
to !username
, but this inadvertently stopped handling the empty-string case. This patch restores the original intent and makes the behavior consistent.On Tue, Jul 1, 2025 at 7:56 PM Jianghua Yang <yjhjstz@gmail.com> wrote:Let me know if this approach seems reasonable or if you’d prefer we explicitly reject empty usernames with an error instead.I'd rather we reject the ambiguous input.David J.
git show 8e673801262c66af4a54837f63ff596407835c20
effective_user = get_id();
- if (strlen(username) == 0)
+ if (!username)
username = effective_user;
The previous code already intended to treat a missing username as falling back to the system user.The check was changed fromstrlen(username) == 0
to!username
, but this inadvertently stopped handling the empty-string case. This patch restores the original intent and makes the behavior consistent.
On Wed, Jul 2, 2025 at 12:01 AM David G. Johnston <david.g.johnston@gmail.com> wrote: > On Tue, Jul 1, 2025 at 8:31 PM Jianghua Yang <yjhjstz@gmail.com> wrote: >> >> git show 8e673801262c66af4a54837f63ff596407835c20 >> >> >> effective_user = get_id(); >> >> - if (strlen(username) == 0) >> >> + if (!username) >> >> username = effective_user; >> >> >> The previous code already intended to treat a missing username as falling back to the system user. >> The check was changed from strlen(username) == 0 to !username, but this inadvertently stopped handling the empty-stringcase. This patch restores the original intent and makes the behavior consistent. >>> >>> > > At this point I'd rather take advantage of this behaveing in the "doesn't work" category for the past 8 years, and thusall supported releases, and not change existing behavior (just improve the error message) rather than accept originalintent. Also, the amount of things it has to be consistent with is quite small and I'm pytr sure that some of thoseare also broken - encoding/pgdata/textsearch all exhibit the same pattern (xlog is the reverse so maybe ok...) > FWIW, I tend to agree with David; I feel like if a user passes in -U, there was probably a reason, and a good error message would be more useful in clarifying things rather than blindly pushing forward with potentially the wrong thing. Robert Treat https://xzilla.net
> On 2 Jul 2025, at 15:52, Jianghua Yang <yjhjstz@gmail.com> wrote: > > Hi hackers, > > Based on the suggestion that we should explicitly reject empty usernames instead of silently falling back, I’ve updatedthe patch accordingly. > > ### Changes in v2: > > - `initdb` now errors out immediately if the `-U` or `--username` argument is an empty string. > - The error message is: > > superuser name must not be empty > > - A regression test is added to `src/bin/initdb/t/001_initdb.pl` to verify that the case `initdb -U ''` fails as expected. > > This approach avoids any ambiguity about whether an empty username is valid, and fails early with a clear message. It alsobrings consistency with existing checks, such as the one disallowing superuser names starting with `pg_`. > > Let me know if this looks acceptable or if further refinement is needed. + pg_log_error("superuser name must not be empty"); + exit(1); I would prefer pg_fatal() for this. -- Daniel Gustafsson
Jianghua Yang <yjhjstz@gmail.com> writes: > - A regression test is added to `src/bin/initdb/t/001_initdb.pl` to verify > that the case `initdb -U ''` fails as expected. [ ... ] > diff --git a/src/bin/initdb/t/001_initdb.pl b/src/bin/initdb/t/001_initdb.pl > index 15dd10ce40a..67eb53064f6 100644 > --- a/src/bin/initdb/t/001_initdb.pl > +++ b/src/bin/initdb/t/001_initdb.pl > @@ -37,6 +37,10 @@ command_fails( > command_fails([ 'initdb', '--username' => 'pg_test', $datadir ], > 'role names cannot begin with "pg_"'); > > +command_fails( > + [ 'initdb', '-U', '', $datadir ], > + 'empty username not allowed'); > + This only tests that it fails, not that it fails as expected. It should use command_fails_like() to check that stderr contains the expected error. Also, it shoud use => between the -U option and its argument, as seen in the above test with --username. - ilmari
I've updated the test to use `command_fails_like()` instead of `command_fails()`, so it now asserts that the error message matches the expected stderr output.
Let me know if any further adjustments are needed.
Best regards,
Jianghua Yang
Jianghua Yang <yjhjstz@gmail.com> writes:
> - A regression test is added to `src/bin/initdb/t/001_initdb.pl` to verify
> that the case `initdb -U ''` fails as expected.
[ ... ]
> diff --git a/src/bin/initdb/t/001_initdb.pl b/src/bin/initdb/t/001_initdb.pl
> index 15dd10ce40a..67eb53064f6 100644
> --- a/src/bin/initdb/t/001_initdb.pl
> +++ b/src/bin/initdb/t/001_initdb.pl
> @@ -37,6 +37,10 @@ command_fails(
> command_fails([ 'initdb', '--username' => 'pg_test', $datadir ],
> 'role names cannot begin with "pg_"');
>
> +command_fails(
> + [ 'initdb', '-U', '', $datadir ],
> + 'empty username not allowed');
> +
This only tests that it fails, not that it fails as expected. It should
use command_fails_like() to check that stderr contains the expected
error. Also, it shoud use => between the -U option and its argument, as
seen in the above test with --username.
- ilmari
Attachment
On 02.07.25 04:55, Jianghua Yang wrote: > While working with `initdb`, I noticed that passing an empty string to > the `-U` option (e.g., `initdb -U ''`) causes it to fail with a > misleading error: > > performing post-bootstrap initialization ... 2025-07-01 19:48:42.006 PDT > [14888] FATAL:role """ does not exist at character 72 > > 2025-07-01 19:48:42.006 PDT [14888] STATEMENT: > > UPDATE pg_class SET relacl = (SELECT array_agg(a.acl) FROM(SELECT > E'=r/""' as acl UNION SELECT unnest(pg_catalog.acldefault(CASE WHEN > relkind = 'S' THEN 's'ELSE 'r' END::"char",10::oid)) ) as a) WHERE > relkind IN ('r', 'v', 'm', 'S')AND relacl IS NULL; > > This happens because `initdb` accepts the empty string as a valid role > name and attempts to use it as the database superuser, which is not > intended and fails during bootstrap SQL. I'll start by saying, of course an empty user name isn't going to work, so we should reject it. But let's dig a little deeper into why it fails. Observe the error: FATAL:role """ does not exist at character 72 It thinks that the role name is `"` (a sole double-quote, not empty!). Why is that? This error comes from the literal E'=r/""' interpreted as an aclitem value. The aclitem parsing ends up in getid() in src/backend/utils/adt/acl.c, which thinks that an input string consisting entirely of "" is an escaped double quote. Maybe it's worth fixing that, and making putid() also print empty user names correspondingly. Alternatively, it's the fault of initdb that it constructs aclitem values that don't follow the aclitem-specific quoting rules. Another thought is, if we don't allow zero-length names, shouldn't namein() reject empty input strings? Then this whole thing would fail as postgres.bki is being loaded. (This is more hypothetical, since this appears to break a number of other things.) All of this is to say, it's worth looking at the actual cause and think about if there are related problems, maybe other name patterns that we don't handle well, instead of just papering over it at the top level.
Thanks for your detailed analysis. I appreciate you digging deeper into the root cause.
For this patch, I'd like to keep the changes to `initdb` minimal and focused on rejecting empty usernames, as that seems to be the consensus from the previous discussion.
I'll be happy to discuss the `getid()` and `aclitem` parsing behavior in a separate thread.
Best regards,
Jianghua Yang
On 02.07.25 04:55, Jianghua Yang wrote:
> While working with `initdb`, I noticed that passing an empty string to
> the `-U` option (e.g., `initdb -U ''`) causes it to fail with a
> misleading error:
>
> performing post-bootstrap initialization ... 2025-07-01 19:48:42.006 PDT
> [14888] FATAL:role """ does not exist at character 72
>
> 2025-07-01 19:48:42.006 PDT [14888] STATEMENT:
>
> UPDATE pg_class SET relacl = (SELECT array_agg(a.acl) FROM(SELECT
> E'=r/""' as acl UNION SELECT unnest(pg_catalog.acldefault(CASE WHEN
> relkind = 'S' THEN 's'ELSE 'r' END::"char",10::oid)) ) as a) WHERE
> relkind IN ('r', 'v', 'm', 'S')AND relacl IS NULL;
>
> This happens because `initdb` accepts the empty string as a valid role
> name and attempts to use it as the database superuser, which is not
> intended and fails during bootstrap SQL.
I'll start by saying, of course an empty user name isn't going to work,
so we should reject it.
But let's dig a little deeper into why it fails. Observe the error:
FATAL:role """ does not exist at character 72
It thinks that the role name is `"` (a sole double-quote, not empty!).
Why is that?
This error comes from the literal
E'=r/""'
interpreted as an aclitem value. The aclitem parsing ends up in getid()
in src/backend/utils/adt/acl.c, which thinks that an input string
consisting entirely of "" is an escaped double quote.
Maybe it's worth fixing that, and making putid() also print empty user
names correspondingly.
Alternatively, it's the fault of initdb that it constructs aclitem
values that don't follow the aclitem-specific quoting rules.
Another thought is, if we don't allow zero-length names, shouldn't
namein() reject empty input strings? Then this whole thing would fail
as postgres.bki is being loaded. (This is more hypothetical, since this
appears to break a number of other things.)
All of this is to say, it's worth looking at the actual cause and think
about if there are related problems, maybe other name patterns that we
don't handle well, instead of just papering over it at the top level.
Peter Eisentraut <peter@eisentraut.org> writes: > ... The aclitem parsing ends up in getid() > in src/backend/utils/adt/acl.c, which thinks that an input string > consisting entirely of "" is an escaped double quote. Yeah, that is definitely broken, and also it occurs to me that this coding is not robust in the face of non-ASCII identifiers (since the behavior of isalnum is unportable in that case). I've posted a draft patch for that in a separate thread [1]. > Maybe it's worth fixing that, and making putid() also print empty user > names correspondingly. putid() prints an empty name as an empty string, which seems fine at least at this level. > Alternatively, it's the fault of initdb that it constructs aclitem > values that don't follow the aclitem-specific quoting rules. With the patch at [1], initdb -U '' still fails at the same place, but now with FATAL: a name must follow the "/" sign at character 72 which is a consequence of getid()'s output not distinguishing "I found nothing" from "I found an empty string". Now if we cared to support empty identifiers, we could make some more revisions here so that those were consistently represented as "" rather than nothing. But I fail to see that that's a useful expenditure of time: we're never going to allow empty names. > Another thought is, if we don't allow zero-length names, shouldn't > namein() reject empty input strings? I'm inclined to think not. It's introducing too much of a gotcha for too little return. As a perhaps-not-quite-exact analogy, NULL::name also doesn't correspond to any identifier you can spell in SQL; but we're not going to try to forbid that value. So there is at least one value of type "name" that isn't valid in SQL, and I don't see why ''::name can't be another. regards, tom lane [1] https://www.postgresql.org/message-id/3792884.1751492172%40sss.pgh.pa.us