Thread: Re: getting "shell command argument contains a newline or carriage return:" error with pg_dumpall when db name have new line in double quote

On Thu, 30 Jan 2025 at 16:47, Srinath Reddy <srinath2133@gmail.com> wrote:
>
>
>
> On Wed, Jan 29, 2025 at 9:55 PM Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
>>
>> Hi,
>> While doing some testing with pg_dumpall, I noticed one weird behaviour.
>>
>> While we create the database, we are allowing the database name with a new line (if name is in double quote).
>> For example:
>>>
>>> postgres=# create database "dbstr1;
>>> dbstr 2";
>>> CREATE DATABASE
>>> postgres=#
>>
>> Here, the database name is in 2 lines.
>>
>> With the help of pg_dumpall, I tried to dump but I am getting an error for the new line.
>>
>>> --
>>> -- Database "dbstr1;
>>> dbstr 2" dump
>>> --
>>>
>>> shell command argument contains a newline or carriage return: " dbname='dbstr1;
>>> dbstr 2'"
>>
>>  
>> After this message, we are stopping the dump.
>
>
> I have reproduced and verified the same.The reason is in runPgDump during appendShellString for forming the pg_dump command , in appendShellStringNoError we are considering the string as invalid if it has '\n' and '\r'.
>  
>>
>>
>> I think, if we are allowing new lines in the db name, then we should dump it.
>

In another thread[1], we have some discussions regarding \n\r in dbname and they think that we should fix it.

As per code,
*
 * Append the given string to the shell command being built in the buffer,
 * with shell-style quoting as needed to create exactly one argument.
 *
 * Forbid LF or CR characters, which have scant practical use beyond designing
 * security breaches.  The Windows command shell is unusable as a conduit for
 * arguments containing LF or CR characters.  A future major release should
 * reject those characters in CREATE ROLE and CREATE DATABASE, because use
 * there eventually leads to errors here.

 *
 * appendShellString() simply prints an error and dies if LF or CR appears.
 * appendShellStringNoError() omits those characters from the result, and
 * returns false if there were any.
 */
void
appendShellString(PQExpBuffer buf, const char *str)
{
    if (!appendShellStringNoError(buf, str))
    {
        fprintf(stderr,
                _("shell command argument contains a newline or carriage return: \"%s\"\n"),
                str);
        exit(EXIT_FAILURE);
    }
}

Here, we are mentioning that in future majar releases, we should reject \n\r in CREATE ROLE and CREATE DATABASE.

Above comment was added in 2016.
commit 142c24c23447f212e642a0ffac9af878b93f490d
Author: Noah Misch <noah@leadboat.com>
Date:   Mon Aug 8 10:07:46 2016 -0400

    Reject, in pg_dumpall, names containing CR or LF.
   
    These characters prematurely terminate Windows shell command processing,
    causing the shell to execute a prefix of the intended command.  The
    chief alternative to rejecting these characters was to bypass the
    Windows shell with CreateProcess(), but the ability to use such names
    has little value.  Back-patch to 9.1 (all supported versions).
   
    This change formally revokes support for these characters in database
    names and roles names.  Don't document this; the error message is
    self-explanatory, and too few users would benefit.  A future major
    release may forbid creation of databases and roles so named.  For now,
    check only at known weak points in pg_dumpall.  Future commits will,
    without notice, reject affected names from other frontend programs.
   
    Also extend the restriction to pg_dumpall --dbname=CONNSTR arguments and
    --file arguments.  Unlike the effects on role name arguments and
    database names, this does not reflect a broad policy change.  A
    migration to CreateProcess() could lift these two restrictions.
   
    Reviewed by Peter Eisentraut.
   
    Security: CVE-2016-5424

As per above comments, we can work on a patch which will reject \n\r in roles and database names.

I will work on this.

 
--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
On Thu, 20 Mar 2025 at 23:39, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
>
> On Thu, 30 Jan 2025 at 16:47, Srinath Reddy <srinath2133@gmail.com> wrote:
> >
> >
> >
> > On Wed, Jan 29, 2025 at 9:55 PM Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
> >>
> >> Hi,
> >> While doing some testing with pg_dumpall, I noticed one weird behaviour.
> >>
> >> While we create the database, we are allowing the database name with a new line (if name is in double quote).
> >> For example:
> >>>
> >>> postgres=# create database "dbstr1;
> >>> dbstr 2";
> >>> CREATE DATABASE
> >>> postgres=#
> >>
> >> Here, the database name is in 2 lines.
> >>
> >> With the help of pg_dumpall, I tried to dump but I am getting an error for the new line.
> >>
> >>> --
> >>> -- Database "dbstr1;
> >>> dbstr 2" dump
> >>> --
> >>>
> >>> shell command argument contains a newline or carriage return: " dbname='dbstr1;
> >>> dbstr 2'"
> >>
> >>  
> >> After this message, we are stopping the dump.
> >
> >
> > I have reproduced and verified the same.The reason is in runPgDump during appendShellString for forming the pg_dump command , in appendShellStringNoError we are considering the string as invalid if it has '\n' and '\r'.
> >  
> >>
> >>
> >> I think, if we are allowing new lines in the db name, then we should dump it.
> >
>
> In another thread[1], we have some discussions regarding \n\r in dbname and they think that we should fix it.
>
> As per code,
>>
>> *
>>  * Append the given string to the shell command being built in the buffer,
>>  * with shell-style quoting as needed to create exactly one argument.
>>  *
>>  * Forbid LF or CR characters, which have scant practical use beyond designing
>>  * security breaches.  The Windows command shell is unusable as a conduit for
>>  * arguments containing LF or CR characters.  A future major release should
>>  * reject those characters in CREATE ROLE and CREATE DATABASE, because use
>>  * there eventually leads to errors here.
>>  *
>>  * appendShellString() simply prints an error and dies if LF or CR appears.
>>  * appendShellStringNoError() omits those characters from the result, and
>>  * returns false if there were any.
>>  */
>> void
>> appendShellString(PQExpBuffer buf, const char *str)
>> {
>>     if (!appendShellStringNoError(buf, str))
>>     {
>>         fprintf(stderr,
>>                 _("shell command argument contains a newline or carriage return: \"%s\"\n"),
>>                 str);
>>         exit(EXIT_FAILURE);
>>     }
>> }
>
>
> Here, we are mentioning that in future majar releases, we should reject \n\r in CREATE ROLE and CREATE DATABASE.
>
> Above comment was added in 2016.
>>
>> commit 142c24c23447f212e642a0ffac9af878b93f490d
>> Author: Noah Misch <noah@leadboat.com>
>> Date:   Mon Aug 8 10:07:46 2016 -0400
>>
>>     Reject, in pg_dumpall, names containing CR or LF.
>>    
>>     These characters prematurely terminate Windows shell command processing,
>>     causing the shell to execute a prefix of the intended command.  The
>>     chief alternative to rejecting these characters was to bypass the
>>     Windows shell with CreateProcess(), but the ability to use such names
>>     has little value.  Back-patch to 9.1 (all supported versions).
>>    
>>     This change formally revokes support for these characters in database
>>     names and roles names.  Don't document this; the error message is
>>     self-explanatory, and too few users would benefit.  A future major
>>     release may forbid creation of databases and roles so named.  For now,
>>     check only at known weak points in pg_dumpall.  Future commits will,
>>     without notice, reject affected names from other frontend programs.
>>    
>>     Also extend the restriction to pg_dumpall --dbname=CONNSTR arguments and
>>     --file arguments.  Unlike the effects on role name arguments and
>>     database names, this does not reflect a broad policy change.  A
>>     migration to CreateProcess() could lift these two restrictions.
>>    
>>     Reviewed by Peter Eisentraut.
>>    
>>     Security: CVE-2016-5424
>
>
> As per above comments, we can work on a patch which will reject \n\r in roles and database names.
>
> I will work on this.
>
> [1] : names with \n\r in dbnames
>  
> --

Hi,
I tried to do some improvements for database names that have \n or \r in dbname.

Solution 1:
As per code comments in appendShellString function, we can block database creation with \n or \r. 
sol1_v01* patch is doing the same for database creation.

Solution 2:
While dumping the database, report WARNING if the database name has \n or \r and skip dump for a particular database but dump all other databases by pg_dumpall.
sol2_v01* patch is doing this.

Solution 3:
While dumping the database, report FATAL if the database name has \n or \r and add a hint message in FALAL (rename particular database to dump without \n\r char).
sol3_v01* is doing the same.

Please review attached patches and let me know feedback.

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

Attachment


On Mon, Mar 24, 2025 at 11:21 PM Mahendra Singh Thalor <mahi6run@gmail.com> wrote:

> As per code,
>>
>> *
>>  * Append the given string to the shell command being built in the buffer,
>>  * with shell-style quoting as needed to create exactly one argument.
>>  *
>>  * Forbid LF or CR characters, which have scant practical use beyond designing
>>  * security breaches.  The Windows command shell is unusable as a conduit for
>>  * arguments containing LF or CR characters.  A future major release should
>>  * reject those characters in CREATE ROLE and CREATE DATABASE, because use
>>  * there eventually leads to errors here.
>>  *
>>  * appendShellString() simply prints an error and dies if LF or CR appears.
>>  * appendShellStringNoError() omits those characters from the result, and
>>  * returns false if there were any.
>>  */
>> void
>> appendShellString(PQExpBuffer buf, const char *str)
>> {
>>     if (!appendShellStringNoError(buf, str))
>>     {
>>         fprintf(stderr,
>>                 _("shell command argument contains a newline or carriage return: \"%s\"\n"),
>>                 str);
>>         exit(EXIT_FAILURE);
>>     }
>> }
>
>
> Here, we are mentioning that in future majar releases, we should reject \n\r in CREATE ROLE and CREATE DATABASE.
>
> Above comment was added in 2016.
>>
>> commit 142c24c23447f212e642a0ffac9af878b93f490d
>> Author: Noah Misch <noah@leadboat.com>
>> Date:   Mon Aug 8 10:07:46 2016 -0400
>>
>>     Reject, in pg_dumpall, names containing CR or LF.
>>    
>>     These characters prematurely terminate Windows shell command processing,
>>     causing the shell to execute a prefix of the intended command.  The
>>     chief alternative to rejecting these characters was to bypass the
>>     Windows shell with CreateProcess(), but the ability to use such names
>>     has little value.  Back-patch to 9.1 (all supported versions).
>>    
>>     This change formally revokes support for these characters in database
>>     names and roles names.  Don't document this; the error message is
>>     self-explanatory, and too few users would benefit.  A future major
>>     release may forbid creation of databases and roles so named.  For now,
>>     check only at known weak points in pg_dumpall.  Future commits will,
>>     without notice, reject affected names from other frontend programs.
>>    
>>     Also extend the restriction to pg_dumpall --dbname=CONNSTR arguments and
>>     --file arguments.  Unlike the effects on role name arguments and
>>     database names, this does not reflect a broad policy change.  A
>>     migration to CreateProcess() could lift these two restrictions.
>>    
>>     Reviewed by Peter Eisentraut.
>>    
>>     Security: CVE-2016-5424
>
>
> As per above comments, we can work on a patch which will reject \n\r in roles and database names.
>
> I will work on this.
>
> [1] : names with \n\r in dbnames
>  
> --

Hi,
I tried to do some improvements for database names that have \n or \r in dbname.

Solution 1:
As per code comments in appendShellString function, we can block database creation with \n or \r. 
sol1_v01* patch is doing the same for database creation.

Solution 2:
While dumping the database, report WARNING if the database name has \n or \r and skip dump for a particular database but dump all other databases by pg_dumpall.
sol2_v01* patch is doing this.

Solution 3:
While dumping the database, report FATAL if the database name has \n or \r and add a hint message in FALAL (rename particular database to dump without \n\r char).
sol3_v01* is doing the same.

Please review attached patches and let me know feedback.


Hi ,

I have reviewed all solutions but based on the commit message and comments, it is clear that the goal is to entirely forbid database names containing carriage return (CR) or line feed (LF) characters. Solution 1 LGTM and aligns with this approach by enforcing the restriction at the time of database creation, ensuring consistency throughout PostgreSQL. This approach eliminates ambiguity and guarantees that such database names cannot be created or dumped with CR or LF.

To validate this behavior, I have also implemented a TAP test for Solution 1.
Thanks and regards,
Srinath Reddy Sadipiralla,
EnterpriseDB: http://www.enterprisedb.com 

Attachment


On 2025-03-26 We 3:04 AM, Srinath Reddy wrote:

I have reviewed all solutions but based on the commit message and comments, it is clear that the goal is to entirely forbid database names containing carriage return (CR) or line feed (LF) characters. Solution 1 LGTM and aligns with this approach by enforcing the restriction at the time of database creation, ensuring consistency throughout PostgreSQL. This approach eliminates ambiguity and guarantees that such database names cannot be created or dumped with CR or LF.

To validate this behavior, I have also implemented a TAP test for Solution 1.


You can still create a database with these using "CREATE DATABASE" though. Shouldn't we should really be preventing that?


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com
./psql postgres

Hi,

On Wed, Mar 26, 2025 at 5:55 PM Andrew Dunstan <andrew@dunslane.net> wrote:

You can still create a database with these using "CREATE DATABASE" though. Shouldn't we should really be preventing that?


yes, solution 1 which i mentioned prevents these while we are using "CREATE DATABASE".

/*
  * Create a new database using the WAL_LOG strategy.
@@ -741,6 +742,13 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
  CreateDBStrategy dbstrategy = CREATEDB_WAL_LOG;
  createdb_failure_params fparms;
 
+ /* Report error if dbname have newline or carriage return in name. */
+ if (is_name_contain_lfcr(dbname))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE)),
+ errmsg("database name contains a newline or carriage return character"),
+ errhint("newline or carriage return character is not allowed in database name"));
+

psql (18devel)
Type "help" for help.

postgres=# create database "test
postgres"# lines";
ERROR:  database name contains a newline or carriage return character
HINT:  newline or carriage return character is not allowed in database name


Thanks and regards,
Srinath Reddy Sadipiralla,
EnterpriseDB: http://www.enterprisedb.com

postgres=#\q

 
sorry for the noise ,previous response had my editor's formatting,just resending without that formatting.

./psql postgres

Hi,

On Wed, Mar 26, 2025 at 5:55 PM Andrew Dunstan <andrew@dunslane.net> wrote:

You can still create a database with these using "CREATE DATABASE" though. Shouldn't we should really be preventing that?


yes, solution 1 which I mentioned prevents these while we are using "CREATE DATABASE".

/*
  * Create a new database using the WAL_LOG strategy.
@@ -741,6 +742,13 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
  CreateDBStrategy dbstrategy = CREATEDB_WAL_LOG;
  createdb_failure_params fparms;
 
+ /* Report error if dbname have newline or carriage return in name. */
+ if (is_name_contain_lfcr(dbname))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE)),
+ errmsg("database name contains a newline or carriage return character"),
+ errhint("newline or carriage return character is not allowed in database name"));
+

psql (18devel)
Type "help" for help.

postgres=# create database "test
postgres"# lines";
ERROR:  database name contains a newline or carriage return character
HINT:  newline or carriage return character is not allowed in database name

Thanks and regards,
Srinath Reddy Sadipiralla,
EDB: https://www.enterprisedb.com

postgres=#\q


On 2025-03-26 We 8:52 AM, Srinath Reddy wrote:
sorry for the noise ,previous response had my editor's formatting,just resending without that formatting.

./psql postgres

Hi,

On Wed, Mar 26, 2025 at 5:55 PM Andrew Dunstan <andrew@dunslane.net> wrote:

You can still create a database with these using "CREATE DATABASE" though. Shouldn't we should really be preventing that?


yes, solution 1 which I mentioned prevents these while we are using "CREATE DATABASE".

/*
  * Create a new database using the WAL_LOG strategy.
@@ -741,6 +742,13 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
  CreateDBStrategy dbstrategy = CREATEDB_WAL_LOG;
  createdb_failure_params fparms;
 
+ /* Report error if dbname have newline or carriage return in name. */
+ if (is_name_contain_lfcr(dbname))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE)),
+ errmsg("database name contains a newline or carriage return character"),
+ errhint("newline or carriage return character is not allowed in database name"));
+

psql (18devel)
Type "help" for help.

postgres=# create database "test
postgres"# lines";
ERROR:  database name contains a newline or carriage return character
HINT:  newline or carriage return character is not allowed in database name




Yes, sorry, I misread the thread. I think we should proceed with options 1 and 3 i.e. prevent creation of new databases with a CR or LF, and have pgdumpall exit with a more useful error message.

Your invention of an is_name_contain_lfcr() function is unnecessary - we can just use the standard library function strpbrk() to look for a CR or LF.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com
./psql postgres

On Thu, Mar 27, 2025 at 4:16 PM Andrew Dunstan <andrew@dunslane.net> wrote:

Yes, sorry, I misread the thread. I think we should proceed with options 1 and 3 i.e. prevent creation of new databases with a CR or LF, and have pgdumpall exit with a more useful error message.

agreed. 

Your invention of an is_name_contain_lfcr() function is unnecessary - we can just use the standard library function strpbrk() to look for a CR or LF.

makes sense,but I have a dumb doubt why in appendShellStringNoError() it still continues even after it found CR or LF? ,AFAIK The reasoning is this function is designed to silently filter out \n and \r while still producing a usable shell-safe argument. It informs the caller of the issue (false return value) but does not abruptly stop execution,then the caller will decide what to do but every place this function is called they are just throwing the error.

if we could just break the loop right after we found \n or \r in appendShellStringNoError() we can also use strpbrk() here and during creation of new database as you suggested.

thoughts?
 
May the force be with you,
Srinath Reddy Sadipiralla
EDB: https://www.enterprisedb.com/

postgres=#\q
On Thu, 27 Mar 2025 at 16:16, Andrew Dunstan <andrew@dunslane.net> wrote:
>
>
> On 2025-03-26 We 8:52 AM, Srinath Reddy wrote:
>
> sorry for the noise ,previous response had my editor's formatting,just resending without that formatting.
>
> ./psql postgres
>
> Hi,
>
> On Wed, Mar 26, 2025 at 5:55 PM Andrew Dunstan <andrew@dunslane.net> wrote:
>>
>> You can still create a database with these using "CREATE DATABASE" though. Shouldn't we should really be preventing
that?
>
>
> yes, solution 1 which I mentioned prevents these while we are using "CREATE DATABASE".
>
> /*
>   * Create a new database using the WAL_LOG strategy.
> @@ -741,6 +742,13 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
>   CreateDBStrategy dbstrategy = CREATEDB_WAL_LOG;
>   createdb_failure_params fparms;
>
> + /* Report error if dbname have newline or carriage return in name. */
> + if (is_name_contain_lfcr(dbname))
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE)),
> + errmsg("database name contains a newline or carriage return character"),
> + errhint("newline or carriage return character is not allowed in database name"));
> +
>
> psql (18devel)
> Type "help" for help.
>
> postgres=# create database "test
> postgres"# lines";
> ERROR:  database name contains a newline or carriage return character
> HINT:  newline or carriage return character is not allowed in database name
>
>
>
>
> Yes, sorry, I misread the thread. I think we should proceed with options 1 and 3 i.e. prevent creation of new
databaseswith a CR or LF, and have pgdumpall exit with a more useful error message. 
>
> Your invention of an is_name_contain_lfcr() function is unnecessary - we can just use the standard library function
strpbrk()to look for a CR or LF. 
>
>
> cheers
>

Thanks Andrew and Srinath for feedback.

Yes, we should use the strpbrk function. Fixed.

Here, I am attaching an updated patch which has check in createdb and
RenameDatabase. For older versions, we can add more useful error
message (like: rename database as database has \n\r")

I will add some TAP tests and will make patches for older branches.

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

Attachment


On 2025-03-27 Th 7:33 AM, Srinath Reddy wrote:
./psql postgres

On Thu, Mar 27, 2025 at 4:16 PM Andrew Dunstan <andrew@dunslane.net> wrote:

Yes, sorry, I misread the thread. I think we should proceed with options 1 and 3 i.e. prevent creation of new databases with a CR or LF, and have pgdumpall exit with a more useful error message.

agreed. 

Your invention of an is_name_contain_lfcr() function is unnecessary - we can just use the standard library function strpbrk() to look for a CR or LF.

makes sense,but I have a dumb doubt why in appendShellStringNoError() it still continues even after it found CR or LF? ,AFAIK The reasoning is this function is designed to silently filter out \n and \r while still producing a usable shell-safe argument. It informs the caller of the issue (false return value) but does not abruptly stop execution,then the caller will decide what to do but every place this function is called they are just throwing the error.

if we could just break the loop right after we found \n or \r in appendShellStringNoError() we can also use strpbrk() here and during creation of new database as you suggested.

thoughts?
 


I don't know. If you want to submit a patch cleaning it up go ahead. But right now I just want to get this original issue cleaned up to go along with the pg_dumpall improvements.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On 2025-03-27 Th 7:57 AM, Mahendra Singh Thalor wrote:
> On Thu, 27 Mar 2025 at 16:16, Andrew Dunstan <andrew@dunslane.net> wrote:
>>
>> On 2025-03-26 We 8:52 AM, Srinath Reddy wrote:
>>
>> sorry for the noise ,previous response had my editor's formatting,just resending without that formatting.
>>
>> ./psql postgres
>>
>> Hi,
>>
>> On Wed, Mar 26, 2025 at 5:55 PM Andrew Dunstan <andrew@dunslane.net> wrote:
>>> You can still create a database with these using "CREATE DATABASE" though. Shouldn't we should really be preventing
that?
>>
>> yes, solution 1 which I mentioned prevents these while we are using "CREATE DATABASE".
>>
>> /*
>>    * Create a new database using the WAL_LOG strategy.
>> @@ -741,6 +742,13 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
>>    CreateDBStrategy dbstrategy = CREATEDB_WAL_LOG;
>>    createdb_failure_params fparms;
>>
>> + /* Report error if dbname have newline or carriage return in name. */
>> + if (is_name_contain_lfcr(dbname))
>> + ereport(ERROR,
>> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE)),
>> + errmsg("database name contains a newline or carriage return character"),
>> + errhint("newline or carriage return character is not allowed in database name"));
>> +
>>
>> psql (18devel)
>> Type "help" for help.
>>
>> postgres=# create database "test
>> postgres"# lines";
>> ERROR:  database name contains a newline or carriage return character
>> HINT:  newline or carriage return character is not allowed in database name
>>
>>
>>
>>
>> Yes, sorry, I misread the thread. I think we should proceed with options 1 and 3 i.e. prevent creation of new
databaseswith a CR or LF, and have pgdumpall exit with a more useful error message.
 
>>
>> Your invention of an is_name_contain_lfcr() function is unnecessary - we can just use the standard library function
strpbrk()to look for a CR or LF.
 
>>
>>
>> cheers
>>
> Thanks Andrew and Srinath for feedback.
>
> Yes, we should use the strpbrk function. Fixed.
>
> Here, I am attaching an updated patch which has check in createdb and
> RenameDatabase. For older versions, we can add more useful error
> message (like: rename database as database has \n\r")
>
> I will add some TAP tests and will make patches for older branches.
>


I don't think we can backpatch this. It's a behaviour change.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




On Thu, 27 Mar 2025 at 18:10, Andrew Dunstan <andrew@dunslane.net> wrote:
>
>
> On 2025-03-27 Th 7:57 AM, Mahendra Singh Thalor wrote:
> > On Thu, 27 Mar 2025 at 16:16, Andrew Dunstan <andrew@dunslane.net> wrote:
> >>
> >> On 2025-03-26 We 8:52 AM, Srinath Reddy wrote:
> >>
> >> sorry for the noise ,previous response had my editor's formatting,just resending without that formatting.
> >>
> >> ./psql postgres
> >>
> >> Hi,
> >>
> >> On Wed, Mar 26, 2025 at 5:55 PM Andrew Dunstan <andrew@dunslane.net> wrote:
> >>> You can still create a database with these using "CREATE DATABASE" though. Shouldn't we should really be preventing that?
> >>
> >> yes, solution 1 which I mentioned prevents these while we are using "CREATE DATABASE".
> >>
> >> /*
> >>    * Create a new database using the WAL_LOG strategy.
> >> @@ -741,6 +742,13 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
> >>    CreateDBStrategy dbstrategy = CREATEDB_WAL_LOG;
> >>    createdb_failure_params fparms;
> >>
> >> + /* Report error if dbname have newline or carriage return in name. */
> >> + if (is_name_contain_lfcr(dbname))
> >> + ereport(ERROR,
> >> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE)),
> >> + errmsg("database name contains a newline or carriage return character"),
> >> + errhint("newline or carriage return character is not allowed in database name"));
> >> +
> >>
> >> psql (18devel)
> >> Type "help" for help.
> >>
> >> postgres=# create database "test
> >> postgres"# lines";
> >> ERROR:  database name contains a newline or carriage return character
> >> HINT:  newline or carriage return character is not allowed in database name
> >>
> >>
> >>
> >>
> >> Yes, sorry, I misread the thread. I think we should proceed with options 1 and 3 i.e. prevent creation of new databases with a CR or LF, and have pgdumpall exit with a more useful error message.
> >>
> >> Your invention of an is_name_contain_lfcr() function is unnecessary - we can just use the standard library function strpbrk() to look for a CR or LF.
> >>
> >>
> >> cheers
> >>
> > Thanks Andrew and Srinath for feedback.
> >
> > Yes, we should use the strpbrk function. Fixed.
> >
> > Here, I am attaching an updated patch which has check in createdb and
> > RenameDatabase. For older versions, we can add more useful error
> > message (like: rename database as database has \n\r")
> >
> > I will add some TAP tests and will make patches for older branches.
> >
>
>
> I don't think we can backpatch this. It's a behaviour change.

We can add a good error message in pg_dumpall and pg_dump for all older branches by checking dbname.
[mst@localhost postgres]$ git diff
diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
index 2ea574b0f06..991dd4db2ca 100644
--- a/src/bin/pg_dump/pg_dumpall.c
+++ b/src/bin/pg_dump/pg_dumpall.c
@@ -1611,6 +1611,10 @@ dumpDatabases(PGconn *conn)
                if (strcmp(dbname, "template0") == 0)
                        continue;
 
+               /* Report fatal if database name have \n\r */
+               if (strpbrk(dbname, "\n\r"))
+                       pg_fatal("database name has newline or carriage return so stoping dump. To fix, rename dbname.");
+
                /* Skip any explicitly excluded database */
                if (simple_string_list_member(&database_exclude_names, dbname))
                {
[mst@localhost postgres]$ 

This will change the error message in older versions. If it is OK to report more readable errors in back branches, then we can do the above change.

Thoughts?

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
On 2025-Mar-27, Andrew Dunstan wrote:

> I don't think we can backpatch this. It's a behaviour change.

I agree, we can't.

Also, if we're going to enforce this rule, then pg_upgrade --check needs
to alert users that they have a database name that's no longer valid.
That needs to be part of this patch as well.  We should also ensure that
these are the only problem characters, which IMO means it should add a
test for pg_dumpall that creates a database whose name has all possible
characters and ensures that it is dumpable.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"La virtud es el justo medio entre dos defectos" (Aristóteles)



On Thu, 27 Mar 2025 at 18:33, Álvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2025-Mar-27, Andrew Dunstan wrote:
>
> > I don't think we can backpatch this. It's a behaviour change.
>
> I agree, we can't.
>
> Also, if we're going to enforce this rule, then pg_upgrade --check needs
> to alert users that they have a database name that's no longer valid.
> That needs to be part of this patch as well.  We should also ensure that
> these are the only problem characters, which IMO means it should add a
> test for pg_dumpall that creates a database whose name has all possible
> characters and ensures that it is dumpable.
>
> --
> Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
> "La virtud es el justo medio entre dos defectos" (Aristóteles)

Thanks Álvaro.

Yes, I also agree with you. "pg_upgrade --check" should alert users regarding database names because pg_upgrade is failing if we have \n\r in dbname(old cluster).

pg_upgrade --check: This is passing but pg_upgrade is failing.
............
Checking for new cluster tablespace directories               ok
*Clusters are compatible*

pg_upgrade
.........................
Creating dump of global objects                               ok
Creating dump of database schemas                            
shell command argument contains a newline or carriage return: "dbname='invalid
db'"


As a part of this patch, we can teach pg_upgrade to alert users regarding database names with invalid characters or we can keep old behavior as upgrade is already failing without this patch also.

I will try to make a separate patch to teach "pg_upgrade --check" to alert users regarding database/user/role with \n\r in the old cluster.

Here, I am attaching an updated patch for review.

This patch has changes for: CREATE DATABASE, CREATE ROLE, CREATE USER and RENAME DATABASE/USER/ROLE and have some tests also. (EXCEPT RENAME test case)

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
Attachment
On Fri, 28 Mar 2025 at 01:37, Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
>
> On Thu, 27 Mar 2025 at 18:33, Álvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >
> > On 2025-Mar-27, Andrew Dunstan wrote:
> >
> > > I don't think we can backpatch this. It's a behaviour change.
> >
> > I agree, we can't.
> >
> > Also, if we're going to enforce this rule, then pg_upgrade --check needs
> > to alert users that they have a database name that's no longer valid.
> > That needs to be part of this patch as well.  We should also ensure that
> > these are the only problem characters, which IMO means it should add a
> > test for pg_dumpall that creates a database whose name has all possible
> > characters and ensures that it is dumpable.
> >
> > --
> > Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
> > "La virtud es el justo medio entre dos defectos" (Aristóteles)
>
> Thanks Álvaro.
>
> Yes, I also agree with you. "pg_upgrade --check" should alert users regarding database names because pg_upgrade is failing if we have \n\r in dbname(old cluster).
>
> pg_upgrade --check: This is passing but pg_upgrade is failing.
> ............
> Checking for new cluster tablespace directories               ok
> *Clusters are compatible*
>
> pg_upgrade:
> .........................
> Creating dump of global objects                               ok
> Creating dump of database schemas                            
> shell command argument contains a newline or carriage return: "dbname='invalid
> db'"

In the attached patch, instead of the above error, we will get proper ALERT for invalid names even with "pg_upgrade --check" also.

Ex:
Performing Consistency Checks
-----------------------------
Checking cluster versions                                     ok
Checking database connection settings                         ok
Checking names of databases/users/roles                       fatal

All the database names should have only valid characters.  A newline or
carriage return character is not allowed in database name.  To fix this,
please rename database names with valid names.
    /home/mst/pg_all/head_pg/postgres/inst/bin/data/pg_upgrade_output.d/20250328T164926.680/db_role_invalid_names.txt
Failure, exiting


>
> As a part of this patch, we can teach pg_upgrade to alert users regarding database names with invalid characters or we can keep old behavior as upgrade is already failing without this patch also.
>
> I will try to make a separate patch to teach "pg_upgrade --check" to alert users regarding database/user/role with \n\r in the old cluster.
>
> Here, I am attaching an updated patch for review.
>
> This patch has changes for: CREATE DATABASE, CREATE ROLE, CREATE USER and RENAME DATABASE/USER/ROLE and have some tests also. (EXCEPT RENAME test case)
>
> --
> Thanks and Regards
> Mahendra Singh Thalor
> EnterpriseDB: http://www.enterprisedb.com

Here, I am attaching updated patches for review.

v04_001* has the changes for CREATE DATABASE/ROLE/USER and
v04_002* has the changes into pg_upgrade to give ALERTS for invalid names.

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
Attachment
On Fri, Mar 28, 2025 at 05:08:26PM +0530, Mahendra Singh Thalor wrote:
> Here, I am attaching updated patches for review.
> 
> v04_001* has the changes for CREATE DATABASE/ROLE/USER and
> v04_002* has the changes into pg_upgrade to give ALERTS for invalid names.

In general, +1 for these changes.  Thanks for picking this up.

If these are intended for v18, we probably should have a committer
attached to it soon.  I'm not confident that I'll have time for it,
unfortunately.

+    /* Report error if dbname have newline or carriage return in name. */
+    if (strpbrk(dbname, "\n\r"))
+        ereport(ERROR,
+                (errcode(ERRCODE_INVALID_PARAMETER_VALUE)),
+                errmsg("database name contains a newline or carriage return character"),
+                errhint("newline or carriage return character is not allowed in database name"));

I think it would be better to move this to a helper function instead of
duplicating this code in several places.

Taking a step back, are we sure that 1) this is the right place to do these
checks and 2) we shouldn't apply the same restrictions to all names?  I'm
wondering if it would be better to add these checks to the grammar instead
of trying to patch up all the various places they are used in the tree.

-- 
nathan



./psql postgres

Hi,

On Fri, Mar 28, 2025 at 5:08 PM Mahendra Singh Thalor <mahi6run@gmail.com> wrote:

v04_001* has the changes for CREATE DATABASE/ROLE/USER and
v04_002* has the changes into pg_upgrade to give ALERTS for invalid names.

i have reviewed these patches,in v04_002* i think it would be better if log all the names like database ,roles/users names then throw error instead of just logging database names into db_role_invalid_names and throwing error,which helps the user to see at once all the invalid names and resolve then again try pg_upgrade,so here i am attaching delta patch for the same ,except this the patches LGTM.


May the force be with you,
Srinath Reddy Sadipiralla
EDB: https://www.enterprisedb.com/

postgres=# \q
Attachment
./psql postgres
postgres=# BEGIN;

Hi,May the force be with you,
Srinath Reddy Sadipiralla
EDB: https://www.enterprisedb.com/

postgres=# COMMIT;
postgres=# \q
On Fri, 28 Mar 2025 at 20:13, Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Fri, Mar 28, 2025 at 05:08:26PM +0530, Mahendra Singh Thalor wrote:
> > Here, I am attaching updated patches for review.
> >
> > v04_001* has the changes for CREATE DATABASE/ROLE/USER and
> > v04_002* has the changes into pg_upgrade to give ALERTS for invalid names.
>
> In general, +1 for these changes.  Thanks for picking this up.

Thanks Nathan for quick feedback.

>
> If these are intended for v18, we probably should have a committer
> attached to it soon.  I'm not confident that I'll have time for it,
> unfortunately.

I think Andrew is planning this as a cleanup for "non-text pg_dumpall" patch. Andrew, please if you have some bandwidth, then please let us know your feedback for these patches.
 
>
> +       /* Report error if dbname have newline or carriage return in name. */
> +       if (strpbrk(dbname, "\n\r"))
> +               ereport(ERROR,
> +                               (errcode(ERRCODE_INVALID_PARAMETER_VALUE)),
> +                               errmsg("database name contains a newline or carriage return character"),
> +                               errhint("newline or carriage return character is not allowed in database name"));
>
> I think it would be better to move this to a helper function instead of
> duplicating this code in several places.

Agreed. Fixed this and as Srinath pointed out, I moved it inside "src/backend/commands/define.c" file.

>
> Taking a step back, are we sure that 1) this is the right place to do these
> checks and 2) we shouldn't apply the same restrictions to all names?  I'm
> wondering if it would be better to add these checks to the grammar instead
> of trying to patch up all the various places they are used in the tree.
>

As of now, we made this restriction to only "database/role/user" because dump is not allowed with these special characters.
yes, we can add these checks in grammar also. (probably we should add checks in grammar only). If others also feel same, I can try to add these checks in grammar.

On Sun, 30 Mar 2025 at 00:03, Srinath Reddy <srinath2133@gmail.com> wrote:
>
> ./psql postgres
>
> Hi,
>
> On Fri, Mar 28, 2025 at 5:08 PM Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
>>
>>
>> v04_001* has the changes for CREATE DATABASE/ROLE/USER and
>> v04_002* has the changes into pg_upgrade to give ALERTS for invalid names.
>
>
> i have reviewed these patches,in v04_002* i think it would be better if log all the names like database ,roles/users names then throw error instead of just logging database names into db_role_invalid_names and throwing error,which helps the user to see at once all the invalid names and resolve then again try pg_upgrade,so here i am attaching delta patch for the same ,except this the patches LGTM.
>
Thanks Srinath for the review.

In attached v05_0002* patch, instead of 2 exec commands, I merged into 1 and as per your comment, we will report all the invalid names at once and additionally we are giving count of invalid names.
 
Ex: pg_upgrade --check:
Performing Consistency Checks
-----------------------------
Checking cluster versions                                     ok
Checking database connection settings                         ok
Checking names of databases/users/roles                       fatal

All the database, role/user names should have only valid characters. A newline or
carriage return character is not allowed in these object names.  To fix this, please
rename these names with valid names.
To see all 5 invalid object names, refer db_role_user_invalid_names.txt file.
    /home/mst/pg_all/head_pg/postgres/inst/bin/data/pg_upgrade_output.d/20250402T160610.664/db_role_user_invalid_names.txt
Failure, exiting

 Here, I am attaching updated patches for review.

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
Attachment


On 2025-04-02 We 7:45 AM, Mahendra Singh Thalor wrote:
On Fri, 28 Mar 2025 at 20:13, Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Fri, Mar 28, 2025 at 05:08:26PM +0530, Mahendra Singh Thalor wrote:
> > Here, I am attaching updated patches for review.
> >
> > v04_001* has the changes for CREATE DATABASE/ROLE/USER and
> > v04_002* has the changes into pg_upgrade to give ALERTS for invalid names.
>
> In general, +1 for these changes.  Thanks for picking this up.

Thanks Nathan for quick feedback.

>
> If these are intended for v18, we probably should have a committer
> attached to it soon.  I'm not confident that I'll have time for it,
> unfortunately.

I think Andrew is planning this as a cleanup for "non-text pg_dumpall" patch. Andrew, please if you have some bandwidth, then please let us know your feedback for these patches.


Not sure if I will have time for this.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com


On 2025-04-02 We 7:45 AM, Mahendra Singh Thalor wrote:
On Fri, 28 Mar 2025 at 20:13, Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Fri, Mar 28, 2025 at 05:08:26PM +0530, Mahendra Singh Thalor wrote:
> > Here, I am attaching updated patches for review.
> >
> > v04_001* has the changes for CREATE DATABASE/ROLE/USER and
> > v04_002* has the changes into pg_upgrade to give ALERTS for invalid names.
>
> In general, +1 for these changes.  Thanks for picking this up.

Thanks Nathan for quick feedback.

>
> If these are intended for v18, we probably should have a committer
> attached to it soon.  I'm not confident that I'll have time for it,
> unfortunately.

I think Andrew is planning this as a cleanup for "non-text pg_dumpall" patch. Andrew, please if you have some bandwidth, then please let us know your feedback for these patches.
 
>
> +       /* Report error if dbname have newline or carriage return in name. */
> +       if (strpbrk(dbname, "\n\r"))
> +               ereport(ERROR,
> +                               (errcode(ERRCODE_INVALID_PARAMETER_VALUE)),
> +                               errmsg("database name contains a newline or carriage return character"),
> +                               errhint("newline or carriage return character is not allowed in database name"));
>
> I think it would be better to move this to a helper function instead of
> duplicating this code in several places.

Agreed. Fixed this and as Srinath pointed out, I moved it inside "src/backend/commands/define.c" file.

>
> Taking a step back, are we sure that 1) this is the right place to do these
> checks and 2) we shouldn't apply the same restrictions to all names?  I'm
> wondering if it would be better to add these checks to the grammar instead
> of trying to patch up all the various places they are used in the tree.
>

As of now, we made this restriction to only "database/role/user" because dump is not allowed with these special characters.
yes, we can add these checks in grammar also. (probably we should add checks in grammar only). If others also feel same, I can try to add these checks in grammar.

On Sun, 30 Mar 2025 at 00:03, Srinath Reddy <srinath2133@gmail.com> wrote:
>
> ./psql postgres
>
> Hi,
>
> On Fri, Mar 28, 2025 at 5:08 PM Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
>>
>>
>> v04_001* has the changes for CREATE DATABASE/ROLE/USER and
>> v04_002* has the changes into pg_upgrade to give ALERTS for invalid names.
>
>
> i have reviewed these patches,in v04_002* i think it would be better if log all the names like database ,roles/users names then throw error instead of just logging database names into db_role_invalid_names and throwing error,which helps the user to see at once all the invalid names and resolve then again try pg_upgrade,so here i am attaching delta patch for the same ,except this the patches LGTM.
>
Thanks Srinath for the review.

In attached v05_0002* patch, instead of 2 exec commands, I merged into 1 and as per your comment, we will report all the invalid names at once and additionally we are giving count of invalid names.
 
Ex: pg_upgrade --check:
Performing Consistency Checks
-----------------------------
Checking cluster versions                                     ok
Checking database connection settings                         ok
Checking names of databases/users/roles                       fatal

All the database, role/user names should have only valid characters. A newline or
carriage return character is not allowed in these object names.  To fix this, please
rename these names with valid names.
To see all 5 invalid object names, refer db_role_user_invalid_names.txt file.
    /home/mst/pg_all/head_pg/postgres/inst/bin/data/pg_upgrade_output.d/20250402T160610.664/db_role_user_invalid_names.txt
Failure, exiting

 Here, I am attaching updated patches for review.



+void
+check_lfcr_in_objname(const char *objname, const char *objtype)
+{
+   /* Report error if name has \n or \r character. */
+   if (strpbrk(objname, "\n\r"))
+       ereport(ERROR,
+               (errcode(ERRCODE_INVALID_PARAMETER_VALUE)),
+                errmsg("%s name contains a newline or carriage return character", objtype),
+                errhint("a newline or a carriage return character is not allowed in %s name\n"
+                    "here, given %s name is : \"%s\"", objtype, objtype, objname));
+}


I don't think the errhint here adds much. If we're going to put the offending name in an error message I think it possibly needs to be escaped so it's more obvious where the CR/LF are. This should be in an errdetail rather than an errhint.


+static void
+check_database_user_role_names_in_old_cluser(ClusterInfo *cluster)

s/cluser/cluster/

+   prep_status("Checking names of databases/users/roles ");

I would just say "Checking names of databases and roles".


+       pg_fatal("All the database, role/user names should have only valid characters. A newline or \n"
+               "carriage return character is not allowed in these object names.  To fix this, please \n"
+               "rename these names with valid names. \n"
+               "To see all %d invalid object names, refer db_role_user_invalid_names.txt file. \n"
+               "    %s", count, output_path);


Also needs some cleanup.



cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Thanks Andrew for the review.

On Sat, 5 Apr 2025 at 20:12, Andrew Dunstan <andrew@dunslane.net> wrote:
>
>
> On 2025-04-02 We 7:45 AM, Mahendra Singh Thalor wrote:
>
> On Fri, 28 Mar 2025 at 20:13, Nathan Bossart <nathandbossart@gmail.com> wrote:
> >
> > On Fri, Mar 28, 2025 at 05:08:26PM +0530, Mahendra Singh Thalor wrote:
> > > Here, I am attaching updated patches for review.
> > >
> > > v04_001* has the changes for CREATE DATABASE/ROLE/USER and
> > > v04_002* has the changes into pg_upgrade to give ALERTS for invalid names.
> >
> > In general, +1 for these changes.  Thanks for picking this up.
>
> Thanks Nathan for quick feedback.
>
> >
> > If these are intended for v18, we probably should have a committer
> > attached to it soon.  I'm not confident that I'll have time for it,
> > unfortunately.
>
> I think Andrew is planning this as a cleanup for "non-text pg_dumpall" patch. Andrew, please if you have some
bandwidth,then please let us know your feedback for these patches. 
>
> >
> > +       /* Report error if dbname have newline or carriage return in name. */
> > +       if (strpbrk(dbname, "\n\r"))
> > +               ereport(ERROR,
> > +                               (errcode(ERRCODE_INVALID_PARAMETER_VALUE)),
> > +                               errmsg("database name contains a newline or carriage return character"),
> > +                               errhint("newline or carriage return character is not allowed in database name"));
> >
> > I think it would be better to move this to a helper function instead of
> > duplicating this code in several places.
>
> Agreed. Fixed this and as Srinath pointed out, I moved it inside "src/backend/commands/define.c" file.
>
> >
> > Taking a step back, are we sure that 1) this is the right place to do these
> > checks and 2) we shouldn't apply the same restrictions to all names?  I'm
> > wondering if it would be better to add these checks to the grammar instead
> > of trying to patch up all the various places they are used in the tree.
> >
>
> As of now, we made this restriction to only "database/role/user" because dump is not allowed with these special
characters.
> yes, we can add these checks in grammar also. (probably we should add checks in grammar only). If others also feel
same,I can try to add these checks in grammar. 
>
> On Sun, 30 Mar 2025 at 00:03, Srinath Reddy <srinath2133@gmail.com> wrote:
> >
> > ./psql postgres
> >
> > Hi,
> >
> > On Fri, Mar 28, 2025 at 5:08 PM Mahendra Singh Thalor <mahi6run@gmail.com> wrote:
> >>
> >>
> >> v04_001* has the changes for CREATE DATABASE/ROLE/USER and
> >> v04_002* has the changes into pg_upgrade to give ALERTS for invalid names.
> >
> >
> > i have reviewed these patches,in v04_002* i think it would be better if log all the names like database
,roles/usersnames then throw error instead of just logging database names into db_role_invalid_names and throwing
error,whichhelps the user to see at once all the invalid names and resolve then again try pg_upgrade,so here i am
attachingdelta patch for the same ,except this the patches LGTM. 
> >
> Thanks Srinath for the review.
>
> In attached v05_0002* patch, instead of 2 exec commands, I merged into 1 and as per your comment, we will report all
theinvalid names at once and additionally we are giving count of invalid names. 
>
> Ex: pg_upgrade --check:
>>
>> Performing Consistency Checks
>> -----------------------------
>> Checking cluster versions                                     ok
>> Checking database connection settings                         ok
>> Checking names of databases/users/roles                       fatal
>>
>> All the database, role/user names should have only valid characters. A newline or
>> carriage return character is not allowed in these object names.  To fix this, please
>> rename these names with valid names.
>> To see all 5 invalid object names, refer db_role_user_invalid_names.txt file.
>>
/home/mst/pg_all/head_pg/postgres/inst/bin/data/pg_upgrade_output.d/20250402T160610.664/db_role_user_invalid_names.txt
>> Failure, exiting
>
>
>  Here, I am attaching updated patches for review.
>
>
>
> +void
> +check_lfcr_in_objname(const char *objname, const char *objtype)
> +{
> +   /* Report error if name has \n or \r character. */
> +   if (strpbrk(objname, "\n\r"))
> +       ereport(ERROR,
> +               (errcode(ERRCODE_INVALID_PARAMETER_VALUE)),
> +                errmsg("%s name contains a newline or carriage return character", objtype),
> +                errhint("a newline or a carriage return character is not allowed in %s name\n"
> +                    "here, given %s name is : \"%s\"", objtype, objtype, objname));
> +}
>
>
> I don't think the errhint here adds much. If we're going to put the offending name in an error message I think it
possiblyneeds to be escaped so it's more obvious where the CR/LF are. This should be in an errdetail rather than an
errhint.

Fixed. I replaced errhint with errdetail as "errdetail("invalid %s
name is  \"%s\"", objtype, objname));"

>
> +static void
> +check_database_user_role_names_in_old_cluser(ClusterInfo *cluster)
>
> s/cluser/cluster/
>
> +   prep_status("Checking names of databases/users/roles ");
>
> I would just say "Checking names of databases and roles".

Fixed.

>
>
> +       pg_fatal("All the database, role/user names should have only valid characters. A newline or \n"
> +               "carriage return character is not allowed in these object names.  To fix this, please \n"
> +               "rename these names with valid names. \n"
> +               "To see all %d invalid object names, refer db_role_user_invalid_names.txt file. \n"
> +               "    %s", count, output_path);
>
>
> Also needs some cleanup.
>
>
>
> cheers
>
>
> andrew
>
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com

Here, I am attaching updated patches for review.

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

Attachment
On 2025-03-28 Fr 10:43 AM, Nathan Bossart wrote:
> On Fri, Mar 28, 2025 at 05:08:26PM +0530, Mahendra Singh Thalor wrote:
>> Here, I am attaching updated patches for review.
>>
>> v04_001* has the changes for CREATE DATABASE/ROLE/USER and
>> v04_002* has the changes into pg_upgrade to give ALERTS for invalid names.
> In general, +1 for these changes.  Thanks for picking this up.
>
> If these are intended for v18, we probably should have a committer
> attached to it soon.  I'm not confident that I'll have time for it,
> unfortunately.
>
> +    /* Report error if dbname have newline or carriage return in name. */
> +    if (strpbrk(dbname, "\n\r"))
> +        ereport(ERROR,
> +                (errcode(ERRCODE_INVALID_PARAMETER_VALUE)),
> +                errmsg("database name contains a newline or carriage return character"),
> +                errhint("newline or carriage return character is not allowed in database name"));
>
> I think it would be better to move this to a helper function instead of
> duplicating this code in several places.


The latest patches do that, but I'm not really sure it's an improvement, 
nor that define.c is the right place for it (everything else there works 
on a defElem)


>
> Taking a step back, are we sure that 1) this is the right place to do these
> checks and 2) we shouldn't apply the same restrictions to all names?  I'm
> wondering if it would be better to add these checks to the grammar instead
> of trying to patch up all the various places they are used in the tree.
>

Maybe. I don't think there is time for that for v18, so we'd have to 
defer this for now. I can live with that - it's been like this for a 
long time.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




On Sun, Apr 06, 2025 at 10:24:58AM -0400, Andrew Dunstan wrote:
> On 2025-03-28 Fr 10:43 AM, Nathan Bossart wrote:
>> Taking a step back, are we sure that 1) this is the right place to do these
>> checks and 2) we shouldn't apply the same restrictions to all names?  I'm
>> wondering if it would be better to add these checks to the grammar instead
>> of trying to patch up all the various places they are used in the tree.
> 
> Maybe. I don't think there is time for that for v18, so we'd have to defer
> this for now. I can live with that - it's been like this for a long time.

+1, I think deferring is the right call.

-- 
nathan



On 2025-Apr-06, Andrew Dunstan wrote:

> On 2025-03-28 Fr 10:43 AM, Nathan Bossart wrote:

> > Taking a step back, are we sure that 1) this is the right place to do these
> > checks and 2) we shouldn't apply the same restrictions to all names?  I'm
> > wondering if it would be better to add these checks to the grammar instead
> > of trying to patch up all the various places they are used in the tree.
> 
> Maybe. I don't think there is time for that for v18, so we'd have to defer
> this for now. I can live with that - it's been like this for a long time.

Grumble.  I'd rather introduce a partial restriction now only for names
that affect the tools failing outright (pg_dumpall in particular), than
do nothing for this release.  If we feel the need to extend the
restriction later, that's easy to do and bothers nobody.  We've wanted
these characters to be forbidden on database names for a long time, but
nobody has cared as much for their use on other types of names.  I'm not
even sure we'd support the idea of forbidding them on all names (though
the SQL standard doesn't allow control chars in identifiers.)

Another point is that we can easily have pg_upgrade check for invalid
database and role names, but checking *all* names would be more onerous.

I don't like the present implementation though, on translability
grounds.  I think the error message should appear at each callsite
rather than be hardcoded in the new function, to avoid string building.
I think it'd be cleaner if the new function (maybe "name_contains_crlf"
or "is_identifier_awful") just returned a boolean based on strpbrk(),
and the callsite throws the error.

I wonder why does the patch restrict both database and role names.  Does
a user with a newline also cause pg_upgrade to fail?  I mean, this
thread started with a consideration for database names only, and the
usage on role names seemed to have appeared out of nowhere in [1].

Cheers

[1] https://postgr.es/m/CAKYtNAoVKQL5rLD4P4hZXZSnThwO-j4q3Y1vTDHQGjzwC-kUJg@mail.gmail.com

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Aprender sin pensar es inútil; pensar sin aprender, peligroso" (Confucio)



Sent from my iPhone

> On Apr 6, 2025, at 11:23 AM, Álvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2025-Apr-06, Andrew Dunstan wrote:
>
>> On 2025-03-28 Fr 10:43 AM, Nathan Bossart wrote:
>
>>> Taking a step back, are we sure that 1) this is the right place to do these
>>> checks and 2) we shouldn't apply the same restrictions to all names?  I'm
>>> wondering if it would be better to add these checks to the grammar instead
>>> of trying to patch up all the various places they are used in the tree.
>>
>> Maybe. I don't think there is time for that for v18, so we'd have to defer
>> this for now. I can live with that - it's been like this for a long time.
>
> Grumble.  I'd rather introduce a partial restriction now only for names
> that affect the tools failing outright (pg_dumpall in particular), than
> do nothing for this release.  If we feel the need to extend the
> restriction later, that's easy to do and bothers nobody.  We've wanted
> these characters to be forbidden on database names for a long time, but
> nobody has cared as much for their use on other types of names.  I'm not
> even sure we'd support the idea of forbidding them on all names (though
> the SQL standard doesn't allow control chars in identifiers.)

Ok. That seems reasonable, let’s do this now and revisit in the v19 cycle

Cheers

Andrew




On Sun, 6 Apr 2025 at 20:53, Álvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2025-Apr-06, Andrew Dunstan wrote:
>
> > On 2025-03-28 Fr 10:43 AM, Nathan Bossart wrote:
>
> > > Taking a step back, are we sure that 1) this is the right place to do these
> > > checks and 2) we shouldn't apply the same restrictions to all names?  I'm
> > > wondering if it would be better to add these checks to the grammar instead
> > > of trying to patch up all the various places they are used in the tree.
> >
> > Maybe. I don't think there is time for that for v18, so we'd have to defer
> > this for now. I can live with that - it's been like this for a long time.
>
> Grumble.  I'd rather introduce a partial restriction now only for names
> that affect the tools failing outright (pg_dumpall in particular), than
> do nothing for this release.  If we feel the need to extend the
> restriction later, that's easy to do and bothers nobody.  We've wanted
> these characters to be forbidden on database names for a long time, but
> nobody has cared as much for their use on other types of names.  I'm not
> even sure we'd support the idea of forbidding them on all names (though
> the SQL standard doesn't allow control chars in identifiers.)
>
> Another point is that we can easily have pg_upgrade check for invalid
> database and role names, but checking *all* names would be more onerous.
>
> I don't like the present implementation though, on translability
> grounds.  I think the error message should appear at each callsite
> rather than be hardcoded in the new function, to avoid string building.
> I think it'd be cleaner if the new function (maybe "name_contains_crlf"
> or "is_identifier_awful") just returned a boolean based on strpbrk(),
> and the callsite throws the error.
>
> I wonder why does the patch restrict both database and role names.  Does
> a user with a newline also cause pg_upgrade to fail?  I mean, this
> thread started with a consideration for database names only, and the
> usage on role names seemed to have appeared out of nowhere in [1].

In my testing, pg_dumpall was not failing with roles/user (\n\r) in my machine but due to the below comment in code, I restricted roles also.

+++ b/src/fe_utils/string_utils.c
@@ -568,12 +568,6 @@ appendByteaLiteral(PQExpBuffer buf, const unsigned char *str, size_t length,
  * Append the given string to the shell command being built in the buffer,
  * with shell-style quoting as needed to create exactly one argument.
  *
- * Forbid LF or CR characters, which have scant practical use beyond designing
- * security breaches.  The Windows command shell is unusable as a conduit for
- * arguments containing LF or CR characters.  A future major release should
- * reject those characters in CREATE ROLE and CREATE DATABASE, because use
- * there eventually leads to errors here.
- *
  * appendShellString() simply prints an error and dies if LF or CR appears.


> Cheers
>
> [1] https://postgr.es/m/CAKYtNAoVKQL5rLD4P4hZXZSnThwO-j4q3Y1vTDHQGjzwC-kUJg@mail.gmail.com
>
> --
> Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
> "Aprender sin pensar es inútil; pensar sin aprender, peligroso" (Confucio)

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com
=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@alvh.no-ip.org> writes:
> Grumble.  I'd rather introduce a partial restriction now only for names
> that affect the tools failing outright (pg_dumpall in particular), than
> do nothing for this release.  If we feel the need to extend the
> restriction later, that's easy to do and bothers nobody.  We've wanted
> these characters to be forbidden on database names for a long time, but
> nobody has cared as much for their use on other types of names.  I'm not
> even sure we'd support the idea of forbidding them on all names (though
> the SQL standard doesn't allow control chars in identifiers.)

I'd be 100% behind forbidding all ASCII control characters in all
identifiers.  I can't see any situation in which that's a good thing,
and I can think of plenty where it's a mistake (eg your editor
decided to change space to tab) or done with underhanded intent.
If we can cite the SQL standard then it's an entirely defensible
restriction.

Having said that, I'm not quite sure where we ought to implement
the restriction, and it's possible that there are multiple places
that would need to check.  I concur that the day before feature
freeze is not a good time to be designing this.  Let's defer.

            regards, tom lane



On 2025-Apr-06, Tom Lane wrote:

> I'd be 100% behind forbidding all ASCII control characters in all
> identifiers.  I can't see any situation in which that's a good thing,
> and I can think of plenty where it's a mistake (eg your editor
> decided to change space to tab) or done with underhanded intent.

Right.

> If we can cite the SQL standard then it's an entirely defensible
> restriction.

We can.  It says (in 5.2 <token> and <separator>)

<regular identifier> ::= <identifier body>
<identifier body> ::= <identifier start> [ <identifier part>... ]
<identifier part> ::= <identifier start> | <identifier extend>
<identifier start> ::= !! See the Syntax Rules.
<identifier extend> ::= !! See the Syntax Rules.

Syntax Rules
1) An <identifier start> is any character in the Unicode General Category
   classes “Lu”, “Ll”, “Lt”, “Lm”, “Lo”, or “Nl”.
      NOTE 112 — The Unicode General Category classes “Lu”, “Ll”, “Lt”, “Lm”,
      “Lo”, and “Nl” are assigned to Unicode characters that are, respectively,
      upper-case letters, lower-case letters, title-case letters, modifier
      letters, other letters, and letter numbers.
2) An <identifier extend> is U+00B7, “Middle Dot”, or any character in the
   Unicode General Category classes “Mn”, “Mc”, “Nd”, or “Pc”.
      NOTE 113 — The Unicode General Category classes “Mn”, “Mc”, “Nd”, and
      “Pc”, are assigned to Unicode characters that are, respectively,
      non-spacing marks, spacing combining marks, decimal numbers, and connector
      punctuations.

The class for control characters is "C", so there are allowed nowhere.

https://www.unicode.org/charts/script/

> Having said that, I'm not quite sure where we ought to implement
> the restriction, and it's possible that there are multiple places
> that would need to check.

Yeah, a general ban on control characters for all identifiers is harder
to implement than a restricted ban, because it probably involves the
lexer, and I'm not sure the resulting "syntax error" type of rejections
are going to be nice enough to users.  A C-function based rejection
seems more convenient at this stage.

> I concur that the day before feature freeze is not a good time to be
> designing this.  Let's defer.

Augh.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"In fact, the basic problem with Perl 5's subroutines is that they're not
crufty enough, so the cruft leaks out into user-defined code instead, by
the Conservation of Cruft Principle."  (Larry Wall, Apocalypse 6)



=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@alvh.no-ip.org> writes:
> On 2025-Apr-06, Tom Lane wrote:
>> If we can cite the SQL standard then it's an entirely defensible
>> restriction.

> We can.  It says (in 5.2 <token> and <separator>)

> <regular identifier> ::= <identifier body>
> <identifier body> ::= <identifier start> [ <identifier part>... ]
> <identifier part> ::= <identifier start> | <identifier extend>
> <identifier start> ::= !! See the Syntax Rules.
> <identifier extend> ::= !! See the Syntax Rules.

Hmm, but that's about non-quoted identifiers, so of course their
character set is pretty restricted.  What's of concern here is
what's allowed in double-quoted identifiers.  AFAICS there's
not much restriction: it can be any <nondoublequote character>,
and SR 7 says

    7) A <nondoublequote character> is any character of the source
    language character set other than a <double quote>.

        NOTE 115 — “source language character set” is defined in
        Subclause 4.10.1, “Host languages”, in ISO/IEC 9075-1.

The referenced bit of 9075-1 is pretty vague too:

    No matter what binding style is chosen, SQL-statements are written
    in an implementation-defined character set, known as the source
    language character set. The source language character set is not
    required to be the same as the character set of any character
    string appearing in SQL-data.

So I'm not really seeing anything there that justifies forbidding any
characters.  However, I still think that if we're going to forbid CR
or LF, we might as well go the whole way and forbid all the ASCII
control characters; none of them are any saner to use in identifiers
than those two.  (I'd be for banning   and similar as well, on
the same usability grounds as banning tabs, except that putting an
encoding dependency into this rule will not end well.)

            regards, tom lane



On 2025-04-06 Su 1:51 PM, Tom Lane wrote:
> =?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@alvh.no-ip.org> writes:
>> On 2025-Apr-06, Tom Lane wrote:
>>> If we can cite the SQL standard then it's an entirely defensible
>>> restriction.
>> We can.  It says (in 5.2 <token> and <separator>)
>> <regular identifier> ::= <identifier body>
>> <identifier body> ::= <identifier start> [ <identifier part>... ]
>> <identifier part> ::= <identifier start> | <identifier extend>
>> <identifier start> ::= !! See the Syntax Rules.
>> <identifier extend> ::= !! See the Syntax Rules.
> Hmm, but that's about non-quoted identifiers, so of course their
> character set is pretty restricted.  What's of concern here is
> what's allowed in double-quoted identifiers.  AFAICS there's
> not much restriction: it can be any <nondoublequote character>,
> and SR 7 says
>
>      7) A <nondoublequote character> is any character of the source
>      language character set other than a <double quote>.
>
>          NOTE 115 — “source language character set” is defined in
>          Subclause 4.10.1, “Host languages”, in ISO/IEC 9075-1.
>
> The referenced bit of 9075-1 is pretty vague too:
>
>      No matter what binding style is chosen, SQL-statements are written
>      in an implementation-defined character set, known as the source
>      language character set. The source language character set is not
>      required to be the same as the character set of any character
>      string appearing in SQL-data.
>
> So I'm not really seeing anything there that justifies forbidding any
> characters.  However, I still think that if we're going to forbid CR
> or LF, we might as well go the whole way and forbid all the ASCII
> control characters; none of them are any saner to use in identifiers
> than those two.  (I'd be for banning   and similar as well, on
> the same usability grounds as banning tabs, except that putting an
> encoding dependency into this rule will not end well.)


Sound like we have some work to do, and that's not going to happen in 24 
hours.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




On Mon, 7 Apr 2025 at 02:40, Andrew Dunstan <andrew@dunslane.net> wrote:
>
>
> On 2025-04-06 Su 1:51 PM, Tom Lane wrote:
> > =?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@alvh.no-ip.org> writes:
> >> On 2025-Apr-06, Tom Lane wrote:
> >>> If we can cite the SQL standard then it's an entirely defensible
> >>> restriction.
> >> We can.  It says (in 5.2 <token> and <separator>)
> >> <regular identifier> ::= <identifier body>
> >> <identifier body> ::= <identifier start> [ <identifier part>... ]
> >> <identifier part> ::= <identifier start> | <identifier extend>
> >> <identifier start> ::= !! See the Syntax Rules.
> >> <identifier extend> ::= !! See the Syntax Rules.
> > Hmm, but that's about non-quoted identifiers, so of course their
> > character set is pretty restricted.  What's of concern here is
> > what's allowed in double-quoted identifiers.  AFAICS there's
> > not much restriction: it can be any <nondoublequote character>,
> > and SR 7 says
> >
> >      7) A <nondoublequote character> is any character of the source
> >      language character set other than a <double quote>.
> >
> >          NOTE 115 — “source language character set” is defined in
> >          Subclause 4.10.1, “Host languages”, in ISO/IEC 9075-1.
> >
> > The referenced bit of 9075-1 is pretty vague too:
> >
> >      No matter what binding style is chosen, SQL-statements are written
> >      in an implementation-defined character set, known as the source
> >      language character set. The source language character set is not
> >      required to be the same as the character set of any character
> >      string appearing in SQL-data.
> >
> > So I'm not really seeing anything there that justifies forbidding any
> > characters.  However, I still think that if we're going to forbid CR
> > or LF, we might as well go the whole way and forbid all the ASCII
> > control characters; none of them are any saner to use in identifiers
> > than those two.  (I'd be for banning   and similar as well, on
> > the same usability grounds as banning tabs, except that putting an
> > encoding dependency into this rule will not end well.)
>
>
> Sound like we have some work to do, and that's not going to happen in 24
> hours.
>
>
> cheers
>
>
> andrew
>
>
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com
>

Hi,
As per above discussions, for v18, we will not do any change to server
side to fix the issue of \n\r in database names. But as a cleanup
patch, we can give an alert to the user by "pg_upgrade --check". As
per current code, pg_dump and pg_upgrade will fail with "shell
command" error but in the attached patch, we will give some extra info
to the user by "pg_upgrade --check" so that they can fix database
names before trying to upgrade.

Here, I am attaching a patch which will give a list of invalid
database names in "pg_upgrade --check". We can consider this as a
cleanup patch.

--
Thanks and Regards
Mahendra Singh Thalor
EnterpriseDB: http://www.enterprisedb.com

Attachment
On Thu, Apr 10, 2025 at 11:58:41PM +0530, Mahendra Singh Thalor wrote:
> As per above discussions, for v18, we will not do any change to server
> side to fix the issue of \n\r in database names. But as a cleanup
> patch, we can give an alert to the user by "pg_upgrade --check". As
> per current code, pg_dump and pg_upgrade will fail with "shell
> command" error but in the attached patch, we will give some extra info
> to the user by "pg_upgrade --check" so that they can fix database
> names before trying to upgrade.
> 
> Here, I am attaching a patch which will give a list of invalid
> database names in "pg_upgrade --check". We can consider this as a
> cleanup patch.

Are you proposing this for v18?  I think this is all v19 material at this
point.  Perhaps we could argue this is a bug fix that should be
back-patched, but IMHO that's a bit of a stretch.  I don't sense a
tremendous amount of urgency, either.

-- 
nathan