Thread: pgsql: With GB18030, prevent SIGSEGV from reading past end of allocatio

With GB18030, prevent SIGSEGV from reading past end of allocation.

With GB18030 as source encoding, applications could crash the server via
SQL functions convert() or convert_from().  Applications themselves
could crash after passing unterminated GB18030 input to libpq functions
PQescapeLiteral(), PQescapeIdentifier(), PQescapeStringConn(), or
PQescapeString().  Extension code could crash by passing unterminated
GB18030 input to jsonapi.h functions.  All those functions have been
intended to handle untrusted, unterminated input safely.

A crash required allocating the input such that the last byte of the
allocation was the last byte of a virtual memory page.  Some malloc()
implementations take measures against that, making the SIGSEGV hard to
reach.  Back-patch to v13 (all supported versions).

Author: Noah Misch <noah@leadboat.com>
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Backpatch-through: 13
Security: CVE-2025-4207

Branch
------
REL_14_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/3f2ab73934ab1e27151ecd14fd7d8ef602555093

Modified Files
--------------
src/backend/utils/mb/mbutils.c             | 18 ++++--
src/common/jsonapi.c                       |  7 ++-
src/common/wchar.c                         | 51 +++++++++++++--
src/include/mb/pg_wchar.h                  |  2 +
src/interfaces/libpq/fe-exec.c             |  6 +-
src/interfaces/libpq/fe-misc.c             | 15 ++---
src/test/modules/test_escape/test_escape.c | 99 ++++++++++++++++++++++++++++++
src/test/regress/expected/conversion.out   | 13 ++--
src/test/regress/sql/conversion.sql        |  7 ++-
9 files changed, 188 insertions(+), 30 deletions(-)


Hi Noah Misch,

   I read the patch you commit to pgsql, and i found there also some other routine which

   will still call pg_encoding_mblen to get char length.

   I have no idea why these routine don't have to replaced by call pg_encoding_mblen_or_incomplete?

    And how can i do to reproduce this crash?


   Thank you for your time.


Regards.


At 2025-05-09 01:33:58, "Noah Misch" <noah@leadboat.com> wrote:

>With GB18030, prevent SIGSEGV from reading past end of allocation.
>
>With GB18030 as source encoding, applications could crash the server via
>SQL functions convert() or convert_from().  Applications themselves
>could crash after passing unterminated GB18030 input to libpq functions
>PQescapeLiteral(), PQescapeIdentifier(), PQescapeStringConn(), or
>PQescapeString().  Extension code could crash by passing unterminated
>GB18030 input to jsonapi.h functions.  All those functions have been
>intended to handle untrusted, unterminated input safely.
>
>A crash required allocating the input such that the last byte of the
>allocation was the last byte of a virtual memory page.  Some malloc()
>implementations take measures against that, making the SIGSEGV hard to
>reach.  Back-patch to v13 (all supported versions).
>
>Author: Noah Misch <noah@leadboat.com>
>Author: Andres Freund <andres@anarazel.de>
>Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
>Backpatch-through: 13
>Security: CVE-2025-4207
>
>Branch
>------
>REL_15_STABLE
>
>Details
>-------
>https://git.postgresql.org/pg/commitdiff/44ba3f55f552b56b2fbefae028fcf3ea5b53461d
>
>Modified Files
>--------------
>src/backend/utils/mb/mbutils.c             | 18 ++++--
>src/common/jsonapi.c                       |  7 ++-
>src/common/wchar.c                         | 51 +++++++++++++--
>src/include/mb/pg_wchar.h                  |  2 +
>src/interfaces/libpq/fe-exec.c             |  6 +-
>src/interfaces/libpq/fe-misc.c             | 15 ++---
>src/test/modules/test_escape/test_escape.c | 99 ++++++++++++++++++++++++++++++
>src/test/regress/expected/conversion.out   | 13 ++--
>src/test/regress/sql/conversion.sql        |  7 ++-
>9 files changed, 188 insertions(+), 30 deletions(-)
>
On Wed, May 14, 2025 at 04:38:06PM +0800, sean wrote:
>    I read the patch you commit to pgsql, and i found there also some other routine which
>    will still call pg_encoding_mblen to get char length.
>    I have no idea why these routine don't have to replaced by call pg_encoding_mblen_or_incomplete?

See the pg_encoding_mblen() header comment for the rules on when calling it is
okay.  For example, it's okay for NUL-terminated input.

>     And how can i do to reproduce this crash?

The patch-added test cases provide some indication on how to reproduce the
crash.