Use-after-free in reorderbuffer.c for INSERT ON CONFLICT - Mailing list pgsql-bugs

From Michael Paquier
Subject Use-after-free in reorderbuffer.c for INSERT ON CONFLICT
Date
Msg-id aIsQqDZ7x4LAQ6u1@paquier.xyz
Whole thread Raw
Responses Re: Use-after-free in reorderbuffer.c for INSERT ON CONFLICT
Re: Use-after-free in reorderbuffer.c for INSERT ON CONFLICT
List pgsql-bugs
Hi all,

Ethan Mertz (colleague, in CC) has found a bug in reorderbuffer.c when
processing a REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM change, based
on the data gathered from a customer issue.  The bug is a
use-after-free, causing random crashes, that can be summarized like
that:
1) "specinsert" is saved as a new change to process:
   change = specinsert;
   change->action = REORDER_BUFFER_CHANGE_INSERT;
2) A bit later on the change and specinsert are freed when we are done
with the speculative insert:
  change_done:

       /*
        * If speculative insertion was confirmed, the record
        * isn't needed anymore.
        */
       if (specinsert != NULL)
       {
           ReorderBufferFreeChange(rb, specinsert, true);
           specinsert = NULL;
       }
3) Finally, a couple of lines down, we then do the following things,
and attempt to use a reference to change->lsn that has been freed:
#define CHANGES_THRESHOLD 100

       if (++changes_count >= CHANGES_THRESHOLD)
       {
           rb->update_progress_txn(rb, txn, change->lsn);
           changes_count = 0;
       }

This issue has been introduced in 8c58624df462, down to REL_16_STABLE.
Committer of the related change is added in CC (aka Amit K.).

Sawada-san (also in CC) has come up with an imaginative trick to
easily reproduce the issue, as of:
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -2595,7 +2595,7 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
              * overhead, but testing showed there is no noticeable overhead if
              * we do it after every ~100 changes.
              */
-#define CHANGES_THRESHOLD 100
+#define CHANGES_THRESHOLD 1

             if (++changes_count >= CHANGES_THRESHOLD)
             {

When running the test "ddl" from test_decoding, an instance running
valgrind comes up with that, pointing immediately at the problem:
==28821== Invalid read of size 8
==28821==    at 0x222B2D8: ReorderBufferProcessTXN (reorderbuffer.c:2602)
==28821==    by 0x222C6D6: ReorderBufferReplay (reorderbuffer.c:2864)
==28821==    by 0x222C754: ReorderBufferCommit (reorderbuffer.c:2888)
==28821==    by 0x21EDA48: DecodeCommit (decode.c:743)
==28821==    by 0x21EA54C: xact_decode (decode.c:242)
==28821==    by 0x21E98AA: LogicalDecodingProcessRecord (decode.c:116)
==28821==    by 0x22093A4: pg_logical_slot_get_changes_guts (logicalfuncs.c:266)
==28821==    by 0x22097BC: pg_logical_slot_get_changes (logicalfuncs.c:333)
==28821==    by 0x1AF25D3: ExecMakeTableFunctionResult (execSRF.c:234)
==28821==    by 0x1B65606: FunctionNext (nodeFunctionscan.c:94)
==28821==    by 0x1AF7306: ExecScanFetch (execScan.h:126)
==28821==    by 0x1AF748D: ExecScanExtended (execScan.h:187)
==28821==  Address 0x107fae08 is 7,656 bytes inside a recently re-allocated block of size 8,192 alloc'd
==28821==    at 0x4844818: malloc (vg_replace_malloc.c:446)
==28821==    by 0x2AD3573: SlabAllocFromNewBlock (slab.c:565)
==28821==    by 0x2AD3EF7: SlabAlloc (slab.c:656)
==28821==    by 0x2AC5282: MemoryContextAlloc (mcxt.c:1237)
==28821==    by 0x221E17A: ReorderBufferAllocChange (reorderbuffer.c:511)
==28821==    by 0x222F64B: ReorderBufferAddNewTupleCids (reorderbuffer.c:3444)
==28821==    by 0x2248439: SnapBuildProcessNewCid (snapbuild.c:700)
==28821==    by 0x21EB9D8: heap2_decode (decode.c:437)
==28821==    by 0x21E98AA: LogicalDecodingProcessRecord (decode.c:116)
==28821==    by 0x22093A4: pg_logical_slot_get_changes_guts (logicalfuncs.c:266)
==28821==    by 0x22097BC: pg_logical_slot_get_changes (logicalfuncs.c:333)
==28821==    by 0x1AF25D3: ExecMakeTableFunctionResult (execSRF.c:234)

A simple solution suggested by Ethan would be to use the "prev_lsn"
guessed from the change at the beginning of the loop, rather than the
problematic change->lsn.  But that does not seem completely right to
me because we can switch to "specinsert" as the change to process,
hence wouldn't we want to call update_progress_txn() based on the LSN
of the actual change we are looking at?  All that leads me to the
attached.  Note that I am not the most familiar with this area of the
code, so please take my arguments with a grain of salt.

Comments and thoughts are welcome.
--
Michael

Attachment

pgsql-bugs by date:

Previous
From: Tender Wang
Date:
Subject: Re: BUG #19000: gist index returns inconsistent result with gist_inet_ops
Next
From: John Naylor
Date:
Subject: Re: BUG #19004: Incorrect lowercasing of word-final Greek capital Sigma (Σ)