Re: PostgreSQL crashes with SIGSEGV - Mailing list pgsql-bugs
From | Peter Geoghegan |
---|---|
Subject | Re: PostgreSQL crashes with SIGSEGV |
Date | |
Msg-id | CAH2-WzkAWqMQCOoT7xV_otN77FGa8psK8rVM_gu9Q=R7XiQVtw@mail.gmail.com Whole thread Raw |
In response to | Re: PostgreSQL crashes with SIGSEGV (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: PostgreSQL crashes with SIGSEGV
|
List | pgsql-bugs |
On Thu, Dec 14, 2017 at 5:58 PM, Peter Geoghegan <pg@bowt.ie> wrote: > The ambiguity about who owns the tuple memory when is the basic > problem, once again. One could argue that it's the caller's fault for > not knowing to not pfree() the tuple after tuplesort_end() is called, > but I don't buy that because it seems like an undue burden on callers > to do that. It seems okay to either provide a very weak, simple > guarantee about tuple memory lifetime, or very strong simple guarantee > about tuple memory lifetime. That's what routines like > tuplesort_gettupleslot() and tuplesort_getindextuple() should limit > their contracts to -- we've had enough problems in this area over the > years that that seems pretty clear to me. > > ISTM that an appropriate fix is one that results in having > tuplesort_gettupleslot() tuple memory that is *consistently* allocated > within caller's own memory context, at least on versions prior to > Postgres 10 (it's a bit more complicated in Postgres 10). This took longer than expected. Attached patch, which targets 9.6, shows what I have in mind. I'm going on vacation shortly, but wanted to post this patch before I leave. It could definitely use some more testing, since it was written under time pressure. I expect to be back early in the new year, so if someone feels like taking this off my hands, they're more than welcome to do so. Postgres v10 isn't affected because there is no such thing as should_free from commit f1f5ec1ef onwards (commit fa117ee40 reintroduced a no-copy optimization a while later, which looks like it made things unsafe again, but it's actually fine -- the ExecClearTuple() pfree() won't happen, so no crash). Everything is fine on v10 because the design was improved for v10. On v10, tuplesort_gettupleslot() calls ExecStoreMinimalTuple() with arguments that indicate that the memory still belongs to the tuplesort, unless it was an explicit copy made with heap_copy_minimal_tuple() there and then, in which case it belong to caller. This works because on v10 we're always managing some temp buffers, regardless of state->status. There is no question of a routine like readtup_alloc() allocating or using memory without knowing about its lifetime, because it's always using the smallish slab allocator buffers. To put in another way, in v10 everything is fine because callers to routines like tuplesort_gettupleslot() either have very limited guarantees about fetched tuple memory that is owned by tuplesort (it can go away almost immediately, on the next call to tuplesort_gettupleslot()), or infinitely flexible guarantees (it's unambiguously caller's copy, and lives in caller's memory context). There is no awkward mix of these two situations, as is the case in earlier versions. That awkward mix is: "You (caller) should free this memory, and can rely on this memory sticking around until tuplesort_end() is called, at which point I (tuplesort) will free it if you haven't -- though I might *not* free it then, based on whether or not I felt like using your memory context!". This is an accident waiting to happen. Remember, the crash isn't a use-after-free; it's a double-free caused by conflated responsibilities across two low-level, refcount-like mechanisms. The idea of a tts_shouldFree=true tuple whose memory exists in a context that isn't under the executor's direct control is a fundamentally bogus idea. It kinda works most of the time, but only to the extent that the lifetime of a tuplesort is accidentally tied to the lifetime of an executor node, without the tuple/slot making in into something like estate->es_tupleTable (as was the case in Andreas' example). An alternative approach to fixing this issue, that I also want to throw out there, is perform a simple change within ExecResetTupleTable() to make ExecClearTuple()'d slots force "tts_shouldFree = false" -- a second patch shows what I mean here. All ExecResetTupleTable() callers pass "shouldFree = false" anyway, because all memory is about to go away (within FreeExecutorState()). This second approach seems like a real kludge to me, but it's also a fix that requires a tiny tweak, rather than non-trivial tuplesort.c memory management changes. I would feel worse about the idea of applying this kludge if we had to rely on it forever, but since v10 is unaffected anyway, we really don't. But it's still an awful kludge. I find it hard to know what to do here (the vacation is overdue, it seems). (Note that v10 does have a minor, related bug, noted in commit messages of both patches.) -- Peter Geoghegan
Attachment
pgsql-bugs by date: