Thread: A modest proposal: make parser/rewriter/planner inputs read-only
The parser, rewriter, and planner all have a bad habit of scribbling on their input parsetrees. At this point I've lost count of how many bugs that's caused (but e33f2335a and 0f43083d1 are the two latest examples), and of how many places there are that are brute-forcing a solution to the problem by copying a whole parsetree before letting one of these subsystems see it. Years ago we fixed the executor to treat its input Plan trees as read-only. It seems like we really ought to do the same for these upstream subsystems. Surely, whatever benefit we get from changing Node contents in-place is lost due to all these other hacks. While I've had this thought in the back of my head for a long while, I haven't pursued it because I couldn't think of a practical way to find all the places that are violating such a policy, much less prevent future violations. However, today I thought of this sketch of a solution: 1. Invent a way to make a particular memory context read-only after putting some data into it. 2. In debug builds, after we've built a tree that should be considered read-only, copy it into such a context and make it read-only. Or perhaps build it there in the first place. 3. Fix the resulting crashes. 4. Profit! (In particular, nuke a lot of no-longer-needed copyObject calls.) My first thought about implementing #1 was to seek Valgrind's help, but so far as I can find out there's no VALGRIND_MAKE_MEM_READ_ONLY. Step #3 would be pretty tedious anyway if it required running under Valgrind. However, all modern hardware has the ability to mark memory read-only at the page level, and most platforms expose that in some way or other. So it doesn't seem unreasonable to invent a memory context option (or whole new context type, if that seems easier) that is careful to align its allocation blocks on page boundaries and then can set or clear the hardware R/O flag on demand. It'd be enough if the R/O enforcement worked on popular development platforms, we don't have to make it work absolutely everywhere. I'm not planning to pursue this idea Right Now, but it seems like something that could happen for v19 or so. In the meantime I wanted to get the ideas down on electrons. Thoughts? regards, tom lane
Hi, On 2025-04-05 12:46:37 -0400, Tom Lane wrote: > 1. Invent a way to make a particular memory context read-only > after putting some data into it. > > 2. In debug builds, after we've built a tree that should be considered > read-only, copy it into such a context and make it read-only. Or > perhaps build it there in the first place. > 3. Fix the resulting crashes. > > 4. Profit! (In particular, nuke a lot of no-longer-needed copyObject > calls.) > > My first thought about implementing #1 was to seek Valgrind's help, > but so far as I can find out there's no VALGRIND_MAKE_MEM_READ_ONLY. > Step #3 would be pretty tedious anyway if it required running under > Valgrind. However, all modern hardware has the ability to mark > memory read-only at the page level, and most platforms expose that > in some way or other. So it doesn't seem unreasonable to invent > a memory context option (or whole new context type, if that seems > easier) that is careful to align its allocation blocks on page > boundaries and then can set or clear the hardware R/O flag on > demand. It'd be enough if the R/O enforcement worked on popular > development platforms, we don't have to make it work absolutely > everywhere. FWIW, while hacking on patch to making hint bit writes not happening while IO is going on (so we don't need to copy the page anymore and don't cause filesystem level issues with DIO), I hacked up protection for shared buffers using mprotect() - it worked way better than I thought it would. The overhead ended up surprisingly low: base: real 1m4.613s user 4m31.409s sys 3m20.445s ENFORCE_BUFFER_PROT real 1m11.912s user 4m27.332s sys 3m28.063s See https://postgr.es/m/043c8b50-d183-46e5-b054-145cc0f6f908%40iki.fi I'm mostly sharing that to say that a) yes, mprotect() is viable and works surprisingly well b) it might be worth inventing some common platform abstraction for mprotect That prototype patch already worked on most platforms, windows should be entirely doable. Greetings, Andres Freund
On Sun, Apr 6, 2025 at 1:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > The parser, rewriter, and planner all have a bad habit of scribbling > on their input parsetrees. At this point I've lost count of how many > bugs that's caused (but e33f2335a and 0f43083d1 are the two latest > examples), and of how many places there are that are brute-forcing a > solution to the problem by copying a whole parsetree before letting > one of these subsystems see it. > > Years ago we fixed the executor to treat its input Plan trees as > read-only. It seems like we really ought to do the same for these > upstream subsystems. Surely, whatever benefit we get from changing > Node contents in-place is lost due to all these other hacks. While I'm in favor of this idea, it seems that it will require lots of code changes. In particular, within the planner, the parsetree goes through considerable transformations during the preprocessing phase, such as subquery pull-up, constant folding, and so on. Would this proposal mean we'd need to refactor all those places to treat the input parsetrees as read-only? Just trying to understand the scope of what would be involved. Thanks Richard
On 4/5/25 18:46, Tom Lane wrote: > I'm not planning to pursue this idea Right Now, but it seems like > something that could happen for v19 or so. In the meantime I wanted > to get the ideas down on electrons. > > Thoughts? I generally like the idea because, for now, I need to be sure that no one touched the parse tree before copying it to do additional transformations before the optimisation phase. But what is the way you are proposing here? Do you mean that one more entity will be explicitly introduced: a transformed parse tree? It would open an opportunity for extensions to build a set of alternative transformed trees, pass them through the optimisation phase and choose the best plan. -- regards, Andrei Lepikhov
Richard Guo <guofenglinux@gmail.com> writes: > On Sun, Apr 6, 2025 at 1:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Years ago we fixed the executor to treat its input Plan trees as >> read-only. It seems like we really ought to do the same for these >> upstream subsystems. Surely, whatever benefit we get from changing >> Node contents in-place is lost due to all these other hacks. > While I'm in favor of this idea, it seems that it will require lots of > code changes. In particular, within the planner, the parsetree goes > through considerable transformations during the preprocessing phase, > such as subquery pull-up, constant folding, and so on. Would this > proposal mean we'd need to refactor all those places to treat the > input parsetrees as read-only? Just trying to understand the scope of > what would be involved. From a high-level point of view, I'd be satisfied to have this property at the subsystem level: parse analysis can't modify the raw parse tree output by the grammar, rewriter can't modify the analyzed parse tree it's given, planner can't modify the parse tree it's given. That would not preclude, say, portions of the planner doing modify-in-place as long as it was certain that what they were modifying was a planner-local copy. (The proposed test infrastructure would only be able to enforce things at the subsystem level anyway.) As we get into the project we might decide that it'd be worth enforcing similar rules on portions of those subsystems, eg successive phases of the planner. I don't have strong feelings about that at present. An example here is that it'd be nice to have some way of verifying that a GEQO search step doesn't change any of the data structures that are supposed to remain the same. I do suspect that there are specific functions that will need to be made strict about this, for example eval_const_expressions, but that doesn't necessarily translate to a hard requirement across the board. regards, tom lane
Andrei Lepikhov <lepihov@gmail.com> writes: > But what is the way you are proposing here? Do you mean that one more > entity will be explicitly introduced: a transformed parse tree? No, I wasn't thinking of adding new concepts, just saying that the inputs to certain operations need to be treated as read-only. What would you envision a "transformed parse tree" being that's not what we have today? > It would open an opportunity for extensions to build a set of > alternative transformed trees, pass them through the optimisation phase > and choose the best plan. Yeah, the ability to repeatedly operate on a tree without the overhead of making copies would be a major benefit of being stricter here. regards, tom lane
On Sat, Apr 5, 2025 at 7:31 PM Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2025-04-05 12:46:37 -0400, Tom Lane wrote:
> 1. Invent a way to make a particular memory context read-only
> after putting some data into it.
>
> 2. In debug builds, after we've built a tree that should be considered
> read-only, copy it into such a context and make it read-only. Or
> perhaps build it there in the first place.
> 3. Fix the resulting crashes.
>
> 4. Profit! (In particular, nuke a lot of no-longer-needed copyObject
> calls.)
>
> My first thought about implementing #1 was to seek Valgrind's help,
> but so far as I can find out there's no VALGRIND_MAKE_MEM_READ_ONLY.
> Step #3 would be pretty tedious anyway if it required running under
> Valgrind. However, all modern hardware has the ability to mark
> memory read-only at the page level, and most platforms expose that
> in some way or other. So it doesn't seem unreasonable to invent
> a memory context option (or whole new context type, if that seems
> easier) that is careful to align its allocation blocks on page
> boundaries and then can set or clear the hardware R/O flag on
> demand. It'd be enough if the R/O enforcement worked on popular
> development platforms, we don't have to make it work absolutely
> everywhere.
FWIW, while hacking on patch to making hint bit writes not happening while IO
is going on (so we don't need to copy the page anymore and don't cause
filesystem level issues with DIO), I hacked up protection for shared buffers
using mprotect() - it worked way better than I thought it would. The overhead
ended up surprisingly low:
base:
real 1m4.613s
user 4m31.409s
sys 3m20.445s
ENFORCE_BUFFER_PROT
real 1m11.912s
user 4m27.332s
sys 3m28.063s
See https://postgr.es/m/043c8b50-d183-46e5-b054-145cc0f6f908%40iki.fi
I'm mostly sharing that to say that
a) yes, mprotect() is viable and works surprisingly well
b) it might be worth inventing some common platform abstraction for mprotect
That prototype patch already worked on most platforms, windows should be
entirely doable.
Also, I would like to provide reference to this old thread [1] in favor of mprotect().
In that thread, and in Greenplum using mprotect helps detect Shared buffer access rule violations.
This electronic communication and the information and any files transmitted with it, or attached to it, are confidential and are intended solely for the use of the individual or entity to whom it is addressed and may contain information that is confidential, legally privileged, protected by privacy laws, or otherwise restricted from disclosure to anyone else. If you are not the intended recipient or the person responsible for delivering the e-mail to the intended recipient, you are hereby notified that any use, copying, distributing, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. If you received this e-mail in error, please return the e-mail to the sender, delete it from your computer, and destroy any printed copy of it.
On 4/15/25 19:34, Tom Lane wrote: > Andrei Lepikhov <lepihov@gmail.com> writes: >> But what is the way you are proposing here? Do you mean that one more >> entity will be explicitly introduced: a transformed parse tree? > > No, I wasn't thinking of adding new concepts, just saying that the > inputs to certain operations need to be treated as read-only. > What would you envision a "transformed parse tree" being that's not > what we have today? I just want to understand how your idea will work. The query_planner does the job for subqueries separately. If a query is transformed in some way (let's say, an unnecessary join is deleted), we need to change references in the parse tree of another subquery, or it will not find the reference at the moment of planning, right? -- regards, Andrei Lepikhov
Andrei Lepikhov <lepihov@gmail.com> writes: > I just want to understand how your idea will work. The query_planner > does the job for subqueries separately. If a query is transformed in > some way (let's say, an unnecessary join is deleted), we need to change > references in the parse tree of another subquery, or it will not find > the reference at the moment of planning, right? Don't see why. If we're separately planning a subquery, we would not dare to change anything about its outputs, just as we would not make a change that affects the topmost level's outputs. I don't believe there's anything right now that requires a recursive subquery_planner call to change the outer parsetree, and this idea wouldn't affect that. Now, subquery_planner does have side effects on the PlannerGlobal struct, but that's planner-local data, not an input to the planner. Maybe we would like to have some enforced contract about what subquery_planner can and can't touch in the outer planner level's data, but I'm not feeling a great need for that right now. regards, tom lane
On 4/16/25 18:22, Tom Lane wrote: > Maybe we would like to have some enforced contract about what > subquery_planner can and can't touch in the outer planner level's > data, but I'm not feeling a great need for that right now. I wouldn't say I'm entirely convinced that side effects are absent, but multiple copies of the same query tree look messy now. It makes sense to try this way and see what happens. -- regards, Andrei Lepikhov