Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode - Mailing list pgsql-committers
From | Robert Haas |
---|---|
Subject | Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode |
Date | |
Msg-id | CA+TgmoazkgAyyMiOy7t2cCRX-VpWUULAOxx+5mb92WCTvkyDmA@mail.gmail.com Whole thread Raw |
In response to | Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode (Andres Freund <andres@anarazel.de>) |
Responses |
Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode
RE: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode |
List | pgsql-committers |
On Tue, Mar 23, 2021 at 3:13 PM Andres Freund <andres@anarazel.de> wrote: > You cache it. Yeah, exactly. I don't think it's super-easy to understand exactly how to make that work well for something like this. It would be easy enough to set a flag in the relcache whose value is computed the first time we need it and is then consulted every time after that, and you just invalidate it based on sinval messages. But, if you go with that design, you've got a big problem: now an insert has to lock all the tables in the partitioning hierarchy to decide whether it can run in parallel or not, and we do not want that. We want to be able to just lock the partitions we need, so really, we want to be able to test for parallel-safety without requiring a relation lock, or only requiring it on the partitioned table itself and not all the partitions. However, that introduces a race condition, because if you ever check the value of the flag without a lock on the relation, then you can't rely on sinval to blow away the cached state. I don't have a good solution to that problem in mind right now, because I haven't really devoted much time to thinking about it, but I think it might be possible to solve it with more thought. But if I had thought about it and had not come up with anything better than what you committed here, I would have committed nothing, and I think that's what you should have done, too. This patch is full of grotty hacks. Just to take one example, consider the code that forces a transaction ID assignment prior to the operation. You *could* have solved this problem by introducing new machinery to make it safe to assign an XID in parallel mode. Then, we'd have a fundamental new capability that we currently lack. Instead, you just force-assigned an XID before entering parallel mode. That's not building any new capability; that's just hacking around the lack of a capability to make something work, while ignoring the disadvantages of doing it that way, namely that sometimes an XID will be assigned for no purpose. Likewise, the XXX comment you added to max_parallel_hazard_walker claims that some of the code introduced there is to compensate for an unspecified bug in the rewriter. I'm a bit skeptical that the comment is correct, and there's no way to find out because the comment doesn't say what the bug supposedly is, but let's just say for the sake of argument that it's true. Well, you *could* have fixed the bug, but instead you hacked around it, and in a relatively expensive way that affects every query with a CTE in it whether it can benefit from this patch or not. That's not a responsible way of maintaining the core PostgreSQL code. I also agree with Andres's criticism of the code in target_rel_index_max_parallel_hazard entirely. It's completely unacceptable to be doing index_open() here. If you don't understand the design of the planner well enough to know why that's not OK, then you shouldn't be committing patches like this. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-committers by date: