Re: making update/delete of inheritance trees scale better - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: making update/delete of inheritance trees scale better |
Date | |
Msg-id | CA+HiwqFMw9G8SQMzjnPge2X3tdZF0uvkFTjrV--7_yQb6eEDSQ@mail.gmail.com Whole thread Raw |
In response to | Re: making update/delete of inheritance trees scale better (Amit Langote <amitlangote09@gmail.com>) |
Responses |
Re: making update/delete of inheritance trees scale better
Re: making update/delete of inheritance trees scale better |
List | pgsql-hackers |
Hello, I have been working away at this and have updated the patches for many cosmetic and some functional improvements. On Fri, Jun 12, 2020 at 3:46 PM Amit Langote <amitlangote09@gmail.com> wrote: > I divided that into two patches: > > 1. Make the plan producing tuples to be updated emit only the columns > that are actually updated. postgres_fdw test fails unless you also > apply the patch I posted at [1], because there is an unrelated bug in > UPDATE tuple routing code that manifests due to some changes of this > patch. > > 2. Due to 1, inheritance_planner() is no longer needed, that is, > inherited update/delete can be handled by pulling the rows to > update/delete from only one plan, not one per child result relation. > This one makes that so. > > There are some unsolved problems having to do with foreign tables in > both 1 and 2: > > In 1, FDW update APIs still assume that the plan produces "full" tuple > for update. That needs to be fixed so that FDWs deal with getting > only the updated columns in the plan's output targetlist. > > In 2, still haven't figured out a way to call PlanDirectModify() on > child foreign tables. Lacking that, inherited updates on foreign > tables are now slower, because they are not pushed down. I'd like to > figure something out to fix that situation. In the updated patch, I have implemented a partial solution to this, but I think it should be enough in most practically useful situations. With the updated patch, PlanDirectModify is now called for child result relations, but the FDWs will need to be revised to do useful work in that call (as the patch does for postgres_fdw), because a potentially pushable ForeignScan involving a given child result relation will now be at the bottom of the source plan tree, whereas before it would be the top-level plan. Another disadvantage of this new situation is that inherited update/delete involving joins that were previously pushable cannot be pushed anymore. If update/delete would have been able to use partition-wise join, a child join involving a given child result relation could in principle be pushed, but some semi-related issues prevent the use of partition-wise joins for update/delete, especially when there are foreign table partitions. Another major change is that instead of "tableoid" junk attribute to identify the target result relation for a given tuple to be updated/deleted, the patch now makes the tuples to be updated/deleted contain a junk attribute that gives the index of the result relation in the query's list of result relations which can be used to look up the target result relation directly. With "tableoid", we would need to build a hash table to map the result relation OIDs to result relation indexes, a step that could be seen to become a bottleneck with large partition counts (I am talking about executing generic plans here and have mentioned this problem on the thread to make generic plan execution for update/delete faster [1]). Here are the commit messages of the attached patches: [PATCH v3 1/3] Overhaul how updates compute a new tuple Currently, the planner rewrites the top-level targetlist of an update statement's parsetree so that it contains entries for all attributes of the target relation, including for those columns that have not been changed. This arrangement means that the executor can take a tuple that the plan produces, remove any junk attributes in it and pass it down to the table AM or FDW update API as the new tuple. It also means that in an inherited update, where there are multiple target relations, the planner must produce that many plans, because the targetlists for different target relations may not all look the same considering that child relations may have different sets of columns with varying attribute numbers. This commit revises things so that the planner no longer expands the parsetree targetlist to include unchanged columns so that the plan only produces values of the changed columns. To make the new tuple to pass to table AM and FDW update API, executor now evaluates another targetlist matching the target table's TupleDesc which refers to the plan's output tuple to gets values of the changed columns and to the old tuple that is refetched for values of unchanged columns. To get values for unchanged columns to use when forming the new tuple to pass to ExecForeignUpdate(), we now require foreign scans to always include the wholerow Var corresponding to the old tuple being updated, because the unchanged columns are not present in the plan's targetlist. As a note to FDW authors, any FDW update planning APIs that look at the plan's targetlist for checking if it is pushable to remote side (e.g. PlanDirectModify) should now instead look at "update targetlist" that is set by the planner in PlannerInfo.update_tlist, because resnos in the plan's targetlist is no longer indexable by target column's attribute numbers. Note that even though the main goal of doing this is to avoid having to make multiple plans in the inherited update case, this commit does not touch that subject. A subsequent commit will change things that are necessary to make inherited updates work with a single plan. [PATCH v3 2/3] Include result relation index if any in ForeignScan FDWs that can perform an UPDATE/DELETE remotely using the "direct modify" set of APIs need in some cases to access the result relation properties for which they can currently look at EState.es_result_relation_info. However that means the executor must ensure that es_result_relation_info points to the correct result relation at all times, especially during inherited updates. This requirement gets in the way of number of projects related to changing how ModifyTable operates. For example, an upcoming patch will change things such that there will be one source plan for all result relations whereas currently there is one per result relation, an arrangement which makes it convenient to switch the result relation when the source plan changes. This commit installs a new field 'resultRelIndex' in ForeignScan node which must be set by an FDW if the node will be used to carry out an UPDATE/DELETE operation on a given foreign table, which is the case if the FDW manages to push that operations to the remote side. This commit also modifies postgres_fdw to implement that. [PATCH v3 3/3] Revise how inherited update/delete are handled Now that we have the ability to maintain and evaluate the targetlist needed to generate an update's new tuples independently of the plan which fetches the tuples to be updated, there is no need to make separate plans for child result relations as inheritance_planner() currently does. We generated separate plans before such capability was present, because that was the only way to generate new tuples of child relations where each may have its own unique set of columns (albeit all sharing the set columns present in the root parent). With this commit, an inherited update/delete query will now be planned just as a non-inherited one, generating a single plan that goes under ModifyTable. The plan for the inherited case is essentially the one that we get for a select query, although the targetlist additionally contains junk attributes needed by update/delete. By going from one plan per result relation to only one shared across all result relations, the executor now needs a new way to identify the result relation to direct a given tuple's update/delete to, whereas before, it could tell that from the plan it is executing. To that end, the planner now adds a new junk attribute to the query's targetlist that for each tuple gives the index of the result relation in the query's list of result relations. That is in addition to the junk attribute that the planner already adds to identify the tuple's position in a given relation (such as "ctid"). Given the way query planning with inherited tables work where child relations are not part of the query's jointree and only the root parent is, there are some challenges that arise in the update/delete case: * The junk attributes needed by child result relations need to be represented as root parent Vars, which is a non-issue for a given child if what the child needs and what is added for the root parent are one and the same column. But considering that that may not always be the case, more parent Vars might get added to the top-level targetlist as children are added to the query as result relations. In some cases, a child relation may use a column that is not present in the parent (allowed by traditional inheritance) or a non-column expression, which must be represented using what this patch calls "fake" parent vars. These fake parent vars are really only placeholders for the underlying child relation's column or expression and don't reach the executor's expression evluation machinery. * FDWs that are able to push update/delete fully to the remote side using DirectModify set of APIs now have to go through hoops to identify the subplan and the UPDATE targetlist to push for child result relations, because the subplans for individual result relations are no loger top-level plans. In fact, if the result relation is joined to another relation, update/delete cannot be pushed down at all anymore, whereas before since the child relations would be present in the main jointree, they could be in the case where the relation being joined to was present on the same server as the child result relation. -- Amit Langote EnterpriseDB: http://www.enterprisedb.com [1] https://www.postgresql.org/message-id/CA%2BHiwqG7ZruBmmih3wPsBZ4s0H2EhywrnXEduckY5Hr3fWzPWA%40mail.gmail.com
Attachment
pgsql-hackers by date: