Re: Query plan and Inheritance. Weird behavior - Mailing list pgsql-performance
From | John Lange |
---|---|
Subject | Re: Query plan and Inheritance. Weird behavior |
Date | |
Msg-id | 1043336171.2048.35.camel@johnlaptop.darkcore.net Whole thread Raw |
In response to | Re: Query plan and Inheritance. Weird behavior (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-performance |
On Thu, 2003-01-23 at 00:07, Tom Lane wrote: > > On 22 Jan 2003, John Lange wrote: > >> In this way the parent table would not need to know, nor would it care > >> about child tables in any way (just like inheritance in most programming > >> languages). If done this way a select on a parent table would only > >> require the retrieval of a single row and a select on a child table > >> would only require the retrieval of two rows (one in the child table and > >> one in the parent table). > > No, it'd require the retrieval of N rows: you're failing to think about > multiple levels of inheritance or multi-parent inheritance, both of > which are supported reasonably effectively by the current model. Lets not be too nit-picky here. In the case of multiple layers of inheritance you are still only selecting two rows (at a time), one child, one parent. However if the parent also has a parent, then the process repeats, once for every layer. This is entirely reasonable and efficient compared to the current model where a select on a parent table requires the same select to be executed on EVERY child table. If it's a large expensive JOIN of some kind then this is verging on un-workable. > My guess is that this scheme would crash and burn just on locking > considerations. (When you want to update a child row, what locks do you > have to get in what order? With pieces of the row scattered through > many tables, it'd be pretty messy.) You lock the parent on down to the last child. I'm not a database developer but that seems fairly straight forward? The choice between the schema I've suggested and the way it is currently implemented is a trade off between more efficient selects vs. more efficient updates. If you are selecting on the parent table more than updating then my idea is vastly more efficient. If you INSERT a lot then the current way is marginally better. With apologies to the developers, I don't feel the current implementation is really usable for the simple fact that expensive operations performed on the parent table causes them to be repeated for every child table. And, as an added penalty, indexes on parent tables are NOT inherited to the children so the child operations can be even more expensive. This solution is not that large and I've already got 6 child tables. It just so happens that I do a LOT of selects on the parent so I'm going to have to make a decision on where to go from here. Solving this programmatically is not really that hard but I've gone a ways down this path now so I'm not anxious to redo the entire database schema since we do have customers already using this. > You may care to look in the pghackers archives for prior discussions. I will, thanks. > The variant scheme that's sounded most interesting to me so far is to > store *all* rows of an inheritance hierarchy in a single physical table. Unless I'm not understanding I don't think that works. In my case for example, a single parent has 4-5 children so the only columns they have in common are the ones in the parent. Combining them all into a single table means a big de-normalized table (loads of empty columns). If you are going to go this route then you might as well just do it. It doesn't need to be implemented on the DBMS. Regards, John Lange > This'd require giving up multiple inheritance, but few people seem to > use that, and the other benefits (like being able to enforce uniqueness > constraints over the whole hierarchy with just a standard unique index) > seem worth it. No one's stepped up to bat to do the legwork on the idea > yet, though. One bit that looks pretty tricky is ALTER TABLE ADD > COLUMN. > > regards, tom lane
pgsql-performance by date: