Re: ALTER TABLE ADD COLUMN fast default - Mailing list pgsql-hackers
From | Andrew Dunstan |
---|---|
Subject | Re: ALTER TABLE ADD COLUMN fast default |
Date | |
Msg-id | CAA8=A7_MS4biaP2nuAcOMAcZ++_2e9uks8kYkHYPKB0q3SB44Q@mail.gmail.com Whole thread Raw |
In response to | Re: ALTER TABLE ADD COLUMN fast default (Andres Freund <andres@anarazel.de>) |
Responses |
Re: ALTER TABLE ADD COLUMN fast default
Re: ALTER TABLE ADD COLUMN fast default |
List | pgsql-hackers |
On Fri, Mar 30, 2018 at 5:00 AM, Andres Freund <andres@anarazel.de> wrote: > Hi, > > On 2018-03-29 10:16:23 +1030, Andrew Dunstan wrote: >> On Wed, Mar 28, 2018 at 5:30 PM, Andres Freund <andres@anarazel.de> wrote: >> > Hi, >> > >> > On 2018-03-26 09:02:10 +1030, Andrew Dunstan wrote: >> >> Thanks for this, all looks good. Here is the consolidate patch >> >> rebased. If there are no further comments I propose to commit this in >> >> a few days time. >> > >> > Here's a bit of post commit review: >> > >> > @@ -1310,13 +1679,11 @@ slot_getsomeattrs(TupleTableSlot *slot, int attnum) >> > >> > /* >> > * If tuple doesn't have all the atts indicated by tupleDesc, read the >> > - * rest as null >> > + * rest as NULLs or missing values >> > */ >> > - for (; attno < attnum; attno++) >> > - { >> > - slot->tts_values[attno] = (Datum) 0; >> > - slot->tts_isnull[attno] = true; >> > - } >> > + if (attno < attnum) >> > + slot_getmissingattrs(slot, attno, attnum); >> > + >> > slot->tts_nvalid = attnum; >> > } >> > >> > It's worthwhile to note that this'll re-process *all* missing values, >> > even if they've already been deformed. I.e. if >> > slot_getmissingattrs(.., first-missing) >> > slot_getmissingattrs(.., first-missing + 1) >> > slot_getmissingattrs(.., first-missing + 2) >> > is called, all three missing values will be copied every time. That's >> > because tts_nvalid isn't taken into account. I wonder if slot_getmissingattrs >> > could take tts_nvalid into account? >> > >> > I also wonder if this doesn't deserve an unlikely(), to avoid the cost >> > of spilling registers in the hot branch of not missing any values. > >> One of us at least is very confused about this function. >> slot_getmissingattrs() gets called at most once by slot_getsomeattrs >> and never for any attribute that's already been deformed. > > Imagine the same slot being used in two different expressions. The > typical case for that is e.g. something like > SELECT a,b,c,d FROM foo WHERE b = 1; > > this'll trigger one slot_getsomeattrs(slot, 2) call from within qual > evaluation, and then a slot_getsomeattrs(slot, 4) from within the > projection code. If you then imagine a tuple where only the first > column exists physically, we'd copy b twice, because attno is only going > to be one 1. I think we might just want to check tts nvalid in > getmissingattrs? > > OK. so add something like this to the top of slot_getmissingattrs()? startAttNum = Max(startAttNum, slot->tts_nvalid); >> > I'm still strongly object to do doing this unconditionally for queries >> > that never need any of this. We're can end up with a significant number >> > of large tuples in memory here, and then copy that through dozens of >> > tupledesc copies in queries. > >> We're just doing the same thing we do for default values. > > That's really not a defense. It's hurting us there too. > I will take a look and see if it can be avoided easily. > >> None of the tests I did with large numbers of missing values seemed to >> show performance impacts of the kind you describe. Now, none of the >> queries were particularly complex, but the worst case was from >> actually using very large numbers of attributes with missing values, >> where we'd need this data anyway. If we just selected a few attributes >> performance went up rather than down. > > Now imagine a partitioned workload with a few thousand partitions over a > few tables. The additional memory is going to start being noticeable. > > >> pg_attrdef isn't really a good place for it (what if they drop the >> default?). So the only alternative then would be a completely new >> catalog. I'd need a deal of convincing that that was justified. > > There's plenty databases with pg_attribute being many gigabytes large, > and this is going to make that even worse. > I'll change it if there's a consensus about it, but so far the only other opinion has been from Tom who doesn't apparently see much of a problem. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: