Re: Faster methods for getting SPI results (460%improvement) - Mailing list pgsql-hackers
From | Jim Nasby |
---|---|
Subject | Re: Faster methods for getting SPI results (460%improvement) |
Date | |
Msg-id | a96b9506-5587-7290-9eed-8830ce7f318e@openscg.com Whole thread Raw |
In response to | Re: Faster methods for getting SPI results (460% improvement) (Craig Ringer <craig@2ndquadrant.com>) |
Responses |
Re: Faster methods for getting SPI results (460%improvement)
|
List | pgsql-hackers |
On 4/4/17 7:44 PM, Craig Ringer wrote: > On 5 April 2017 at 08:23, Craig Ringer <craig@2ndquadrant.com> wrote: >> On 5 April 2017 at 08:00, Craig Ringer <craig@2ndquadrant.com> wrote: > This patch fails to update the documentation at all. > > https://www.postgresql.org/docs/devel/static/spi.html I'll fix that soon. > missing newline Fixed. > +/* Execute a previously prepared plan with a callback Destination */ > > > caps "Destination" Hmm, I capitalized it since DestReceiver is capitalized. I guess it's best to just drop Destination. > + // XXX throw error if callback is set Fixed (opted to just use an Assert). > +static DestReceiver spi_callbackDR = { > + donothingReceive, donothingStartup, donothingCleanup, donothingCleanup, > + DestSPICallback > +}; > Presumably that's a default destination you're then supposed to modify > with your own callbacks? There aren't any comments to indicate what's > going on here. Correct. Added the following comment: > /* > * This is strictly a starting point for creating a callback. It should not > * actually be used. > */ > Comments on patch 2: > > If this is the "bare minimum" patch, what is pending? Does it impose > any downsides or limits? No limits. I'm not aware of any downsides. It's "bare minimum" because a follow-on step is to allow different methods of returning results. In particular, my testing indicates that patch 1 + returning a dict of lists (as opposed to the current list of dicts) results in a 460% improvement vs ~30% with patch 2. > +/* Get switch execution contexts */ > +extern PLyExecutionContext > *PLy_switch_execution_context(PLyExecutionContext *new); > > Comment makes no sense to me. This seems something like memory context > switch, where you supply the new and return the old. But the comment > doesn't explain it at all. Comment changed to /* Switch execution context (similar to MemoryContextSwitchTo */ > +void PLy_CSStartup(DestReceiver *self, int operation, TupleDesc typeinfo); > +void PLy_CSDestroy(DestReceiver *self); > > These are declared in the plpy_spi.c. Why aren't these static? Derp. Fixed. > + volatile MemoryContext oldcontext; > + volatile ResourceOwner oldowner; > int rv; > - volatile MemoryContext oldcontext; > - volatile ResourceOwner oldowner; > > Unnecessary code movement. IMHO it reads better that way. I've left it for now so $COMMITTER can decide, but if it really bugs you let me know and I'll swap it. > In PLy_Callback_New, I think your use of a sub-context is sensible. Is > it necessary to palloc0 though? Hrm, maybe not... but it seems like cheap insurance considering the amount of other stuff involved in just starting a new SPI call. And honestly, I'd rather not mess with it at this point. :) I have added an XXX comment though. > +static CallbackState * > +PLy_Callback_Free(CallbackState *callback) > > The code here looks like it could be part of spi.c's callback support, > rather than plpy_spi specific, since you provide a destroy callback in > the SPI callback struct. I added this for use in PG_CATCH() blocks, because it's not clear to me that the portal gets cleaned up in those cases. It's certainly possible that it's pointless. FWIW, I'm pretty sure I copied that pattern from somewhere else... probably plpgsql or pltcl. > + /* We need to store this because the TupleDesc the receive > function gets has no names. */ > + myState->desc = typeinfo; > > Is it safe to just store the pointer to the TupleDesc here? What's the lifetime? Only Portal lifetime. > + * will clean it up when the time is right. XXX This might result in a leak > + * if an error happens and the result doesn't get dereferenced. > + */ > + MemoryContextSwitchTo(TopMemoryContext); > + result->tupdesc = CreateTupleDescCopy(typeinfo); > > especially given this XXX comment... I've changed the comment to the following. Hopefully this clarifies things: > /* > * Save tuple descriptor for later use by result set metadata > * functions. Save it in TopMemoryContext so that it survives outside of > * an SPI context. We trust that PLy_result_dealloc() will clean it up > * when the time is right. The difference between result and everything > * else is that result needs to survive after the portal is destroyed, > * because result is what's handed back to the plpython function. While > * it's tempting to use something other than TopMemoryContext, that won't > * work: the user could potentially put result into the global dictionary, > * which means it could survive as long as the session does. This might > * result in a leak if an error happens and the result doesn't get > * dereferenced, but if that happens it means the python GC has failed us, > * at which point we probably have bigger problems. > * > * This still isn't perfect though; if something the result tupledesc > * references has it's OID changed then the tupledesc will be invalid. I'm > * not sure it's worth worrying about that though. > */ Updated patches attached, but I still need to update the docs. -- Jim Nasby, Chief Data Architect, Austin TX OpenSCG http://OpenSCG.com
Attachment
pgsql-hackers by date: