Thread: Lookup penalty for VARIADIC patch
The proposed variadic-functions patch inserts some none-too-cheap code into FuncnameGetCandidates (it's deconstructing the proargmodes column to see if the function is variadic or not) which gets executed whether or not there are any variadic functions involved. I checked whether this would cause a noticeable slowdown in practice, and got a discouraging answer: $ cat functest.sql select sin(5), cos(45); $ pgbench -c 1 -t 10000 -n -f functest.sql regression transaction type: Custom query scaling factor: 1 query mode: simple number of clients: 1 number of transactions per client: 10000 number of transactions actually processed: 10000/10000 tps = 927.418555 (including connections establishing) tps = 928.953281 (excluding connections establishing) That's with the patch. CVS HEAD gets tps = 1017.901218 (including connections establishing) tps = 1019.724948 (excluding connections establishing) so that code is adding about 10% to the total round-trip execution time for the select --- considering all the other overhead involved there, that means the actual cost of FuncnameGetCandidates has gone up probably by an order of magnitude. And that's for the *best* case, where proargmodes is null so SysCacheGetAttr will fall out without returning an array to examine. This doesn't seem acceptable to me. What I'm thinking of doing is adding a column to pg_proc that provides the needed info in a trivial-to-get-at format. There are two ways we could do it: a bool column that is TRUE if the function is variadic, or an oid column that is the variadic array's element type, or zero if the function isn't variadic. The second would take more space but would avoid having to do a catalog lookup to get the element type in the case that the function is indeed variadic. I'm leaning to the second way but wanted to know if anyone objected? Also, it occurs to me that we could buy back a good part of the extra space if we allowed pg_proc.probin to be NULL for internal functions. Right now it's always "-" in that case, which is useless ... regards, tom lane
2008/7/15 Tom Lane <tgl@sss.pgh.pa.us>: > The proposed variadic-functions patch inserts some none-too-cheap code > into FuncnameGetCandidates (it's deconstructing the proargmodes column > to see if the function is variadic or not) which gets executed whether > or not there are any variadic functions involved. I checked whether > this would cause a noticeable slowdown in practice, and got a > discouraging answer: > > $ cat functest.sql > select sin(5), cos(45); > $ pgbench -c 1 -t 10000 -n -f functest.sql regression > transaction type: Custom query > scaling factor: 1 > query mode: simple > number of clients: 1 > number of transactions per client: 10000 > number of transactions actually processed: 10000/10000 > tps = 927.418555 (including connections establishing) > tps = 928.953281 (excluding connections establishing) > > That's with the patch. CVS HEAD gets > tps = 1017.901218 (including connections establishing) > tps = 1019.724948 (excluding connections establishing) > > so that code is adding about 10% to the total round-trip execution time > for the select --- considering all the other overhead involved there, > that means the actual cost of FuncnameGetCandidates has gone up probably > by an order of magnitude. And that's for the *best* case, where > proargmodes is null so SysCacheGetAttr will fall out without returning > an array to examine. This doesn't seem acceptable to me. > > What I'm thinking of doing is adding a column to pg_proc that provides > the needed info in a trivial-to-get-at format. There are two ways we > could do it: a bool column that is TRUE if the function is variadic, > or an oid column that is the variadic array's element type, or zero > if the function isn't variadic. The second would take more space but > would avoid having to do a catalog lookup to get the element type in > the case that the function is indeed variadic. I'm leaning to the > second way but wanted to know if anyone objected? > > Also, it occurs to me that we could buy back a good part of the extra > space if we allowed pg_proc.probin to be NULL for internal functions. > Right now it's always "-" in that case, which is useless ... probin is used in some unofficial pl hacks, so this space its some times used. I vote for special column that containst variadic element type Regards Pavel Stehule > > regards, tom lane > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
"Pavel Stehule" <pavel.stehule@gmail.com> writes: > 2008/7/15 Tom Lane <tgl@sss.pgh.pa.us>: >> Also, it occurs to me that we could buy back a good part of the extra >> space if we allowed pg_proc.probin to be NULL for internal functions. >> Right now it's always "-" in that case, which is useless ... > probin is used in some unofficial pl hacks, so this space its some > times used. Sure, if you want to use it you can. I'm just saying we should allow it to be really NULL, instead of a dummy value, when it isn't being used. regards, tom lane
Decibel! <decibel@decibel.org> writes: > On Jul 15, 2008, at 4:58 PM, Tom Lane wrote: >> There are two ways we >> could do it: a bool column that is TRUE if the function is variadic, >> or an oid column that is the variadic array's element type, or zero >> if the function isn't variadic. The second would take more space but >> would avoid having to do a catalog lookup to get the element type in >> the case that the function is indeed variadic. I'm leaning to the >> second way but wanted to know if anyone objected? > If you go the second route, I'd vote for it being NULL if the > function isn't variadic, unless that would play hell with the C side > of the catalog code... Getting rid of the check for null is *exactly* the point here --- AFAICT that's what's eating all the time in the existing code. regards, tom lane
On Jul 15, 2008, at 4:58 PM, Tom Lane wrote: > There are two ways we > could do it: a bool column that is TRUE if the function is variadic, > or an oid column that is the variadic array's element type, or zero > if the function isn't variadic. The second would take more space but > would avoid having to do a catalog lookup to get the element type in > the case that the function is indeed variadic. I'm leaning to the > second way but wanted to know if anyone objected? If you go the second route, I'd vote for it being NULL if the function isn't variadic, unless that would play hell with the C side of the catalog code... > Also, it occurs to me that we could buy back a good part of the extra > space if we allowed pg_proc.probin to be NULL for internal functions. > Right now it's always "-" in that case, which is useless ... I'd vote for that being NULL in any case... magic values should be avoided when possible. -- Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828