Re: Manipulating complex types as non-contiguous structures in-memory - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Manipulating complex types as non-contiguous structures in-memory |
Date | |
Msg-id | 12295.1431274181@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Manipulating complex types as non-contiguous structures in-memory (Andres Freund <andres@anarazel.de>) |
Responses |
Re: Manipulating complex types as non-contiguous
structures in-memory
Re: Manipulating complex types as non-contiguous structures in-memory |
List | pgsql-hackers |
Andres Freund <andres@anarazel.de> writes: > Looking at this. First reading the patch to understand the details. > * The VARTAG_IS_EXPANDED(tag) trick in VARTAG_SIZE is unlikely to > beneficial, before the compiler could implement the whole thing as a > computed goto or lookup table, afterwards not. Well, if you're worried about the speed of VARTAG_SIZE() then the right thing to do would be to revert your change that made enum vartag_external distinct from the size of the struct, so that we could go back to just using the second byte of a varattrib_1b_e datum as its size. As I said at the time, inserting pad bytes to force each different type of toast pointer to be a different size would probably be a better tradeoff than what commit 3682025015 did. > * It'd be nice if the get_flat_size comment in expandeddatm.h could > specify whether the header size is included. That varies enough around > toast that it seems worthwhile. OK. > * You were rather bothered by the potential of multiple evaluations for > the ilist stuff. And now the AARR macros are full of them... Yeah, there is doubtless some added cost there. But I think it's a better answer than duplicating each function in toto; the code space that that would take isn't free either. > * I find the ARRAY_ITER_VARS/ARRAY_ITER_NEXT macros rather ugly. I don't > buy the argument that turning them into functions will be slower. I'd > bet the contrary on common platforms. Perhaps; do you want to do some testing and see? > * Not a fan of the EH_ prefix in array_expanded.c and EOH_ > elsewhere. Just looks ugly to me. Whatever. I'm not wedded to that naming if you have a better suggestion. > * The list of hardwired safe ops in exec_check_rw_parameter is somewhat > sad. Don't have a better idea though. It's very sad, and it will be high on my list to improve that in 9.6. But I do not think it's a fatal problem to ship it that way in 9.5, because *as things stand today* those are the only two functions that could benefit anyway. It won't really matter until we have extensions that want to use this mechanism. > * "Also, a C function that is modifying a read-write expanded value > in-place should take care to leave the value in a sane state if it > fails partway through." - that's a pretty hefty requirement imo. It is, which is one reason that I'm nervous about relaxing the test in exec_check_rw_parameter. Still, it was possible to code array_set_element to meet that restriction without too much pain. I'm inclined to leave the stronger requirement in the docs for now, until we get more push-back. > * The forced RW->RO conversion in subquery scans is a bit sad, but I > seems like something left for later. Yes, there are definitely some things that look like future opportunities here. > Somewhere in the thread you comment on the fact that it's a bit sad that > plpgsql is the sole benefactor of this (unless some function forces > expansion internally). I'd be ok to leave it at that for now. It'd be > quite cool to get some feedback from postgis folks about the suitability > of this for their cases. Indeed, that's the main reason I'm eager to ship something in 9.5, even if it's not perfect. I hope those guys will look at it and use it, and maybe give us feedback on ways to improve it. > ISTM that the worst case for the new situation is large arrays that > exist as plpgsql variables but are only ever passed on. Well, more to the point, large arrays that are forced into expanded format (costing a conversion step) but then we never do anything with them that would benefit from that. Just saying they're "passed on" doesn't prove much since the called function might or might not get any benefit. array_length doesn't, but some other things would. Your example with array_agg() is interesting, since one of the things on my to-do list is to see whether we could change array_agg to return an expanded array. It would not be hard to make it build that representation directly, instead of its present ad-hoc internal state. The trick would be, when can you return the internal state without an additional copy step? But maybe it could return a R/O pointer ... > ... Expanding only in > cases where it'd be beneficial is going to be hard. Yeah, improving that heuristic looks like a research project. Still, even with all the limitations and to-do items in the patch now, I'm pretty sure this will be a net win for practically all applications. regards, tom lane
pgsql-hackers by date: