Thread: Double linking MemoryContext children
The attached script demonstrates an O(N^2) problem we recently became aware of. The script simply executes a large anonymous code block that doesn't do anything useful. Usage is ./t1.py [NUM_STMTS] | psql [DBNAME] NUM_STMTS defaults to 20,000. The code block executes rather fast, but the entire script runs for over a minute (at 20,000 statements) on a 2.66 GHz Xeon. The time is spent when all the prepared SPI statements are freed at the end of execution. All prepared SPI statements are children of the Cache context. MemoryContext children are a single linked list where new members are inserted at the head. This works best when children are created and destroyed in a last-in-first-out pattern. SPI however destroys the SPI statements in the order they were created, forcing MemoryContextSetParent() to traverse the entire linked list for each child. The attached patch makes MemoryContext children a double linked list that no longer needs any list traversal no find the position of the child within the list. Regards, Jan -- Jan Wieck Senior Software Engineer http://slony.info
Attachment
Jan Wieck <jan@wi3ck.info> writes: > The attached script demonstrates an O(N^2) problem we recently became > aware of. The script simply executes a large anonymous code block that > doesn't do anything useful. Usage is > ./t1.py [NUM_STMTS] | psql [DBNAME] > NUM_STMTS defaults to 20,000. The code block executes rather fast, but > the entire script runs for over a minute (at 20,000 statements) on a > 2.66 GHz Xeon. > The time is spent when all the prepared SPI statements are freed at the > end of execution. All prepared SPI statements are children of the Cache > context. MemoryContext children are a single linked list where new > members are inserted at the head. This works best when children are > created and destroyed in a last-in-first-out pattern. SPI however > destroys the SPI statements in the order they were created, forcing > MemoryContextSetParent() to traverse the entire linked list for each child. > The attached patch makes MemoryContext children a double linked list > that no longer needs any list traversal no find the position of the > child within the list. Seems less invasive to fix SPI to delete in the opposite order? regards, tom lane
On 09/11/2015 09:38 AM, Tom Lane wrote: > Jan Wieck <jan@wi3ck.info> writes: >> The attached script demonstrates an O(N^2) problem we recently became >> aware of. The script simply executes a large anonymous code block that >> doesn't do anything useful. Usage is > >> ./t1.py [NUM_STMTS] | psql [DBNAME] > >> NUM_STMTS defaults to 20,000. The code block executes rather fast, but >> the entire script runs for over a minute (at 20,000 statements) on a >> 2.66 GHz Xeon. > >> The time is spent when all the prepared SPI statements are freed at the >> end of execution. All prepared SPI statements are children of the Cache >> context. MemoryContext children are a single linked list where new >> members are inserted at the head. This works best when children are >> created and destroyed in a last-in-first-out pattern. SPI however >> destroys the SPI statements in the order they were created, forcing >> MemoryContextSetParent() to traverse the entire linked list for each child. > >> The attached patch makes MemoryContext children a double linked list >> that no longer needs any list traversal no find the position of the >> child within the list. > > Seems less invasive to fix SPI to delete in the opposite order? That would assume that there are no other code paths that destroy memory contexts out of LIFO order. Regards, Jan -- Jan Wieck Senior Software Engineer http://slony.info
Jan Wieck <jan@wi3ck.info> writes: > On 09/11/2015 09:38 AM, Tom Lane wrote: >> Seems less invasive to fix SPI to delete in the opposite order? > That would assume that there are no other code paths that destroy memory > contexts out of LIFO order. Given that we've not seen this sort of problem reported in the dozen or more years the code has been like that, I'm a bit dubious that we need to worry about that. It's possible you're right that we need to expend more space in the MemoryContext lists --- but I'd like to see more than one example. regards, tom lane
On 09/11/2015 09:58 AM, Tom Lane wrote: > Jan Wieck <jan@wi3ck.info> writes: >> On 09/11/2015 09:38 AM, Tom Lane wrote: >>> Seems less invasive to fix SPI to delete in the opposite order? > >> That would assume that there are no other code paths that destroy memory >> contexts out of LIFO order. > > Given that we've not seen this sort of problem reported in the dozen or > more years the code has been like that, I'm a bit dubious that we need > to worry about that. It's possible you're right that we need to expend > more space in the MemoryContext lists --- but I'd like to see more than > one example. I added a bit of debug code to MemoryContextSetParent(), tracking loop iterations per context name. This is what happens during "make check" of current master: name | count | not_first | iterations --------------------------------------+--------+-----------+------------ CacheMemoryContext | 6271 | 2634 | 104846 ExecutorState | 222290 | 2951 | 7176 TopMemoryContext | 28464 | 620 | 2716 SPI Proc | 8424 | 225 | 381 PortalMemory | 24616 | 31 | 291 TopTransactionContext | 19312 | 114 | 141 ExprContext | 3317 | 1 | 1 There was a total of 6,271 calls to MemoryContextSetParent() where the old parent is the CacheMemoryContext. For 2,634 of those the context is not parent->firstchild and this lead to 104,846 loop iterations total. The remaining numbers indicate that other contexts are mostly used in the intended fashion, but not strictly. This means there is definitely potential for more edge cases, similar to the SPI example above. Regards, Jan -- Jan Wieck Senior Software Engineer http://slony.info
Jan Wieck <jan@wi3ck.info> wrote: >>> On 09/11/2015 09:38 AM, Tom Lane wrote: >>>> Seems less invasive to fix SPI to delete in the opposite order? > The remaining numbers indicate that other contexts are mostly used in > the intended fashion, but not strictly. This means there is definitely > potential for more edge cases, similar to the SPI example above. I guess the question is whether to add a pointer for each memory context to give the MemoryContextSetParent() function O(1) performance characteristics or add comments in front of this function to document how callers should organize their code to avoid O(N^2) performance. I generally prefer that callers of such a function need not be that aware of implementation details, so I would prefer the former. On the other hand, a grep indicates that there are two places that MemoryContextData.nextchild is set (and we therefore probably need to also set the new field), and Jan's proposed patch only changes one of them. If we do this, I think we need to change both places that are affected, so ResourceOwnerCreate() in resowner.c would need a line or two added. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Kevin Grittner <kgrittn@ymail.com> writes: > I guess the question is whether to add a pointer for each memory > context to give the MemoryContextSetParent() function O(1) > performance characteristics or add comments in front of this > function to document how callers should organize their code to > avoid O(N^2) performance. I generally prefer that callers of such > a function need not be that aware of implementation details, so I > would prefer the former. On reflection, it's a bit silly to complain about one extra pointer per MemoryContext, when the memory represented by the context is going to be at least one kilobyte and usually a lot more. So I withdraw my objection to the concept. I concur that we'd just as soon not worry about what order things are done in. > On the other hand, a grep indicates that there are two places that > MemoryContextData.nextchild is set (and we therefore probably need > to also set the new field), and Jan's proposed patch only changes > one of them. If we do this, I think we need to change both places > that are affected, so ResourceOwnerCreate() in resowner.c would > need a line or two added. Um. Sounds like it needs some actual code review then ... regards, tom lane
On 09/12/2015 11:35 AM, Kevin Grittner wrote: > On the other hand, a grep indicates that there are two places that > MemoryContextData.nextchild is set (and we therefore probably need > to also set the new field), and Jan's proposed patch only changes > one of them. If we do this, I think we need to change both places > that are affected, so ResourceOwnerCreate() in resowner.c would > need a line or two added. ResourceOwnerCreate() sets ResourceOwnerData.nextchild, not MemoryContextData.nextchild. Regards, Jan -- Jan Wieck Senior Software Engineer http://slony.info
On 9/14/15 7:24 AM, Jan Wieck wrote: > On 09/12/2015 11:35 AM, Kevin Grittner wrote: > >> On the other hand, a grep indicates that there are two places that >> MemoryContextData.nextchild is set (and we therefore probably need >> to also set the new field), and Jan's proposed patch only changes >> one of them. If we do this, I think we need to change both places >> that are affected, so ResourceOwnerCreate() in resowner.c would >> need a line or two added. > > ResourceOwnerCreate() sets ResourceOwnerData.nextchild, not > MemoryContextData.nextchild. Anything ever happen with this? </Momjian-Mode> -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
On 7 December 2015 at 01:30, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: > On 9/14/15 7:24 AM, Jan Wieck wrote: >> >> On 09/12/2015 11:35 AM, Kevin Grittner wrote: >> >>> On the other hand, a grep indicates that there are two places that >>> MemoryContextData.nextchild is set (and we therefore probably need >>> to also set the new field), and Jan's proposed patch only changes >>> one of them. If we do this, I think we need to change both places >>> that are affected, so ResourceOwnerCreate() in resowner.c would >>> need a line or two added. >> >> >> ResourceOwnerCreate() sets ResourceOwnerData.nextchild, not >> MemoryContextData.nextchild. > > > Anything ever happen with this? </Momjian-Mode> Yeah, it was committed... a few mins ago. Thom
On Sun, Dec 6, 2015 at 7:30 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: > On 9/14/15 7:24 AM, Jan Wieck wrote: >> On 09/12/2015 11:35 AM, Kevin Grittner wrote: >> >>> On the other hand, a grep indicates that there are two places that >>> MemoryContextData.nextchild is set (and we therefore probably need >>> to also set the new field), and Jan's proposed patch only changes >>> one of them. If we do this, I think we need to change both places >>> that are affected, so ResourceOwnerCreate() in resowner.c would >>> need a line or two added. >> >> ResourceOwnerCreate() sets ResourceOwnerData.nextchild, not >> MemoryContextData.nextchild. > > Anything ever happen with this? </Momjian-Mode> Jan was right; the code for operating on resource owners was similar enough that I mistook it for memory context code in a quick review of grep results looking for any places that Jan might have missed. I went over it all again and couldn't resist adding an Assert() at one point, but otherwise it looks good. An optimized build without assertions runs his 20000 statement DO test in 25646.811 ms without the patch and 2933.754 ms with the patch. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company