Re: exec_execute_message crash - Mailing list pgsql-hackers
From | Tatsuo Ishii |
---|---|
Subject | Re: exec_execute_message crash |
Date | |
Msg-id | 20091230.214635.09776596.t-ishii@sraoss.co.jp Whole thread Raw |
In response to | exec_execute_message crush (Tatsuo Ishii <ishii@postgresql.org>) |
Responses |
Re: exec_execute_message crash
Re: exec_execute_message crash |
List | pgsql-hackers |
> While inspecting a complain from a pgpool user, I found that > PostgreSQL crushes with following statck trace: > > #0 0x0826436a in list_length (l=0xaabe4e28) > at ../../../src/include/nodes/pg_list.h:94 > #1 0x08262168 in IsTransactionStmtList (parseTrees=0xaabe4e28) > at postgres.c:2429 > #2 0x0826132e in exec_execute_message (portal_name=0x857bab0 "", max_rows=0) > at postgres.c:1824 > #3 0x08263b2a in PostgresMain (argc=4, argv=0x84f6c28, > username=0x84f6b08 "t-ishii") at postgres.c:3671 > #4 0x0823299e in BackendRun (port=0x8511e68) at postmaster.c:3449 > #5 0x08231f78 in BackendStartup (port=0x8511e68) at postmaster.c:3063 > #6 0x0822f90a in ServerLoop () at postmaster.c:1387 > #7 0x0822f131 in PostmasterMain (argc=3, argv=0x84f4bf8) at postmaster.c:1040 > #8 0x081c6217 in main (argc=3, argv=0x84f4bf8) at main.c:188 Ok, I think I understand what's going on. parse bind describe execute This sequence of commands create cached plan in unnamed portal. $5 = {name = 0x8574de4 "", prepStmtName = 0x0, heap = 0x8598400, resowner = 0x8598488, cleanup = 0x81632ca <PortalCleanup>,createSubid = 1, sourceText = 0x85ab818 " SELECT <omitted>"..., commandTag = 0x84682ca "SELECT", stmts =0xaabf43b0, cplan = 0xaabf4950, portalParams = 0x0, strategy = PORTAL_ONE_SELECT, cursorOptions = 4, status = PORTAL_READY,queryDesc = 0x85abc20, tupDesc = 0x85ddcb0, formats = 0x85abc68, holdStore = 0x0, holdContext = 0x0, atStart= 1 '\001', atEnd = 1 '\001', posOverflow = 0 '\0', portalPos = 0, creation_time = 315487957498169, visible = 1 '\001'} The cached plan(portal->cplan) and statements(portal->stmts) are created by exec_bind_message(): /* * Revalidate the cached plan; this may result in replanning. Any * cruft will be generated in MessageContext. The plan refcount will * be assigned to the Portal, so it will be released at portal * destruction. */ cplan = RevalidateCachedPlan(psrc, false); plan_list = cplan->stmt_list; Please note that cplan and stmts belong to the same memory context. Then following commands are coming: parse invalid SQL thus abort a transaction bind (error) describe (error) execute (crash) parse causes transaction to abort, which causes call to AbortCurrentTransaction->AbortTransaction->AtAbort_portals->ReleaseCachedPlan. It calls ReleaseCachePlan(portal->cplan). ReleaseCachePlan calls MemoryContextDelete(plan->context) which destroys both portal->cplan and portal->stmts. That was the reason why I had segfault by accessing portal->stmts. To fix this I think exec_execute_message should throw an error if portal->cleanup is NULL, since portal->cleanup is NULLed by AtAbort_Portals at transaction abort (or portal is dropped). Here is a suggested fix: diff -c postgres.c~ postgres.c *** postgres.c~ 2009-06-18 19:08:08.000000000 +0900 --- postgres.c 2009-12-30 21:34:49.000000000 +0900 *************** *** 1804,1810 **** dest = DestRemoteExecute; portal = GetPortalByName(portal_name); ! if (!PortalIsValid(portal)) ereport(ERROR, (errcode(ERRCODE_UNDEFINED_CURSOR), errmsg("portal \"%s\" does not exist", portal_name))); --- 1804,1810 ---- dest = DestRemoteExecute; portal = GetPortalByName(portal_name); ! if (!PortalIsValid(portal) || (PortalIsValid(portal) && portal->cleanup == NULL)) ereport(ERROR, (errcode(ERRCODE_UNDEFINED_CURSOR), errmsg("portal \"%s\" does not exist", portal_name))); -- Tatsuo Ishii SRA OSS, Inc. Japan
pgsql-hackers by date: