BUG #15677: Crash while deleting from partitioned table - Mailing list pgsql-bugs
From | PG Bug reporting form |
---|---|
Subject | BUG #15677: Crash while deleting from partitioned table |
Date | |
Msg-id | 15677-0bf089579b4cd02d@postgresql.org Whole thread Raw |
Responses |
Re: BUG #15677: Crash while deleting from partitioned table
|
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 15677 Logged by: Norbert Benkocs Email address: infernorb@gmail.com PostgreSQL version: 11.2 Operating system: CentOS Linux release 7.4.1708 (Core) Description: Version: PostgreSQL 11.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36), 64-bit OS: CentOS Linux release 7.4.1708 (Core) Hello, We have an insert/update/delete query on a partitioned table (multiple CTE-s) that causes our PostgreSQL server to crash once every few days. We haven't been able to reproduce this crash so far, and re-running the same query with the same parameters didn't result in a crash either. The table in question is updated thousands of times each hour, and most of these work fine. Previously this table was not partitioned, we started seeing the crash after partitioning the table. DDL: create table rates.room_daily_rates ( "ratePlanId" integer not null, "roomTypeId" integer not null, date date not null, duration smallint not null, guests smallint not null, rate bigint not null, "fixedRate" bigint, constraint room_daily_rates_pkey primary key ("roomTypeId", "ratePlanId", date, guests, duration), constraint rate_sanity check (rate > 0 and "fixedRate" >= 0), constraint guests_sanity check (guests > 0), constraint duration_sanity check (duration >= 0), constraint room_daily_rates_rate_fkey foreign key ("roomTypeId", "ratePlanId") references rates.room_rate_plans on update restrict on delete cascade ) partition by range (date); create table rates.room_daily_rates_2014 partition of rates.room_daily_rates for values from ('2014-01-01') to ('2015-01-01'); create table rates.room_daily_rates_2015 partition of rates.room_daily_rates for values from ('2015-01-01') to ('2016-01-01'); [ ... repeated for each year up to 2023 ... ] The query that caused the crash: with in_rows (requested, "roomTypeId","ratePlanId","duration","guests","date","rate","fixedRate") as ( values (true,$1::integer,$2::integer,$3::integer,$4::integer,$5::"date",$6::"int8",$7::"int8") ), candidates as ( select t.*, case when found and not requested then 'Delete' when not found and requested then 'Add' when "rateNew" is distinct from "rateOriginal" or "fixedRateNew" is distinct from "fixedRateOriginal" then 'Update' else null end as kind from ( select coalesce(in_rows."roomTypeId", db."roomTypeId") as "roomTypeId",coalesce(in_rows."ratePlanId", db."ratePlanId") as "ratePlanId",coalesce(in_rows."date", db."date") as "date",coalesce(in_rows."duration", db."duration") as "duration",coalesce(in_rows."guests", db."guests") as "guests",in_rows.requested is not null as requested,db."roomTypeId" is not null as found,db."rate" as "rateOriginal",in_rows."rate" as "rateNew",db."fixedRate" as "fixedRateOriginal",in_rows."fixedRate" as "fixedRateNew" from in_rows full outer join (select * from rates.room_daily_rates db where (db."roomTypeId", db."ratePlanId", db."date", db."duration", db."guests") in (select * from (values ($1::integer, $2::integer, $5::"date", $3::integer, $4::integer)) t)) db on (in_rows."roomTypeId" = db."roomTypeId" and in_rows."ratePlanId" = db."ratePlanId" and in_rows."date" = db."date" and in_rows."duration" = db."duration" and in_rows."guests" = db."guests") ) t ), inserts as ( insert into rates.room_daily_rates ("roomTypeId","ratePlanId","date","duration","guests","rate","fixedRate") select in_rows."roomTypeId",in_rows."ratePlanId",in_rows."date",in_rows."duration",in_rows."guests",in_rows."rateNew",in_rows."fixedRateNew" from candidates in_rows where in_rows.kind = 'Add' returning rates.room_daily_rates.* ), updates as ( update rates.room_daily_rates db set "rate" = in_rows."rateNew","fixedRate" = in_rows."fixedRateNew" from candidates in_rows where in_rows.kind = 'Update' and in_rows."roomTypeId" = db."roomTypeId" and in_rows."ratePlanId" = db."ratePlanId" and in_rows."date" = db."date" and in_rows."duration" = db."duration" and in_rows."guests" = db."guests" and (db."roomTypeId", db."ratePlanId", db."date", db."duration", db."guests") in (select * from (values ($1::integer, $2::integer, $5::"date", $3::integer, $4::integer)) t) returning db.* ), deletes as ( delete from rates.room_daily_rates db using candidates in_rows where in_rows.kind = 'Delete' and in_rows."roomTypeId" = db."roomTypeId" and in_rows."ratePlanId" = db."ratePlanId" and in_rows."date" = db."date" and in_rows."duration" = db."duration" and in_rows."guests" = db."guests" and (db."roomTypeId", db."ratePlanId", db."date", db."duration", db."guests") in (select * from (values ($1::integer, $2::integer, $5::"date", $3::integer, $4::integer)) t) returning db.* ), deletion_logger as ( insert into rates.room_daily_rates_clog ("changeId", action, "roomTypeId", "ratePlanId", "duration", "guests", "date", "rate", "fixedRate") select $8, 'Delete', changes."roomTypeId", changes."ratePlanId", changes."duration", changes."guests", changes."date", changes."rate", changes."fixedRate" from deletes as changes ), insert_logger as ( insert into rates.room_daily_rates_clog ("changeId", action, "roomTypeId", "ratePlanId", "date", "duration", "guests" ) select $8, 'Insert', changes."roomTypeId", changes."ratePlanId", changes."date", changes."duration", changes."guests" from inserts as changes ), update_logger as ( insert into rates.room_daily_rates_clog ("changeId", action, "roomTypeId", "ratePlanId", "date", "duration", "guests", "rate", "fixedRate") select $8, 'Update', changes."roomTypeId", changes."ratePlanId", changes."date", changes."duration", changes."guests", case when changes."rateOriginal" is distinct from changes."rateNew" then changes."rateOriginal" else null end, case when changes."fixedRateOriginal" is distinct from changes."fixedRateNew" then changes."fixedRateOriginal" else null end from candidates as changes where changes.kind = 'Update' ) select (select count(*) from inserts) + (select count(*) from deletes) + (select count(*) from updates) Server log: 2019-03-07 16:16:08.831 CET [22673] LOG: server process (PID 49730) was terminated by signal 11: Segmentation fault 2019-03-07 16:16:08.831 CET [22673] DETAIL: Failed process was running: with in_rows (requested, "roomTypeId","ratePlanId","duration","guests","date","rate","fixedRate") as ( values (true,$1::integer,$2::integer,$3::integer,$4::integer,$5::"date",$6::"int8",$7::"int8") ), candidates as ( select t.*, case when found and not requested then 'Delete' when not found and requested then 'Add' when "rateNew" is distinct from "rateOriginal" or "fixedRateNew" is distinct from "fixedRateOriginal" then 'Update' else null end as kind from ( select coalesce(in_rows."roomTypeId", db."roomTypeId") as "roomTypeId",coalesce(in_rows."ratePlanId", db."ratePlanId") as "ratePlanId",coalesce(in_rows."date", db."date") as "date",coalesce(in_rows."duration", db."duration") as "duration",coalesce(in_rows."guests", db."guests") as "guests 2019-03-07 16:16:08.831 CET [22673] LOG: terminating any other active server processes 2019-03-07 16:16:08.831 CET [56666] WARNING: terminating connection because of crash of another server process 2019-03-07 16:16:08.831 CET [56666] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2019-03-07 16:16:08.831 CET [56666] HINT: In a moment you should be able to reconnect to the database and repeat your command. [ ... The above 3 lines repeated many times ... ] 2019-03-07 16:16:09.044 CET [56807] FATAL: the database system is in recovery mode [ ... repeated many times ... ] 2019-03-07 16:16:10.817 CET [22673] LOG: all server processes terminated; reinitializing 2019-03-07 16:16:12.504 CET [57223] LOG: database system was interrupted; last known up at 2019-03-07 16:15:27 CET 2019-03-07 16:16:13.043 CET [57223] LOG: database system was not properly shut down; automatic recovery in progress 2019-03-07 16:16:13.167 CET [57223] LOG: redo starts at 399B/ECCF9450 2019-03-07 16:16:16.680 CET [57223] LOG: invalid record length at 399B/FCF3F2B8: wanted 24, got 0 2019-03-07 16:16:16.680 CET [57223] LOG: redo done at 399B/FCF3F290 2019-03-07 16:16:16.680 CET [57223] LOG: last completed transaction was at log time 2019-03-07 16:16:08.826429+01 2019-03-07 16:16:16.685 CET [57223] LOG: checkpoint starting: end-of-recovery immediate 2019-03-07 16:16:18.437 CET [57223] LOG: checkpoint complete: wrote 39408 buffers (5.0%); 0 WAL file(s) added, 2 removed, 0 recycled; write=1.728 s, sync=0.014 s, total=1.754 s; sync files=836, longest=0.010 s, average=0.000 s; distance=264471 kB, estimate=264471 kB 2019-03-07 16:16:18.491 CET [22673] LOG: database system is ready to accept connections Backtrace from the latest crash: (gdb) bt #0 ExecInitModifyTable (node=node@entry=0x2568180, estate=estate@entry=0x35f1440, eflags=eflags@entry=0) at nodeModifyTable.c:2327 #1 0x000000000060af88 in ExecInitNode (node=0x2568180, estate=estate@entry=0x35f1440, eflags=eflags@entry=0) at execProcnode.c:174 #2 0x0000000000606fdd in EvalPlanQualStart (epqstate=0x3773848, epqstate=0x3773848, planTree=0x36c3f08, parentestate=0xa6) at execMain.c:3257 #3 EvalPlanQualBegin (epqstate=epqstate@entry=0x3773848, parentestate=parentestate@entry=0x3698450) at execMain.c:3066 #4 0x00000000006070c1 in EvalPlanQual (estate=estate@entry=0x3698450, epqstate=epqstate@entry=0x3773848, relation=relation@entry=0x7efa6640b040, rti=38, lockmode=<optimized out>, tid=tid@entry=0x7ffcede66e80, priorXmax=4181341875) at execMain.c:2536 #5 0x0000000000626b7e in ExecUpdate (mtstate=mtstate@entry=0x3773788, tupleid=0x7ffcede66f60, oldtuple=0x0, slot=0x230b768, planSlot=planSlot@entry=0x289ffa8, epqstate=epqstate@entry=0x3773848, estate=estate@entry=0x3698450, canSetTag=false) at nodeModifyTable.c:1267 #6 0x000000000062715e in ExecModifyTable (pstate=0x3773788) at nodeModifyTable.c:2166 #7 0x000000000062b21d in ExecProcNode (node=0x3773788) at ../../../src/include/executor/executor.h:247 #8 CteScanNext (node=node@entry=0x337aa78) at nodeCtescan.c:103 #9 0x000000000060c2fa in ExecScanFetch (recheckMtd=0x62b120 <CteScanRecheck>, accessMtd=0x62b140 <CteScanNext>, node=0x337aa78) at execScan.c:95 #10 ExecScan (node=0x337aa78, accessMtd=0x62b140 <CteScanNext>, recheckMtd=0x62b120 <CteScanRecheck>) at execScan.c:145 #11 0x00000000006127b9 in ExecProcNode (node=0x337aa78) at ../../../src/include/executor/executor.h:247 #12 fetch_input_tuple (aggstate=aggstate@entry=0x337a640) at nodeAgg.c:406 #13 0x0000000000614230 in agg_retrieve_direct (aggstate=0x337a640) at nodeAgg.c:1740 #14 ExecAgg (pstate=0x337a640) at nodeAgg.c:1555 #15 0x000000000062d1d1 in ExecProcNode (node=0x337a640) at ../../../src/include/executor/executor.h:247 #16 ExecSetParamPlan (node=<optimized out>, econtext=econtext@entry=0x337d528) at nodeSubplan.c:1095 #17 0x00000000005fe246 in ExecEvalParamExec (state=state@entry=0x337d7a0, op=op@entry=0x337e1a8, econtext=econtext@entry=0x337d528) at execExprInterp.c:2247 #18 0x0000000000600f4e in ExecInterpExpr (state=0x337d7a0, econtext=0x337d528, isnull=<optimized out>) at execExprInterp.c:991 #19 0x0000000000628fbb in ExecEvalExprSwitchContext (isNull=0x7ffcede67267, econtext=0x337d528, state=0x337d7a0) at ../../../src/include/executor/executor.h:313 #20 ExecProject (projInfo=0x337d798) at ../../../src/include/executor/executor.h:347 #21 ExecResult (pstate=<optimized out>) at nodeResult.c:136 #22 0x000000000060401a in ExecProcNode (node=0x337d418) at ../../../src/include/executor/executor.h:247 #23 ExecutePlan (execute_once=<optimized out>, dest=0x2708fb8, direction=<optimized out>, numberTuples=0, sendTuples=true, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x337d418, estate=0x3698450) at execMain.c:1723 #24 standard_ExecutorRun (queryDesc=0x20ccfb8, direction=<optimized out>, count=0, execute_once=<optimized out>) at execMain.c:364 #25 0x00007efbf28ea08d in pgss_ExecutorRun (queryDesc=0x20ccfb8, direction=ForwardScanDirection, count=0, execute_once=<optimized out>) at pg_stat_statements.c:892 #26 0x000000000074669a in ProcessQuery (plan=<optimized out>, sourceText=0x2857d18 ' ' <repeats 12 times>, "with in_rows (requested, \"roomTypeId\",\"ratePlanId\",\"duration\",\"guests\",\"date\",\"rate\",\"f"..., params=0x2708d80, queryEnv=0x0, dest=0x2708fb8, completionTag=0x7ffcede674b0 "") at pquery.c:161 #27 0x00000000007468d2 in PortalRunMulti (portal=portal@entry=0x1aed290, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=true, dest=dest@entry=0x2708fb8, altdest=0xcef3a0 <donothingDR>, completionTag=completionTag@entry=0x7ffcede674b0 "") at pquery.c:1286 #28 0x0000000000746b91 in FillPortalStore (portal=portal@entry=0x1aed290, isTopLevel=isTopLevel@entry=true) at pquery.c:1033 #29 0x000000000074759f in PortalRun (portal=<optimized out>, count=9223372036854775807, isTopLevel=<optimized out>, run_once=<optimized out>, dest=0x1a4eaf0, altdest=0x1a4eaf0, completionTag=0x7ffcede678d0 "") at pquery.c:768 #30 0x0000000000744e6d in PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=<optimized out>, username=<optimized out>) at postgres.c:2036 #31 0x00000000004803e9 in BackendRun (port=0x1a96eb0) at postmaster.c:4361 #32 BackendStartup (port=0x1a96eb0) at postmaster.c:4033 #33 ServerLoop () at postmaster.c:1706 #34 0x00000000006d7b99 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x1a49280) at postmaster.c:1379 #35 0x000000000048124f in main (argc=3, argv=0x1a49280) at main.c:228 Dump of vars in the ExecInitModifyTable stack frame: (gdb) info local mtstate = 0x39c7e78 operation = CMD_DELETE nplans = 15 saved_resultRelInfo = 0x0 resultRelInfo = <optimized out> subplan = <optimized out> l = <optimized out> i = <optimized out> rel = <optimized out> update_tuple_routing_needed = false (gdb) p *mtstate $4 = {ps = {type = T_ModifyTableState, plan = 0x2568180, state = 0x35f1440, ExecProcNode = 0x626e30 <ExecModifyTable>, ExecProcNodeReal = 0x0, instrument = 0x0, worker_instrument = 0x0, worker_jit_instrument = 0x0, qual = 0x0, lefttree = 0x0, righttree = 0x0, initPlan = 0x0, subPlan = 0x0, chgParam = 0x0, ps_ResultTupleSlot = 0x0, ps_ExprContext = 0x0, ps_ProjInfo = 0x0, scandesc = 0x0}, operation = CMD_DELETE, canSetTag = false, mt_done = false, mt_plans = 0x39c8088, mt_nplans = 15, mt_whichplan = 0, resultRelInfo = 0x35f3f78, rootResultRelInfo = 0xc0, mt_arowmarks = 0x39c8118, mt_epqstate = {estate = 0x0, planstate = 0x0, origslot = 0x0, plan = 0x0, arowMarks = 0x0, epqParam = 86}, fireBSTriggers = true, mt_existing = 0x0, mt_excludedtlist = 0x0, mt_conflproj = 0x0, mt_partition_tuple_routing = 0x0, mt_transition_capture = 0x0, mt_oc_transition_capture = 0x0, mt_per_subplan_tupconv_maps = 0x0} (gdb) p *mtstate->resultRelInfo $2 = {type = T_ResultRelInfo, ri_RangeTableIndex = 96, ri_RelationDesc = 0x7efa664049c0, ri_NumIndices = 0, ri_IndexRelationDescs = 0x0, ri_IndexRelationInfo = 0x0, ri_TrigDesc = 0x27a4328, ri_TrigFunctions = 0x27a4528, ri_TrigWhenExprs = 0x27a45b8, ri_TrigInstrument = 0x0, ri_FdwRoutine = 0x0, ri_FdwState = 0x0, ri_usesFdwDirectModify = false, ri_WithCheckOptions = 0x0, ri_WithCheckOptionExprs = 0x0, ri_ConstraintExprs = 0x0, ri_junkFilter = 0x336fae0, ri_returningList = 0x25af130, ri_projectReturning = 0x336aa38, ri_onConflictArbiterIndexes = 0x0, ri_onConflict = 0x0, ri_PartitionCheck = 0x27a45d8, ri_PartitionCheckExpr = 0x0, ri_PartitionRoot = 0x0, ri_PartitionReadyForRouting = false} (gdb) p *mtstate->resultRelInfo->ri_RelationDesc $3 = {rd_node = {spcNode = 1663, dbNode = 20254, relNode = 458573263}, rd_smgr = 0x2098ee8, rd_refcnt = 8, rd_backend = -1, rd_islocaltemp = false, rd_isnailed = false, rd_isvalid = true, rd_indexvalid = 1 '\001', rd_statvalid = true, rd_createSubid = 0, rd_newRelfilenodeSubid = 0, rd_rel = 0x7efa66404bd0, rd_att = 0x7efa66404ce0, rd_id = 458573263, rd_lockInfo = {lockRelId = {relId = 458573263, dbId = 20254}}, rd_rules = 0x0, rd_rulescxt = 0x0, trigdesc = 0x7efa66405c40, rd_rsdesc = 0x0, rd_fkeylist = 0x7efa66422940, rd_fkeyvalid = true, rd_partkeycxt = 0x0, rd_partkey = 0x0, rd_pdcxt = 0x0, rd_partdesc = 0x0, rd_partcheck = 0x7efa6646f590, rd_indexlist = 0x7efa66417d60, rd_oidindex = 0, rd_pkindex = 458573269, rd_replidindex = 458573269, rd_statlist = 0x0, rd_indexattr = 0x0, rd_projindexattr = 0x0, rd_keyattr = 0x0, rd_pkattr = 0x0, rd_idattr = 0x0, rd_projidx = 0x0, rd_pubactions = 0x0, rd_options = 0x0, rd_index = 0x0, rd_indextuple = 0x0, rd_amhandler = 0, rd_indexcxt = 0x0, rd_amroutine = 0x0, rd_opfamily = 0x0, rd_opcintype = 0x0, rd_support = 0x0, rd_supportinfo = 0x0, rd_indoption = 0x0, rd_indexprs = 0x0, rd_indpred = 0x0, rd_exclops = 0x0, rd_exclprocs = 0x0, rd_exclstrats = 0x0, rd_amcache = 0x0, rd_indcollation = 0x0, rd_fdwroutine = 0x0, rd_toastoid = 0, pgstat_info = 0x1b37f18} (gdb) p *mtstate->resultRelInfo->ri_RelationDesc->rd_rel $1 = {relname = {data = "room_daily_rates_2014", '\000' <repeats 42 times>}, relnamespace = 20273, reltype = 458573265, reloftype = 0, relowner = 17696, relam = 0, relfilenode = 458573263, reltablespace = 0, relpages = 29797, reltuples = 4678117, relallvisible = 0, reltoastrelid = 0, relhasindex = true, relisshared = false, relpersistence = 112 'p', relkind = 114 'r', relnatts = 7, relchecks = 3, relhasoids = false, relhasrules = false, relhastriggers = true, relhassubclass = false, relrowsecurity = false, relforcerowsecurity = false, relispopulated = true, relreplident = 100 'd', relispartition = true, relrewrite = 0, relfrozenxid = 4167686063, relminmxid = 9744227} (gdb) p *node $6 = {plan = {type = T_ModifyTable, startup_cost = 0.52249999999999996, total_cost = 6788616.1286984282, plan_rows = 15, plan_width = 94, parallel_aware = false, parallel_safe = false, plan_node_id = 149, targetlist = 0x25af130, qual = 0x0, lefttree = 0x0, righttree = 0x0, initPlan = 0x0, extParam = 0x0, allParam = 0x0}, operation = CMD_DELETE, canSetTag = false, nominalRelation = 92, partitioned_rels = 0x25688e8, partColsUpdated = false, resultRelations = 0x2568938, resultRelIndex = 16, rootResultRelIndex = 1, plans = 0x2568b28, withCheckOptionLists = 0x0, returningLists = 0x25af0e0, fdwPrivLists = 0x25b5038, fdwDirectModifyPlans = 0x0, rowMarks = 0x25b5248, epqParam = 86, onConflictAction = ONCONFLICT_NONE, arbiterIndexes = 0x0, onConflictSet = 0x0, onConflictWhere = 0x0, exclRelRTI = 91, exclRelTlist = 0x0} (gdb) p *estate $7 = {type = T_EState, es_direction = ForwardScanDirection, es_snapshot = 0x208ba70, es_crosscheck_snapshot = 0x0, es_range_table = 0x282af48, es_plannedstmt = 0x2829e98, es_sourceText = 0x0, es_junkFilter = 0x0, es_output_cid = 0, es_result_relations = 0x35f3378, es_num_result_relations = 34, es_result_relation_info = 0x0, es_root_result_relations = 0x0, es_num_root_result_relations = 0, es_tuple_routing_result_relations = 0x0, es_trig_target_relations = 0x0, es_trig_tuple_slot = 0x3814810, es_trig_oldtup_slot = 0x0, es_trig_newtup_slot = 0x0, es_param_list_info = 0x2708d80, es_param_exec_vals = 0x35f7388, es_queryEnv = 0x0, es_query_cxt = 0x35f1330, es_tupleTable = 0x35f96e8, es_rowMarks = 0x27f81b8, es_processed = 0, es_lastoid = 0, es_top_eflags = 0, es_instrument = 0, es_finished = false, es_exprcontexts = 0x35f3310, es_subplanstates = 0x35f97c8, es_auxmodifytables = 0x38148e0, es_per_tuple_exprcontext = 0x0, es_epqTuple = 0x35f6398, es_epqTupleSet = 0x35f9398, es_epqScanDone = 0x35f71c8, es_use_parallel_mode = false, es_query_dsa = 0x0, es_jit_flags = 0, es_jit = 0x0, es_jit_worker_instr = 0x0}
pgsql-bugs by date: