SerializeParamList vs machines with strict alignment - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | SerializeParamList vs machines with strict alignment |
Date | |
Msg-id | 11629.1536550032@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: SerializeParamList vs machines with strict alignment
Re: SerializeParamList vs machines with strict alignment |
List | pgsql-hackers |
I wondered why buildfarm member chipmunk has been failing hard for the last little while. Fortunately, it's supplying us with a handy backtrace: Program terminated with signal 7, Bus error. #0 EA_flatten_into (allocated_size=<optimized out>, result=0xb55ff30e, eohptr=0x188f440) at array_expanded.c:329 329 aresult->dataoffset = dataoffset; #0 EA_flatten_into (allocated_size=<optimized out>, result=0xb55ff30e, eohptr=0x188f440) at array_expanded.c:329 #1 EA_flatten_into (eohptr=0x188f440, result=0xb55ff30e, allocated_size=<optimized out>) at array_expanded.c:293 #2 0x003c3dfc in EOH_flatten_into (eohptr=<optimized out>, result=<optimized out>, allocated_size=<optimized out>) at expandeddatum.c:84 #3 0x003c076c in datumSerialize (value=3934060, isnull=<optimized out>, typByVal=<optimized out>, typLen=<optimized out>,start_address=0xbea3bd54) at datum.c:341 #4 0x002a8510 in SerializeParamList (paramLI=0x1889f18, start_address=0xbea3bd54) at params.c:195 #5 0x002342cc in ExecInitParallelPlan (planstate=0xffffffff, estate=0x18863e0, sendParams=0x46e, nworkers=1, tuples_needed=-1)at execParallel.c:700 #6 0x002461dc in ExecGather (pstate=0x18864f0) at nodeGather.c:151 #7 0x00236b20 in ExecProcNodeFirst (node=0x18864f0) at execProcnode.c:445 #8 0x0022fc2c in ExecProcNode (node=0x18864f0) at ../../../src/include/executor/executor.h:237 #9 ExecutePlan (execute_once=<optimized out>, dest=0x188a108, direction=<optimized out>, numberTuples=0, sendTuples=<optimizedout>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x18864f0, estate=0x18863e0)at execMain.c:1721 #10 standard_ExecutorRun (queryDesc=0x188a138, direction=<optimized out>, count=0, execute_once=true) at execMain.c:362 #11 0x0023d630 in postquel_getnext (fcache=0x1888408, es=0x1889d68) at functions.c:867 #12 fmgr_sql (fcinfo=0x701c7c) at functions.c:1164 This is remarkably hard to replicate on other machines, but I eventually managed to duplicate it on gaur's host, after which it became really obvious that the parallel-query data transfer logic has never been stressed very hard on machines with strict data alignment rules. In particular, SerializeParamList does this: /* Write flags. */ memcpy(*start_address, &prm->pflags, sizeof(uint16)); *start_address += sizeof(uint16); immediately followed by this: datumSerialize(prm->value, prm->isnull, typByVal, typLen, start_address); and datumSerialize might do this: EOH_flatten_into(eoh, (void *) *start_address, header); Now, I will plead mea culpa that the expanded-object API doesn't say in large red letters that the target address for EOH_flatten_into is supposed to be maxaligned. It only says * The flattened representation must be a valid in-line, non-compressed, * 4-byte-header varlena object. Still, one might reasonably suspect from that that *at least* 4-byte alignment is expected. This code path isn't providing such alignment, and machines that require it will crash. The only reason we've not noticed, AFAICS, is that nobody has been running with force_parallel_mode = regress on alignment-picky hardware. regards, tom lane
pgsql-hackers by date: