Re: PosgreSQL is crashing with a signal 11 - Bug? - Mailing list pgsql-bugs
| From | Kjetil Torgrim Homme |
|---|---|
| Subject | Re: PosgreSQL is crashing with a signal 11 - Bug? |
| Date | |
| Msg-id | 1rbrge70qq.fsf@rovereto.ifi.uio.no Whole thread Raw |
| In response to | Re: PosgreSQL is crashing with a signal 11 - Bug? (Kjetil Torgrim Homme <kjetilho@ifi.uio.no>) |
| Responses |
Re: PosgreSQL is crashing with a signal 11 - Bug?
|
| List | pgsql-bugs |
we got a new coredump of 7.3.7 today. this instance was running on a
freshly installed computer, to eliminate(?) all hardware issues. it's
still the same brand and model, though. the old system has been
running hard disk tests 30+ hours with no errors yet.
the core dump happens at the same place in the code, and this time we
got a complete backtrace:
(gdb) bt
#0 0xb734d07c in memcpy () from /lib/tls/libc.so.6
#1 0x0806bba8 in DataFill (data=3D0xb7488fff "", tupleDesc=3D0x82899a0,=20
value=3D0x8289980, nulls=3D0xbfffd3c0 " n \"", infomask=3D0x8806b=
04c,=20
bit=3D0x8806b04f "=EF\001") at heaptuple.c:139
#2 0x0806c3ee in heap_formtuple (tupleDescriptor=3D0x8279ec0, value=3D0x82=
89980,=20
nulls=3D0xbfffd3c0 " n \"") at heaptuple.c:623
#3 0x080d1af1 in ExecTargetList (targetlist=3D0x8278298, nodomains=3D9,=20
targettype=3D0x8279ec0, values=3D0x8289980, econtext=3D0x8279a60,=20
isDone=3D0xbfffd468) at execQual.c:2230
#4 0x080d1cdb in ExecScan (node=3D0x827a208, accessMtd=3D0xbfffd468)
at execScan.c:49
#5 0x080d1d7d in ExecScan (node=3D0x8278c70, accessMtd=3D0x80d7c58 <SeqNex=
t+24>)
at execScan.c:146
#6 0x080d7cfb in InitScanRelation (node=3D0x82899a0, estate=3D0x8278c70,=
=20
scanstate=3D0xbfffd4c8) at nodeSeqscan.c:162
#7 0x080cfd86 in ExecProcNode (node=3D0x8289bf8, parent=3D0x0)
at execProcnode.c:315
#8 0x080cecf3 in ExecutePlan (estate=3D0x8279c90, plan=3D0x8278c70,=20
operation=3DCMD_SELECT, numberTuples=3D0, direction=3D136878496,=20
destfunc=3D0x82899c8) at execMain.c:964
#9 0x080ce392 in ExecutorEnd (queryDesc=3D0x82899a0, estate=3D0x0)
at execMain.c:223
#10 0x0811d069 in ProcessQuery (parsetree=3D0x82899c8, plan=3D0x8278c70,=20
dest=3DRemote, completionTag=3D0xbfffd610 "") at pquery.c:251
#11 0x0811b7ed in pg_exec_query_string (query_string=3D0xbfffd610, dest=3DR=
emote,=20
parse_context=3D0x823d610) at postgres.c:844
#12 0x0811c64d in PostgresMain (argc=3D4, argv=3D0xbfffd850,=20
username=3D0x8238c69 "cerebrum") at postgres.c:2018
#13 0x0810413d in DoBackend (port=3D0x8238b38) at postmaster.c:2304
#14 0x08103cb2 in BackendStartup (port=3D0x8238b38) at postmaster.c:1935
#15 0x08102dad in ServerLoop () at postmaster.c:1016
#16 0x081027ea in PostmasterMain (argc=3D1, argv=3D0x8220170) at postmaster=
.c:797
#17 0x080e1234 in main (argc=3D1, argv=3D0xbfffe204) at main.c:217
(gdb) print *att[i]
$20 =3D {attrelid =3D 0, attname =3D {
data =3D "pageunits_total", '\0' <repeats 48 times>,=20
alignmentDummy =3D 1701273968}, atttypid =3D 1700, attstattarget =3D -1=
,=20
attlen =3D -1, attnum =3D 9, attndims =3D 0, attcacheoff =3D -1, atttypmo=
d =3D 393220,=20
attbyval =3D 0 '\0', attstorage =3D 109 'm', attisset =3D 0 '\0',=20
attalign =3D 105 'i', attnotnull =3D 0 '\0', atthasdef =3D 0 '\0',=20
attisdropped =3D 0 '\0', attislocal =3D 1 '\001', attinhcount =3D 0}
(gdb) print i
$21 =3D 8
(gdb) x/10 value[i]
0xb7190928: 0x2f00000b 0x00000000 0x00200000 0x00000207
0xb7190938: 0x00000314 0x01bf913d 0x10120000 0x00090020
0xb7190948: 0xef201553 0x00000001
the relevant code again is:
if (att[i]->attbyval)
[...]
else if (att[i]->attlen =3D=3D -1)
[...]
else if (att[i]->attlen =3D=3D -2)
[...]
else
{
/* fixed-length pass-by-reference */
Assert(att[i]->attlen > 0);
data_length =3D att[i]->attlen;
=3D=3D=3D> memcpy(data, DatumGetPointer(value[i]), data_length);
}
(gdb) print data_length
$25 =3D 788529163
(gdb) print att[i]->attlen
$26 =3D -1
how can att[i]->attlen possibly change in the interim? but
data_length looks corrupted, too.
(gdb) print *att[i-1]
$27 =3D {attrelid =3D 0, attname =3D {
data =3D "pageunits_paid", '\0' <repeats 49 times>,=20
alignmentDummy =3D 1701273968}, atttypid =3D 1700, attstattarget =3D -1=
,=20
attlen =3D -1, attnum =3D 8, attndims =3D 0, attcacheoff =3D -1, atttypmo=
d =3D 393220,=20
attbyval =3D 0 '\0', attstorage =3D 109 'm', attisset =3D 0 '\0',=20
attalign =3D 105 'i', attnotnull =3D 0 '\0', atthasdef =3D 0 '\0',=20
attisdropped =3D 0 '\0', attislocal =3D 1 '\001', attinhcount =3D 0}
also:
(gdb) print data
$39 =3D 0xb7488fff ""
which doesn't seem very aligned for an integer.
(gdb) print data[1]
Cannot access memory at address 0xb7489000
thank you for any insights.
--=20
Kjetil T.
pgsql-bugs by date: