Thread: Broken PostgreSQL (latest CVS)
Hi hackers. I grabbed the latest CVS from this morning and did a clean build and initdb. Things look a little broken as I get a SIGSEGV when trying to create a table. Any idea what went wrong? Keith. Program received signal SIGSEGV, Segmentation fault. 0xe016e6b4 in _wordcopy_fwd_aligned () (gdb) bt #0 0xe016e6b4 in _wordcopy_fwd_aligned () #1 0xe011d62c in memmove () #2 0x26ad0 in DataFill (data=0x1af964 "", tupleDesc=0x4, value=0xefffce3c, nulls=0xefffce40 "", infomask=0xefffcbc6, bit=0x1af958 "\003") at heaptuple.c:208 #3 0x27abc in index_formtuple (tupleDescriptor=0x15b710, value=0xefffce3c, null=0xefffce40 "") at indextuple.c:78 #4 0x3ab8c in btinsert (rel=0x1b0510, datum=0xefffce3c, nulls=0xefffce40 "", ht_ctid=0x1b04a8, heapRel=0xefffcc8f) at nbtree.c:358 #5 0xd5700 in fmgr_c (finfo=0xefffcd20, values=0xefffcd30, isNull=0xefffcd17 "") at fmgr.c:115 #6 0xd5a10 in fmgr (procedureId=331) at fmgr.c:286 #7 0x33ab8 in index_insert (relation=0x1b0510, datum=0xefffce3c, nulls=0xefffce40 "", heap_t_ctid=0x1b04a8, heapRel=0x158410) at indexam.c:190 #8 0x4782c in CatalogIndexInsert (idescs=0xefffcef8, nIndices=3, heapRelation=0x158410, heapTuple=0x1b0490) at indexing.c:162 #9 0x44bb8 in AddNewAttributeTuples (new_rel_oid=18144, tupdesc=0x15b910) at heap.c:588 #10 0x44e5c in heap_create_with_catalog (relname=0x159290 "dummy2", tupdesc=0x15b930, relkind=114 'r') at heap.c:805 #11 0x4bc6c in DefineRelation (stmt=0x159350, relkind=114 'r') at creatinh.c:140 #12 0xb0618 in ProcessUtility (parsetree=0x159350, dest=Remote) at utility.c:176 #13 0xaedfc in pg_exec_query_dest ( query_string=0xefffd1a0 " create table dummy2 (dummy2 int);", dest=Remote) at postgres.c:654 #14 0xaed2c in pg_exec_query ( query_string=0xefffd1a0 " create table dummy2 (dummy2 int);") at postgres.c:602 #15 0xafcb8 in PostgresMain (argc=1159168, argv=0xeffff1a0, real_argc=10, real_argv=0xeffffd84) at postgres.c:1429 #16 0x98b9c in DoBackend (port=0xfd400) at postmaster.c:1412 #17 0x985d8 in BackendStartup (port=0x15cc00) at postmaster.c:1191 #18 0x97c04 in ServerLoop () at postmaster.c:725 #19 0x9775c in PostmasterMain (argc=0, argv=0xeffffd84) at postmaster.c:534 #20 0x6c15c in main (argc=10, argv=0xeffffd84) at main.c:93
> > Hi hackers. > > I grabbed the latest CVS from this morning and did a clean build > and initdb. > > Things look a little broken as I get a SIGSEGV when trying to > create a table. I noticed them too (but I thought it was a flaw since they where gone (up to now) after a 'make install' and 'initdb'). > > Any idea what went wrong? > > Keith. > > Program received signal SIGSEGV, Segmentation fault. > 0xe016e6b4 in _wordcopy_fwd_aligned () > (gdb) bt > #0 0xe016e6b4 in _wordcopy_fwd_aligned () > #1 0xe011d62c in memmove () > #2 0x26ad0 in DataFill (data=0x1af964 "", tupleDesc=0x4, value=0xefffce3c, > nulls=0xefffce40 "", infomask=0xefffcbc6, bit=0x1af958 "\003") > at heaptuple.c:208 > #3 0x27abc in index_formtuple (tupleDescriptor=0x15b710, value=0xefffce3c, > null=0xefffce40 "") at indextuple.c:78 Looks like the tuple desctiptor given to index_formtuple() is corrupted somewhere before the call to memmove() in DataFill() line 208. I think it must have happened inside of DataFill() or something DataFill() called, because at the time of failure, the on-stack variable tupleDescriptor of index_formtuple() looks still good - so I assume index_formtuple() handed the correct value to DataFill() at line 78 and it has been corrupted after. BTW: I love gdb. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #======================================== jwieck@debis.com (Jan Wieck) #
> > > > Hi hackers. > > > > I grabbed the latest CVS from this morning and did a clean build > > and initdb. > > > > Things look a little broken as I get a SIGSEGV when trying to > > create a table. > > I noticed them too (but I thought it was a flaw since they > where gone (up to now) after a 'make install' and 'initdb'). > > > > > Any idea what went wrong? > > > > Keith. > > > > Program received signal SIGSEGV, Segmentation fault. > > 0xe016e6b4 in _wordcopy_fwd_aligned () > > (gdb) bt > > #0 0xe016e6b4 in _wordcopy_fwd_aligned () > > #1 0xe011d62c in memmove () > > #2 0x26ad0 in DataFill (data=0x1af964 "", tupleDesc=0x4, value=0xefffce3c, > > nulls=0xefffce40 "", infomask=0xefffcbc6, bit=0x1af958 "\003") > > at heaptuple.c:208 > > #3 0x27abc in index_formtuple (tupleDescriptor=0x15b710, value=0xefffce3c, > > null=0xefffce40 "") at indextuple.c:78 > > Looks like the tuple desctiptor given to index_formtuple() is > corrupted somewhere before the call to memmove() in > DataFill() line 208. > > I think it must have happened inside of DataFill() or > something DataFill() called, because at the time of failure, > the on-stack variable tupleDescriptor of index_formtuple() > looks still good - so I assume index_formtuple() handed the > correct value to DataFill() at line 78 and it has been > corrupted after. > > BTW: I love gdb. OK, I have just applied another patch to fix the problem I was seeing with vacuum analyze. The problem was in catalog/index.c::UpdateStats(). First, this was being called during initialization, but was using the cache. I had to put an IsBootstrapProcessingMode() test, and add a sequential scan. Second, there was no WriteBuffer() after the modification of the record, and that caused problems. I assume the old code was so kludgy that they somehow got the block written by accident. The manifestation of the problem was the at the pg_class.relhasindex record was not being set, and therefore the pg_description index was not getting updated, and vacuum was complaining. Adding syscache calls is tricky because it is hard to know if a function is called during bootstrap. Please do a clean compile and new initdb, and let me know if anyone has any more problems. I still have the patch file I used, and have consulted it to see what the cause it. In this case, there never was a WriteBuffer() call where it should have been. It is possible there is some other missing code that has been tickled by my changes. initdb runs fine here now under BSDI. Of course, it ran fine earlier too, with just the vacuum analyze problem. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
> Looks like the tuple desctiptor given to index_formtuple() is > corrupted somewhere before the call to memmove() in > DataFill() line 208. > > I think it must have happened inside of DataFill() or > something DataFill() called, because at the time of failure, > the on-stack variable tupleDescriptor of index_formtuple() > looks still good - so I assume index_formtuple() handed the > correct value to DataFill() at line 78 and it has been > corrupted after. > > BTW: I love gdb. This is happening during initdb? Hope the new code fixes that. BTW, it does pass the regression tests on my machine, so I figured it was safe. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
I think my new fixes may have fixed this too. Destruction of indexes was causing major corruption in random areas. > Hi hackers. > > I grabbed the latest CVS from this morning and did a clean build > and initdb. > > Things look a little broken as I get a SIGSEGV when trying to > create a table. > > Any idea what went wrong? > > Keith. > > Program received signal SIGSEGV, Segmentation fault. > 0xe016e6b4 in _wordcopy_fwd_aligned () > (gdb) bt > #0 0xe016e6b4 in _wordcopy_fwd_aligned () > #1 0xe011d62c in memmove () > #2 0x26ad0 in DataFill (data=0x1af964 "", tupleDesc=0x4, value=0xefffce3c, > nulls=0xefffce40 "", infomask=0xefffcbc6, bit=0x1af958 "\003") > at heaptuple.c:208 > #3 0x27abc in index_formtuple (tupleDescriptor=0x15b710, value=0xefffce3c, > null=0xefffce40 "") at indextuple.c:78 > #4 0x3ab8c in btinsert (rel=0x1b0510, datum=0xefffce3c, nulls=0xefffce40 "", > ht_ctid=0x1b04a8, heapRel=0xefffcc8f) at nbtree.c:358 > #5 0xd5700 in fmgr_c (finfo=0xefffcd20, values=0xefffcd30, isNull=0xefffcd17 > "") > at fmgr.c:115 > #6 0xd5a10 in fmgr (procedureId=331) at fmgr.c:286 > #7 0x33ab8 in index_insert (relation=0x1b0510, datum=0xefffce3c, > nulls=0xefffce40 "", heap_t_ctid=0x1b04a8, heapRel=0x158410) at > indexam.c:190 > #8 0x4782c in CatalogIndexInsert (idescs=0xefffcef8, nIndices=3, > heapRelation=0x158410, heapTuple=0x1b0490) at indexing.c:162 > #9 0x44bb8 in AddNewAttributeTuples (new_rel_oid=18144, tupdesc=0x15b910) > at heap.c:588 > #10 0x44e5c in heap_create_with_catalog (relname=0x159290 "dummy2", > tupdesc=0x15b930, relkind=114 'r') at heap.c:805 > #11 0x4bc6c in DefineRelation (stmt=0x159350, relkind=114 'r') at creatinh.c:140 > #12 0xb0618 in ProcessUtility (parsetree=0x159350, dest=Remote) at utility.c:176 > #13 0xaedfc in pg_exec_query_dest ( > query_string=0xefffd1a0 " create table dummy2 (dummy2 int);", dest=Remote) > at postgres.c:654 > #14 0xaed2c in pg_exec_query ( > query_string=0xefffd1a0 " create table dummy2 (dummy2 int);") at > postgres.c:602 > #15 0xafcb8 in PostgresMain (argc=1159168, argv=0xeffff1a0, real_argc=10, > real_argv=0xeffffd84) at postgres.c:1429 > #16 0x98b9c in DoBackend (port=0xfd400) at postmaster.c:1412 > #17 0x985d8 in BackendStartup (port=0x15cc00) at postmaster.c:1191 > #18 0x97c04 in ServerLoop () at postmaster.c:725 > #19 0x9775c in PostmasterMain (argc=0, argv=0xeffffd84) at postmaster.c:534 > #20 0x6c15c in main (argc=10, argv=0xeffffd84) at main.c:93 > > > > -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)