Re: [HACKERS] memory problems in copying large table to STDOUT - Mailing list pgsql-hackers
From | Martin Weinberg |
---|---|
Subject | Re: [HACKERS] memory problems in copying large table to STDOUT |
Date | |
Msg-id | 199910110417.AAA18542@osprey.astro.umass.edu Whole thread Raw |
In response to | Re: [HACKERS] memory problems in copying large table to STDOUT (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [HACKERS] memory problems in copying large table to STDOUT
|
List | pgsql-hackers |
Hi Tom, Tom Lane wrote on Sun, 10 Oct 1999 21:33:00 EDT > >I think it's pretty likely that the original cause is the kernel disk >driver or disk hardware flakiness that we already have evidence for. >However, I hate passing the buck like that, so I'm willing to continue >digging if you are. Honestly, I'm inclined to agree but I'd like to know for sure before go buy a hardware raid controller (or give up trying to build a database this large with Linux altogether). The "coincidence" that's haunting me here is that I used pgsql on a 12GB database of the same sort with no trouble for about 6 months. All of this started when I tried to load 60GB. And after loading the table, I index it and exercise it and everything is fine. Twice now, this problem has arisen when I've tried to backup the table with a "copy". The first time, I wiped out the raid array, got the latest kernel with the up-to-date raid patches and rebuilt (takes a while . . .). Same thing again. Of course, this copy hits the disks pretty hard, so maybe it's not so coincidental. Anyway, ruling out a pgsql problem would be progress. >> Again, I've saved the process in case checking any symbols >> would be helpful. > >You should look at the source tuple location info in CopyTo --- >something like > (gdb) f 6 -- frame 6, ie, CopyTo > (gdb) p i -- get column number > (gdb) p *tuple -- print contents of HeapTupleData > (gdb) p *tuple->t_data -- print contents of HeapTupleHeaderData > >The last is mainly to find out the tuple's OID for possible future >reference. What we want right now is the tuple location info, >tuple->t_self, which will give us a block number (bi_hi and bi_lo in >that struct are the high and low 16 bits of the block number). Then, >if you can use dd and od to get a hex dump of that block from the >relation's data files, we can see what's really on disk there. >(Remember that the "blocks" are 8K each; also, if you get an offset >beyond 1 gig, then it's going to be in one of the continuation files >"psc.1", "psc.2", etc --- one gig apiece.) > Ok done. Here's what I find: (gdb) p i $1 = 48 (gdb) p *tuple $2 = {t_len = 352, t_self = {ip_blkid = {bi_hi = 24, bi_lo = 26279}, ip_posid = 19}, t_data = 0x41061710} (gdb) p *tuple->t_data $3 = {t_oid = 37497689, t_cmin = 0, t_cmax = 0, t_xmin = 17943, t_xmax = 0, t_ctid = {ip_blkid = {bi_hi = 24, bi_lo = 26279}, ip_posid = 19}, t_natts = 63, t_infomask = 2307, t_hoff = 40 '(',t_bits = "<FF><FF><FF><FF>"} Now, check me to make sure I've followed you correctly: Since 1GB of blocks is 0x20000, this data in the 13th GB. The offset into the 13th is 26279. So I did: dd if=psc.12 skip=26279 count=1 bs=8k | od -t x > ~/dump.hex I'm not sure what I'm looking for in dump.hex. The first hand full of lines are: 0000000 01840064 20002000 02c09ea0 02809d60 0000020 02c09c00 02989ab4 02c09954 02989808 0000040 028096c8 02809588 02c09428 02c092c8 0000060 02c09168 02c09008 02c08ea8 02988d5c 0000100 02c08bfc 02c08a9c 0280895c 02588830 0000120 02c086d0 02c08570 02588444 02c082e4 0000140 02c08184 00000000 00000000 00000000 0000160 00000000 00000000 00000000 00000000 * 0000600 00000000 023c2b5d 00000000 00000000 0000620 00004617 00000000 66a70018 003f0017 0000640 ff280903 ffffffff 000fffff 00000005 0000660 00000073 fffffd95 000000ac 00003d2b . . . --Martin =========================================================================== Martin Weinberg Phone: (413) 545-3821 Dept. of Physics and Astronomy FAX: (413) 545-2117/0648 530 Graduate Research Tower weinberg@astro.umass.edu University of Massachusetts http://www.astro.umass.edu/~weinberg/ Amherst, MA 01003-4525
pgsql-hackers by date: