Re: Help: 8.0.3 Vacuum of an empty table never completes ... - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Help: 8.0.3 Vacuum of an empty table never completes ...
Date
Msg-id 11653.1133197204@sss.pgh.pa.us
Whole thread Raw
In response to Re: Help: 8.0.3 Vacuum of an empty table never completes ...  (James Robinson <jlrobins@socialserve.com>)
Responses Re: Help: 8.0.3 Vacuum of an empty table never completes ...
List pgsql-hackers
James Robinson <jlrobins@socialserve.com> writes:
> On Nov 28, 2005, at 11:38 AM, Tom Lane wrote:
>> Can you get a similar backtrace from the vacuumdb process?   

> OK:

> (gdb) bt
> #0  0xffffe410 in ?? ()
> #1  0xbfffe4f8 in ?? ()
> #2  0x00000030 in ?? ()
> #3  0x08057b68 in ?? ()
> #4  0xb7e98533 in __write_nocancel () from /lib/tls/libc.so.6
> #5  0xb7e4aae6 in _IO_new_file_write () from /lib/tls/libc.so.6
> #6  0xb7e4a7e5 in new_do_write () from /lib/tls/libc.so.6
> #7  0xb7e4aa63 in _IO_new_file_xsputn () from /lib/tls/libc.so.6
> #8  0xb7e413a2 in fputs () from /lib/tls/libc.so.6
> #9  0xb7fd8f99 in defaultNoticeProcessor () from /usr/local/pgsql/lib/ 
> libpq.so.4
> #10 0xb7fd8fe5 in defaultNoticeReceiver () from /usr/local/pgsql/lib/ 
> libpq.so.4
> #11 0xb7fe2d34 in pqGetErrorNotice3 () from /usr/local/pgsql/lib/ 
> libpq.so.4
> #12 0xb7fe3921 in pqParseInput3 () from /usr/local/pgsql/lib/libpq.so.4
> #13 0xb7fdb174 in parseInput () from /usr/local/pgsql/lib/libpq.so.4
> #14 0xb7fdca99 in PQgetResult () from /usr/local/pgsql/lib/libpq.so.4
> #15 0xb7fdcc4b in PQexecFinish () from /usr/local/pgsql/lib/libpq.so.4
> #16 0x0804942c in vacuum_one_database ()
> #17 0x080497a1 in main ()

OK, so evidently the backend is sending NOTICE messages, and the
vacuumdb is blocked trying to copy those messages to stderr.

> Things to know which could possibly be of use. This cron is kicked  
> off on the backup database box, and the vacuumdb is run via ssh to  
> the primary box. The primary box is running the vacuumdb operation  
> with --analyze --verbose, with the output being streamed to a logfile  
> on the backup box. Lemme guess __write_nocancel calls syscall write,  
> and 0x00000030 might could well be the syscall entry point? Something  
> gumming up the networking or sshd itself could have stopped up the  
> ouput queues, and the backups populated all the way down to this level?

That's what it looks like: the output queue from the vacuumdb has
stopped up somehow.  Your next move is to look at the state of sshd
and whatever is running at the client end of the ssh tunnel.
        regards, tom lane


pgsql-hackers by date:

Previous
From: James Robinson
Date:
Subject: Re: Help: 8.0.3 Vacuum of an empty table never completes ...
Next
From: Tom Lane
Date:
Subject: Re: Getting different number of results when using hashjoin on/off