Thread: How to cancel a query if SIGINT does not work?
Hi, I have a query that appears to have hung somehow. I have tried sending a SIGINT but this does not cancel it... What's the next step without taking down the whole server? Thanks Adam -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
Adam Witney <awitney@sgul.ac.uk> writes: > I have a query that appears to have hung somehow. I have tried sending a > SIGINT but this does not cancel it... What's the next step without taking > down the whole server? In theory that should always work. What PG version is this? Can you attach to the stuck backend with gdb and get a stack trace? regards, tom lane
On 19/8/05 5:27 pm, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > Adam Witney <awitney@sgul.ac.uk> writes: >> I have a query that appears to have hung somehow. I have tried sending a >> SIGINT but this does not cancel it... What's the next step without taking >> down the whole server? > > In theory that should always work. What PG version is this? Can you > attach to the stuck backend with gdb and get a stack trace? select version(); version --------------------------------------------------------------- PostgreSQL 7.4.8 on i686-pc-linux-gnu, compiled by GCC 2.95.4 Am not particularly familiar with gdb, but if you tell me the steps I will do it and report back Thanks for your help Adam -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
Adam Witney <awitney@sgul.ac.uk> writes: > On 19/8/05 5:27 pm, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: >> In theory that should always work. What PG version is this? Can you >> attach to the stuck backend with gdb and get a stack trace? > Am not particularly familiar with gdb, but if you tell me the steps I will > do it and report back Think I've posted this before, but: Determine PID of backend you are interested in, eg with ps As postgres user, do $ gdb /path/to/postgres-executable PID-of-process gdb prints a bunch of junk gdb> bt ... copy and paste the output given here ... gdb> quit gdb: ok to detach? y $ If you get a pile of purely numeric output from bt, it's useless (means you're running symbol-stripped executables). I'm hoping for at least some function names. regards, tom lane
On 19/8/05 6:08 pm, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > Adam Witney <awitney@sgul.ac.uk> writes: >> On 19/8/05 5:27 pm, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: >>> In theory that should always work. What PG version is this? Can you >>> attach to the stuck backend with gdb and get a stack trace? > >> Am not particularly familiar with gdb, but if you tell me the steps I will >> do it and report back > > Think I've posted this before, but: > > Determine PID of backend you are interested in, eg with ps > > As postgres user, do > $ gdb /path/to/postgres-executable PID-of-process > gdb prints a bunch of junk > gdb> bt > ... copy and paste the output given here ... > gdb> quit > gdb: ok to detach? y > $ > > If you get a pile of purely numeric output from bt, it's useless > (means you're running symbol-stripped executables). I'm hoping > for at least some function names. (gdb) bt #0 0x40193812 in recv () from /lib/libc.so.6 #1 0x081044d8 in secure_read () #2 0x081084c7 in pq_recvbuf () #3 0x0810853d in pq_getbyte () #4 0x080cb12b in CopyGetData () #5 0x080cb36e in CopyGetChar () #6 0x080cd9d7 in CopyReadLine () #7 0x080cce90 in CopyFrom () #8 0x080cbdea in DoCopy () #9 0x08157759 in ProcessUtility () #10 0x08156a1b in PortalRunUtility () #11 0x08156ae2 in PortalRunMulti () #12 0x0815667e in PortalRun () #13 0x08153843 in exec_simple_query () #14 0x08155ab9 in PostgresMain () #15 0x0813523d in BackendFork () #16 0x08134ad5 in BackendStartup () #17 0x08133439 in ServerLoop () #18 0x08132fa5 in PostmasterMain () #19 0x081097af in main () #20 0x400dc14f in __libc_start_main () from /lib/libc.so.6 Also, what's the best way to cancel the query now? Thanks again for your help adam -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
On Fri, Aug 19, 2005 at 06:10:28PM +0100, Adam Witney wrote: > > If you get a pile of purely numeric output from bt, it's useless > > (means you're running symbol-stripped executables). I'm hoping > > for at least some function names. [trace below] That's not stuck, somebody has typed COPY FROM STDIN and not sent any data (or not finished it). Should be able to Ctrl-C it though. Are you sure you have the right backend? Closing the frontend attached to it should do... > (gdb) bt > #0 0x40193812 in recv () from /lib/libc.so.6 > #1 0x081044d8 in secure_read () > #2 0x081084c7 in pq_recvbuf () > #3 0x0810853d in pq_getbyte () > #4 0x080cb12b in CopyGetData () > #5 0x080cb36e in CopyGetChar () > #6 0x080cd9d7 in CopyReadLine () > #7 0x080cce90 in CopyFrom () > #8 0x080cbdea in DoCopy () > #9 0x08157759 in ProcessUtility () > #10 0x08156a1b in PortalRunUtility () > #11 0x08156ae2 in PortalRunMulti () > #12 0x0815667e in PortalRun () > #13 0x08153843 in exec_simple_query () > #14 0x08155ab9 in PostgresMain () > #15 0x0813523d in BackendFork () > #16 0x08134ad5 in BackendStartup () > #17 0x08133439 in ServerLoop () > #18 0x08132fa5 in PostmasterMain () > #19 0x081097af in main () > #20 0x400dc14f in __libc_start_main () from /lib/libc.so.6 > > Also, what's the best way to cancel the query now? -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment
Adam Witney <awitney@sgul.ac.uk> writes: > (gdb) bt > #0 0x40193812 in recv () from /lib/libc.so.6 > #1 0x081044d8 in secure_read () > #2 0x081084c7 in pq_recvbuf () > #3 0x0810853d in pq_getbyte () > #4 0x080cb12b in CopyGetData () > #5 0x080cb36e in CopyGetChar () > #6 0x080cd9d7 in CopyReadLine () > #7 0x080cce90 in CopyFrom () > #8 0x080cbdea in DoCopy () > #9 0x08157759 in ProcessUtility () > #10 0x08156a1b in PortalRunUtility () > #11 0x08156ae2 in PortalRunMulti () > #12 0x0815667e in PortalRun () > #13 0x08153843 in exec_simple_query () > #14 0x08155ab9 in PostgresMain () > #15 0x0813523d in BackendFork () > #16 0x08134ad5 in BackendStartup () > #17 0x08133439 in ServerLoop () > #18 0x08132fa5 in PostmasterMain () > #19 0x081097af in main () > #20 0x400dc14f in __libc_start_main () from /lib/libc.so.6 OK, apparently it's waiting for data from the client during a COPY FROM. Not sure if we should try to fix things so that a SIGINT will work there --- seems like it'd be more likely to bollix the protocol state than do anything very useful. Something for the TODO list. > Also, what's the best way to cancel the query now? Kill the connected client. The recv() should error out once the network connection is dropped. regards, tom lane
On 19/8/05 6:20 pm, "Martijn van Oosterhout" <kleptog@svana.org> wrote: > On Fri, Aug 19, 2005 at 06:10:28PM +0100, Adam Witney wrote: >>> If you get a pile of purely numeric output from bt, it's useless >>> (means you're running symbol-stripped executables). I'm hoping >>> for at least some function names. > > [trace below] > > That's not stuck, somebody has typed COPY FROM STDIN and not sent any > data (or not finished it). Should be able to Ctrl-C it though. Are you > sure you have the right backend? > > Closing the frontend attached to it should do... Ah.... Its a web application... I assumed that because the browser had been closed that the query must have hung. But just restarted the apache server and that cancelled it, so apache must have kept the query running despite the user closing the browser. Sorry I probably should have thought of that! Thanks for your help Tom and Martijn Adam -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.
I'm just starting out using VIEWs -- and mostly new to postgresql and I'm trying to learn a little about reading the EXPLAIN ANALYZE output. I have the following view setup and I'm wonder where to create indexes, and mostly if I'm doing anything the incorrect or a very expensive way. CREATE VIEW class_list ( id, name, class_time, location, workshop, review_mode, workshop_group, location_name, address, city, state, zip, region, region_name ) AS SELECT class.id, class.name, class.class_time, class.location, class.workshop, class.review_mode, class.workshop_group, location.name, address.id, address.city, address.state, address.zip, region.id, region.name FROM class, location, address, region WHERE class.location = location.id AND location.address = address.id AND location.region = region.id; I'm not clear about the Seq Scan below. The region table is quite small, so am I correct that is why the planner is doing a seq scan on that table? \d region Table "public.region" Column | Type | Modifiers ------------+---------+-------------------------------------------------------- id | integer | not null default nextval('public.region_id_seq'::text) active | boolean | not null default true sort_order | integer | not null default 1 name | text | not null Indexes: "region_pkey" primary key, btree (id) "region_name_key" unique, btree (name) EXPLAIN ANALYZE select * from class_list where workshop = 28; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=51.78..93.07 rows=9 width=157) (actual time=1.306..1.468 rows=6 loops=1) -> Hash Join (cost=51.78..76.87 rows=8 width=129) (actual time=1.245..1.299 rows=6 loops=1) Hash Cond: ("outer".id = "inner".region) -> Seq Scan on region (cost=0.00..20.00 rows=1000 width=36) (actual time=0.016..0.027 rows=10 loops=1) -> Hash (cost=51.76..51.76 rows=8 width=97) (actual time=1.019..1.019 rows=0 loops=1) -> Hash Join (cost=26.68..51.76 rows=8 width=97) (actual time=0.201..1.007 rows=6 loops=1) Hash Cond: ("outer".id = "inner"."location") -> Seq Scan on "location" (cost=0.00..20.00 rows=1000 width=44) (actual time=0.014..0.694 rows=104loops=1) -> Hash (cost=26.66..26.66 rows=7 width=57) (actual time=0.150..0.150 rows=0 loops=1) -> Index Scan using class_workshop_index on "class" (cost=0.00..26.66 rows=7 width=57) (actualtime=0.057..0.137 rows=6 loops=1) Index Cond: (workshop = 28) -> Index Scan using address_pkey on address (cost=0.00..2.01 rows=1 width=32) (actual time=0.013..0.015 rows=1 loops=6) Index Cond: ("outer".address = address.id) Total runtime: 1.853 ms (14 rows) By the way -- at one point I managed to hang postgresql (7.4.8-16 on Debian Sid). I have not been able to make it happen again, but it seemed odd. (gdb) bt #0 0x081e51ee in tuplestore_gettuple () #1 0x0810c7f0 in ExecMaterial () #2 0x08102cb2 in ExecProcNode () #3 0x0810d8d5 in ExecNestLoop () #4 0x08102ceb in ExecProcNode () #5 0x081093a4 in ExecAgg () #6 0x08102c79 in ExecProcNode () #7 0x08101ecc in ExecutorRun () #8 0x0816f58b in PortalSetResultFormat () #9 0x0816f8c7 in PortalRun () #10 0x0816da9f in PostgresMain () #11 0x08148b4e in ClosePostmasterPorts () #12 0x0814a4e1 in PostmasterMain () #13 0x0811c2e7 in main () -- Bill Moseley moseley@hank.org