Autovacuum worker doesn't immediately exit on postmaster death - Mailing list pgsql-hackers
From | Alexander Kukushkin |
---|---|
Subject | Autovacuum worker doesn't immediately exit on postmaster death |
Date | |
Msg-id | CAFh8B=kcdk8k-Y21RfXPu5dX=bgPqJ8TC3p_qxR_ygdBS=JN5w@mail.gmail.com Whole thread Raw |
Responses |
Re: Autovacuum worker doesn't immediately exit on postmaster death
Re: Autovacuum worker doesn't immediately exit on postmaster death |
List | pgsql-hackers |
Hello, I know, nobody in their mind should do that, but, if the postmaster process is killed with SIGKILL signal, most backend processes correctly notice the fact of the postmaster process absence and exit. There is one exception though, when there are autovacuum worker processes they are continuing to run until eventually finish and exit. Steps to reproduce: 1. Initialize the cluster and start it up 2. Create a table and fill it up with some data: localhost/postgres=# create table foo(id int); CREATE TABLE localhost/postgres=# alter table foo set (autovacuum_vacuum_cost_delay = 100); ALTER TABLE localhost/postgres=# insert into foo select * from generate_series(1, 10000000); INSERT 0 10000000 localhost/postgres=# \dt+ foo List of relations Schema │ Name │ Type │ Owner │ Persistence │ Size │ Description ────────┼──────┼───────┼──────────┼─────────────┼────────┼───────────── public │ foo │ table │ postgres │ permanent │ 346 MB │ (1 row) 3. Wait until autovacuum worker process started and kill it: $ ps auxwwwf | grep [p]ostgres akukush+ 7728 0.0 0.1 321244 26836 ? S 15:51 0:00 postgres -D data akukush+ 7730 0.0 0.0 173488 4312 ? Ss 15:51 0:00 \_ postgres: logger akukush+ 7732 0.0 0.0 321584 8808 ? Ss 15:51 0:00 \_ postgres: checkpointer akukush+ 7733 0.0 0.4 321376 70688 ? Ss 15:51 0:00 \_ postgres: background writer akukush+ 7734 0.0 0.0 321244 9780 ? Ss 15:51 0:00 \_ postgres: walwriter akukush+ 7735 0.0 0.0 321796 6684 ? Ss 15:51 0:00 \_ postgres: autovacuum launcher akukush+ 7736 0.0 0.0 175608 6224 ? Ss 15:51 0:00 \_ postgres: archiver last was 0000002E000000000000002A akukush+ 7737 0.0 0.0 175608 4340 ? Ss 15:51 0:00 \_ postgres: stats collector akukush+ 7738 0.0 0.0 321672 6812 ? Ss 15:51 0:00 \_ postgres: logical replication launcher akukush+ 7743 0.0 0.0 322460 14624 ? Ss 15:52 0:00 \_ postgres: postgres postgres 127.0.0.1(39130) idle $ ps auxwwwf | grep [p]ostgres | grep auto akukush+ 7735 0.0 0.0 321796 6684 ? Ss 15:51 0:00 \_ postgres: autovacuum launcher akukush+ 10483 1.0 0.0 322432 12472 ? Ss 16:28 0:00 \_ postgres: autovacuum worker postgres $ kill -9 7728 $ ps auxwwwf | grep [p]ostgres | grep auto akukush+ 10483 0.7 0.0 322432 12472 ? Ss 16:28 0:00 postgres: autovacuum worker postgres And here is gdb backtrace: (gdb) bt #0 0x00007f6e1c80c0f7 in __GI___select (nfds=nfds@entry=0, readfds=readfds@entry=0x0, writefds=writefds@entry=0x0, exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x7fff6cf1b580) at ../sysdeps/unix/sysv/linux/select.c:41 #1 0x000055dbd93ade2d in pg_usleep (microsec=<optimized out>) at ./build/../src/port/pgsleep.c:56 #2 0x000055dbd90b7543 in vacuum_delay_point () at ./build/../src/backend/commands/vacuum.c:2034 #3 0x000055dbd8f5c00d in lazy_scan_heap (aggressive=false, nindexes=0, Irel=0x0, vacrelstats=<optimized out>, params=0x55dbdaac7e7c, onerel=<optimized out>) at ./build/../src/backend/access/heap/vacuumlazy.c:1034 #4 heap_vacuum_rel (onerel=<optimized out>, params=0x55dbdaac7e7c, bstrategy=<optimized out>) at ./build/../src/backend/access/heap/vacuumlazy.c:518 #5 0x000055dbd90b561d in table_relation_vacuum (bstrategy=<optimized out>, params=0x55dbdaac7e7c, rel=0x7f6e1f3105f0) at ./build/../src/include/access/tableam.h:1460 #6 vacuum_rel (relid=16396, relation=<optimized out>, params=params@entry=0x55dbdaac7e7c) at ./build/../src/backend/commands/vacuum.c:1893 #7 0x000055dbd90b68c5 in vacuum (relations=0x55dbdab38588, params=params@entry=0x55dbdaac7e7c, bstrategy=<optimized out>, bstrategy@entry=0x55dbdaac7f98, isTopLevel=isTopLevel@entry=true) at ./build/../src/backend/commands/vacuum.c:449 #8 0x000055dbd8f0777b in autovacuum_do_vac_analyze (bstrategy=0x55dbdaac7f98, tab=0x55dbdaac7e78) at ./build/../src/backend/postmaster/autovacuum.c:3137 #9 do_autovacuum () at ./build/../src/backend/postmaster/autovacuum.c:2467 #10 0x000055dbd8f07e2a in AutoVacWorkerMain (argv=0x0, argc=0) at ./build/../src/backend/postmaster/autovacuum.c:1694 #11 0x000055dbd91a753a in StartAutoVacWorker () at ./build/../src/backend/postmaster/autovacuum.c:1488 #12 0x000055dbd91b54ca in StartAutovacuumWorker () at ./build/../src/backend/postmaster/postmaster.c:5613 #13 sigusr1_handler (postgres_signal_arg=<optimized out>) at ./build/../src/backend/postmaster/postmaster.c:5320 #14 <signal handler called> #15 0x00007f6e1c80c0f7 in __GI___select (nfds=nfds@entry=10, readfds=readfds@entry=0x7fff6cf1c7f0, writefds=writefds@entry=0x0, exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x7fff6cf1c750) at ../sysdeps/unix/sysv/linux/select.c:41 #16 0x000055dbd91b5759 in ServerLoop () at ./build/../src/backend/postmaster/postmaster.c:1703 #17 0x000055dbd91b75e3 in PostmasterMain (argc=17, argv=<optimized out>) at ./build/../src/backend/postmaster/postmaster.c:1412 #18 0x000055dbd8f0a3c8 in main (argc=17, argv=0x55dbdaa4aef0) at ./build/../src/backend/main/main.c:210 Fifteen minutes later process 10483 was still alive, but the backtrace looked a bit different (it finished vacuuming and was doing analyze): (gdb) bt #0 0x00007f6e1c80c0f7 in __GI___select (nfds=nfds@entry=0, readfds=readfds@entry=0x0, writefds=writefds@entry=0x0, exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x7fff6cf1b480) at ../sysdeps/unix/sysv/linux/select.c:41 #1 0x000055dbd93ade2d in pg_usleep (microsec=<optimized out>) at ./build/../src/port/pgsleep.c:56 #2 0x000055dbd90b7543 in vacuum_delay_point () at ./build/../src/backend/commands/vacuum.c:2034 #3 0x000055dbd9037251 in acquire_sample_rows (onerel=onerel@entry=0x7f6e1f310070, elevel=elevel@entry=13, rows=rows@entry=0x7f6e1f2b8048, targrows=targrows@entry=30000, totalrows=totalrows@entry=0x7fff6cf1b6d8, totaldeadrows=totaldeadrows@entry=0x7fff6cf1b6e0) at ./build/../src/backend/commands/analyze.c:1079 #4 0x000055dbd9039d51 in do_analyze_rel (onerel=onerel@entry=0x7f6e1f310070, params=params@entry=0x55dbdaac7e7c, va_cols=va_cols@entry=0x0, acquirefunc=0x55dbd9037110 <acquire_sample_rows>, relpages=44248, inh=inh@entry=false, in_outer_xact=false, elevel=13) at ./build/../src/backend/commands/analyze.c:522 #5 0x000055dbd903b452 in analyze_rel (relid=<optimized out>, relation=<optimized out>, params=params@entry=0x55dbdaac7e7c, va_cols=0x0, in_outer_xact=<optimized out>, bstrategy=<optimized out>) at ./build/../src/backend/commands/analyze.c:263 #6 0x000055dbd90b6884 in vacuum (relations=0x55dbdab38588, params=params@entry=0x55dbdaac7e7c, bstrategy=<optimized out>, bstrategy@entry=0x55dbdaac7f98, isTopLevel=isTopLevel@entry=true) at ./build/../src/backend/commands/vacuum.c:466 #7 0x000055dbd8f0777b in autovacuum_do_vac_analyze (bstrategy=0x55dbdaac7f98, tab=0x55dbdaac7e78) at ./build/../src/backend/postmaster/autovacuum.c:3137 #8 do_autovacuum () at ./build/../src/backend/postmaster/autovacuum.c:2467 #9 0x000055dbd8f07e2a in AutoVacWorkerMain (argv=0x0, argc=0) at ./build/../src/backend/postmaster/autovacuum.c:1694 #10 0x000055dbd91a753a in StartAutoVacWorker () at ./build/../src/backend/postmaster/autovacuum.c:1488 #11 0x000055dbd91b54ca in StartAutovacuumWorker () at ./build/../src/backend/postmaster/postmaster.c:5613 #12 sigusr1_handler (postgres_signal_arg=<optimized out>) at ./build/../src/backend/postmaster/postmaster.c:5320 #13 <signal handler called> #14 0x00007f6e1c80c0f7 in __GI___select (nfds=nfds@entry=10, readfds=readfds@entry=0x7fff6cf1c7f0, writefds=writefds@entry=0x0, exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x7fff6cf1c750) at ../sysdeps/unix/sysv/linux/select.c:41 #15 0x000055dbd91b5759 in ServerLoop () at ./build/../src/backend/postmaster/postmaster.c:1703 #16 0x000055dbd91b75e3 in PostmasterMain (argc=17, argv=<optimized out>) at ./build/../src/backend/postmaster/postmaster.c:1412 #17 0x000055dbd8f0a3c8 in main (argc=17, argv=0x55dbdaa4aef0) at ./build/../src/backend/main/main.c:210 Eventually, after 20-25 minutes the process 10483 exited. I was able to reproduce it with 13.0 and 12.4, and I believe older versions are also affected. Regards, -- Alexander Kukushkin
pgsql-hackers by date: