Runaway backend at 100% CPU PostgreSQL v8.3.5 - Mailing list pgsql-hackers
From | Stephen R. van den Berg |
---|---|
Subject | Runaway backend at 100% CPU PostgreSQL v8.3.5 |
Date | |
Msg-id | 20090106231134.GA13379@cuci.nl Whole thread Raw |
Responses |
Re: Runaway backend at 100% CPU PostgreSQL v8.3.5
|
List | pgsql-hackers |
I'm running Debian PostgreSQL v8.3.5-1 on x86 in 32-bit mode. Every once in a while, some backends start taking 100% CPU, as can be seen below in the excerpt from the process table: 27256 ? Ss 0:04 /usr/lib/postgresql/8.3/bin/postgres -D /var/lib/post 27299 ? Ss 0:00 \_ postgres: writer process 27300 ? Ss 0:00 \_ postgres: wal writer process 27301 ? Ss 0:00 \_ postgres: autovacuum launcher process 27302 ? Ss 0:26 \_ postgres: stats collector process 12076 ? Rs 5681:50 \_ postgres: cms cms 10.0.0.5(59125) PARSE 12853 ? Rs 5549:25 \_ postgres: cms cms 10.0.0.5(48437) PARSE 25985 ? Ss 0:00 \_ postgres: cms cms 10.0.0.18(45807) idle 25986 ? Ss 0:00 \_ postgres: cms cms 10.0.0.18(45808) idle It seems that the backend is stuck in some kind of endless loop. Since it's a production Debian server, the backend is not compiled with debugging turned on. The best I can do is ltrace it, in hopes of someone recognising the infinite sequence. Ltracing the 12853 process reveals the following libc calls: Label a: [pid 12853] __sigsetjmp(0xbfec7000, 0, 0x852e740, 0xb5b3fc16, 0x85e9228) = 0 [pid 12853] strlen("ExecutorState") = 13 [pid 12853] strcpy(0x84fb0e8, "ExecutorState") = 0x84fb0e8 [pid 12853] malloc(8192) = 0x860da48 [pid 12853] strlen("ExprContext") = 11 [pid 12853] strcpy(0x84fb060, "ExprContext") = 0x84fb060 [pid 12853] strlen("ExprContext") = 11 [pid 12853] strcpy(0x84fbda0, "ExprContext") = 0x84fbda0 [pid 12853] strncpy(0x860e498, "parent", 64) = 0x860e498 [pid 12853] strncpy(0x860e500, "sid", 64) = 0x860e500 [pid 12853] strlen("ExprContext") = 11 [pid 12853] strcpy(0x85cd810, "ExprContext") = 0x85cd810 [pid 12853] malloc(8256) = 0x8600a28 [pid 12853] memmove(0x860e738, 0x860e600, 40, 0, 0x82eada9) = 0x860e738 [pid 12853] strncpy(0x860e7f0, "parent", 64) = 0x860e7f0 [pid 12853] strncpy(0x860e858, "sid", 64) = 0x860e858 [pid 12853] strlen("SPI TupTable") = 12 [pid 12853] strcpy(0x85cd1b0, "SPI TupTable") = 0x85cd1b0 [pid 12853] malloc(8192) = 0x860fa50 [pid 12853] memmove(0x860e738, 0x860e600, 40, 0x830fab4, 0x84fb088) = 0x860e738 [pid 12853] free(0x8600a28) = <void> [pid 12853] free(0x860da48) = <void> [pid 12853] free(0x860fa50) = <void> [pid 12853] __sigsetjmp(0xbfec7000, 0, 0x852e740, 0xb5b3fc16, 0x85e9228) = 0 [pid 12853] strlen("ExecutorState") = 13 [pid 12853] strcpy(0x85cd1b0, "ExecutorState") = 0x85cd1b0 [pid 12853] malloc(8192) = 0x860da48 [pid 12853] strlen("ExprContext") = 11 [pid 12853] strcpy(0x84fb0e8, "ExprContext") = 0x84fb0e8 [pid 12853] strlen("ExprContext") = 11 [pid 12853] strcpy(0x84fb060, "ExprContext") = 0x84fb060 [pid 12853] strncpy(0x860e498, "parent", 64) = 0x860e498 [pid 12853] strncpy(0x860e500, "sid", 64) = 0x860e500 [pid 12853] strlen("ExprContext") = 11 [pid 12853] strcpy(0x84fbda0, "ExprContext") = 0x84fbda0 [pid 12853] malloc(8256) = 0x8600a28 [pid 12853] memmove(0x860e738, 0x860e600, 40, 0, 0x82eada9) = 0x860e738 [pid 12853] strncpy(0x860e7f0, "parent", 64) = 0x860e7f0 [pid 12853] strncpy(0x860e858, "sid", 64) = 0x860e858 [pid 12853] strlen("SPI TupTable") = 12 [pid 12853] strcpy(0x85cd810, "SPI TupTable") = 0x85cd810 [pid 12853] malloc(8192) = 0x860fa50 [pid 12853] memmove(0x860e738, 0x860e600, 40, 0x830fab4, 0x85cd150) = 0x860e738 [pid 12853] free(0x8600a28) = <void> [pid 12853] free(0x860da48) = <void> [pid 12853] free(0x860fa50) = <void> [pid 12853] __sigsetjmp(0xbfec7000, 0, 0x852e740, 0xb5b3fc16, 0x85e9228) = 0 [pid 12853] strlen("ExecutorState") = 13 [pid 12853] strcpy(0x85cd810, "ExecutorState") = 0x85cd810 [pid 12853] malloc(8192) = 0x860da48 [pid 12853] strlen("ExprContext") = 11 [pid 12853] strcpy(0x85cd1b0, "ExprContext") = 0x85cd1b0 [pid 12853] strlen("ExprContext") = 11 [pid 12853] strcpy(0x84fb0e8, "ExprContext") = 0x84fb0e8 [pid 12853] strncpy(0x860e498, "parent", 64) = 0x860e498 [pid 12853] strncpy(0x860e500, "sid", 64) = 0x860e500 [pid 12853] strlen("ExprContext") = 11 [pid 12853] strcpy(0x84fb060, "ExprContext") = 0x84fb060 [pid 12853] malloc(8256) = 0x8600a28 [pid 12853] memmove(0x860e738, 0x860e600, 40, 0, 0x82eada9) = 0x860e738 [pid 12853] strncpy(0x860e7f0, "parent", 64) = 0x860e7f0 [pid 12853] strncpy(0x860e858, "sid", 64) = 0x860e858 [pid 12853] strlen("SPI TupTable") = 12 [pid 12853] strcpy(0x84fbda0, "SPI TupTable") = 0x84fbda0 [pid 12853] malloc(8192) = 0x860fa50 [pid 12853] memmove(0x860e738, 0x860e600, 40, 0x830fab4, 0x85cd7b0) = 0x860e738 [pid 12853] free(0x8600a28) = <void> [pid 12853] free(0x860da48) = <void> [pid 12853] free(0x860fa50) = <void> [pid 12853] __sigsetjmp(0xbfec7000, 0, 0x852e740, 0xb5b3fc16, 0x85e9228) = 0 [pid 12853] strlen("ExecutorState") = 13 [pid 12853] strcpy(0x84fbda0, "ExecutorState") = 0x84fbda0 [pid 12853] malloc(8192) = 0x860da48 [pid 12853] strlen("ExprContext") = 11 [pid 12853] strcpy(0x85cd810, "ExprContext") = 0x85cd810 [pid 12853] strlen("ExprContext") = 11 [pid 12853] strcpy(0x85cd1b0, "ExprContext") = 0x85cd1b0 [pid 12853] strncpy(0x860e498, "parent", 64) = 0x860e498 [pid 12853] strncpy(0x860e500, "sid", 64) = 0x860e500 [pid 12853] strlen("ExprContext") = 11 [pid 12853] strcpy(0x84fb0e8, "ExprContext") = 0x84fb0e8 [pid 12853] malloc(8256) = 0x8600a28 [pid 12853] memmove(0x860e738, 0x860e600, 40, 0, 0x82eada9) = 0x860e738 [pid 12853] strncpy(0x860e7f0, "parent", 64) = 0x860e7f0 [pid 12853] strncpy(0x860e858, "sid", 64) = 0x860e858 [pid 12853] strlen("SPI TupTable") = 12 [pid 12853] strcpy(0x84fb060, "SPI TupTable") = 0x84fb060 [pid 12853] malloc(8192) = 0x860fa50 [pid 12853] memmove(0x860e738, 0x860e600, 40, 0x830fab4, 0x84fbd40) = 0x860e738 [pid 12853] free(0x8600a28) = <void> [pid 12853] free(0x860da48) = <void> [pid 12853] free(0x860fa50) = <void> [pid 12853] __sigsetjmp(0xbfec7000, 0, 0x852e740, 0xb5b3fc16, 0x85e9228) = 0 [pid 12853] strlen("ExecutorState") = 13 [pid 12853] strcpy(0x84fb060, "ExecutorState") = 0x84fb060 [pid 12853] malloc(8192) = 0x860da48 [pid 12853] strlen("ExprContext") = 11 [pid 12853] strcpy(0x84fbda0, "ExprContext") = 0x84fbda0 [pid 12853] strlen("ExprContext") = 11 [pid 12853] strcpy(0x85cd810, "ExprContext") = 0x85cd810 [pid 12853] strncpy(0x860e498, "parent", 64) = 0x860e498 [pid 12853] strncpy(0x860e500, "sid", 64) = 0x860e500 [pid 12853] strlen("ExprContext") = 11 [pid 12853] strcpy(0x85cd1b0, "ExprContext") = 0x85cd1b0 [pid 12853] malloc(8256) = 0x8600a28 [pid 12853] memmove(0x860e738, 0x860e600, 40, 0, 0x82eada9) = 0x860e738 [pid 12853] strncpy(0x860e7f0, "parent", 64) = 0x860e7f0 [pid 12853] strncpy(0x860e858, "sid", 64) = 0x860e858 [pid 12853] strlen("SPI TupTable") = 12 [pid 12853] strcpy(0x84fb0e8, "SPI TupTable") = 0x84fb0e8 [pid 12853] malloc(8192) = 0x860fa50 [pid 12853] memmove(0x860e738, 0x860e600, 40, 0x830fab4, 0x84fb000) = 0x860e738 [pid 12853] free(0x8600a28) = <void> [pid 12853] free(0x860da48) = <void> [pid 12853] free(0x860fa50) = <void> Jump back to label a:, repeat ad infinitum, chewing up 100% CPU Below a complete process list of all running backends: 27256 ? Ss 0:04 /usr/lib/postgresql/8.3/bin/postgres -D /var/lib/post 27299 ? Ss 0:00 \_ postgres: writer process 27300 ? Ss 0:00 \_ postgres: wal writer process 27301 ? Ss 0:00 \_ postgres: autovacuum launcher process 27302 ? Ss 0:26 \_ postgres: stats collector process 12076 ? Rs 5681:50 \_ postgres: cms cms 10.0.0.5(59125) PARSE 12853 ? Ts 5549:25 \_ postgres: cms cms 10.0.0.5(48437) PARSE 25985 ? Ss 0:00 \_ postgres: cms cms 10.0.0.18(45807) idle 25986 ? Ss 0:00 \_ postgres: cms cms 10.0.0.18(45808) idle 21100 ? Ss 0:00 \_ postgres: cms cms 10.0.0.12(45544) idle 21101 ? Ss 0:00 \_ postgres: cms cms 10.0.0.12(45545) idle 26777 ? Ss 0:00 \_ postgres: cms cms 10.0.0.12(43123) idle 24711 ? Ss 0:00 \_ postgres: cms cms 10.0.0.5(34684) idle 24715 ? Ss 0:00 \_ postgres: cms cms 10.0.0.5(34687) idle 24716 ? Ss 0:00 \_ postgres: cms cms 10.0.0.5(34688) idle 24721 ? Ss 0:00 \_ postgres: cms cms 10.0.0.5(34693) idle 8355 ? Ss 0:01 \_ postgres: cms cms10.0.0.5(59188) idle 13504 ? Ss 0:05 \_ postgres: cms cms 10.0.0.5(35244) idle 17568 ? Ss 0:01 \_ postgres: cms cms 10.0.0.5(50414) idle 17615 ? Ss 0:02 \_ postgres: cms cms 10.0.0.5(42313) idle 17617 ? Ss 0:01 \_ postgres: cms cms 10.0.0.5(42315) idle 20412 ? Ss 0:00 \_ postgres: cms cms 10.0.0.5(43902) idle 20425 ? Ss 0:00 \_ postgres: cms cms 10.0.0.5(43905) idle 20427 ? Ss 0:00 \_ postgres: cms cms 10.0.0.5(43907) idle 20428 ? Ss 0:00 \_ postgres: cms cms 10.0.0.5(43908) idle 20429 ? Ss 0:00 \_ postgres: cms cms 10.0.0.5(43909) idle 20430 ? Ss 0:00 \_ postgres: cms cms 10.0.0.5(43910) idle 20431 ? Ss 0:00 \_ postgres: cms cms 10.0.0.5(43911) idle -- Sincerely, Stephen R. van den Berg. "Very funny, Mr. Scott. Now beam down my clothes!"
pgsql-hackers by date: