RE: FW: query pg_stat_ssl hang 100%cpu - Mailing list pgsql-bugs
From | James Pang (chaolpan) |
---|---|
Subject | RE: FW: query pg_stat_ssl hang 100%cpu |
Date | |
Msg-id | PH0PR11MB5191FAF4D1FF00F918018F5ED6EEA@PH0PR11MB5191.namprd11.prod.outlook.com Whole thread Raw |
In response to | RE: FW: query pg_stat_ssl hang 100%cpu ("James Pang (chaolpan)" <chaolpan@cisco.com>) |
Responses |
Re: FW: query pg_stat_ssl hang 100%cpu
|
List | pgsql-bugs |
Yes, this backend has been always on same call stack tens of hours and 100% cpu there. It's still hang there now, butI can not reproduce that in other similar environment. I found this query start a transaction "xact_start" from " 2023-09-0302:36:23" from pg_stat_activity, no idea why. -----Original Message----- From: Michael Paquier <michael@paquier.xyz> Sent: Thursday, September 7, 2023 12:05 PM To: James Pang (chaolpan) <chaolpan@cisco.com> Cc: PostgreSQL mailing lists <pgsql-bugs@lists.postgresql.org> Subject: Re: FW: query pg_stat_ssl hang 100%cpu On Thu, Sep 07, 2023 at 01:35:00AM +0000, James Pang (chaolpan) wrote: > PGv14.8, OS RHEL8, no SSL enabled in this database, we have a > lot of client sessions who check it's ssl state by query, all > other sessions got done very quickly, but only 1 session hang > there in 100% cpu tens of hours, even pg_terminate_backend does > not make it stopped either. It looks like abnormal. > > select ssl from pg_stat_ssl where pid=pg_backend_pid(); This is hard to act on without more details or even a reproducible and self-contained test case. Even a java script basedon the JDBC driver would be OK for me, for example, if it helps digging into what you are seeing. > #0 ensure_record_cache_typmod_slot_exists (typmod=0) at > typcache.c:1714 > #1 0x000000000091185b in assign_record_type_typmod > (tupDesc=<optimized out>, tupDesc@entry=0x27bc738) at typcache.c:2001 > #2 0x000000000091df03 in internal_get_result_type (funcid=<optimized out>, call_expr=<optimized out>, rsinfo=<optimizedout>, > resultTypeId=<optimized out>, resultTupleDesc=0x7ffc9dff8cd0) at > funcapi.c:393 > #3 0x000000000091e263 in get_expr_result_type (expr=expr@entry=0x2792798, resultTypeId=resultTypeId@entry=0x7ffc9dff8ccc, > resultTupleDesc=resultTupleDesc@entry=0x7ffc9dff8cd0) at > funcapi.c:230 > #4 0x00000000006a2fa5 in ExecInitFunctionScan > (node=node@entry=0x273afa8, estate=estate@entry=0x269e948, > eflags=eflags@entry=16) at nodeFunctionscan.c:370 > #5 0x000000000069084e in ExecInitNode (node=node@entry=0x273afa8, > estate=estate@entry=0x269e948, eflags=eflags@entry=16) at > execProcnode.c:255 > #6 0x000000000068a96d in InitPlan (eflags=16, queryDesc=0x273b2d8) at > execMain.c:936 > #7 standard_ExecutorStart (queryDesc=0x273b2d8, eflags=16) at > execMain.c:263 > #8 0x00007f67c2821d5d in pgss_ExecutorStart (queryDesc=0x273b2d8, > eflags=<optimized out>) at pg_stat_statements.c:965 > #9 0x00000000007fc226 in PortalStart (portal=portal@entry=0x26848b8, params=params@entry=0x0, eflags=eflags@entry=0, snapshot=snapshot@entry=0x0) > at pquery.c:514 > #10 0x00000000007fa27f in exec_bind_message > (input_message=0x7ffc9dff90d0) at postgres.c:1995 > #11 PostgresMain (argc=argc@entry=1, argv=argv@entry=0x7ffc9dff9370, > dbname=<optimized out>, username=<optimized out>) at postgres.c:4552 > #12 0x000000000077a4ea in BackendRun (port=<optimized out>, > port=<optimized out>) at postmaster.c:4537 > #13 BackendStartup (port=<optimized out>) at postmaster.c:4259 > #14 ServerLoop () at postmaster.c:1745 > #15 0x000000000077b363 in PostmasterMain (argc=argc@entry=5, > argv=argv@entry=0x256abc0) at postmaster.c:1417 > #16 0x00000000004fec63 in main (argc=5, argv=0x256abc0) at main.c:209 This stack is referring to a code path where we are checking that some of the type-related data associated to a record isaround, but this does not say exactly where the loop happens, so... Are we looking on a loop in the function executionitself from which the information of pg_stat_ssl is retrieved (aka pg_stat_get_activity())? Or is the type cachesomewhat broken because of the extended query protocol? That's not really possible to see any evidence based on the information provided, though it provides a few hits that canhelp. FWIW, I've not heard about an issue like that in the field. The first thing I would do is update to 14.9, which is the latest version of Postgres available for this major version. -- Michael
pgsql-bugs by date: