Re: [PATCH] More docs on what to do and not do in extension code - Mailing list pgsql-hackers
From | Bharath Rupireddy |
---|---|
Subject | Re: [PATCH] More docs on what to do and not do in extension code |
Date | |
Msg-id | CALj2ACVcfNR9KOgNidR0g1M-d9r+xCKDRXBsOgjD8DENa1hKDg@mail.gmail.com Whole thread Raw |
In response to | [PATCH] More docs on what to do and not do in extension code (Craig Ringer <craig.ringer@enterprisedb.com>) |
Responses |
Re: [PATCH] More docs on what to do and not do in extension code
|
List | pgsql-hackers |
On Mon, Jan 18, 2021 at 1:27 PM Craig Ringer <craig.ringer@enterprisedb.com> wrote: > Hi folks > > The attached patch expands the xfunc docs and bgworker docs a little, providing a starting point for developers to learnhow to do some common tasks the Postgres Way. > > It mentions in brief these topics: > > * longjmp() based exception handling with elog(ERROR), PG_CATCH() and PG_RE_THROW() etc > * Latches, spinlocks, LWLocks, heavyweight locks, condition variables > * shm, DSM, DSA, shm_mq > * syscache, relcache, relation_open(), invalidations > * deferred signal handling, CHECK_FOR_INTERRUPTS() > * Resource cleanup hooks and callbacks like on_exit, before_shmem_exit, the resowner callbacks, etc > * signal handling in bgworkers > > All very superficial, but all things I really wish I'd known a little about, or even that I needed to learn about, whenI started working on postgres. > > I'm not sure it's in quite the right place. I wonder if there should be a separate part of xfunc.sgml that covers the slightlymore advanced bits of postgres backend and function coding like this, lists relevant README files in the source tree,etc. > > I avoided going into details like how resource owners work. I don't want the docs to have to cover all that in detail;what I hope to do is start providing people with clear references to the right place in the code, READMEs, etc tolook when they need to understand specific topics. Thanks for the patch. Here are some comments: [1] background worker's main function, and must be unblocked by it; this is to allow the process to customize its signal handlers, if necessary. - Signals can be unblocked in the new process by calling - <function>BackgroundWorkerUnblockSignals</function> and blocked by calling - <function>BackgroundWorkerBlockSignals</function>. + It is important that all background workers set up and unblock signal + handling before they enter their main loops. Signal handling in background + workers is discussed separately in <xref linkend="bgworker-signals"/>. </para> IMO, we can retain the statement about BackgroundWorkerUnblockSignals and BackgroundWorkerBlockSignals, but mention the link to "bgworker-signals" for more details and move the statement "it's important to unblock signals before enter their main loop" to "bgworker-signals" section and we can also reason there the consequences if not done. [2] + interupt-aware APIs</link> for the purpose. Do not <function>usleep()</function>, + <function>system()</function>, make blocking system calls, etc. + </para> Is it "Do not use <function>usleep()</function>, <function>system()</function> or make blocking system calls etc." ? [3] IMO, we can remove following from "bgworker-signals" if we retain it where currently it is, as discussed in [1]. + Signals can be unblocked in the new process by calling + <function>BackgroundWorkerUnblockSignals</function> and blocked by calling + <function>BackgroundWorkerBlockSignals</function>. [4] Can we say + The default signal handlers set up for background workers <emphasis>do + default background worker signal handlers, it should call instead of + The default signal handlers installed for background workers <emphasis>do + default background worker signal handling it should call [5] IMO, we can have something like below + request, etc. Set up these handlers before unblocking signals as + shown below: instead of + request, etc. To install these handlers, before unblocking interrupts + run: [6] I think logs and errors either elog() or ereport can be used, so how about + Use <function>elog()</function> or <function>ereport()</function> for + logging output or raising errors instead of any direct stdio calls. instead of + Use <function>elog()</function> and <function>ereport()</function> for + logging output and raising errors instead of any direct stdio calls. [7] Can we use child processes instead of subprocess ? If okay in other places in the patch as well. + and should only use the main thread. PostgreSQL generally uses child processes + that coordinate over shared memory instead of threads - for instance, see + <xref linkend="bgworker"/>. instead of + and should only use the main thread. PostgreSQL generally uses subprocesses + that coordinate over shared memory instead of threads - see + <xref linkend="bgworker"/>. [8] Why should file descriptor manager API be used to execute subprocesses/child processes? + To execute subprocesses and open files, use the routines provided by + the file descriptor manager like <function>BasicOpenFile</function> + and <function>OpenPipeStream</function> instead of a direct [9] "should always be"? even if it's a blocking extesion, does it work? If our intention is to recommend the developers, maybe we should avoid using the term "should" in the patch in other places as well. + Extension code should always be structured as a non-blocking [10] I think it is + you should avoid using <function>sleep()</function> or <function>usleep()</function> instead of + you should <function>sleep()</function> or <function>usleep()</function> [11] I think it is + block if this happens. So cleanup of resources is not entirely managed by PostgreSQL, it + must be handled using appropriate callbacks provided by PostgreSQL instead of + block if this happens. So all cleanup of resources not already + managed by the PostgreSQL runtime must be handled using appropriate [12] I think it is + found in corresponding PostgreSQL header and source files. instead of + found in the PostgreSQL headers and sources. [13] I think it is + Use PostgreSQL runtime concurrency and synchronisation primitives + between the PostgreSQL processes. These include signals and ProcSignal multiplexed instead of + Use the PostgreSQL runtime's concurrency and synchronisation primitives + between PostgreSQL processes. These include signals and ProcSignal multiplexed [14] Is it "relation/table based state management"? + Sometimes relation-based state management for extensions is not [15] I think it is + use PostgreSQL shared-memory based inter-process communication instead of + use PostgreSQL's shared-memory based inter-process communication [16] I think it is + or shared memory message queues (<acronym>shm_mq</acronym>). Examples + usage of some of these features can be found in the + <filename>src/test/modules/test_shm_mq/</filename> sample extension. Others instead of + or shared memory message queues (<acronym>shm_mq</acronym>). Examples + of the use of some these features can be found in the + <filename>src/test/modules/test_shm_mq/</filename> example extension. Others [17] I think it is + syscache entries, as this can cause subtle bugs. See instead of + syscache cache entries, as this can cause subtle bugs. See With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: