Re: [HACKERS] Passing values to a dynamic background worker - Mailing list pgsql-hackers
From | Kyotaro HORIGUCHI |
---|---|
Subject | Re: [HACKERS] Passing values to a dynamic background worker |
Date | |
Msg-id | 20170418.181238.125943501.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
In response to | [HACKERS] Passing values to a dynamic background worker (Keith Fiske <keith@omniti.com>) |
Responses |
Re: [HACKERS] Passing values to a dynamic background worker
|
List | pgsql-hackers |
Hello, At Mon, 17 Apr 2017 16:19:13 -0400, Keith Fiske <keith@omniti.com> wrote in <CAG1_KcAFJ60pac_QnnZX0qeO12NENiPOcohuoQvs297WaT_ObQ@mail.gmail.com> > So after reading a recent thread on the steep learning curve for PG > internals [1], I figured I'd share where I've gotten stuck with this in a > new thread vs hijacking that one. > > One of the goals I had with pg_partman was to see if I could get the > partitioning python scripts redone as C functions using a dynamic > background worker to be able to commit in batches with a single call. My > thinking was to have a user-function that can accept arguments for things > like the interval value, batch size, and other arguments to the python > script, then start/stop a dynamic bgw up for each batch so it can commit > after each one. The dymanic bgw would essentially just have to call the > already existing partition_data() plpgsql function, but I have to be able > to pass the argument values that the user gave down into the dynamic bgw. > > I've reached a roadblock in that bgw_main_arg can only accept a single > argument that must be passed by value for a dynamic bgw. I already worked > around this for passing the database name to the my existing use of a bgw > with doing partition maintenance (pass a simple integer to use as an index > array value). But I'm not sure how to do this for passing multiple values > in. I'm assuming this would be the place where I'd see about storing values > in shared memory to be able to re-use later? I'm not even sure if that's > the right approach, and if it is, where to even start to understand how to > do that. I think you are on the way, shared memory is that. There are two ways to acquire shared memory areas for such purpose. One is static shared memory that stays living aside shared_buffers, and the another is dynamic shared memory (DSM). If you need fixed size of memory segment, the former will work. If you need that of indefinite amount, DSM will work. You will see how to use (static) shared memory in the following section in the documentation. Or pg_stat_statements.c will be a good reference. This kind of shared memory is guaranteed to be mapped at the same address so we can use pointers on there. https://www.postgresql.org/docs/devel/static/xfunc-c.html#idp83376336 On the other hand, AFAICS, DSM doesn't seem well documented. I mangaged to find a related document in Postgres Wiki but it seems a bit old. https://wiki.postgresql.org/wiki/Parallel_Internal_Sort This is a little complex than static shared memory, and it is *not* guaranteed to mapped at the same address among workers. You will see an instance in LaunchParallelWorkers() and the related functions in parallel.c. The basic of its usage would be as the follows. - Create a segment : dsm_segment *seg = dsm_create(size); - Send its handle via the bgw_main_arg. worker.bgw_main_arg = dsm_segment_handle(seg); - Attach the memory on the other side. dsm_segment *seg = dsm_attach(main_arg); On both side, the address of the attached shared memory is obtained using dsm_segment_address(seg). dsm_detach(seg) detaches the segment. All users of this segment detach the segment, it will be destroyed. You might need some locking or notification mechanism. Usually the mechanisms named LWLock and Latch are used for the purpose. > Let alone in the context of how that would interact with the > background worker system. If you look at my existing C code, you can see > it's very simple and doesn't do much more than the worker_spi example. I've > yet to have to interact with any memory contexts or such things, and as the > referenced thread below mentions, doing so is quite a steep learning curve. > > Any guidance for a newer internals dev here would be great. > > 1. > https://www.postgresql.org/message-id/CAH%3Dt1kqwCBF7J1bP0RjgsTcp-SaJaHrF4Yhb1iiQZMe3W-FX2w%40mail.gmail.com > > -- > Keith Fiske > Database Administrator > OmniTI Computer Consulting, Inc. > http://www.keithf4.com Good luck! -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: