Threads - Mailing list pgsql-hackers
From | Shridhar Daithankar |
---|---|
Subject | Threads |
Date | |
Msg-id | 200301032054.11125.shridhar_daithankar@persistent.co.in Whole thread Raw |
Responses |
Re: Threads
|
List | pgsql-hackers |
Hi all, I am sure, many of you would like to delete this message before reading, hold on. :-) There is much talk about threading on this list and the idea is always deferred for want of robust thread models across all supported platforms and feasibility of gains v/s efforts required. I think threads are useful in difference situations namely parallelising blocking conditions and using multiple CPUs. Attached is a framework that I ported to C from a C++ server I have written. It has threadpool and threads implementation based on pthreads. This code expects minimum pthreads implementation and does not assume anything on threads part (e.g kernel threads or not etc.) I request hackers on this list to take a look at it. It should be easily pluggable in any source code and is released without any strings for any use. This framework allows to plug-in the worker function and argument on the fly. The threads created are sleeping by default and can be woken up s and when required. I propose to use it incrementally in postgresql. Let's start with I/O. When a block of data is being read, rather than blocking for read, we can set up creator-consumer link between two threads That we way can utilize that I/O time in a overlapped fashion. Further threads can be useful when the server has more CPUs. It can spread CPU intensive work to different threads such as index creation or sorting. This way we can utilise idle CPU which we can not as of now. There are many advantages that I can see. 1)Threads can be optionally turned on/off depending upon the configuration. So we can entirely keep existing functionality and convert them one-by-one to threaded application. 2)For each functionality we can have two code branches, one that do not use threads i.e. current code base and one that can use threads. Agreed the binary will be bit bloated but that would give enormous flexibility. If we find a thread implementation buggy, we simply switch it off either in compilation or inconfiguration. 3) Not much efforts should be required to plug code into this model. The idea of using threads is to assign exclusive work to each thread. So that should not require much of a locking. In case of using multiple CPUs, separate functions need be written that can handle the things in a thread-safe fashion. Also a merger function would be required which would merge results of worker threads. That would be totally additional. I would say two threads per CPU per back-end should be a reasonable default as that would cover I/O blocking well. Of course unless threading is turned off in build or in configuration. Please note that I have tested the code in C++ and my C is rusty. Quite likely there are bugs in the code. I will stress test the code on monday but I would like to seek an opinion on this as soon as possible. ( Hey but it compiles clean..) If required I can post example usage of this code, but I don't think that should be necessary.:-) ByeShridhar
pgsql-hackers by date: