PostgreSQL as advanced job queuing system - Mailing list pgsql-general
From | ushi |
---|---|
Subject | PostgreSQL as advanced job queuing system |
Date | |
Msg-id | 90ee725f-d003-4838-97ec-e7e4149ac75b@mailbox.org Whole thread Raw |
Responses |
Re: PostgreSQL as advanced job queuing system
Re: PostgreSQL as advanced job queuing system Re: PostgreSQL as advanced job queuing system |
List | pgsql-general |
Hello List, i am playing with the idea to implement a job queuing system using PostgreSQL. To meet requirements the system needs to offersome advanced features compared to "classic" queuing systems: - users can create new queues at any time - queues have priorities - priorities of queues can change at any time - jobs in queues with the highest priority should be processed first A simple schema could look like this: create table queues ( id integer primary key, priority integer not null default 0 ); create table jobs ( id serial primary key, queue_id integer not null references queues(id) ); Right now i am stuck writing an efficient query for dequeuing. This is what i got so far: insert into queues (id, priority) values (1, 0), (2, 1), (3, 1); insert into jobs (queue_id) select 1 from generate_series(1, 1000000); insert into jobs (queue_id) select 2 from generate_series(1, 1000000); insert into jobs (queue_id) select 3 from generate_series(1, 1000000); Here is a simple dequeuing query that is super efficient, but ignores priority: select * from jobs limit 1 for update of jobs skip locked; -- id | queue_id -- ----+---------- -- 1 | 1 -- (1 row) -- -- Time: 2.641 ms This is my naive query obeying priority. This is very slow and i could not get it any faster: select * from jobs join queues on queues.id = jobs.queue_id order by priority desc limit 1 for update of jobs skip locked; -- id | queue_id | id | priority -- ---------+----------+----+---------- -- 1000001 | 2 | 2 | 1 -- (1 row) -- -- Time: 1116.631 ms (00:01.117) Here is the query plan: -- QUERY PLAN -- ----------------------------------------------------------------------------------------------------------------------------------------- -- Limit (cost=66225.61..66225.63 rows=1 width=28) (actual time=1333.139..1333.142 rows=1 loops=1) -- -> LockRows (cost=66225.61..103725.61 rows=3000000 width=28) (actual time=1333.137..1333.139 rows=1 loops=1) -- -> Sort (cost=66225.61..73725.61 rows=3000000 width=28) (actual time=1333.110..1333.112 rows=1 loops=1) -- Sort Key: queues.priority DESC -- Sort Method: external merge Disk: 111568kB -- -> Hash Join (cost=60.85..51225.61 rows=3000000 width=28) (actual time=0.064..660.317 rows=3000000loops=1) -- Hash Cond: (jobs.queue_id = queues.id) -- -> Seq Scan on jobs (cost=0.00..43275.00 rows=3000000 width=14) (actual time=0.027..253.868rows=3000000 loops=1) -- -> Hash (cost=32.60..32.60 rows=2260 width=14) (actual time=0.021..0.022 rows=3 loops=1) -- Buckets: 4096 Batches: 1 Memory Usage: 33kB -- -> Seq Scan on queues (cost=0.00..32.60 rows=2260 width=14) (actual time=0.011..0.013rows=3 loops=1) -- Planning Time: 0.430 ms -- Execution Time: 1347.448 ms -- (13 rows) Any ideas for a more efficient implementation? Thank you for your time, ushi
pgsql-general by date: