Thread: [HACKERS] WIP Patch: Precalculate stable functions
Hello everyone! Now in Postgresql only immutable functions are precalculated; stable functions are calculated for every row so in fact they don't differ from volatile functions. There's a proposal to precalculate stable and immutable functions (= calculate once for all output rows, but as many times as function is mentioned in query), if they don't return a set and their arguments are constants or recursively precalculated functions. The same for operators' functions, strict functions, tracking functions. It can be very effective, for example, there's a comparison for full text search in messages (Intel® Core™ i5-6500 CPU @ 3.20GHz × 4, RAM 8Gb): Without precalculation: EXPLAIN (ANALYZE TRUE, BUFFERS TRUE) SELECT COUNT(*) FROM messages WHERE body_tsvector @@ to_tsquery('postgres'); QUERY PLAN ------------------------------------------------------------------------------------------------------ ------------------------------------ Aggregate (cost=18714.82..18714.83 rows=1 width=8) (actual time=2275.334..2275.334 rows=1 loops=1) Buffers: shared hit=309234 read=184261 -> Bitmap Heap Scan on messages (cost=66.93..18702.34rows=4991 width=0) (actual time=70.661..224 7.462 rows=151967 loops=1) Recheck Cond: (body_tsvector @@ to_tsquery('postgres'::text)) Rows Removed byIndex Recheck: 118531 Heap Blocks: exact=56726 lossy=33286 Buffers: shared hit=309234 read=184261 -> Bitmap Index Scan on message_body_idx (cost=0.00..65.68 rows=4991 width=0) (actual time= 54.599..54.599 rows=151967 loops=1) Index Cond: (body_tsvector @@ to_tsquery('postgres'::text)) Buffers: shared hit=1 read=37 Planning time: 0.493 ms Execution time: 2276.412ms (12 rows) With precalculation: EXPLAIN (ANALYZE TRUE, BUFFERS TRUE) SELECT COUNT(*) FROM messages WHERE body_tsvector @@ to_tsquery('postgres'); QUERY PLAN ------------------------------------------------------------------------------------------------------ ---------------------------------------- Aggregate (cost=192269.70..192269.71 rows=1 width=8) (actual time=1458.679..1458.680 rows=1 loops=1) Buffers: shared hit=309234 read=184261 -> Bitmap Heap Scan on messages (cost=1445.68..191883.51 rows=154474 width=0) (actual time=70.069 ..1433.999 rows=151967 loops=1) Recheck Cond: (body_tsvector @@ to_tsquery('postgres'::text)) Rows Removedby Index Recheck: 118531 Heap Blocks: exact=56726 lossy=33286 Buffers: shared hit=309234 read=184261 -> Bitmap Index Scan on message_body_idx (cost=0.00..1406.81 rows=154474 width=0) (actual t ime=56.149..56.149 rows=151967 loops=1) Index Cond: (body_tsvector @@ to_tsquery('postgres'::text)) Buffers: shared hit=1 read=37 Planning time: 1.644 ms Execution time: 1459.836ms (12 rows) Patch is attached. It isn't done yet: - changing documentation (partly because of next lines); - precalculation of expressions IS DISTINCT FROM and NULLIF which use nonvolatile equality operators; - precalculation of expressions "scalar op ANY/ALL (array)" which use nonvolatile operators; - precalculation of row compare expressions which use nonvolatile operators. -- Marina Polyakova Postgres Professional: http://www.postgrespro.com +7 926 92 00 265
Sorry, attached patch. -------- Исходное сообщение -------- Тема: WIP Patch: Precalculate stable functions Дата: 20-04-2017 19:56 От: Marina Polyakova <m.polyakova@postgrespro.ru> Кому: pgsql-hackers@postgresql.org Hello everyone! Now in Postgresql only immutable functions are precalculated; stable functions are calculated for every row so in fact they don't differ from volatile functions. There's a proposal to precalculate stable and immutable functions (= calculate once for all output rows, but as many times as function is mentioned in query), if they don't return a set and their arguments are constants or recursively precalculated functions. The same for operators' functions, strict functions, tracking functions. It can be very effective, for example, there's a comparison for full text search in messages (Intel® Core™ i5-6500 CPU @ 3.20GHz × 4, RAM 8Gb): Without precalculation: EXPLAIN (ANALYZE TRUE, BUFFERS TRUE) SELECT COUNT(*) FROM messages WHERE body_tsvector @@ to_tsquery('postgres'); QUERY PLAN ------------------------------------------------------------------------------------------------------ ------------------------------------ Aggregate (cost=18714.82..18714.83 rows=1 width=8) (actual time=2275.334..2275.334 rows=1 loops=1) Buffers: shared hit=309234 read=184261 -> Bitmap Heap Scan on messages (cost=66.93..18702.34 rows=4991 width=0) (actual time=70.661..224 7.462 rows=151967 loops=1) Recheck Cond: (body_tsvector @@ to_tsquery('postgres'::text)) Rows Removed by Index Recheck: 118531 Heap Blocks: exact=56726 lossy=33286 Buffers: shared hit=309234 read=184261 -> Bitmap Index Scan on message_body_idx (cost=0.00..65.68 rows=4991 width=0) (actual time= 54.599..54.599 rows=151967 loops=1) Index Cond: (body_tsvector @@ to_tsquery('postgres'::text)) Buffers: shared hit=1 read=37 Planning time: 0.493 ms Execution time: 2276.412 ms (12 rows) With precalculation: EXPLAIN (ANALYZE TRUE, BUFFERS TRUE) SELECT COUNT(*) FROM messages WHERE body_tsvector @@ to_tsquery('postgres'); QUERY PLAN ------------------------------------------------------------------------------------------------------ ---------------------------------------- Aggregate (cost=192269.70..192269.71 rows=1 width=8) (actual time=1458.679..1458.680 rows=1 loops=1) Buffers: shared hit=309234 read=184261 -> Bitmap Heap Scan on messages (cost=1445.68..191883.51 rows=154474 width=0) (actual time=70.069 ..1433.999 rows=151967 loops=1) Recheck Cond: (body_tsvector @@ to_tsquery('postgres'::text)) Rows Removed by Index Recheck: 118531 Heap Blocks: exact=56726 lossy=33286 Buffers: shared hit=309234 read=184261 -> Bitmap Index Scan on message_body_idx (cost=0.00..1406.81 rows=154474 width=0) (actual t ime=56.149..56.149 rows=151967 loops=1) Index Cond: (body_tsvector @@ to_tsquery('postgres'::text)) Buffers: shared hit=1 read=37 Planning time: 1.644 ms Execution time: 1459.836 ms (12 rows) Patch is attached. It isn't done yet: - changing documentation (partly because of next lines); - precalculation of expressions IS DISTINCT FROM and NULLIF which use nonvolatile equality operators; - precalculation of expressions "scalar op ANY/ALL (array)" which use nonvolatile operators; - precalculation of row compare expressions which use nonvolatile operators. -- Marina Polyakova Postgres Professional: http://www.postgrespro.com +7 926 92 00 265 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Marina Polyakova <m.polyakova@postgrespro.ru> writes: > Now in Postgresql only immutable functions are precalculated; stable > functions are calculated for every row so in fact they don't differ from > volatile functions. > There's a proposal to precalculate stable and immutable functions (= > calculate once for all output rows, but as many times as function is > mentioned in query), if they don't return a set and their arguments are > constants or recursively precalculated functions. Have you looked at the previous efforts in this direction? The last discussion I can find is https://www.postgresql.org/message-id/flat/CABRT9RA-RomVS-yzQ2wUtZ%3Dm-eV61LcbrL1P1J3jydPStTfc6Q%40mail.gmail.com In particular, that relied on the planner to decide which subtrees were worth caching and insert marker nodes for the purpose. I'm not certain that that's better than putting the intelligence into execExpr.c, but I'm not sure it isn't either. In principle we could afford to spend more effort on making such determinations at plan time than we should do at executor startup. Also, the fundamental implementation seemed less invasive, in that only the marker node type had to know about the caching behavior, whereas I gather from your description that what you are doing is going to end up touching almost all node types. v10's new expression eval technology is sufficiently different that it may well be that that old approach isn't very relevant anymore. But it would be a good idea to look. regards, tom lane