Thread: Re: PoC. The saving of the compiled jit-code in the plan cache

Re: PoC. The saving of the compiled jit-code in the plan cache

From

Matheus Alcantara

Date:

07 March, 01:02:39

Hi,

I've spent some time learning more about jit in the last few weeks and I
think that this patch could be very useful, thanks for working on this!
I'm new on this subject but I would like to share some thoughts about
it.

> 1. Changes in jit-code generation.
>
> a) the load of the absolute address (as const) changed to the load of this
> address from a struct member:
>
If I understood correctly, this change is required to avoid making the
jit code points to a memory address that was reallocated (making it
invalid)? If that's the case, would be possible to split this change
into a separated patch? It would help a lot to review.

It's not clear to me what's the difference between jit_context being set
to NULL or CACHED_JITCONTEXT_EMPTY and how it changes from NULL to
CACHED_JITCONTEXT_EMPTY. I think some code comments will help on this,
specially on llvm_compile_expr around if/else blocks.

> Updated patch rebased to the current master. Also I resolved the problems
> with the lookup of the compiled expressions.
>  Cached jit compiles only expressions from cached plan - they are
> recognized by memory context - if Expr is not NULL and belong to the same
> memory context as cached plan, this expression is compiled to the function
> with expression address in the name (expression address casted to Size
> type).
>  All other expression (generated later than plan, f.e. expressions in
> aggregates, hash join, hash aggregates) are not compiled and are executed
> by standard expression interpreter.
>
What's stopping us from going back to the current code generation
without caching on these scenarios? I'm a bit concerned for never jit
compile these kind of expressions and have some kind of performance
issue.

What's the relation of this exec time generated expressions (hash join,
hash agg) problem with the function name lookup issue? It's seems
different problems to me but I may be missing something.

If these problems is not fully correlated I think that it would be
better to have some kind of hash map or use the number of the call to
connect expressions with functions (as you did on v4) to lookup the
cached compiled function. IMHO it sounds a bit strange to me to add the
function pointer on the function name and have this memory context
validation.

I've also executed make check-world and it seems that
test_misc/003_check_guc is falling:

[11:29:42.995](1.153s) not ok 1 - no parameters missing from
postgresql.conf.sample
[11:29:42.995](0.000s) #   Failed test 'no parameters missing from
postgresql.conf.sample'
#   at src/test/modules/test_misc/t/003_check_guc.pl line 85.
[11:29:42.995](0.000s) #          got: '1'
#     expected: '0'
[11:29:42.995](0.000s) ok 2 - no parameters missing from guc_tables.c
[11:29:42.995](0.000s) ok 3 - no parameters marked as NOT_IN_SAMPLE in
postgresql.conf.sample
found GUC jit_cached in guc_tables.c, missing from postgresql.conf.sample

-- 
Matheus Alcantara

Re: PoC. The saving of the compiled jit-code in the plan cache

From

Vladlen Popolitov

Date:

07 March, 13:43:54

Matheus Alcantara писал(а) 2025-03-07 05:02:

Hi!

  Thank for interest to the patch.

>> 1. Changes in jit-code generation.
>> 
>> a) the load of the absolute address (as const) changed to the load of 
>> this
>> address from a struct member:
>> 
> If I understood correctly, this change is required to avoid making the
> jit code points to a memory address that was reallocated (making it
> invalid)? If that's the case, would be possible to split this change
> into a separated patch? It would help a lot to review.
Yes, you correct. In this patch the value of the pointers initialised
every time, when code called for new structure, to reffer to new 
address.
I did not put it into separate patch, because now I see all changes
in the system are interconnected and require to change many files.
It is hard to maintain, at least now.

Also I did it to check the possibility to implement it and check, is it
really better. During implementation I understood, that the design must
be changed and clarified, it looks dirty now.

> 
> It's not clear to me what's the difference between jit_context being 
> set
> to NULL or CACHED_JITCONTEXT_EMPTY and how it changes from NULL to
> CACHED_JITCONTEXT_EMPTY. I think some code comments will help on this,
> specially on llvm_compile_expr around if/else blocks.
  When llvm_compile_expr is called, I need to know, is it compilation for
cache or usual compilation. I use jit_context for this. It is always
initialised by NULL in palloc0(). If plan is saved to cache,
jit_context assigned CACHED_JITCONTEXT_EMPTY - it means, jit also must
me saved in cache, but not saved yet.
> 
>> Updated patch rebased to the current master. Also I resolved the 
>> problems
>> with the lookup of the compiled expressions.
>>  Cached jit compiles only expressions from cached plan - they are
>> recognized by memory context - if Expr is not NULL and belong to the 
>> same
>> memory context as cached plan, this expression is compiled to the 
>> function
>> with expression address in the name (expression address casted to Size
>> type).
>>  All other expression (generated later than plan, f.e. expressions in
>> aggregates, hash join, hash aggregates) are not compiled and are 
>> executed
>> by standard expression interpreter.
>> 
> What's stopping us from going back to the current code generation
> without caching on these scenarios? I'm a bit concerned for never jit
> compile these kind of expressions and have some kind of performance
> issue.
> 
I thought about this. The reason to store the jit in cache - avoid
compiling every time when query is executed. If we anyway compile
the code even for 1 function, we lose benefits of cached jit code.
We can choose the standard jit compilation for other functions.

> What's the relation of this exec time generated expressions (hash join,
> hash agg) problem with the function name lookup issue? It's seems
> different problems to me but I may be missing something.
This question and next question are correlated.
> If these problems is not fully correlated I think that it would be
> better to have some kind of hash map or use the number of the call to
> connect expressions with functions (as you did on v4) to lookup the
> cached compiled function. IMHO it sounds a bit strange to me to add the
> function pointer on the function name and have this memory context
> validation.
  This function lookup the biggest challenge in this patch, I see
it correct.
  In current implementation: the function is generated for expression,
the name of the function is saved in the ExprState,
then function is compiled (when first expression is executed),
then the function is called,
lookup by saved address is done to find compiled code,
the compiled code address is saved in ExprState as new execution code,
the compiled code is called.


  In cached implementation when expression is called the next time,
we need to find the link among the expression and the compiled code.
I tried to find reliable way to connect them, and see now only one
version - make compilation only for expressions saved with the plan -
they have the same address across the time and the same memory context
as the saved plan. Expressions generated during execution like
aggregate expression and Hash join expression are allocated in
query execution context and always have different addresses, and I did
not find any way to connect them with cached plan. The have "expr"
member, but it can be NULL, T_List or T_Expr (or anything, what
Custom node can assign to it).
  Function name with expression address now looks as reliable way to find
compiled function, though not elegant. Standard implementation just 
stores
generated name, uses it and pfree() it.

> I've also executed make check-world and it seems that
> test_misc/003_check_guc is falling:
> 
> [11:29:42.995](1.153s) not ok 1 - no parameters missing from
> postgresql.conf.sample
> [11:29:42.995](0.000s) #   Failed test 'no parameters missing from
> postgresql.conf.sample'
> #   at src/test/modules/test_misc/t/003_check_guc.pl line 85.
> [11:29:42.995](0.000s) #          got: '1'
> #     expected: '0'
> [11:29:42.995](0.000s) ok 2 - no parameters missing from guc_tables.c
> [11:29:42.995](0.000s) ok 3 - no parameters marked as NOT_IN_SAMPLE in
> postgresql.conf.sample
> found GUC jit_cached in guc_tables.c, missing from 
> postgresql.conf.sample

I tested in with make check and make installcheck . In v8 I found bugs, 
but
not published fix yet.



-- 
Best regards,

Vladlen Popolitov.

Re: PoC. The saving of the compiled jit-code in the plan cache

From

Matheus Alcantara

Date:

18 March, 17:56:51

On Fri, Mar 7, 2025 at 7:43 AM Vladlen Popolitov
<v.popolitov@postgrespro.ru> wrote:
> >
> >> Updated patch rebased to the current master. Also I resolved the
> >> problems
> >> with the lookup of the compiled expressions.
> >>  Cached jit compiles only expressions from cached plan - they are
> >> recognized by memory context - if Expr is not NULL and belong to the
> >> same
> >> memory context as cached plan, this expression is compiled to the
> >> function
> >> with expression address in the name (expression address casted to Size
> >> type).
> >>  All other expression (generated later than plan, f.e. expressions in
> >> aggregates, hash join, hash aggregates) are not compiled and are
> >> executed
> >> by standard expression interpreter.
> >>
> > What's stopping us from going back to the current code generation
> > without caching on these scenarios? I'm a bit concerned for never jit
> > compile these kind of expressions and have some kind of performance
> > issue.
> >
> I thought about this. The reason to store the jit in cache - avoid
> compiling every time when query is executed. If we anyway compile
> the code even for 1 function, we lose benefits of cached jit code.
> We can choose the standard jit compilation for other functions.
>
I don't think that I get your point here - Why we would lose the
benefits? I think that I may not be clear with my question; My point
was, why we can't jit compile, without caching, these expressions that
are generated later on the plan? Like aggregates, hash join, as you
mention. I think that we may have scenarios that jit compile is worth
even if we will not cache the compiled code. I understand the limitation
of runtime generated ExprState being allocated in another memory
context, but why can't we follow the same path as we have today for
these scenarios?

> > What's the relation of this exec time generated expressions (hash join,
> > hash agg) problem with the function name lookup issue? It's seems
> > different problems to me but I may be missing something.
> This question and next question are correlated.
> > If these problems is not fully correlated I think that it would be
> > better to have some kind of hash map or use the number of the call to
> > connect expressions with functions (as you did on v4) to lookup the
> > cached compiled function. IMHO it sounds a bit strange to me to add the
> > function pointer on the function name and have this memory context
> > validation.
>   This function lookup the biggest challenge in this patch, I see
> it correct.
>   In current implementation: the function is generated for expression,
> the name of the function is saved in the ExprState,
> then function is compiled (when first expression is executed),
> then the function is called,
> lookup by saved address is done to find compiled code,
> the compiled code address is saved in ExprState as new execution code,
> the compiled code is called.
>
>
>   In cached implementation when expression is called the next time,
> we need to find the link among the expression and the compiled code.
> I tried to find reliable way to connect them, and see now only one
> version - make compilation only for expressions saved with the plan -
> they have the same address across the time and the same memory context
> as the saved plan. Expressions generated during execution like
> aggregate expression and Hash join expression are allocated in
> query execution context and always have different addresses, and I did
> not find any way to connect them with cached plan. The have "expr"
> member, but it can be NULL, T_List or T_Expr (or anything, what
> Custom node can assign to it).
>   Function name with expression address now looks as reliable way to find
> compiled function, though not elegant. Standard implementation just
> stores
> generated name, uses it and pfree() it.
>
Yeah, I see that this sounds a bit complicated. I'm wondering if this
logic of checking if the expr is already cached or not could be
implemented in another function or layer (maybe jit_compile_expr?). I
think that llvm_compile_expr is already complicated enough and it should
only care about emitting code, so we could have another layer that check
if the expr is already compiled or not and only call llvm_compile_expr
if it's not compiled yet, WYT?

> > I've also executed make check-world and it seems that
> > test_misc/003_check_guc is falling:
> >
> > [11:29:42.995](1.153s) not ok 1 - no parameters missing from
> > postgresql.conf.sample
> > [11:29:42.995](0.000s) #   Failed test 'no parameters missing from
> > postgresql.conf.sample'
> > #   at src/test/modules/test_misc/t/003_check_guc.pl line 85.
> > [11:29:42.995](0.000s) #          got: '1'
> > #     expected: '0'
> > [11:29:42.995](0.000s) ok 2 - no parameters missing from guc_tables.c
> > [11:29:42.995](0.000s) ok 3 - no parameters marked as NOT_IN_SAMPLE in
> > postgresql.conf.sample
> > found GUC jit_cached in guc_tables.c, missing from
> > postgresql.conf.sample
>
> I tested in with make check and make installcheck . In v8 I found bugs,
> but not published fix yet.
>
Do you have any intent to work on a new version for this?

--
Matheus Alcantara

Re: PoC. The saving of the compiled jit-code in the plan cache

From

Vladlen Popolitov

Date:

20 March, 13:40:51

Matheus Alcantara писал(а) 2025-03-18 21:56:
> On Fri, Mar 7, 2025 at 7:43 AM Vladlen Popolitov
> <v.popolitov@postgrespro.ru> wrote:
>> >
>> >> Updated patch rebased to the current master. Also I resolved the
>> >> problems
>> >> with the lookup of the compiled expressions.
>> >>  Cached jit compiles only expressions from cached plan - they are
>> >> recognized by memory context - if Expr is not NULL and belong to the
>> >> same
>> >> memory context as cached plan, this expression is compiled to the
>> >> function
>> >> with expression address in the name (expression address casted to Size
>> >> type).
>> >>  All other expression (generated later than plan, f.e. expressions in
>> >> aggregates, hash join, hash aggregates) are not compiled and are
>> >> executed
>> >> by standard expression interpreter.
>> >>
>> > What's stopping us from going back to the current code generation
>> > without caching on these scenarios? I'm a bit concerned for never jit
>> > compile these kind of expressions and have some kind of performance
>> > issue.
>> >
>> I thought about this. The reason to store the jit in cache - avoid
>> compiling every time when query is executed. If we anyway compile
>> the code even for 1 function, we lose benefits of cached jit code.
>> We can choose the standard jit compilation for other functions.
>> 
> I don't think that I get your point here - Why we would lose the
> benefits? I think that I may not be clear with my question; My point
> was, why we can't jit compile, without caching, these expressions that
> are generated later on the plan? Like aggregates, hash join, as you
> mention. I think that we may have scenarios that jit compile is worth
> even if we will not cache the compiled code. I understand the 
> limitation
> of runtime generated ExprState being allocated in another memory
> context, but why can't we follow the same path as we have today for
> these scenarios?
Yes,  I understood. It is possible to make compilation of the newly
created expressions. I tried to avoid additional compilation and chose
the implementation without compilation.
>> > What's the relation of this exec time generated expressions (hash join,
>> > hash agg) problem with the function name lookup issue? It's seems
>> > different problems to me but I may be missing something.
>> This question and next question are correlated.
>> > If these problems is not fully correlated I think that it would be
>> > better to have some kind of hash map or use the number of the call to
>> > connect expressions with functions (as you did on v4) to lookup the
>> > cached compiled function. IMHO it sounds a bit strange to me to add the
>> > function pointer on the function name and have this memory context
>> > validation.
>>   This function lookup the biggest challenge in this patch, I see
>> it correct.
>>   In current implementation: the function is generated for expression,
>> the name of the function is saved in the ExprState,
>> then function is compiled (when first expression is executed),
>> then the function is called,
>> lookup by saved address is done to find compiled code,
>> the compiled code address is saved in ExprState as new execution code,
>> the compiled code is called.
>> 
>> 
>>   In cached implementation when expression is called the next time,
>> we need to find the link among the expression and the compiled code.
>> I tried to find reliable way to connect them, and see now only one
>> version - make compilation only for expressions saved with the plan -
>> they have the same address across the time and the same memory context
>> as the saved plan. Expressions generated during execution like
>> aggregate expression and Hash join expression are allocated in
>> query execution context and always have different addresses, and I did
>> not find any way to connect them with cached plan. The have "expr"
>> member, but it can be NULL, T_List or T_Expr (or anything, what
>> Custom node can assign to it).
>>   Function name with expression address now looks as reliable way to 
>> find
>> compiled function, though not elegant. Standard implementation just
>> stores
>> generated name, uses it and pfree() it.
>> 
> Yeah, I see that this sounds a bit complicated. I'm wondering if this
> logic of checking if the expr is already cached or not could be
> implemented in another function or layer (maybe jit_compile_expr?). I
> think that llvm_compile_expr is already complicated enough and it 
> should
> only care about emitting code, so we could have another layer that 
> check
> if the expr is already compiled or not and only call llvm_compile_expr
> if it's not compiled yet, WYT?
I agree, it requires cleaning of the code. I am going to do it on the 
next
step.
> 
>> > I've also executed make check-world and it seems that
>> > test_misc/003_check_guc is falling:
>> >
>> > [11:29:42.995](1.153s) not ok 1 - no parameters missing from
>> > postgresql.conf.sample
>> > [11:29:42.995](0.000s) #   Failed test 'no parameters missing from
>> > postgresql.conf.sample'
>> > #   at src/test/modules/test_misc/t/003_check_guc.pl line 85.
>> > [11:29:42.995](0.000s) #          got: '1'
>> > #     expected: '0'
>> > [11:29:42.995](0.000s) ok 2 - no parameters missing from guc_tables.c
>> > [11:29:42.995](0.000s) ok 3 - no parameters marked as NOT_IN_SAMPLE in
>> > postgresql.conf.sample
>> > found GUC jit_cached in guc_tables.c, missing from
>> > postgresql.conf.sample
>> 
>> I tested in with make check and make installcheck . In v8 I found 
>> bugs,
>> but not published fix yet.
>> 

> Do you have any intent to work on a new version for this?
I am preparing the new version. I got comments with bugs and memory 
leaks,
I am fixing and testing the code.

  I run TPC-H benchmark and found the degradation of performance with
jit_cached (and with regular jit also in some queries). As I see,
parallel execution of the query creates a parallel worker, that starts,
gets the plan, compile all expressions again. Even if query has cached
jit code, all parallel workers compile code again.
  By the way, it is possible to send to the worker compiled jit code
together with plan (shared memory is used), but again arise the problem,
how to find the link between expression addresses in parallel worker
address space and jit code in shared memory.


-- 
Best regards,

Vladlen Popolitov.