Thread: PATCH: Add hooks for pg_total_relation_size and pg_indexes_size
- Adds pg_total_relation_size_hook and pg_indexes_size_hook
- Modifies pg_total_relation_size and pg_indexes_size to call these hooks if they are set
- Adds necessary type definitions and extern declarations
Attachment
On 8/8/24 14:18, Abdoulaye Ba wrote: > Hello PostgreSQL Hackers, > > I am submitting a patch to add hooks for the functions > pg_total_relation_size and pg_indexes_size. These hooks allow for custom > behaviour to be injected into these functions, which can be useful for > extensions and other custom PostgreSQL modifications. > > Patch details: > > * Adds pg_total_relation_size_hook and pg_indexes_size_hook > * Modifies pg_total_relation_size and pg_indexes_size to call these > hooks if they are set > * Adds necessary type definitions and extern declarations > > This feature is useful because it allows for more flexible and > customizable behaviour in relation size calculations, which can be > particularly valuable for extensions that need to account for additional > storage outside of the standard PostgreSQL mechanisms. > > The patch is attached. > > Thank you for considering this patch. I look forward to your feedback. > Hi, Thanks for the patch. A couple comments: 1) Unfortunately, it doesn't compile - it fails with errors like this: In file included from ../../../../src/include/access/tableam.h:25, from detoast.c:18: ../../../../src/include/utils/rel.h:711:51: error: unknown type name ‘FunctionCallInfo’ 711 | typedef Datum (*Pg_total_relation_size_hook_type)(FunctionCallInfo fcinfo); which happens because rel.h references FunctionCallInfo without including fmgr.h. I wonder how you tested the patch ... 2) Also, I'm not sure why you have the "Pg_" at the beginning. 3) I'm not sure why the patch first changes the return to add +1 and then undoes that: PG_RETURN_INT64(size + 1); That seems quite unnecessary. I wonder how you created the patch, seems you just concatenated multiple patches. 4) The patch has a mix of tabs and spaces. We don't do that. 5) It would be really helpful if you could explain the motivation for this. Not just "make this customizable" but what exactly you want to do in the hooks and why (presumably you have an extension). 6) Isn't it a bit strange the patch modifies pg_total_relation_size() but not pg_relation_size() or pg_table_size()? Am I missing something? 7) You should add the patch to the next commitfest (2024-09) at https://commitfest.postgresql.org regards -- Tomas Vondra
On 8/8/24 14:18, Abdoulaye Ba wrote:
> Hello PostgreSQL Hackers,
>
> I am submitting a patch to add hooks for the functions
> pg_total_relation_size and pg_indexes_size. These hooks allow for custom
> behaviour to be injected into these functions, which can be useful for
> extensions and other custom PostgreSQL modifications.
>
> Patch details:
>
> * Adds pg_total_relation_size_hook and pg_indexes_size_hook
> * Modifies pg_total_relation_size and pg_indexes_size to call these
> hooks if they are set
> * Adds necessary type definitions and extern declarations
>
> This feature is useful because it allows for more flexible and
> customizable behaviour in relation size calculations, which can be
> particularly valuable for extensions that need to account for additional
> storage outside of the standard PostgreSQL mechanisms.
>
> The patch is attached.
>
> Thank you for considering this patch. I look forward to your feedback.
>
Hi,
Thanks for the patch. A couple comments:
1) Unfortunately, it doesn't compile - it fails with errors like this:
In file included from ../../../../src/include/access/tableam.h:25,
from detoast.c:18:
../../../../src/include/utils/rel.h:711:51: error: unknown type name
‘FunctionCallInfo’
711 | typedef Datum
(*Pg_total_relation_size_hook_type)(FunctionCallInfo fcinfo);
which happens because rel.h references FunctionCallInfo without
including fmgr.h. I wonder how you tested the patch ...
2) Also, I'm not sure why you have the "Pg_" at the beginning.
3) I'm not sure why the patch first changes the return to add +1 and
then undoes that:
PG_RETURN_INT64(size + 1);
That seems quite unnecessary. I wonder how you created the patch, seems
you just concatenated multiple patches.
4) The patch has a mix of tabs and spaces. We don't do that.
5) It would be really helpful if you could explain the motivation for
this. Not just "make this customizable" but what exactly you want to do
in the hooks and why (presumably you have an extension).
6) Isn't it a bit strange the patch modifies pg_total_relation_size()
but not pg_relation_size() or pg_table_size()? Am I missing something?
7) You should add the patch to the next commitfest (2024-09) at
https://commitfest.postgresql.org
regards
Hi all,
Here's my follow-up based on the feedback received:
Compilation Issue:
I didn’t encounter any errors when compiling on my machine, but I’ll review the environment and ensurefmgr.h
is included where necessary to avoid the issue.Prefix "Pg_":
I’ll remove the"Pg_"
prefix as I see now that it’s unnecessary.Return Value Change:
The+1
in the return value was part of a test that I forgot to remove. I’ll clean that up in the next version of the patch.Tabs and Spaces:
I’ll correct the indentation to use tabs consistently, as per the project’s coding standards.Motivation:
The hooks are intended to add support for calculating the Tantivy index size, in line with the need outlined in this issue. This will allow us to integrate additional index sizes into PostgreSQL’s built-in size functions.Additional Hooks:
I’ll add hooks forpg_relation_size()
andpg_table_size()
for consistency.I’ll make these changes and resubmit the patch soon. Thanks again for your guidance!
Best regards,
On 8/8/24 2:18 PM, Abdoulaye Ba wrote: > I am submitting a patch to add hooks for the functions > pg_total_relation_size and pg_indexes_size. These hooks allow for custom > behaviour to be injected into these functions, which can be useful for > extensions and other custom PostgreSQL modifications. What kind of extensions do you imagine would use this hook? If it is for custom index AMs I think adding this to the index AM interface would make more sense than just adding a generic callback hook. Andreas
On 8/8/24 2:18 PM, Abdoulaye Ba wrote:
> I am submitting a patch to add hooks for the functions
> pg_total_relation_size and pg_indexes_size. These hooks allow for custom
> behaviour to be injected into these functions, which can be useful for
> extensions and other custom PostgreSQL modifications.
What kind of extensions do you imagine would use this hook? If it is for
custom index AMs I think adding this to the index AM interface would
make more sense than just adding a generic callback hook.
Andreas
The primary use case for this hook is to allow extensions to account for additional storage mechanisms that are not covered by the default PostgreSQL relation size calculations. For instance, in our project, we are working with an external indexing system (Tantivy) that maintains additional data structures outside the standard PostgreSQL storage. This hook allows us to include the size of these additional structures in the total relation size calculations.
While I understand your suggestion about custom index AMs, the intent behind this hook is broader. It is not limited to custom index types but can also be used for other forms of external storage that need to be accounted for in relation size calculations. This is why a generic callback hook was chosen over extending the index AM interface.
However, if there is a consensus that such a hook would be better suited within the index AM interface for cases involving custom index storage, I'm open to discussing this further and exploring how it could be integrated more tightly with the existing PostgreSQL AM framework.
On 8/9/24 6:59 PM, Abdoulaye Ba wrote:> The primary use case for this hook is to allow extensions to account > for additional storage mechanisms that are not covered by the > default PostgreSQL relation size calculations. For instance, in our > project, we are working with an external indexing system (Tantivy) > that maintains additional data structures outside the standard > PostgreSQL storage. This hook allows us to include the size of these > additional structures in the total relation size calculations. > > While I understand your suggestion about custom index AMs, the > intent behind this hook is broader. It is not limited to custom > index types but can also be used for other forms of external storage > that need to be accounted for in relation size calculations. This is > why a generic callback hook was chosen over extending the index AM > interface. > > However, if there is a consensus that such a hook would be better > suited within the index AM interface for cases involving custom > index storage, I'm open to discussing this further and exploring how > it could be integrated more tightly with the existing PostgreSQL AM > framework. Yeah, I strongly suspected it was ParadeDB. :) I am only one developer but I really do not like solving this with a hook, instead I think the proper solution is to integrate this properly with custom AMs and storage managers. I think we should do it properly or not at all. Andreas
On 8/28/24 17:53, Andreas Karlsson wrote: > On 8/9/24 6:59 PM, Abdoulaye Ba wrote:> The primary use case for > this hook is to allow extensions to account >> for additional storage mechanisms that are not covered by the >> default PostgreSQL relation size calculations. For instance, in our >> project, we are working with an external indexing system (Tantivy) >> that maintains additional data structures outside the standard >> PostgreSQL storage. This hook allows us to include the size of these >> additional structures in the total relation size calculations. >> >> While I understand your suggestion about custom index AMs, the >> intent behind this hook is broader. It is not limited to custom >> index types but can also be used for other forms of external storage >> that need to be accounted for in relation size calculations. This is >> why a generic callback hook was chosen over extending the index AM >> interface. >> >> However, if there is a consensus that such a hook would be better >> suited within the index AM interface for cases involving custom >> index storage, I'm open to discussing this further and exploring how >> it could be integrated more tightly with the existing PostgreSQL AM >> framework. > > Yeah, I strongly suspected it was ParadeDB. :) > > I am only one developer but I really do not like solving this with a > hook, instead I think the proper solution is to integrate this properly > with custom AMs and storage managers. I think we should do it properly > or not at all. > Not sure. I'd agree if the index was something that could be implemented through index AM - then that'd be the way to go. It might require some improvements to the index AM to use the correct index size, haven't checked. But it seems pg_search (which AFAIK is what paradedb uses to integrate tantivy indexes) uses the term "index" for something very different. I'm not sure that's something that could be conveniently implemented as index AM, but I haven't checked. But that just raises the question why should that be included in pg_total_relation_size and pg_indexes_size at all, if it's not what we'd call an index. regards -- Tomas Vondra