Thread: PATCH: Add hooks for pg_total_relation_size and pg_indexes_size

PATCH: Add hooks for pg_total_relation_size and pg_indexes_size

From
Abdoulaye Ba
Date:
Hello PostgreSQL Hackers,

I am submitting a patch to add hooks for the functions pg_total_relation_size and pg_indexes_size. These hooks allow for custom behaviour to be injected into these functions, which can be useful for extensions and other custom PostgreSQL modifications.

Patch details: 
  • Adds pg_total_relation_size_hook and pg_indexes_size_hook 
  • Modifies pg_total_relation_size and pg_indexes_size to call these hooks if they are set 
  • Adds necessary type definitions and extern declarations
This feature is useful because it allows for more flexible and customizable behaviour in relation size calculations, which can be particularly valuable for extensions that need to account for additional storage outside of the standard PostgreSQL mechanisms.

The patch is attached. 

Thank you for considering this patch. I look forward to your feedback.

Kind regards,
Abdoulaye Ba
Attachment

Re: PATCH: Add hooks for pg_total_relation_size and pg_indexes_size

From
Tomas Vondra
Date:
On 8/8/24 14:18, Abdoulaye Ba wrote:
> Hello PostgreSQL Hackers,
> 
> I am submitting a patch to add hooks for the functions
> pg_total_relation_size and pg_indexes_size. These hooks allow for custom
> behaviour to be injected into these functions, which can be useful for
> extensions and other custom PostgreSQL modifications.
> 
> Patch details: 
> 
>   * Adds pg_total_relation_size_hook and pg_indexes_size_hook 
>   * Modifies pg_total_relation_size and pg_indexes_size to call these
>     hooks if they are set 
>   * Adds necessary type definitions and extern declarations
> 
> This feature is useful because it allows for more flexible and
> customizable behaviour in relation size calculations, which can be
> particularly valuable for extensions that need to account for additional
> storage outside of the standard PostgreSQL mechanisms.
> 
> The patch is attached. 
> 
> Thank you for considering this patch. I look forward to your feedback.
> 

Hi,

Thanks for the patch. A couple comments:

1) Unfortunately, it doesn't compile - it fails with errors like this:

In file included from ../../../../src/include/access/tableam.h:25,
                 from detoast.c:18:
../../../../src/include/utils/rel.h:711:51: error: unknown type name
‘FunctionCallInfo’
  711 | typedef Datum
(*Pg_total_relation_size_hook_type)(FunctionCallInfo fcinfo);

which happens because rel.h references FunctionCallInfo without
including fmgr.h. I wonder how you tested the patch ...

2) Also, I'm not sure why you have the "Pg_" at the beginning.

3) I'm not sure why the patch first changes the return to add +1 and
then undoes that:

   PG_RETURN_INT64(size + 1);

That seems quite unnecessary. I wonder how you created the patch, seems
you just concatenated multiple patches.

4) The patch has a mix of tabs and spaces. We don't do that.

5) It would be really helpful if you could explain the motivation for
this. Not just "make this customizable" but what exactly you want to do
in the hooks and why (presumably you have an extension).

6) Isn't it a bit strange the patch modifies pg_total_relation_size()
but not pg_relation_size() or pg_table_size()? Am I missing something?

7) You should add the patch to the next commitfest (2024-09) at

   https://commitfest.postgresql.org


regards

-- 
Tomas Vondra



Re: PATCH: Add hooks for pg_total_relation_size and pg_indexes_size

From
Abdoulaye Ba
Date:


On Thu, 8 Aug 2024 at 23:29, Tomas Vondra <tomas@vondra.me> wrote:
On 8/8/24 14:18, Abdoulaye Ba wrote:
> Hello PostgreSQL Hackers,
>
> I am submitting a patch to add hooks for the functions
> pg_total_relation_size and pg_indexes_size. These hooks allow for custom
> behaviour to be injected into these functions, which can be useful for
> extensions and other custom PostgreSQL modifications.
>
> Patch details: 
>
>   * Adds pg_total_relation_size_hook and pg_indexes_size_hook 
>   * Modifies pg_total_relation_size and pg_indexes_size to call these
>     hooks if they are set 
>   * Adds necessary type definitions and extern declarations
>
> This feature is useful because it allows for more flexible and
> customizable behaviour in relation size calculations, which can be
> particularly valuable for extensions that need to account for additional
> storage outside of the standard PostgreSQL mechanisms.
>
> The patch is attached. 
>
> Thank you for considering this patch. I look forward to your feedback.
>

Hi,

Thanks for the patch. A couple comments:

1) Unfortunately, it doesn't compile - it fails with errors like this:

In file included from ../../../../src/include/access/tableam.h:25,
                 from detoast.c:18:
../../../../src/include/utils/rel.h:711:51: error: unknown type name
‘FunctionCallInfo’
  711 | typedef Datum
(*Pg_total_relation_size_hook_type)(FunctionCallInfo fcinfo);

which happens because rel.h references FunctionCallInfo without
including fmgr.h. I wonder how you tested the patch ...

2) Also, I'm not sure why you have the "Pg_" at the beginning.

3) I'm not sure why the patch first changes the return to add +1 and
then undoes that:

   PG_RETURN_INT64(size + 1);

That seems quite unnecessary. I wonder how you created the patch, seems
you just concatenated multiple patches.

4) The patch has a mix of tabs and spaces. We don't do that.

5) It would be really helpful if you could explain the motivation for
this. Not just "make this customizable" but what exactly you want to do
in the hooks and why (presumably you have an extension).

6) Isn't it a bit strange the patch modifies pg_total_relation_size()
but not pg_relation_size() or pg_table_size()? Am I missing something?

7) You should add the patch to the next commitfest (2024-09) at

   https://commitfest.postgresql.org


regards


 

Hi all,

Here's my follow-up based on the feedback received:

  1. Compilation Issue:
    I didn’t encounter any errors when compiling on my machine, but I’ll review the environment and ensure fmgr.h is included where necessary to avoid the issue.

  2. Prefix "Pg_":
    I’ll remove the "Pg_" prefix as I see now that it’s unnecessary.

  3. Return Value Change:
    The +1 in the return value was part of a test that I forgot to remove. I’ll clean that up in the next version of the patch.

  4. Tabs and Spaces:
    I’ll correct the indentation to use tabs consistently, as per the project’s coding standards.

  5. Motivation:
    The hooks are intended to add support for calculating the Tantivy index size, in line with the need outlined in this issue. This will allow us to integrate additional index sizes into PostgreSQL’s built-in size functions.

  6. Additional Hooks:
    I’ll add hooks for pg_relation_size() and pg_table_size() for consistency.

I’ll make these changes and resubmit the patch soon. Thanks again for your guidance!

Best regards,


   

 

Re: PATCH: Add hooks for pg_total_relation_size and pg_indexes_size

From
Andreas Karlsson
Date:
On 8/8/24 2:18 PM, Abdoulaye Ba wrote:
> I am submitting a patch to add hooks for the functions 
> pg_total_relation_size and pg_indexes_size. These hooks allow for custom 
> behaviour to be injected into these functions, which can be useful for 
> extensions and other custom PostgreSQL modifications.

What kind of extensions do you imagine would use this hook? If it is for 
custom index AMs I think adding this to the index AM interface would 
make more sense than just adding a generic callback hook.

Andreas




Re: PATCH: Add hooks for pg_total_relation_size and pg_indexes_size

From
Abdoulaye Ba
Date:


On Fri, 9 Aug 2024 at 18:10, Andreas Karlsson <andreas@proxel.se> wrote:
On 8/8/24 2:18 PM, Abdoulaye Ba wrote:
> I am submitting a patch to add hooks for the functions
> pg_total_relation_size and pg_indexes_size. These hooks allow for custom
> behaviour to be injected into these functions, which can be useful for
> extensions and other custom PostgreSQL modifications.

What kind of extensions do you imagine would use this hook? If it is for
custom index AMs I think adding this to the index AM interface would
make more sense than just adding a generic callback hook.

Andreas
 
The primary use case for this hook is to allow extensions to account for additional storage mechanisms that are not covered by the default PostgreSQL relation size calculations. For instance, in our project, we are working with an external indexing system (Tantivy) that maintains additional data structures outside the standard PostgreSQL storage. This hook allows us to include the size of these additional structures in the total relation size calculations.

While I understand your suggestion about custom index AMs, the intent behind this hook is broader. It is not limited to custom index types but can also be used for other forms of external storage that need to be accounted for in relation size calculations. This is why a generic callback hook was chosen over extending the index AM interface.

However, if there is a consensus that such a hook would be better suited within the index AM interface for cases involving custom index storage, I'm open to discussing this further and exploring how it could be integrated more tightly with the existing PostgreSQL AM framework.

Re: PATCH: Add hooks for pg_total_relation_size and pg_indexes_size

From
Andreas Karlsson
Date:
On 8/9/24 6:59 PM, Abdoulaye Ba wrote:>     The primary use case for 
this hook is to allow extensions to account
>     for additional storage mechanisms that are not covered by the
>     default PostgreSQL relation size calculations. For instance, in our
>     project, we are working with an external indexing system (Tantivy)
>     that maintains additional data structures outside the standard
>     PostgreSQL storage. This hook allows us to include the size of these
>     additional structures in the total relation size calculations.
> 
>     While I understand your suggestion about custom index AMs, the
>     intent behind this hook is broader. It is not limited to custom
>     index types but can also be used for other forms of external storage
>     that need to be accounted for in relation size calculations. This is
>     why a generic callback hook was chosen over extending the index AM
>     interface.
> 
>     However, if there is a consensus that such a hook would be better
>     suited within the index AM interface for cases involving custom
>     index storage, I'm open to discussing this further and exploring how
>     it could be integrated more tightly with the existing PostgreSQL AM
>     framework.

Yeah, I strongly suspected it was ParadeDB. :)

I am only one developer but I really do not like solving this with a 
hook, instead I think the proper solution is to integrate this properly 
with custom AMs and storage managers. I think we should do it properly 
or not at all.

Andreas



Re: PATCH: Add hooks for pg_total_relation_size and pg_indexes_size

From
Tomas Vondra
Date:

On 8/28/24 17:53, Andreas Karlsson wrote:
> On 8/9/24 6:59 PM, Abdoulaye Ba wrote:>     The primary use case for
> this hook is to allow extensions to account
>>     for additional storage mechanisms that are not covered by the
>>     default PostgreSQL relation size calculations. For instance, in our
>>     project, we are working with an external indexing system (Tantivy)
>>     that maintains additional data structures outside the standard
>>     PostgreSQL storage. This hook allows us to include the size of these
>>     additional structures in the total relation size calculations.
>>
>>     While I understand your suggestion about custom index AMs, the
>>     intent behind this hook is broader. It is not limited to custom
>>     index types but can also be used for other forms of external storage
>>     that need to be accounted for in relation size calculations. This is
>>     why a generic callback hook was chosen over extending the index AM
>>     interface.
>>
>>     However, if there is a consensus that such a hook would be better
>>     suited within the index AM interface for cases involving custom
>>     index storage, I'm open to discussing this further and exploring how
>>     it could be integrated more tightly with the existing PostgreSQL AM
>>     framework.
> 
> Yeah, I strongly suspected it was ParadeDB. :)
> 
> I am only one developer but I really do not like solving this with a
> hook, instead I think the proper solution is to integrate this properly
> with custom AMs and storage managers. I think we should do it properly
> or not at all.
> 

Not sure. I'd agree if the index was something that could be implemented
through index AM - then that'd be the way to go. It might require some
improvements to the index AM to use the correct index size, haven't checked.

But it seems pg_search (which AFAIK is what paradedb uses to integrate
tantivy indexes) uses the term "index" for something very different. I'm
not sure that's something that could be conveniently implemented as
index AM, but I haven't checked. But that just raises the question why
should that be included in pg_total_relation_size and pg_indexes_size at
all, if it's not what we'd call an index.

regards

-- 
Tomas Vondra