On Tue, Jun 03, 2025 at 01:15:33PM -0400, Tom Lane wrote:
> Thanks for taking this on --- contrib/xml2 is really a mess so far as
> error handling goes. Your patch looks like an improvement, although
> I do have one concern: the routines in xml.c that use an xmlerrcxt
> seem to check xmlerrcxt->err_occurred pretty frequently, eg xmlStrdup
> is used like this:
>
> doc->encoding = xmlStrdup((const xmlChar *) "UTF-8");
> if (doc->encoding == NULL || xmlerrcxt->err_occurred)
> xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY,
> "could not allocate XML document");
>
> Not sure if that's needed here.
Yes, I was also wondering a bit about this part a couple of days ago
while looking at the module's code and xml.c. Looking at cacd42d62cb2
and its thread (particularly [1]), I think that we need to bite the
bullet or we are going to miss some error context.
PgXmlErrorContext can remain private in xml.c as we can reuse
pg_xml_error_occurred() to do the job for out-of-core code.
> There's more that could be looked at, if you feel like it:
Well, now that you mention these things, I do feel like it while I
have my hands on this area of the code.
> xml_encode_special_chars seems to need a PG_TRY block to avoid
> possibly leaking "tt". I also wonder if it's really not possible
> for xmlEncodeSpecialChars to fail and return NULL. (xmltext()
> doesn't believe that either, but maybe it's wrong too.)
Oh, missed this call. xmlEncodeSpecialChars() relies on
xmlEscapeText(), which can return NULL on allocation failures based
on a check in the upstream code (first argument of the routine is not
used, only second is). So xmltext() and xml_encode_special_chars()
are both wrong; we need a try/catch block for the edge case where
cstring_to_text_with_len() or cstring_to_text() could fail an
allocation or we would leak what could have been allocated by libxml.
> The usage of cleanup_workspace seems quite a mess: one caller
> uses a PG_TRY block to ensure it's called, but the rest don't.
> I also find it confusing that pgxml_xpath returns a value that is
> also referenced in the workspace and cleanup_workspace is responsible
> for freeing. I wonder if there's a better way to do that.
Yes, that's not optimal. This comes down to the fact that we need
workspace->res to not be free'd by libxml until the result of these
SQL functions is generated, and even with my previous patch we could
leak it if one of the allocations done for the function results fail.
I've been wondering about a few approaches, like adding the error
context to the workspace, but that did not feel right as we require
the error context before the try/catch block, and the workspace and
its internals allocated by libxml need to be fully handled in the
scope of the try/catch. So I've finished with the attached, where the
workspace and its cleanup routine gain volatile declarations, keeping
pg_xml_done() outside the workspace logic. This is a rather
mechanical change if done this way with the try/catch blocks moved one
level higher, but as long as we need to hold on the result inside
the workspace, I'm feeling that this is a cleaner approach in the
long-run for xml2, because it's impossible to miss what's in the scope
of the catch cleanup with the cleanup policy enforced in the
definition of cleanup_workspace().
> In the end of xslt_process, I wonder why
>
> if (resstr)
> xmlFree((xmlChar *) resstr);
>
> isn't done before the pg_xml_done call.
Good point. Fixed.
All that is rather unlikely going to be a problem in practice, so I
don't really feel a strong reason to backpatch any of that. v18 would
be OK, but we could just also wait for v19 based on how these are
unlikely going to be problems in the field.
Or it's worth worrying about a backpatch of the xml.c code paths which
are in the core backend? The xmltext() case is isolated, so this part
is not invasive.
[1]: https://www.postgresql.org/message-id/23388.1311118974%40sss.pgh.pa.us
--
Michael