Thread: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
[PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
Tommy Pavlicek
Date:
Hi All,
I've attached a couple of patches to allow ALTER OPERATOR to add commutators, negators, hashes and merges to operators that lack them.
The need for this arose adding hash functions to the ltree type after the operator had been created without hash support[1]. There are potential issues with modifying these attributes that have been discussed previously[2], but I understand that setting them, if they have not been set before, is ok.
I belatedly realised that it may not be desirable or necessary to allow adding commutators and negators in ALTER OPERATOR because the linkage can already be added when creating a new operator. I don't know what's best, so I thought I'd post this here and get feedback before removing anything.
The first patch is create_op_fixes_v1.patch and it includes some refactoring in preparation for the ALTER OPERATOR changes and fixes a couple of minor bugs in CREATE OPERATOR:
- prevents self negation when filling in/creating an existing shell operator
- remove reference to sort operator in the self negation error message as the sort attribute appears to be deprecated in Postgres 8.3
The second patch is alter_op_v1.patch which contains the changes to ALTER OPERATOR and depends on create_op_fixes_v1.patch.
Additionally, I wasn't sure whether it was preferred to fail or succeed on ALTERs that have no effect, such as adding hashes on an operator that already allows them or disabling hashes on one that does not. I chose to raise an error when this happens, on the thinking it was more explicit and made the code simpler, even though the end result would be what the user wanted.
Comments appreciated.
Thanks,
Tommy
[1] https://www.postgresql.org/message-id/flat/CAEhP-W9ZEoHeaP_nKnPCVd_o1c3BAUvq1gWHrq8EbkNRiS9CvQ%40mail.gmail.com
[2] https://www.postgresql.org/message-id/flat/3348985.V7xMLFDaJO@dinodell
I've attached a couple of patches to allow ALTER OPERATOR to add commutators, negators, hashes and merges to operators that lack them.
The need for this arose adding hash functions to the ltree type after the operator had been created without hash support[1]. There are potential issues with modifying these attributes that have been discussed previously[2], but I understand that setting them, if they have not been set before, is ok.
I belatedly realised that it may not be desirable or necessary to allow adding commutators and negators in ALTER OPERATOR because the linkage can already be added when creating a new operator. I don't know what's best, so I thought I'd post this here and get feedback before removing anything.
The first patch is create_op_fixes_v1.patch and it includes some refactoring in preparation for the ALTER OPERATOR changes and fixes a couple of minor bugs in CREATE OPERATOR:
- prevents self negation when filling in/creating an existing shell operator
- remove reference to sort operator in the self negation error message as the sort attribute appears to be deprecated in Postgres 8.3
The second patch is alter_op_v1.patch which contains the changes to ALTER OPERATOR and depends on create_op_fixes_v1.patch.
Additionally, I wasn't sure whether it was preferred to fail or succeed on ALTERs that have no effect, such as adding hashes on an operator that already allows them or disabling hashes on one that does not. I chose to raise an error when this happens, on the thinking it was more explicit and made the code simpler, even though the end result would be what the user wanted.
Comments appreciated.
Thanks,
Tommy
[1] https://www.postgresql.org/message-id/flat/CAEhP-W9ZEoHeaP_nKnPCVd_o1c3BAUvq1gWHrq8EbkNRiS9CvQ%40mail.gmail.com
[2] https://www.postgresql.org/message-id/flat/3348985.V7xMLFDaJO@dinodell
Attachment
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
Tom Lane
Date:
Tommy Pavlicek <tommypav122@gmail.com> writes: > I've attached a couple of patches to allow ALTER OPERATOR to add > commutators, negators, hashes and merges to operators that lack them. Please add this to the upcoming commitfest [1], to ensure we don't lose track of it. > The first patch is create_op_fixes_v1.patch and it includes some > refactoring in preparation for the ALTER OPERATOR changes and fixes a > couple of minor bugs in CREATE OPERATOR: > - prevents self negation when filling in/creating an existing shell operator > - remove reference to sort operator in the self negation error message as > the sort attribute appears to be deprecated in Postgres 8.3 Hmm, yeah, I bet nobody has looked at those edge cases in awhile. > Additionally, I wasn't sure whether it was preferred to fail or succeed on > ALTERs that have no effect, such as adding hashes on an operator that > already allows them or disabling hashes on one that does not. I chose to > raise an error when this happens, on the thinking it was more explicit and > made the code simpler, even though the end result would be what the user > wanted. You could argue that both ways I guess. We definitely need to raise error if the command tries to change an existing nondefault setting, since that might break things as per previous discussion. But perhaps rejecting an attempt to set the existing setting is overly nanny-ish. Personally I think I'd lean to "don't throw an error if we don't have to", but I'm not strongly set on that position. (Don't we have existing precedents that apply here? I can't offhand think of any existing ALTER commands that would reject no-op requests, but maybe that's not a direct precedent.) regards, tom lane [1] https://commitfest.postgresql.org/43/
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
Dagfinn Ilmari Mannsåker
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes: > Tommy Pavlicek <tommypav122@gmail.com> writes: > >> Additionally, I wasn't sure whether it was preferred to fail or succeed on >> ALTERs that have no effect, such as adding hashes on an operator that >> already allows them or disabling hashes on one that does not. I chose to >> raise an error when this happens, on the thinking it was more explicit and >> made the code simpler, even though the end result would be what the user >> wanted. > > You could argue that both ways I guess. We definitely need to raise error > if the command tries to change an existing nondefault setting, since that > might break things as per previous discussion. But perhaps rejecting > an attempt to set the existing setting is overly nanny-ish. Personally > I think I'd lean to "don't throw an error if we don't have to", but I'm > not strongly set on that position. > > (Don't we have existing precedents that apply here? I can't offhand > think of any existing ALTER commands that would reject no-op requests, > but maybe that's not a direct precedent.) Since it only supports adding these operations if they don't already exist, should it not be ALTER OPERATOR ADD <thing>, not SET <thing>? That makes it natural to add an IF NOT EXISTS clause, like ALTER TABLE ADD COLUMN has, to make it a no-op instead of an error. > regards, tom lane - ilmari
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
Tom Lane
Date:
=?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?= <ilmari@ilmari.org> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> (Don't we have existing precedents that apply here? I can't offhand >> think of any existing ALTER commands that would reject no-op requests, >> but maybe that's not a direct precedent.) > Since it only supports adding these operations if they don't already > exist, should it not be ALTER OPERATOR ADD <thing>, not SET <thing>? > That makes it natural to add an IF NOT EXISTS clause, like ALTER TABLE > ADD COLUMN has, to make it a no-op instead of an error. Hmm, maybe. But it feels like choosing syntax and semantics based on what might be only a temporary implementation restriction. We certainly don't handle any other property-setting commands that way. Admittedly, "can't change an existing setting" is likely to be pretty permanent in this case, just because I don't see a use-case for it that'd justify the work involved. (My wife recently gave me a coffee cup that says "Nothing is as permanent as a temporary fix.") But still, if someone did show up and do that work, we'd regret this choice of syntax because it'd then be uselessly unlike every other ALTER command. regards, tom lane
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
Tommy Pavlicek
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes: > Please add this to the upcoming commitfest [1], to ensure we don't > lose track of it. I've added a single patch here: https://commitfest.postgresql.org/43/4389/ It wasn't obvious whether I should create a second commitfest entry because I've included 2 patches so I've just done 1 to begin with. On that note, is it preferred here to split patches of this size into separate patches, and if so, additionally, separate threads? Tom Lane <tgl@sss.pgh.pa.us> writes: > > Additionally, I wasn't sure whether it was preferred to fail or succeed on > > ALTERs that have no effect, such as adding hashes on an operator that > > already allows them or disabling hashes on one that does not. I chose to > > raise an error when this happens, on the thinking it was more explicit and > > made the code simpler, even though the end result would be what the user > > wanted. > > You could argue that both ways I guess. We definitely need to raise error > if the command tries to change an existing nondefault setting, since that > might break things as per previous discussion. But perhaps rejecting > an attempt to set the existing setting is overly nanny-ish. Personally > I think I'd lean to "don't throw an error if we don't have to", but I'm > not strongly set on that position. > > (Don't we have existing precedents that apply here? I can't offhand > think of any existing ALTER commands that would reject no-op requests, > but maybe that's not a direct precedent.) My initial thinking behind the error for a no-op was largely driven by the existence of 'DROP.. IF EXISTS'. However, I did some ad hoc testing on ALTER commands and it does seem that they mostly allow no-ops. I did find that renaming an object to the same name will fail due to the object already existing, but that seems to be more of a coincidence than a design decision to me. Given this, I also lean towards allowing the no-ops and will change it unless there are objections.
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
Tom Lane
Date:
Tommy Pavlicek <tommypav122@gmail.com> writes: > I've added a single patch here: https://commitfest.postgresql.org/43/4389/ > It wasn't obvious whether I should create a second commitfest entry > because I've included 2 patches so I've just done 1 to begin with. On > that note, is it preferred here to split patches of this size into > separate patches, and if so, additionally, separate threads? No, our commitfest infrastructure is unable to deal with patches that have interdependencies unless they're presented in a single email. So just use one thread, and be sure to attach all the patches each time. (BTW, while you seem to have gotten away with it so far, it's usually advisable to name the patch files like 0001-foo.patch, 0002-bar.patch, etc, to make sure the cfbot understands what order to apply them in.) regards, tom lane
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
Tommy Pavlicek
Date:
On Fri, Jun 23, 2023 at 12:21 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Tommy Pavlicek <tommypav122@gmail.com> writes: > > I've added a single patch here: https://commitfest.postgresql.org/43/4389/ > > > It wasn't obvious whether I should create a second commitfest entry > > because I've included 2 patches so I've just done 1 to begin with. On > > that note, is it preferred here to split patches of this size into > > separate patches, and if so, additionally, separate threads? > > No, our commitfest infrastructure is unable to deal with patches that have > interdependencies unless they're presented in a single email. So just use > one thread, and be sure to attach all the patches each time. > > (BTW, while you seem to have gotten away with it so far, it's usually > advisable to name the patch files like 0001-foo.patch, 0002-bar.patch, > etc, to make sure the cfbot understands what order to apply them in.) > > regards, tom lane Thanks. I've attached a new version of the ALTER OPERATOR patch that allows no-ops. It should be ready to review now.
Attachment
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
jian he
Date:
/* * AlterOperator * routine implementing ALTER OPERATOR <operator> SET (option = ...). * * Currently, only RESTRICT and JOIN estimator functions can be changed. */ ObjectAddress AlterOperator(AlterOperatorStmt *stmt) The above comment needs to change, other than that, it passed the test, works as expected.
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
jian he
Date:
in doc/src/sgml/ref/alter_operator.sgml <varlistentry> <term><literal>HASHES</literal></term> <listitem> <para> Indicates this operator can support a hash join. Can only be set when the operator does not support a hash join. </para> </listitem> </varlistentry> <varlistentry> <term><literal>MERGES</literal></term> <listitem> <para> Indicates this operator can support a merge join. Can only be set when the operator does not support a merge join. </para> </listitem> </varlistentry> ------------------------ if the operator cannot support hash/merge join, it can't do ALTER OPERATOR oprname(left_arg, right_arg) SET (HASHES/MERGES = false); I think it means: Can only be set when the operator does support a hash/merge join. Once set to true, it cannot be reset to false.
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
Tom Lane
Date:
Tommy Pavlicek <tommypav122@gmail.com> writes: > I've attached a new version of the ALTER OPERATOR patch that allows > no-ops. It should be ready to review now. I got around to looking through this finally (sorry about the delay). I'm mostly on board with the functionality, with the exception that I don't see why we should allow ALTER OPERATOR to cause new shell operators to be created. The argument for allowing that in CREATE OPERATOR was mainly to allow a linked pair of operators to be created without a lot of complexity (specifically, being careful to specify the commutator or negator linkage only in the second CREATE, which is a rule that requires another exception for a self-commutator). However, if you're using ALTER OPERATOR then you might as well create both operators first and then link them with an ALTER command. In fact, I don't really see a use-case where the operators wouldn't both exist; isn't this feature mainly to allow retrospective correction of omitted linkages? So I think allowing ALTER to create a second operator is more likely to allow mistakes to sneak by than to do anything useful --- and they will be mistakes you can't correct except by starting over. I'd even question whether we want to let ALTER establish a linkage to an existing shell operator, rather than insisting you turn it into a valid operator via CREATE first. If we implement it with that restriction then I don't think the refactorization done in 0001 is correct, or at least not ideal. (In any case, it seems like a bad idea that the command reference pages make no mention of this stuff about shell operators. It's explained in 38.15. Operator Optimization Information, but it'd be worth at least alluding to that section here. Or maybe we should move that info to CREATE OPERATOR?) More generally, you muttered something about 0001 fixing some existing bugs, but if so I can't see those trees for the forest of refactorization. I'd suggest splitting any bug fixes apart from the pure-refactorization step. regards, tom lane
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
Tommy Pavlicek
Date:
On Thu, Sep 28, 2023 at 9:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Tommy Pavlicek <tommypav122@gmail.com> writes: > > I've attached a new version of the ALTER OPERATOR patch that allows > > no-ops. It should be ready to review now. > > I got around to looking through this finally (sorry about the delay). > I'm mostly on board with the functionality, with the exception that > I don't see why we should allow ALTER OPERATOR to cause new shell > operators to be created. The argument for allowing that in CREATE > OPERATOR was mainly to allow a linked pair of operators to be created > without a lot of complexity (specifically, being careful to specify > the commutator or negator linkage only in the second CREATE, which > is a rule that requires another exception for a self-commutator). > However, if you're using ALTER OPERATOR then you might as well create > both operators first and then link them with an ALTER command. > In fact, I don't really see a use-case where the operators wouldn't > both exist; isn't this feature mainly to allow retrospective > correction of omitted linkages? So I think allowing ALTER to create a > second operator is more likely to allow mistakes to sneak by than to > do anything useful --- and they will be mistakes you can't correct > except by starting over. I'd even question whether we want to let > ALTER establish a linkage to an existing shell operator, rather than > insisting you turn it into a valid operator via CREATE first. > > If we implement it with that restriction then I don't think the > refactorization done in 0001 is correct, or at least not ideal. > > (In any case, it seems like a bad idea that the command reference > pages make no mention of this stuff about shell operators. It's > explained in 38.15. Operator Optimization Information, but it'd > be worth at least alluding to that section here. Or maybe we > should move that info to CREATE OPERATOR?) > > More generally, you muttered something about 0001 fixing some > existing bugs, but if so I can't see those trees for the forest of > refactorization. I'd suggest splitting any bug fixes apart from > the pure-refactorization step. > > regards, tom lane Thanks Tom. The rationale behind the shell operator and that part of section 38.15 of the documentation had escaped me, but what you're saying makes complete sense. Based on your comments, I've made some changes: 1. I've isolated the bug fixes (fixing the error message and disallowing self negation when filling in a shell operator) into 0001-bug-fixes-v3.patch. 2. I've scaled back most of the refactoring as I agree it no longer makes sense. 3. I updated the logic to prevent the creation of or linking to shell operators. 4. I made further updates to the documentation including referencing 38.15 directly in the CREATE and ALTER pages (It's easy to miss if only 38.14 is referenced) and moved the commentary about creating commutators and negators into the CREATE section as with the the ALTER changes it now seems specific to CREATE. I didn't move the rest of 38.15 as I think this applies to both CREATE and ALTER. I did notice one further potential bug. When creating an operator and adding a commutator, PostgreSQL only links the commutator back to the operator if the commutator has no commutator of its own, but the create operation succeeds regardless of whether this linkage happens. In other words, if A and B are a pair of commutators and one creates another operator, C, with A as its commutator, then C will link to A, but A will still link to B (and B to A). It's not clear to me if this in itself is a problem, but unless I've misunderstood something operator C must be the same as B so it implies an error by the user and there could also be issues if A was deleted since C would continue to refer to the deleted A. The same applies for negators and alter operations. Do you know if this behaviour is intentional or if I've missed something because it seems undesirable to me. If it is a bug, then I think I can see how to fix it, but wanted to ask before making any changes. On Mon, Sep 25, 2023 at 11:52 AM jian he <jian.universality@gmail.com> wrote: > > /* > * AlterOperator > * routine implementing ALTER OPERATOR <operator> SET (option = ...). > * > * Currently, only RESTRICT and JOIN estimator functions can be changed. > */ > ObjectAddress > AlterOperator(AlterOperatorStmt *stmt) > > The above comment needs to change, other than that, it passed the > test, works as expected. Thanks, added a comment. > Can only be set when the operator does support a hash/merge join. Once > set to true, it cannot be reset to false. Yes, I updated the wording. Is it clearer now?
Attachment
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
Tom Lane
Date:
Tommy Pavlicek <tommypav122@gmail.com> writes: > I did notice one further potential bug. When creating an operator and > adding a commutator, PostgreSQL only links the commutator back to the > operator if the commutator has no commutator of its own, but the > create operation succeeds regardless of whether this linkage happens. > In other words, if A and B are a pair of commutators and one creates > another operator, C, with A as its commutator, then C will link to A, > but A will still link to B (and B to A). It's not clear to me if this > in itself is a problem, but unless I've misunderstood something > operator C must be the same as B so it implies an error by the user > and there could also be issues if A was deleted since C would continue > to refer to the deleted A. Yeah, it'd make sense to tighten that up. Per the discussion so far, we should not allow an operator's commutator/negator links to change once set, so overwriting the existing link with a different value would be wrong. But allowing creation of the new operator to proceed with a different outcome than expected isn't good either. I think we should start throwing an error for that. regards, tom lane
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
Tommy Pavlicek
Date:
On Tue, Oct 10, 2023 at 9:32 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Tommy Pavlicek <tommypav122@gmail.com> writes: > > I did notice one further potential bug. When creating an operator and > > adding a commutator, PostgreSQL only links the commutator back to the > > operator if the commutator has no commutator of its own, but the > > create operation succeeds regardless of whether this linkage happens. > > > In other words, if A and B are a pair of commutators and one creates > > another operator, C, with A as its commutator, then C will link to A, > > but A will still link to B (and B to A). It's not clear to me if this > > in itself is a problem, but unless I've misunderstood something > > operator C must be the same as B so it implies an error by the user > > and there could also be issues if A was deleted since C would continue > > to refer to the deleted A. > > Yeah, it'd make sense to tighten that up. Per the discussion so far, > we should not allow an operator's commutator/negator links to change > once set, so overwriting the existing link with a different value > would be wrong. But allowing creation of the new operator to proceed > with a different outcome than expected isn't good either. I think > we should start throwing an error for that. > > regards, tom lane Thanks. I've added another patch (0002-require_unused_neg_com-v1.patch) that prevents using a commutator or negator that's already part of a pair. The only other changes from my email yesterday are that in the ALTER command I moved the post alter hook to after OperatorUpd and the addition of tests to verify that we can't use an existing commutator or negator with the ALTER command. I believe this can all be looked at again. Cheers, Tommy
Attachment
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
jian he
Date:
errmsg("operator attribute \"negator\" cannot be changed if it has already been set"))); I feel like the above message is not very helpful. Something like the following may be more helpful for diagnosis. errmsg("operator %s's attribute \"negator\" cannot be changed if it has already been set", operatorname))); when I "git apply", I've noticed some minor whitespace warning. Other than that, it looks fine.
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
Tom Lane
Date:
jian he <jian.universality@gmail.com> writes: > errmsg("operator attribute \"negator\" cannot be changed if it has > already been set"))); > I feel like the above message is not very helpful. I think it's okay to be concise about this as long as the operator we're referring to is the target of the ALTER. I agree that when we're complaining about some *other* operator, we'd better spell out which one we mean, and I made some changes to the patch to improve that. Pushed after a round of editorialization -- mostly cosmetic stuff, except for tweaking some error messages. I shortened the test cases a bit too, as I thought they were somewhat excessive to have as a permanent thing. regards, tom lane
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
Christoph Berg
Date:
Re: Tommy Pavlicek > I've added another patch (0002-require_unused_neg_com-v1.patch) that > prevents using a commutator or negator that's already part of a pair. Hmm. I agree with the general idea of adding sanity checks, but this might be overzealous: This change is breaking pgsphere which has <@ @> operator pairs, but for historical reasons also includes alternative spellings of these operators (both called @ with swapped operand types) which now explodes because we can't add them with the "proper" commutator and negators declared (which point to the canonical <@ @> !<@ !@> operators). https://github.com/postgrespro/pgsphere/blob/master/pgs_moc_compat.sql.in We might be able to simply delete the @ operators, but doesn't this new check break the general possibility to have more than one spelling for the same operator? Christoph
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
Tom Lane
Date:
Christoph Berg <myon@debian.org> writes: > This change is breaking pgsphere which has <@ @> operator pairs, but > for historical reasons also includes alternative spellings of these > operators (both called @ with swapped operand types) which now > explodes because we can't add them with the "proper" commutator and > negators declared (which point to the canonical <@ @> !<@ !@> > operators). Should have guessed that somebody might be depending on the previous squishy behavior. Still, I can't see how the above situation is a good idea. Commutators/negators should come in pairs, not have completely random links. I think it's only accidental that this setup isn't triggering other strange behavior. > We might be able to simply delete the @ operators, but doesn't this > new check break the general possibility to have more than one spelling > for the same operator? You can have more than one operator atop the same function. But why didn't you make the @ operators commutators of each other, rather than this mess? regards, tom lane
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
Christoph Berg
Date:
Re: Tom Lane > > We might be able to simply delete the @ operators, but doesn't this > > new check break the general possibility to have more than one spelling > > for the same operator? > > You can have more than one operator atop the same function. > But why didn't you make the @ operators commutators of each other, > rather than this mess? Historical something. You are right that the commutators could be fixed that way, but the negators are a different question. There is no legacy spelling for these. Anyway, if this doesn't raise any "oh we didn't think of this" concerns, we'll just remove the old operators in pgsphere. Christoph
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
Tom Lane
Date:
Christoph Berg <myon@debian.org> writes: > Anyway, if this doesn't raise any "oh we didn't think of this" > concerns, we'll just remove the old operators in pgsphere. Well, the idea was exactly to forbid that sort of setup. However, if we get sufficient pushback maybe we should reconsider --- for example, maybe it'd be sane to enforce the restriction in ALTER but not CREATE? I'm inclined to wait and see if there are more complaints. regards, tom lane
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
Tommy Pavlicek
Date:
On Fri, Oct 20, 2023 at 5:33 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Pushed after a round of editorialization -- mostly cosmetic > stuff, except for tweaking some error messages. I shortened the > test cases a bit too, as I thought they were somewhat excessive > to have as a permanent thing. Thanks Tom. On Tue, Oct 24, 2023 at 2:51 PM Christoph Berg <myon@debian.org> wrote: > > Re: Tommy Pavlicek > > I've added another patch (0002-require_unused_neg_com-v1.patch) that > > prevents using a commutator or negator that's already part of a pair. > > Hmm. I agree with the general idea of adding sanity checks, but this > might be overzealous: I can't add much beyond what Tom said, but I think this does go a bit beyond a sanity check. I forgot to mention it in my previous message, but the main reason I noticed this was because the DELETE operator code cleans up commutator and negator links to the operator being deleted and that code expects each to be part of exactly a pair.
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
Christoph Berg
Date:
Re: Tom Lane > Well, the idea was exactly to forbid that sort of setup. Fwiw, pgsphere has remove the problematic operators now: https://github.com/postgrespro/pgsphere/commit/e810f5ddd827881b06a92a303c5c9fbf997b892e > However, if we get sufficient pushback maybe we should > reconsider --- for example, maybe it'd be sane to enforce > the restriction in ALTER but not CREATE? Hmm, that seems backwards, I'd expect that CREATE might have some checks that could circumvent using ALTER if I really insisted. If CREATE can create things that I can't reach by ALTERing existing other things, that's weird. Let's keep it like it is now in PG17. Christoph
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
Christoph Berg
Date:
Re: To Tom Lane > Let's keep it like it is now in PG17. Late followup news: This feature has actually found a bug in postgresql-debversion: CREATE OPERATOR > ( LEFTARG = debversion, RIGHTARG = debversion, COMMUTATOR = <, - NEGATOR = >=, + NEGATOR = <=, RESTRICT = scalargtsel, JOIN = scalargtjoinsel ); https://salsa.debian.org/postgresql/postgresql-debversion/-/commit/8ef08ccbea1438468249b0e94048b1a8a25fc625#000e84a71f8a28b762658375c194b25d529336f3 So, thanks! Christoph
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
Michael Banck
Date:
Hi, On Tue, Oct 24, 2023 at 11:42:15AM -0400, Tom Lane wrote: > Christoph Berg <myon@debian.org> writes: > > Anyway, if this doesn't raise any "oh we didn't think of this" > > concerns, we'll just remove the old operators in pgsphere. > > Well, the idea was exactly to forbid that sort of setup. > However, if we get sufficient pushback maybe we should > reconsider --- for example, maybe it'd be sane to enforce > the restriction in ALTER but not CREATE? > > I'm inclined to wait and see if there are more complaints. FWIW, rdkit also fails, but that seems to be an ancient thing as well: https://github.com/rdkit/rdkit/issues/7843 I guess there's no way to make that error a bit more helpful, like printing out the offenbding SQL command, presumably because we are loding an extension? Michael
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
Christoph Berg
Date:
Re: Michael Banck > I guess there's no way to make that error a bit more helpful, like > printing out the offenbding SQL command, presumably because we are > loding an extension? I wish there was. The error reporting from failing extension scripts is really bad with no context at all, it has hit me a few times in the past already. Christoph
Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges
From
Tom Lane
Date:
Christoph Berg <myon@debian.org> writes: > I wish there was. The error reporting from failing extension scripts > is really bad with no context at all, it has hit me a few times in the > past already. Nobody's spent any work on that :-(. A really basic reporting facility is not hard to add, as in the attached finger exercise. The trouble with it can be explained by showing what I get after intentionally breaking a script file command: regression=# create extension cube; ERROR: syntax error at or near "CREAT" LINE 16: CREAT FUNCTION cube_send(cube) ^ QUERY: /* contrib/cube/cube--1.4--1.5.sql */ -- complain if script is sourced in psql, rather than via ALTER EXTENSION -- Remove @ and ~ DROP OPERATOR @ (cube, cube); DROP OPERATOR ~ (cube, cube); -- Add binary input/output handlers CREATE FUNCTION cube_recv(internal) RETURNS cube AS '$libdir/cube' LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE; CREAT FUNCTION cube_send(cube) RETURNS bytea AS '$libdir/cube' LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE; ALTER TYPE cube SET ( RECEIVE = cube_recv, SEND = cube_send ); CONTEXT: extension script file "/home/postgres/install/share/extension/cube--1.4--1.5.sql" So the first part of that is great, but if your script file is large you probably won't be happy about having the whole thing repeated in the "QUERY" field. So this needs some work on user-friendliness. I'm inclined to think that maybe we'd be best off keeping the server end of it straightforward, and trying to teach psql to abbreviate the QUERY field in a useful way. IIRC you get this same type of problem with very large SQL-language functions and suchlike. Also, I believe this doesn't help much for non-syntax errors (those that aren't reported with an error location). It might be interesting to use the RawStmt.stmt_location/stmt_len fields for the current parsetree to identify what to report, but I've not dug further than this. regards, tom lane diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c index fab59ad5f6..6af72ff422 100644 --- a/src/backend/commands/extension.c +++ b/src/backend/commands/extension.c @@ -107,6 +107,13 @@ typedef struct ExtensionVersionInfo struct ExtensionVersionInfo *previous; /* current best predecessor */ } ExtensionVersionInfo; +/* Info structure for script_error_callback() */ +typedef struct +{ + const char *sql; + const char *filename; +} script_error_callback_arg; + /* Local functions */ static List *find_update_path(List *evi_list, ExtensionVersionInfo *evi_start, @@ -670,9 +677,32 @@ read_extension_script_file(const ExtensionControlFile *control, return dest_str; } +/* + * error context callback for failures in script-file execution + */ +static void +script_error_callback(void *arg) +{ + script_error_callback_arg *callback_arg = (script_error_callback_arg *) arg; + int syntaxerrposition; + + /* If it's a syntax error, convert to internal syntax error report */ + syntaxerrposition = geterrposition(); + if (syntaxerrposition > 0) + { + errposition(0); + internalerrposition(syntaxerrposition); + internalerrquery(callback_arg->sql); + } + + errcontext("extension script file \"%s\"", callback_arg->filename); +} + /* * Execute given SQL string. * + * The filename the string came from is also provided, for error reporting. + * * Note: it's tempting to just use SPI to execute the string, but that does * not work very well. The really serious problem is that SPI will parse, * analyze, and plan the whole string before executing any of it; of course @@ -682,12 +712,25 @@ read_extension_script_file(const ExtensionControlFile *control, * could be very long. */ static void -execute_sql_string(const char *sql) +execute_sql_string(const char *sql, const char *filename) { + script_error_callback_arg callback_arg; + ErrorContextCallback scripterrcontext; List *raw_parsetree_list; DestReceiver *dest; ListCell *lc1; + /* + * Setup error traceback support for ereport(). + */ + callback_arg.sql = sql; + callback_arg.filename = filename; + + scripterrcontext.callback = script_error_callback; + scripterrcontext.arg = (void *) &callback_arg; + scripterrcontext.previous = error_context_stack; + error_context_stack = &scripterrcontext; + /* * Parse the SQL string into a list of raw parse trees. */ @@ -780,6 +823,8 @@ execute_sql_string(const char *sql) /* Be sure to advance the command counter after the last script command */ CommandCounterIncrement(); + + error_context_stack = scripterrcontext.previous; } /* @@ -1054,7 +1099,7 @@ execute_extension_script(Oid extensionOid, ExtensionControlFile *control, /* And now back to C string */ c_sql = text_to_cstring(DatumGetTextPP(t_sql)); - execute_sql_string(c_sql); + execute_sql_string(c_sql, filename); } PG_FINALLY(); {
Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)
From
Christoph Berg
Date:
Re: Tom Lane > So the first part of that is great, but if your script file is > large you probably won't be happy about having the whole thing > repeated in the "QUERY" field. So this needs some work on > user-friendliness. Does this really have to be addressed? It would be way better than it is now, and errors during extension creation are rare and mostly for developers only, so it doesn't have to be pretty. > I'm inclined to think that maybe we'd be best off keeping the server > end of it straightforward, and trying to teach psql to abbreviate the > QUERY field in a useful way. IIRC you get this same type of problem > with very large SQL-language functions and suchlike. I'd treat this as a separate patch, if it's considered to be a good idea. Christoph
Christoph Berg <myon@debian.org> writes: > Re: Tom Lane >> So the first part of that is great, but if your script file is >> large you probably won't be happy about having the whole thing >> repeated in the "QUERY" field. So this needs some work on >> user-friendliness. > Does this really have to be addressed? It would be way better than it > is now, and errors during extension creation are rare and mostly for > developers only, so it doesn't have to be pretty. Perhaps. I spent a little more effort on this and added code to report errors that don't come with an error location. On those, we don't have any constraints about what to report in the QUERY field, so I made it trim the string to just the current query within the script, which makes things quite a bit better. You can see the results in the test_extensions regression test changes. (It might be worth some effort to trim away comments appearing just before a command, but I didn't tackle that here.) regards, tom lane diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c index fab59ad5f6..0fae1332d2 100644 --- a/src/backend/commands/extension.c +++ b/src/backend/commands/extension.c @@ -54,6 +54,7 @@ #include "funcapi.h" #include "mb/pg_wchar.h" #include "miscadmin.h" +#include "nodes/queryjumble.h" #include "storage/fd.h" #include "tcop/utility.h" #include "utils/acl.h" @@ -107,6 +108,17 @@ typedef struct ExtensionVersionInfo struct ExtensionVersionInfo *previous; /* current best predecessor */ } ExtensionVersionInfo; +/* + * Information for script_error_callback() + */ +typedef struct +{ + const char *sql; /* entire script file contents */ + const char *filename; /* script file pathname */ + ParseLoc stmt_location; /* current stmt start loc, or -1 if unknown */ + ParseLoc stmt_len; /* length in bytes; 0 means "rest of string" */ +} script_error_callback_arg; + /* Local functions */ static List *find_update_path(List *evi_list, ExtensionVersionInfo *evi_start, @@ -670,9 +682,60 @@ read_extension_script_file(const ExtensionControlFile *control, return dest_str; } +/* + * error context callback for failures in script-file execution + */ +static void +script_error_callback(void *arg) +{ + script_error_callback_arg *callback_arg = (script_error_callback_arg *) arg; + int syntaxerrposition; + const char *lastslash; + + /* If it's a syntax error, convert to internal syntax error report */ + syntaxerrposition = geterrposition(); + if (syntaxerrposition > 0) + { + /* + * We must report the whole string because otherwise details such as + * psql's line number report would be wrong. + */ + errposition(0); + internalerrposition(syntaxerrposition); + internalerrquery(callback_arg->sql); + } + else if (callback_arg->stmt_location >= 0) + { + /* + * Since no syntax cursor will be shown, it's okay and helpful to trim + * the reported query string to just the current statement. + */ + const char *query = callback_arg->sql; + int location = callback_arg->stmt_location; + int len = callback_arg->stmt_len; + + query = CleanQuerytext(query, &location, &len); + internalerrquery(pnstrdup(query, len)); + } + + /* + * Trim the reported file name to remove the path. We know that + * get_extension_script_filename() inserted a '/', regardless of whether + * we're on Windows. + */ + lastslash = strrchr(callback_arg->filename, '/'); + if (lastslash) + lastslash++; + else + lastslash = callback_arg->filename; /* shouldn't happen, but cope */ + errcontext("extension script file \"%s\"", lastslash); +} + /* * Execute given SQL string. * + * The filename the string came from is also provided, for error reporting. + * * Note: it's tempting to just use SPI to execute the string, but that does * not work very well. The really serious problem is that SPI will parse, * analyze, and plan the whole string before executing any of it; of course @@ -682,12 +745,27 @@ read_extension_script_file(const ExtensionControlFile *control, * could be very long. */ static void -execute_sql_string(const char *sql) +execute_sql_string(const char *sql, const char *filename) { + script_error_callback_arg callback_arg; + ErrorContextCallback scripterrcontext; List *raw_parsetree_list; DestReceiver *dest; ListCell *lc1; + /* + * Setup error traceback support for ereport(). + */ + callback_arg.sql = sql; + callback_arg.filename = filename; + callback_arg.stmt_location = -1; + callback_arg.stmt_len = -1; + + scripterrcontext.callback = script_error_callback; + scripterrcontext.arg = (void *) &callback_arg; + scripterrcontext.previous = error_context_stack; + error_context_stack = &scripterrcontext; + /* * Parse the SQL string into a list of raw parse trees. */ @@ -709,6 +787,10 @@ execute_sql_string(const char *sql) List *stmt_list; ListCell *lc2; + /* Report location of this query for error context callback */ + callback_arg.stmt_location = parsetree->stmt_location; + callback_arg.stmt_len = parsetree->stmt_len; + /* * We do the work for each parsetree in a short-lived context, to * limit the memory used when there are many commands in the string. @@ -778,6 +860,8 @@ execute_sql_string(const char *sql) MemoryContextDelete(per_parsetree_context); } + error_context_stack = scripterrcontext.previous; + /* Be sure to advance the command counter after the last script command */ CommandCounterIncrement(); } @@ -1054,7 +1138,7 @@ execute_extension_script(Oid extensionOid, ExtensionControlFile *control, /* And now back to C string */ c_sql = text_to_cstring(DatumGetTextPP(t_sql)); - execute_sql_string(c_sql); + execute_sql_string(c_sql, filename); } PG_FINALLY(); { diff --git a/src/test/modules/test_extensions/expected/test_extensions.out b/src/test/modules/test_extensions/expected/test_extensions.out index f357cc21aa..b6370b3b4c 100644 --- a/src/test/modules/test_extensions/expected/test_extensions.out +++ b/src/test/modules/test_extensions/expected/test_extensions.out @@ -295,6 +295,16 @@ CREATE FUNCTION ext_cor_func() RETURNS text CREATE EXTENSION test_ext_cor; -- fail ERROR: function ext_cor_func() is not a member of extension "test_ext_cor" DETAIL: An extension is not allowed to replace an object that it does not own. +QUERY: /* src/test/modules/test_extensions/test_ext_cor--1.0.sql */ +-- complain if script is sourced in psql, rather than via CREATE EXTENSION + + +-- It's generally bad style to use CREATE OR REPLACE unnecessarily. +-- Test what happens if an extension does it anyway. + +CREATE OR REPLACE FUNCTION ext_cor_func() RETURNS text + AS $$ SELECT 'ext_cor_func: from extension'::text $$ LANGUAGE sql +CONTEXT: extension script file "test_ext_cor--1.0.sql" SELECT ext_cor_func(); ext_cor_func ------------------------ @@ -307,6 +317,9 @@ CREATE VIEW ext_cor_view AS CREATE EXTENSION test_ext_cor; -- fail ERROR: view ext_cor_view is not a member of extension "test_ext_cor" DETAIL: An extension is not allowed to replace an object that it does not own. +QUERY: CREATE OR REPLACE VIEW ext_cor_view AS + SELECT 'ext_cor_view: from extension'::text AS col +CONTEXT: extension script file "test_ext_cor--1.0.sql" SELECT ext_cor_func(); ERROR: function ext_cor_func() does not exist LINE 1: SELECT ext_cor_func(); @@ -323,6 +336,11 @@ CREATE TYPE test_ext_type; CREATE EXTENSION test_ext_cor; -- fail ERROR: type test_ext_type is not a member of extension "test_ext_cor" DETAIL: An extension is not allowed to replace an object that it does not own. +QUERY: -- These are for testing replacement of a shell type/operator, which works +-- enough like an implicit OR REPLACE to be important to check. + +CREATE TYPE test_ext_type AS ENUM('x', 'y') +CONTEXT: extension script file "test_ext_cor--1.0.sql" DROP TYPE test_ext_type; -- this makes a shell "point <<@@ polygon" operator too CREATE OPERATOR @@>> ( PROCEDURE = poly_contain_pt, @@ -331,6 +349,9 @@ CREATE OPERATOR @@>> ( PROCEDURE = poly_contain_pt, CREATE EXTENSION test_ext_cor; -- fail ERROR: operator <<@@(point,polygon) is not a member of extension "test_ext_cor" DETAIL: An extension is not allowed to replace an object that it does not own. +QUERY: CREATE OPERATOR <<@@ ( PROCEDURE = pt_contained_poly, + LEFTARG = point, RIGHTARG = polygon ) +CONTEXT: extension script file "test_ext_cor--1.0.sql" DROP OPERATOR <<@@ (point, polygon); CREATE EXTENSION test_ext_cor; -- now it should work SELECT ext_cor_func(); @@ -379,37 +400,61 @@ CREATE COLLATION ext_cine_coll CREATE EXTENSION test_ext_cine; -- fail ERROR: collation ext_cine_coll is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +QUERY: /* src/test/modules/test_extensions/test_ext_cine--1.0.sql */ +-- complain if script is sourced in psql, rather than via CREATE EXTENSION + + +-- +-- CREATE IF NOT EXISTS is an entirely unsound thing for an extension +-- to be doing, but let's at least plug the major security hole in it. +-- + +CREATE COLLATION IF NOT EXISTS ext_cine_coll + ( LC_COLLATE = "POSIX", LC_CTYPE = "POSIX" ) +CONTEXT: extension script file "test_ext_cine--1.0.sql" DROP COLLATION ext_cine_coll; CREATE MATERIALIZED VIEW ext_cine_mv AS SELECT 11 AS f1; CREATE EXTENSION test_ext_cine; -- fail ERROR: materialized view ext_cine_mv is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +QUERY: CREATE MATERIALIZED VIEW IF NOT EXISTS ext_cine_mv AS SELECT 42 AS f1 +CONTEXT: extension script file "test_ext_cine--1.0.sql" DROP MATERIALIZED VIEW ext_cine_mv; CREATE FOREIGN DATA WRAPPER dummy; CREATE SERVER ext_cine_srv FOREIGN DATA WRAPPER dummy; CREATE EXTENSION test_ext_cine; -- fail ERROR: server ext_cine_srv is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +QUERY: CREATE SERVER IF NOT EXISTS ext_cine_srv FOREIGN DATA WRAPPER ext_cine_fdw +CONTEXT: extension script file "test_ext_cine--1.0.sql" DROP SERVER ext_cine_srv; CREATE SCHEMA ext_cine_schema; CREATE EXTENSION test_ext_cine; -- fail ERROR: schema ext_cine_schema is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +QUERY: CREATE SCHEMA IF NOT EXISTS ext_cine_schema +CONTEXT: extension script file "test_ext_cine--1.0.sql" DROP SCHEMA ext_cine_schema; CREATE SEQUENCE ext_cine_seq; CREATE EXTENSION test_ext_cine; -- fail ERROR: sequence ext_cine_seq is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +QUERY: CREATE SEQUENCE IF NOT EXISTS ext_cine_seq +CONTEXT: extension script file "test_ext_cine--1.0.sql" DROP SEQUENCE ext_cine_seq; CREATE TABLE ext_cine_tab1 (x int); CREATE EXTENSION test_ext_cine; -- fail ERROR: table ext_cine_tab1 is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +QUERY: CREATE TABLE IF NOT EXISTS ext_cine_tab1 (x int) +CONTEXT: extension script file "test_ext_cine--1.0.sql" DROP TABLE ext_cine_tab1; CREATE TABLE ext_cine_tab2 AS SELECT 42 AS y; CREATE EXTENSION test_ext_cine; -- fail ERROR: table ext_cine_tab2 is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +QUERY: CREATE TABLE IF NOT EXISTS ext_cine_tab2 AS SELECT 42 AS y +CONTEXT: extension script file "test_ext_cine--1.0.sql" DROP TABLE ext_cine_tab2; CREATE EXTENSION test_ext_cine; \dx+ test_ext_cine diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list index b6135f0347..9620d3e9d2 100644 --- a/src/tools/pgindent/typedefs.list +++ b/src/tools/pgindent/typedefs.list @@ -3882,6 +3882,7 @@ saophash_hash save_buffer scram_state scram_state_enum +script_error_callback_arg security_class_t sem_t sepgsql_context_info_t
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)
From
Christoph Berg
Date:
Re: Tom Lane > Perhaps. I spent a little more effort on this and added code to > report errors that don't come with an error location. On those, > we don't have any constraints about what to report in the QUERY > field, so I made it trim the string to just the current query > within the script, which makes things quite a bit better. You > can see the results in the test_extensions regression test changes. That looks very good me, thanks! > (It might be worth some effort to trim away comments appearing > just before a command, but I didn't tackle that here.) The "error when psql" comments do look confusing, but I guess in other places the comment just before the query adds valuable context, so I'd say leaving the comments in is ok. Christoph
Christoph Berg <myon@debian.org> writes: > Re: Tom Lane >> (It might be worth some effort to trim away comments appearing >> just before a command, but I didn't tackle that here.) > The "error when psql" comments do look confusing, but I guess in other > places the comment just before the query adds valuable context, so I'd > say leaving the comments in is ok. It looks like if we did want to suppress that, the right fix is to make gram.y track statement start locations more honestly, as in 0002 attached (0001 is the same as before). This'd add a few cycles per grammar nonterminal reduction, which is kind of annoying but probably is negligible in the grand scheme of things. Still, I'd not propose it just for this. But if memory serves, we've had previous complaints about pg_stat_statements failing to strip leading comments from queries, and this'd fix that. I think it likely also improves error cursor positioning for cases involving empty productions --- I'm a tad surprised that no other regression cases changed. regards, tom lane diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c index fab59ad5f6..0fae1332d2 100644 --- a/src/backend/commands/extension.c +++ b/src/backend/commands/extension.c @@ -54,6 +54,7 @@ #include "funcapi.h" #include "mb/pg_wchar.h" #include "miscadmin.h" +#include "nodes/queryjumble.h" #include "storage/fd.h" #include "tcop/utility.h" #include "utils/acl.h" @@ -107,6 +108,17 @@ typedef struct ExtensionVersionInfo struct ExtensionVersionInfo *previous; /* current best predecessor */ } ExtensionVersionInfo; +/* + * Information for script_error_callback() + */ +typedef struct +{ + const char *sql; /* entire script file contents */ + const char *filename; /* script file pathname */ + ParseLoc stmt_location; /* current stmt start loc, or -1 if unknown */ + ParseLoc stmt_len; /* length in bytes; 0 means "rest of string" */ +} script_error_callback_arg; + /* Local functions */ static List *find_update_path(List *evi_list, ExtensionVersionInfo *evi_start, @@ -670,9 +682,60 @@ read_extension_script_file(const ExtensionControlFile *control, return dest_str; } +/* + * error context callback for failures in script-file execution + */ +static void +script_error_callback(void *arg) +{ + script_error_callback_arg *callback_arg = (script_error_callback_arg *) arg; + int syntaxerrposition; + const char *lastslash; + + /* If it's a syntax error, convert to internal syntax error report */ + syntaxerrposition = geterrposition(); + if (syntaxerrposition > 0) + { + /* + * We must report the whole string because otherwise details such as + * psql's line number report would be wrong. + */ + errposition(0); + internalerrposition(syntaxerrposition); + internalerrquery(callback_arg->sql); + } + else if (callback_arg->stmt_location >= 0) + { + /* + * Since no syntax cursor will be shown, it's okay and helpful to trim + * the reported query string to just the current statement. + */ + const char *query = callback_arg->sql; + int location = callback_arg->stmt_location; + int len = callback_arg->stmt_len; + + query = CleanQuerytext(query, &location, &len); + internalerrquery(pnstrdup(query, len)); + } + + /* + * Trim the reported file name to remove the path. We know that + * get_extension_script_filename() inserted a '/', regardless of whether + * we're on Windows. + */ + lastslash = strrchr(callback_arg->filename, '/'); + if (lastslash) + lastslash++; + else + lastslash = callback_arg->filename; /* shouldn't happen, but cope */ + errcontext("extension script file \"%s\"", lastslash); +} + /* * Execute given SQL string. * + * The filename the string came from is also provided, for error reporting. + * * Note: it's tempting to just use SPI to execute the string, but that does * not work very well. The really serious problem is that SPI will parse, * analyze, and plan the whole string before executing any of it; of course @@ -682,12 +745,27 @@ read_extension_script_file(const ExtensionControlFile *control, * could be very long. */ static void -execute_sql_string(const char *sql) +execute_sql_string(const char *sql, const char *filename) { + script_error_callback_arg callback_arg; + ErrorContextCallback scripterrcontext; List *raw_parsetree_list; DestReceiver *dest; ListCell *lc1; + /* + * Setup error traceback support for ereport(). + */ + callback_arg.sql = sql; + callback_arg.filename = filename; + callback_arg.stmt_location = -1; + callback_arg.stmt_len = -1; + + scripterrcontext.callback = script_error_callback; + scripterrcontext.arg = (void *) &callback_arg; + scripterrcontext.previous = error_context_stack; + error_context_stack = &scripterrcontext; + /* * Parse the SQL string into a list of raw parse trees. */ @@ -709,6 +787,10 @@ execute_sql_string(const char *sql) List *stmt_list; ListCell *lc2; + /* Report location of this query for error context callback */ + callback_arg.stmt_location = parsetree->stmt_location; + callback_arg.stmt_len = parsetree->stmt_len; + /* * We do the work for each parsetree in a short-lived context, to * limit the memory used when there are many commands in the string. @@ -778,6 +860,8 @@ execute_sql_string(const char *sql) MemoryContextDelete(per_parsetree_context); } + error_context_stack = scripterrcontext.previous; + /* Be sure to advance the command counter after the last script command */ CommandCounterIncrement(); } @@ -1054,7 +1138,7 @@ execute_extension_script(Oid extensionOid, ExtensionControlFile *control, /* And now back to C string */ c_sql = text_to_cstring(DatumGetTextPP(t_sql)); - execute_sql_string(c_sql); + execute_sql_string(c_sql, filename); } PG_FINALLY(); { diff --git a/src/test/modules/test_extensions/expected/test_extensions.out b/src/test/modules/test_extensions/expected/test_extensions.out index f357cc21aa..b6370b3b4c 100644 --- a/src/test/modules/test_extensions/expected/test_extensions.out +++ b/src/test/modules/test_extensions/expected/test_extensions.out @@ -295,6 +295,16 @@ CREATE FUNCTION ext_cor_func() RETURNS text CREATE EXTENSION test_ext_cor; -- fail ERROR: function ext_cor_func() is not a member of extension "test_ext_cor" DETAIL: An extension is not allowed to replace an object that it does not own. +QUERY: /* src/test/modules/test_extensions/test_ext_cor--1.0.sql */ +-- complain if script is sourced in psql, rather than via CREATE EXTENSION + + +-- It's generally bad style to use CREATE OR REPLACE unnecessarily. +-- Test what happens if an extension does it anyway. + +CREATE OR REPLACE FUNCTION ext_cor_func() RETURNS text + AS $$ SELECT 'ext_cor_func: from extension'::text $$ LANGUAGE sql +CONTEXT: extension script file "test_ext_cor--1.0.sql" SELECT ext_cor_func(); ext_cor_func ------------------------ @@ -307,6 +317,9 @@ CREATE VIEW ext_cor_view AS CREATE EXTENSION test_ext_cor; -- fail ERROR: view ext_cor_view is not a member of extension "test_ext_cor" DETAIL: An extension is not allowed to replace an object that it does not own. +QUERY: CREATE OR REPLACE VIEW ext_cor_view AS + SELECT 'ext_cor_view: from extension'::text AS col +CONTEXT: extension script file "test_ext_cor--1.0.sql" SELECT ext_cor_func(); ERROR: function ext_cor_func() does not exist LINE 1: SELECT ext_cor_func(); @@ -323,6 +336,11 @@ CREATE TYPE test_ext_type; CREATE EXTENSION test_ext_cor; -- fail ERROR: type test_ext_type is not a member of extension "test_ext_cor" DETAIL: An extension is not allowed to replace an object that it does not own. +QUERY: -- These are for testing replacement of a shell type/operator, which works +-- enough like an implicit OR REPLACE to be important to check. + +CREATE TYPE test_ext_type AS ENUM('x', 'y') +CONTEXT: extension script file "test_ext_cor--1.0.sql" DROP TYPE test_ext_type; -- this makes a shell "point <<@@ polygon" operator too CREATE OPERATOR @@>> ( PROCEDURE = poly_contain_pt, @@ -331,6 +349,9 @@ CREATE OPERATOR @@>> ( PROCEDURE = poly_contain_pt, CREATE EXTENSION test_ext_cor; -- fail ERROR: operator <<@@(point,polygon) is not a member of extension "test_ext_cor" DETAIL: An extension is not allowed to replace an object that it does not own. +QUERY: CREATE OPERATOR <<@@ ( PROCEDURE = pt_contained_poly, + LEFTARG = point, RIGHTARG = polygon ) +CONTEXT: extension script file "test_ext_cor--1.0.sql" DROP OPERATOR <<@@ (point, polygon); CREATE EXTENSION test_ext_cor; -- now it should work SELECT ext_cor_func(); @@ -379,37 +400,61 @@ CREATE COLLATION ext_cine_coll CREATE EXTENSION test_ext_cine; -- fail ERROR: collation ext_cine_coll is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +QUERY: /* src/test/modules/test_extensions/test_ext_cine--1.0.sql */ +-- complain if script is sourced in psql, rather than via CREATE EXTENSION + + +-- +-- CREATE IF NOT EXISTS is an entirely unsound thing for an extension +-- to be doing, but let's at least plug the major security hole in it. +-- + +CREATE COLLATION IF NOT EXISTS ext_cine_coll + ( LC_COLLATE = "POSIX", LC_CTYPE = "POSIX" ) +CONTEXT: extension script file "test_ext_cine--1.0.sql" DROP COLLATION ext_cine_coll; CREATE MATERIALIZED VIEW ext_cine_mv AS SELECT 11 AS f1; CREATE EXTENSION test_ext_cine; -- fail ERROR: materialized view ext_cine_mv is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +QUERY: CREATE MATERIALIZED VIEW IF NOT EXISTS ext_cine_mv AS SELECT 42 AS f1 +CONTEXT: extension script file "test_ext_cine--1.0.sql" DROP MATERIALIZED VIEW ext_cine_mv; CREATE FOREIGN DATA WRAPPER dummy; CREATE SERVER ext_cine_srv FOREIGN DATA WRAPPER dummy; CREATE EXTENSION test_ext_cine; -- fail ERROR: server ext_cine_srv is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +QUERY: CREATE SERVER IF NOT EXISTS ext_cine_srv FOREIGN DATA WRAPPER ext_cine_fdw +CONTEXT: extension script file "test_ext_cine--1.0.sql" DROP SERVER ext_cine_srv; CREATE SCHEMA ext_cine_schema; CREATE EXTENSION test_ext_cine; -- fail ERROR: schema ext_cine_schema is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +QUERY: CREATE SCHEMA IF NOT EXISTS ext_cine_schema +CONTEXT: extension script file "test_ext_cine--1.0.sql" DROP SCHEMA ext_cine_schema; CREATE SEQUENCE ext_cine_seq; CREATE EXTENSION test_ext_cine; -- fail ERROR: sequence ext_cine_seq is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +QUERY: CREATE SEQUENCE IF NOT EXISTS ext_cine_seq +CONTEXT: extension script file "test_ext_cine--1.0.sql" DROP SEQUENCE ext_cine_seq; CREATE TABLE ext_cine_tab1 (x int); CREATE EXTENSION test_ext_cine; -- fail ERROR: table ext_cine_tab1 is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +QUERY: CREATE TABLE IF NOT EXISTS ext_cine_tab1 (x int) +CONTEXT: extension script file "test_ext_cine--1.0.sql" DROP TABLE ext_cine_tab1; CREATE TABLE ext_cine_tab2 AS SELECT 42 AS y; CREATE EXTENSION test_ext_cine; -- fail ERROR: table ext_cine_tab2 is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +QUERY: CREATE TABLE IF NOT EXISTS ext_cine_tab2 AS SELECT 42 AS y +CONTEXT: extension script file "test_ext_cine--1.0.sql" DROP TABLE ext_cine_tab2; CREATE EXTENSION test_ext_cine; \dx+ test_ext_cine diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list index b6135f0347..9620d3e9d2 100644 --- a/src/tools/pgindent/typedefs.list +++ b/src/tools/pgindent/typedefs.list @@ -3882,6 +3882,7 @@ saophash_hash save_buffer scram_state scram_state_enum +script_error_callback_arg security_class_t sem_t sepgsql_context_info_t diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index b1d4642c59..4d5f197a1c 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -67,39 +67,23 @@ /* - * Location tracking support --- simpler than bison's default, since we only - * want to track the start position not the end position of each nonterminal. + * Location tracking support. Unlike bison's default, we only want + * to track the start position not the end position of each nonterminal. + * Nonterminals that reduce to empty receive position "-1". */ #define YYLLOC_DEFAULT(Current, Rhs, N) \ do { \ - if ((N) > 0) \ - (Current) = (Rhs)[1]; \ - else \ - (Current) = (-1); \ + (Current) = (-1); \ + for (int _i_ = 1; _i_ <= (N); _i_++) \ + { \ + if ((Rhs)[_i_] >= 0) \ + { \ + (Current) = (Rhs)[_i_]; \ + break; \ + } \ + } \ } while (0) -/* - * The above macro assigns -1 (unknown) as the parse location of any - * nonterminal that was reduced from an empty rule, or whose leftmost - * component was reduced from an empty rule. This is problematic - * for nonterminals defined like - * OptFooList: / * EMPTY * / { ... } | OptFooList Foo { ... } ; - * because we'll set -1 as the location during the first reduction and then - * copy it during each subsequent reduction, leaving us with -1 for the - * location even when the list is not empty. To fix that, do this in the - * action for the nonempty rule(s): - * if (@$ < 0) @$ = @2; - * (Although we have many nonterminals that follow this pattern, we only - * bother with fixing @$ like this when the nonterminal's parse location - * is actually referenced in some rule.) - * - * A cleaner answer would be to make YYLLOC_DEFAULT scan all the Rhs - * locations until it's found one that's not -1. Then we'd get a correct - * location for any nonterminal that isn't entirely empty. But this way - * would add overhead to every rule reduction, and so far there's not been - * a compelling reason to pay that overhead. - */ - /* * Bison doesn't allocate anything that needs to live across parser calls, * so we can easily have it use palloc instead of malloc. This prevents @@ -930,7 +914,7 @@ parse_toplevel: | MODE_PLPGSQL_EXPR PLpgSQL_Expr { pg_yyget_extra(yyscanner)->parsetree = - list_make1(makeRawStmt($2, 0)); + list_make1(makeRawStmt($2, @2)); } | MODE_PLPGSQL_ASSIGN1 PLAssignStmt { @@ -938,7 +922,7 @@ parse_toplevel: n->nnames = 1; pg_yyget_extra(yyscanner)->parsetree = - list_make1(makeRawStmt((Node *) n, 0)); + list_make1(makeRawStmt((Node *) n, @2)); } | MODE_PLPGSQL_ASSIGN2 PLAssignStmt { @@ -946,7 +930,7 @@ parse_toplevel: n->nnames = 2; pg_yyget_extra(yyscanner)->parsetree = - list_make1(makeRawStmt((Node *) n, 0)); + list_make1(makeRawStmt((Node *) n, @2)); } | MODE_PLPGSQL_ASSIGN3 PLAssignStmt { @@ -954,18 +938,13 @@ parse_toplevel: n->nnames = 3; pg_yyget_extra(yyscanner)->parsetree = - list_make1(makeRawStmt((Node *) n, 0)); + list_make1(makeRawStmt((Node *) n, @2)); } ; /* * At top level, we wrap each stmt with a RawStmt node carrying start location - * and length of the stmt's text. Notice that the start loc/len are driven - * entirely from semicolon locations (@2). It would seem natural to use - * @1 or @3 to get the true start location of a stmt, but that doesn't work - * for statements that can start with empty nonterminals (opt_with_clause is - * the main offender here); as noted in the comments for YYLLOC_DEFAULT, - * we'd get -1 for the location in such cases. + * and length of the stmt's text. * We also take care to discard empty statements entirely. */ stmtmulti: stmtmulti ';' toplevel_stmt @@ -976,14 +955,14 @@ stmtmulti: stmtmulti ';' toplevel_stmt updateRawStmtEnd(llast_node(RawStmt, $1), @2); } if ($3 != NULL) - $$ = lappend($1, makeRawStmt($3, @2 + 1)); + $$ = lappend($1, makeRawStmt($3, @3)); else $$ = $1; } | toplevel_stmt { if ($1 != NULL) - $$ = list_make1(makeRawStmt($1, 0)); + $$ = list_make1(makeRawStmt($1, @1)); else $$ = NIL; } @@ -1584,8 +1563,6 @@ CreateSchemaStmt: OptSchemaEltList: OptSchemaEltList schema_stmt { - if (@$ < 0) /* see comments for YYLLOC_DEFAULT */ - @$ = @2; $$ = lappend($1, $2); } | /* EMPTY */ diff --git a/src/test/modules/test_extensions/expected/test_extensions.out b/src/test/modules/test_extensions/expected/test_extensions.out index b6370b3b4c..f77760d58e 100644 --- a/src/test/modules/test_extensions/expected/test_extensions.out +++ b/src/test/modules/test_extensions/expected/test_extensions.out @@ -295,14 +295,7 @@ CREATE FUNCTION ext_cor_func() RETURNS text CREATE EXTENSION test_ext_cor; -- fail ERROR: function ext_cor_func() is not a member of extension "test_ext_cor" DETAIL: An extension is not allowed to replace an object that it does not own. -QUERY: /* src/test/modules/test_extensions/test_ext_cor--1.0.sql */ --- complain if script is sourced in psql, rather than via CREATE EXTENSION - - --- It's generally bad style to use CREATE OR REPLACE unnecessarily. --- Test what happens if an extension does it anyway. - -CREATE OR REPLACE FUNCTION ext_cor_func() RETURNS text +QUERY: CREATE OR REPLACE FUNCTION ext_cor_func() RETURNS text AS $$ SELECT 'ext_cor_func: from extension'::text $$ LANGUAGE sql CONTEXT: extension script file "test_ext_cor--1.0.sql" SELECT ext_cor_func(); @@ -336,10 +329,7 @@ CREATE TYPE test_ext_type; CREATE EXTENSION test_ext_cor; -- fail ERROR: type test_ext_type is not a member of extension "test_ext_cor" DETAIL: An extension is not allowed to replace an object that it does not own. -QUERY: -- These are for testing replacement of a shell type/operator, which works --- enough like an implicit OR REPLACE to be important to check. - -CREATE TYPE test_ext_type AS ENUM('x', 'y') +QUERY: CREATE TYPE test_ext_type AS ENUM('x', 'y') CONTEXT: extension script file "test_ext_cor--1.0.sql" DROP TYPE test_ext_type; -- this makes a shell "point <<@@ polygon" operator too @@ -400,16 +390,7 @@ CREATE COLLATION ext_cine_coll CREATE EXTENSION test_ext_cine; -- fail ERROR: collation ext_cine_coll is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. -QUERY: /* src/test/modules/test_extensions/test_ext_cine--1.0.sql */ --- complain if script is sourced in psql, rather than via CREATE EXTENSION - - --- --- CREATE IF NOT EXISTS is an entirely unsound thing for an extension --- to be doing, but let's at least plug the major security hole in it. --- - -CREATE COLLATION IF NOT EXISTS ext_cine_coll +QUERY: CREATE COLLATION IF NOT EXISTS ext_cine_coll ( LC_COLLATE = "POSIX", LC_CTYPE = "POSIX" ) CONTEXT: extension script file "test_ext_cine--1.0.sql" DROP COLLATION ext_cine_coll;
I wrote: > It looks like if we did want to suppress that, the right fix is to > make gram.y track statement start locations more honestly, as in > 0002 attached (0001 is the same as before). I did a little bit of further work on this: * I ran some performance checks and convinced myself that indeed the more complex definition of YYLLOC_DEFAULT has no measurable cost compared to the overall runtime of raw_parser(), never mind later processing. So I see no reason not to go ahead with that change. I swapped the two patches to make that 0001, and added a regression test illustrating its effect on pg_stat_statements. (Without the gram.y change, the added slash-star comment shows up in the pg_stat_statements output, which is surely odd.) * I concluded that the best way to report the individual statement when we're able to do that is to include it in an errcontext() message, similar to what spi.c's _SPI_error_callback does. Otherwise it interacts badly if some more-tightly-nested error context function has already set up an "internal error query", as for example SQL function errors will do if you enable check_function_bodies = on. So here's v3, now with commit messages and regression tests. I feel like this is out of the proof-of-concept stage and might now actually be committable. There's still a question of whether reporting the whole script as the query is OK when we have a syntax error, but I have no good ideas as to how to make that terser. And I think you're right that we shouldn't let perfection stand in the way of making things noticeably better. regards, tom lane From f6022f120c6c8e16f33a7b7d220d0f3a8b7ebf4e Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sat, 28 Sep 2024 13:48:39 -0400 Subject: [PATCH v3 1/2] Improve parser's reporting of statement start locations. Up to now, the parser's reporting of a statement's stmt_location included any preceding whitespace or comments. This isn't really desirable but was done to avoid accounting honestly for nonterminals that reduce to empty. It causes problems for pg_stat_statements, which partially compensates by manually stripping whitespace, but is not bright enough to strip /*-style comments. There will be more problems with an upcoming patch to improve reporting of errors in extension scripts, so it's time to do something about this. The thing we have to do to make it work right is to adjust YYLLOC_DEFAULT to scan the inputs of each production to find the first one that has a valid location (i.e., did not reduce to empty). In theory this adds a little bit of per-reduction overhead, but in practice it's negligible. I checked by measuring the time to run raw_parser() on the contents of information_schema.sql, and there was basically no change. Having done that, we can rely on any nonterminal that didn't reduce to completely empty to have a correct starting location, and we don't need the kluges the stmtmulti production formerly used. This should have a side benefit of allowing parse error reports to include an error position in some cases where they formerly failed to do so, due to trying to report the position of an empty nonterminal. I did not go looking for an example though. The one previously known case where that could happen (OptSchemaEltList) no longer needs the kluge it had; but I rather doubt that that was the only case. Discussion: https://postgr.es/m/ZvV1ClhnbJLCz7Sm@msg.df7cb.de --- .../pg_stat_statements/expected/select.out | 5 +- contrib/pg_stat_statements/sql/select.sql | 3 +- src/backend/nodes/queryjumblefuncs.c | 6 ++ src/backend/parser/gram.y | 66 +++++++------------ 4 files changed, 34 insertions(+), 46 deletions(-) diff --git a/contrib/pg_stat_statements/expected/select.out b/contrib/pg_stat_statements/expected/select.out index dd6c756f67..e0e2fa265c 100644 --- a/contrib/pg_stat_statements/expected/select.out +++ b/contrib/pg_stat_statements/expected/select.out @@ -19,8 +19,9 @@ SELECT 1 AS "int"; 1 (1 row) +/* this comment should not appear in the output */ SELECT 'hello' - -- multiline + -- but this one will appear AS "text"; text ------- @@ -129,7 +130,7 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; -------+------+------------------------------------------------------------------------------ 1 | 1 | PREPARE pgss_test (int) AS SELECT $1, $2 LIMIT $3 4 | 4 | SELECT $1 + - | | -- multiline + + | | -- but this one will appear + | | AS "text" 2 | 2 | SELECT $1 + $2 3 | 3 | SELECT $1 + $2 + $3 AS "add" diff --git a/contrib/pg_stat_statements/sql/select.sql b/contrib/pg_stat_statements/sql/select.sql index eb45cb81ad..e0be58d5e2 100644 --- a/contrib/pg_stat_statements/sql/select.sql +++ b/contrib/pg_stat_statements/sql/select.sql @@ -12,8 +12,9 @@ SELECT pg_stat_statements_reset() IS NOT NULL AS t; -- SELECT 1 AS "int"; +/* this comment should not appear in the output */ SELECT 'hello' - -- multiline + -- but this one will appear AS "text"; SELECT 'world' AS "text"; diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c index 129fb44709..f019f6388d 100644 --- a/src/backend/nodes/queryjumblefuncs.c +++ b/src/backend/nodes/queryjumblefuncs.c @@ -89,6 +89,12 @@ CleanQuerytext(const char *query, int *location, int *len) /* * Discard leading and trailing whitespace, too. Use scanner_isspace() * not libc's isspace(), because we want to match the lexer's behavior. + * + * Note: the parser now strips leading comments and whitespace from the + * reported stmt_location, so this first loop will only iterate in the + * unusual case that the location didn't propagate to here. But the + * statement length will extend to the end-of-string or terminating + * semicolon, so the second loop often does something useful. */ while (query_len > 0 && scanner_isspace(query[0])) query++, query_location++, query_len--; diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index b1d4642c59..c0faacfac0 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -67,39 +67,25 @@ /* - * Location tracking support --- simpler than bison's default, since we only - * want to track the start position not the end position of each nonterminal. + * Location tracking support. Unlike bison's default, we only want + * to track the start position not the end position of each nonterminal. + * Nonterminals that reduce to empty receive position "-1". Since a + * production's leading RHS nonterminal(s) may have reduced to empty, + * we have to scan to find the first one that's not -1. */ #define YYLLOC_DEFAULT(Current, Rhs, N) \ do { \ - if ((N) > 0) \ - (Current) = (Rhs)[1]; \ - else \ - (Current) = (-1); \ + (Current) = (-1); \ + for (int _i = 1; _i <= (N); _i++) \ + { \ + if ((Rhs)[_i] >= 0) \ + { \ + (Current) = (Rhs)[_i]; \ + break; \ + } \ + } \ } while (0) -/* - * The above macro assigns -1 (unknown) as the parse location of any - * nonterminal that was reduced from an empty rule, or whose leftmost - * component was reduced from an empty rule. This is problematic - * for nonterminals defined like - * OptFooList: / * EMPTY * / { ... } | OptFooList Foo { ... } ; - * because we'll set -1 as the location during the first reduction and then - * copy it during each subsequent reduction, leaving us with -1 for the - * location even when the list is not empty. To fix that, do this in the - * action for the nonempty rule(s): - * if (@$ < 0) @$ = @2; - * (Although we have many nonterminals that follow this pattern, we only - * bother with fixing @$ like this when the nonterminal's parse location - * is actually referenced in some rule.) - * - * A cleaner answer would be to make YYLLOC_DEFAULT scan all the Rhs - * locations until it's found one that's not -1. Then we'd get a correct - * location for any nonterminal that isn't entirely empty. But this way - * would add overhead to every rule reduction, and so far there's not been - * a compelling reason to pay that overhead. - */ - /* * Bison doesn't allocate anything that needs to live across parser calls, * so we can easily have it use palloc instead of malloc. This prevents @@ -930,7 +916,7 @@ parse_toplevel: | MODE_PLPGSQL_EXPR PLpgSQL_Expr { pg_yyget_extra(yyscanner)->parsetree = - list_make1(makeRawStmt($2, 0)); + list_make1(makeRawStmt($2, @2)); } | MODE_PLPGSQL_ASSIGN1 PLAssignStmt { @@ -938,7 +924,7 @@ parse_toplevel: n->nnames = 1; pg_yyget_extra(yyscanner)->parsetree = - list_make1(makeRawStmt((Node *) n, 0)); + list_make1(makeRawStmt((Node *) n, @2)); } | MODE_PLPGSQL_ASSIGN2 PLAssignStmt { @@ -946,7 +932,7 @@ parse_toplevel: n->nnames = 2; pg_yyget_extra(yyscanner)->parsetree = - list_make1(makeRawStmt((Node *) n, 0)); + list_make1(makeRawStmt((Node *) n, @2)); } | MODE_PLPGSQL_ASSIGN3 PLAssignStmt { @@ -954,19 +940,15 @@ parse_toplevel: n->nnames = 3; pg_yyget_extra(yyscanner)->parsetree = - list_make1(makeRawStmt((Node *) n, 0)); + list_make1(makeRawStmt((Node *) n, @2)); } ; /* * At top level, we wrap each stmt with a RawStmt node carrying start location - * and length of the stmt's text. Notice that the start loc/len are driven - * entirely from semicolon locations (@2). It would seem natural to use - * @1 or @3 to get the true start location of a stmt, but that doesn't work - * for statements that can start with empty nonterminals (opt_with_clause is - * the main offender here); as noted in the comments for YYLLOC_DEFAULT, - * we'd get -1 for the location in such cases. - * We also take care to discard empty statements entirely. + * and length of the stmt's text. + * We also take care to discard empty statements entirely (which among other + * things dodges the problem of assigning them a location). */ stmtmulti: stmtmulti ';' toplevel_stmt { @@ -976,14 +958,14 @@ stmtmulti: stmtmulti ';' toplevel_stmt updateRawStmtEnd(llast_node(RawStmt, $1), @2); } if ($3 != NULL) - $$ = lappend($1, makeRawStmt($3, @2 + 1)); + $$ = lappend($1, makeRawStmt($3, @3)); else $$ = $1; } | toplevel_stmt { if ($1 != NULL) - $$ = list_make1(makeRawStmt($1, 0)); + $$ = list_make1(makeRawStmt($1, @1)); else $$ = NIL; } @@ -1584,8 +1566,6 @@ CreateSchemaStmt: OptSchemaEltList: OptSchemaEltList schema_stmt { - if (@$ < 0) /* see comments for YYLLOC_DEFAULT */ - @$ = @2; $$ = lappend($1, $2); } | /* EMPTY */ -- 2.43.5 From eb0f55f02fc54e4777cf84b92838c4cb9c749855 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Sat, 28 Sep 2024 14:59:26 -0400 Subject: [PATCH v3 2/2] Improve reporting of errors in extension script files. Previously, CREATE/ALTER EXTENSION gave basically no useful context about errors reported while executing script files. I think the idea was that you could run the same commands manually to see the error, but that's often quite inconvenient. Let's improve that. In the case of a syntax error (or anything reported with an error cursor), just report the whole script file as the "internal query". This might be annoyingly long, but it seems more difficult than it's worth to try to shorten it to something relevant. There are interactions with libpq's reportErrorPosition logic to worry about, for example: it seems like a good idea that the line number that that code reports match up with the true line number within the script. Possibly someone will come back to this area later. Without an error cursor, if we have gotten past raw parsing (which we probably have), we can report just the current SQL statement as an item of error context. In any case also report the script file name as error context, since it might not be entirely obvious which of a series of update scripts failed. The error-context code path is already exercised by some test_extensions test cases, but add tests for the syntax-error path. Discussion: https://postgr.es/m/ZvV1ClhnbJLCz7Sm@msg.df7cb.de --- src/backend/commands/extension.c | 91 ++++++++++++++++++- src/test/modules/test_extensions/Makefile | 4 +- .../expected/test_extensions.out | 52 +++++++++++ src/test/modules/test_extensions/meson.build | 2 + .../test_extensions/sql/test_extensions.sql | 4 + .../test_ext7--2.0--2.1bad.sql | 11 +++ .../test_ext7--2.0--2.2bad.sql | 13 +++ src/tools/pgindent/typedefs.list | 1 + 8 files changed, 175 insertions(+), 3 deletions(-) create mode 100644 src/test/modules/test_extensions/test_ext7--2.0--2.1bad.sql create mode 100644 src/test/modules/test_extensions/test_ext7--2.0--2.2bad.sql diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c index fab59ad5f6..ed5ab2c556 100644 --- a/src/backend/commands/extension.c +++ b/src/backend/commands/extension.c @@ -54,6 +54,7 @@ #include "funcapi.h" #include "mb/pg_wchar.h" #include "miscadmin.h" +#include "nodes/queryjumble.h" #include "storage/fd.h" #include "tcop/utility.h" #include "utils/acl.h" @@ -107,6 +108,17 @@ typedef struct ExtensionVersionInfo struct ExtensionVersionInfo *previous; /* current best predecessor */ } ExtensionVersionInfo; +/* + * Information for script_error_callback() + */ +typedef struct +{ + const char *sql; /* entire script file contents */ + const char *filename; /* script file pathname */ + ParseLoc stmt_location; /* current stmt start loc, or -1 if unknown */ + ParseLoc stmt_len; /* length in bytes; 0 means "rest of string" */ +} script_error_callback_arg; + /* Local functions */ static List *find_update_path(List *evi_list, ExtensionVersionInfo *evi_start, @@ -670,9 +682,63 @@ read_extension_script_file(const ExtensionControlFile *control, return dest_str; } +/* + * error context callback for failures in script-file execution + */ +static void +script_error_callback(void *arg) +{ + script_error_callback_arg *callback_arg = (script_error_callback_arg *) arg; + int syntaxerrposition; + const char *lastslash; + + /* + * If there is a syntax error position, convert to internal syntax error; + * otherwise report the current query as an item of context stack + */ + syntaxerrposition = geterrposition(); + if (syntaxerrposition > 0) + { + /* + * We must report the whole string because otherwise details such as + * psql's line number report would be wrong. + */ + errposition(0); + internalerrposition(syntaxerrposition); + internalerrquery(callback_arg->sql); + } + else if (callback_arg->stmt_location >= 0) + { + /* + * Since no syntax cursor will be shown, it's okay and helpful to trim + * the reported query string to just the current statement. + */ + const char *query = callback_arg->sql; + int location = callback_arg->stmt_location; + int len = callback_arg->stmt_len; + + query = CleanQuerytext(query, &location, &len); + errcontext("SQL statement \"%.*s\"", len, query); + } + + /* + * Trim the reported file name to remove the path. We know that + * get_extension_script_filename() inserted a '/', regardless of whether + * we're on Windows. + */ + lastslash = strrchr(callback_arg->filename, '/'); + if (lastslash) + lastslash++; + else + lastslash = callback_arg->filename; /* shouldn't happen, but cope */ + errcontext("extension script file \"%s\"", lastslash); +} + /* * Execute given SQL string. * + * The filename the string came from is also provided, for error reporting. + * * Note: it's tempting to just use SPI to execute the string, but that does * not work very well. The really serious problem is that SPI will parse, * analyze, and plan the whole string before executing any of it; of course @@ -682,12 +748,27 @@ read_extension_script_file(const ExtensionControlFile *control, * could be very long. */ static void -execute_sql_string(const char *sql) +execute_sql_string(const char *sql, const char *filename) { + script_error_callback_arg callback_arg; + ErrorContextCallback scripterrcontext; List *raw_parsetree_list; DestReceiver *dest; ListCell *lc1; + /* + * Setup error traceback support for ereport(). + */ + callback_arg.sql = sql; + callback_arg.filename = filename; + callback_arg.stmt_location = -1; + callback_arg.stmt_len = -1; + + scripterrcontext.callback = script_error_callback; + scripterrcontext.arg = (void *) &callback_arg; + scripterrcontext.previous = error_context_stack; + error_context_stack = &scripterrcontext; + /* * Parse the SQL string into a list of raw parse trees. */ @@ -709,6 +790,10 @@ execute_sql_string(const char *sql) List *stmt_list; ListCell *lc2; + /* Report location of this query for error context callback */ + callback_arg.stmt_location = parsetree->stmt_location; + callback_arg.stmt_len = parsetree->stmt_len; + /* * We do the work for each parsetree in a short-lived context, to * limit the memory used when there are many commands in the string. @@ -778,6 +863,8 @@ execute_sql_string(const char *sql) MemoryContextDelete(per_parsetree_context); } + error_context_stack = scripterrcontext.previous; + /* Be sure to advance the command counter after the last script command */ CommandCounterIncrement(); } @@ -1054,7 +1141,7 @@ execute_extension_script(Oid extensionOid, ExtensionControlFile *control, /* And now back to C string */ c_sql = text_to_cstring(DatumGetTextPP(t_sql)); - execute_sql_string(c_sql); + execute_sql_string(c_sql, filename); } PG_FINALLY(); { diff --git a/src/test/modules/test_extensions/Makefile b/src/test/modules/test_extensions/Makefile index 05272e6a40..1dbec14cba 100644 --- a/src/test/modules/test_extensions/Makefile +++ b/src/test/modules/test_extensions/Makefile @@ -13,7 +13,9 @@ EXTENSION = test_ext1 test_ext2 test_ext3 test_ext4 test_ext5 test_ext6 \ DATA = test_ext1--1.0.sql test_ext2--1.0.sql test_ext3--1.0.sql \ test_ext4--1.0.sql test_ext5--1.0.sql test_ext6--1.0.sql \ - test_ext7--1.0.sql test_ext7--1.0--2.0.sql test_ext8--1.0.sql \ + test_ext7--1.0.sql test_ext7--1.0--2.0.sql \ + test_ext7--2.0--2.1bad.sql test_ext7--2.0--2.2bad.sql \ + test_ext8--1.0.sql \ test_ext9--1.0.sql \ test_ext_cine--1.0.sql test_ext_cine--1.0--1.1.sql \ test_ext_cor--1.0.sql \ diff --git a/src/test/modules/test_extensions/expected/test_extensions.out b/src/test/modules/test_extensions/expected/test_extensions.out index f357cc21aa..dfe0596888 100644 --- a/src/test/modules/test_extensions/expected/test_extensions.out +++ b/src/test/modules/test_extensions/expected/test_extensions.out @@ -72,6 +72,32 @@ Objects in extension "test_ext7" type ext7_table2[] (4 rows) +-- test reporting of errors in extension scripts +alter extension test_ext7 update to '2.1bad'; +ERROR: syntax error at or near "FUNCTIN" +LINE 7: CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE S... + ^ +QUERY: /* src/test/modules/test_extensions/test_ext7--2.0--2.1bad.sql */ + +-- complain if script is sourced in psql, rather than via ALTER EXTENSION + + +-- test reporting of a simple syntax error in an extension script +CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL +AS $$ SELECT $1 + 1 $$; + +CREATE FUNCTION my_other_func(int) RETURNS int LANGUAGE SQL +AS $$ SELECT $1 + 1 $$; + +CONTEXT: extension script file "test_ext7--2.0--2.1bad.sql" +alter extension test_ext7 update to '2.2bad'; +ERROR: syntax error at or near "," +LINE 1: SELECT $1 + , 1 + ^ +QUERY: SELECT $1 + , 1 +CONTEXT: SQL statement "CREATE FUNCTION my_erroneous_func(int) RETURNS int LANGUAGE SQL +AS $$ SELECT $1 + , 1 $$" +extension script file "test_ext7--2.0--2.2bad.sql" -- test handling of temp objects created by extensions create extension test_ext8; -- \dx+ would expose a variable pg_temp_nn schema name, so we can't use it here @@ -295,6 +321,9 @@ CREATE FUNCTION ext_cor_func() RETURNS text CREATE EXTENSION test_ext_cor; -- fail ERROR: function ext_cor_func() is not a member of extension "test_ext_cor" DETAIL: An extension is not allowed to replace an object that it does not own. +CONTEXT: SQL statement "CREATE OR REPLACE FUNCTION ext_cor_func() RETURNS text + AS $$ SELECT 'ext_cor_func: from extension'::text $$ LANGUAGE sql" +extension script file "test_ext_cor--1.0.sql" SELECT ext_cor_func(); ext_cor_func ------------------------ @@ -307,6 +336,9 @@ CREATE VIEW ext_cor_view AS CREATE EXTENSION test_ext_cor; -- fail ERROR: view ext_cor_view is not a member of extension "test_ext_cor" DETAIL: An extension is not allowed to replace an object that it does not own. +CONTEXT: SQL statement "CREATE OR REPLACE VIEW ext_cor_view AS + SELECT 'ext_cor_view: from extension'::text AS col" +extension script file "test_ext_cor--1.0.sql" SELECT ext_cor_func(); ERROR: function ext_cor_func() does not exist LINE 1: SELECT ext_cor_func(); @@ -323,6 +355,8 @@ CREATE TYPE test_ext_type; CREATE EXTENSION test_ext_cor; -- fail ERROR: type test_ext_type is not a member of extension "test_ext_cor" DETAIL: An extension is not allowed to replace an object that it does not own. +CONTEXT: SQL statement "CREATE TYPE test_ext_type AS ENUM('x', 'y')" +extension script file "test_ext_cor--1.0.sql" DROP TYPE test_ext_type; -- this makes a shell "point <<@@ polygon" operator too CREATE OPERATOR @@>> ( PROCEDURE = poly_contain_pt, @@ -331,6 +365,9 @@ CREATE OPERATOR @@>> ( PROCEDURE = poly_contain_pt, CREATE EXTENSION test_ext_cor; -- fail ERROR: operator <<@@(point,polygon) is not a member of extension "test_ext_cor" DETAIL: An extension is not allowed to replace an object that it does not own. +CONTEXT: SQL statement "CREATE OPERATOR <<@@ ( PROCEDURE = pt_contained_poly, + LEFTARG = point, RIGHTARG = polygon )" +extension script file "test_ext_cor--1.0.sql" DROP OPERATOR <<@@ (point, polygon); CREATE EXTENSION test_ext_cor; -- now it should work SELECT ext_cor_func(); @@ -379,37 +416,52 @@ CREATE COLLATION ext_cine_coll CREATE EXTENSION test_ext_cine; -- fail ERROR: collation ext_cine_coll is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +CONTEXT: SQL statement "CREATE COLLATION IF NOT EXISTS ext_cine_coll + ( LC_COLLATE = "POSIX", LC_CTYPE = "POSIX" )" +extension script file "test_ext_cine--1.0.sql" DROP COLLATION ext_cine_coll; CREATE MATERIALIZED VIEW ext_cine_mv AS SELECT 11 AS f1; CREATE EXTENSION test_ext_cine; -- fail ERROR: materialized view ext_cine_mv is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +CONTEXT: SQL statement "CREATE MATERIALIZED VIEW IF NOT EXISTS ext_cine_mv AS SELECT 42 AS f1" +extension script file "test_ext_cine--1.0.sql" DROP MATERIALIZED VIEW ext_cine_mv; CREATE FOREIGN DATA WRAPPER dummy; CREATE SERVER ext_cine_srv FOREIGN DATA WRAPPER dummy; CREATE EXTENSION test_ext_cine; -- fail ERROR: server ext_cine_srv is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +CONTEXT: SQL statement "CREATE SERVER IF NOT EXISTS ext_cine_srv FOREIGN DATA WRAPPER ext_cine_fdw" +extension script file "test_ext_cine--1.0.sql" DROP SERVER ext_cine_srv; CREATE SCHEMA ext_cine_schema; CREATE EXTENSION test_ext_cine; -- fail ERROR: schema ext_cine_schema is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +CONTEXT: SQL statement "CREATE SCHEMA IF NOT EXISTS ext_cine_schema" +extension script file "test_ext_cine--1.0.sql" DROP SCHEMA ext_cine_schema; CREATE SEQUENCE ext_cine_seq; CREATE EXTENSION test_ext_cine; -- fail ERROR: sequence ext_cine_seq is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +CONTEXT: SQL statement "CREATE SEQUENCE IF NOT EXISTS ext_cine_seq" +extension script file "test_ext_cine--1.0.sql" DROP SEQUENCE ext_cine_seq; CREATE TABLE ext_cine_tab1 (x int); CREATE EXTENSION test_ext_cine; -- fail ERROR: table ext_cine_tab1 is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +CONTEXT: SQL statement "CREATE TABLE IF NOT EXISTS ext_cine_tab1 (x int)" +extension script file "test_ext_cine--1.0.sql" DROP TABLE ext_cine_tab1; CREATE TABLE ext_cine_tab2 AS SELECT 42 AS y; CREATE EXTENSION test_ext_cine; -- fail ERROR: table ext_cine_tab2 is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +CONTEXT: SQL statement "CREATE TABLE IF NOT EXISTS ext_cine_tab2 AS SELECT 42 AS y" +extension script file "test_ext_cine--1.0.sql" DROP TABLE ext_cine_tab2; CREATE EXTENSION test_ext_cine; \dx+ test_ext_cine diff --git a/src/test/modules/test_extensions/meson.build b/src/test/modules/test_extensions/meson.build index c5f3424da5..2b31cbf425 100644 --- a/src/test/modules/test_extensions/meson.build +++ b/src/test/modules/test_extensions/meson.build @@ -15,6 +15,8 @@ test_install_data += files( 'test_ext6.control', 'test_ext7--1.0--2.0.sql', 'test_ext7--1.0.sql', + 'test_ext7--2.0--2.1bad.sql', + 'test_ext7--2.0--2.2bad.sql', 'test_ext7.control', 'test_ext8--1.0.sql', 'test_ext8.control', diff --git a/src/test/modules/test_extensions/sql/test_extensions.sql b/src/test/modules/test_extensions/sql/test_extensions.sql index 642c82ff5d..b5878f6f80 100644 --- a/src/test/modules/test_extensions/sql/test_extensions.sql +++ b/src/test/modules/test_extensions/sql/test_extensions.sql @@ -28,6 +28,10 @@ create extension test_ext7; alter extension test_ext7 update to '2.0'; \dx+ test_ext7 +-- test reporting of errors in extension scripts +alter extension test_ext7 update to '2.1bad'; +alter extension test_ext7 update to '2.2bad'; + -- test handling of temp objects created by extensions create extension test_ext8; diff --git a/src/test/modules/test_extensions/test_ext7--2.0--2.1bad.sql b/src/test/modules/test_extensions/test_ext7--2.0--2.1bad.sql new file mode 100644 index 0000000000..910b1d4f38 --- /dev/null +++ b/src/test/modules/test_extensions/test_ext7--2.0--2.1bad.sql @@ -0,0 +1,11 @@ +/* src/test/modules/test_extensions/test_ext7--2.0--2.1bad.sql */ + +-- complain if script is sourced in psql, rather than via ALTER EXTENSION +\echo Use "ALTER EXTENSION test_ext7 UPDATE TO '2.1bad'" to load this file. \quit + +-- test reporting of a simple syntax error in an extension script +CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL +AS $$ SELECT $1 + 1 $$; + +CREATE FUNCTION my_other_func(int) RETURNS int LANGUAGE SQL +AS $$ SELECT $1 + 1 $$; diff --git a/src/test/modules/test_extensions/test_ext7--2.0--2.2bad.sql b/src/test/modules/test_extensions/test_ext7--2.0--2.2bad.sql new file mode 100644 index 0000000000..19aa9bf264 --- /dev/null +++ b/src/test/modules/test_extensions/test_ext7--2.0--2.2bad.sql @@ -0,0 +1,13 @@ +/* src/test/modules/test_extensions/test_ext7--2.0--2.2bad.sql */ + +-- complain if script is sourced in psql, rather than via ALTER EXTENSION +\echo Use "ALTER EXTENSION test_ext7 UPDATE TO '2.2bad'" to load this file. \quit + +-- test reporting of a nested syntax error in an extension script +SET LOCAL check_function_bodies = on; + +CREATE FUNCTION my_erroneous_func(int) RETURNS int LANGUAGE SQL +AS $$ SELECT $1 + , 1 $$; + +CREATE FUNCTION my_other_func(int) RETURNS int LANGUAGE SQL +AS $$ SELECT $1 + 1 $$; diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list index 5fabb127d7..964f51f429 100644 --- a/src/tools/pgindent/typedefs.list +++ b/src/tools/pgindent/typedefs.list @@ -3883,6 +3883,7 @@ saophash_hash save_buffer scram_state scram_state_enum +script_error_callback_arg security_class_t sem_t sepgsql_context_info_t -- 2.43.5
I wrote: > ... There's still a question > of whether reporting the whole script as the query is OK when > we have a syntax error, but I have no good ideas as to how to > make that terser. I had an idea about this: we can use a pretty simple heuristic such as "break at semicolon-newline sequences". That could fail and show you just a fragment of a statement, but that still seems better than showing a whole extension script. We can ameliorate the problem that we might not show enough to clearly identify what failed by including a separate line number counter. In the attached v4 I included that in the context line that reports the script file, eg +CONTEXT: SQL statement "CREATE OR REPLACE FUNCTION ext_cor_func() RETURNS text + AS $$ SELECT 'ext_cor_func: from extension'::text $$ LANGUAGE sql" +extension script file "test_ext_cor--1.0.sql", near line 8 This way seems a whole lot more usable when dealing with a large extension script. regards, tom lane From 6a8e6f66bd1abdb3c52b543651eec50ed4ceb071 Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Tue, 8 Oct 2024 14:36:15 -0400 Subject: [PATCH v4 1/2] Improve parser's reporting of statement start locations. Up to now, the parser's reporting of a statement's stmt_location included any preceding whitespace or comments. This isn't really desirable but was done to avoid accounting honestly for nonterminals that reduce to empty. It causes problems for pg_stat_statements, which partially compensates by manually stripping whitespace, but is not bright enough to strip /*-style comments. There will be more problems with an upcoming patch to improve reporting of errors in extension scripts, so it's time to do something about this. The thing we have to do to make it work right is to adjust YYLLOC_DEFAULT to scan the inputs of each production to find the first one that has a valid location (i.e., did not reduce to empty). In theory this adds a little bit of per-reduction overhead, but in practice it's negligible. I checked by measuring the time to run raw_parser() on the contents of information_schema.sql, and there was basically no change. Having done that, we can rely on any nonterminal that didn't reduce to completely empty to have a correct starting location, and we don't need the kluges the stmtmulti production formerly used. This should have a side benefit of allowing parse error reports to include an error position in some cases where they formerly failed to do so, due to trying to report the position of an empty nonterminal. I did not go looking for an example though. The one previously known case where that could happen (OptSchemaEltList) no longer needs the kluge it had; but I rather doubt that that was the only case. Discussion: https://postgr.es/m/ZvV1ClhnbJLCz7Sm@msg.df7cb.de --- .../pg_stat_statements/expected/select.out | 5 +- contrib/pg_stat_statements/sql/select.sql | 3 +- src/backend/nodes/queryjumblefuncs.c | 6 ++ src/backend/parser/gram.y | 66 +++++++------------ 4 files changed, 34 insertions(+), 46 deletions(-) diff --git a/contrib/pg_stat_statements/expected/select.out b/contrib/pg_stat_statements/expected/select.out index dd6c756f67..e0e2fa265c 100644 --- a/contrib/pg_stat_statements/expected/select.out +++ b/contrib/pg_stat_statements/expected/select.out @@ -19,8 +19,9 @@ SELECT 1 AS "int"; 1 (1 row) +/* this comment should not appear in the output */ SELECT 'hello' - -- multiline + -- but this one will appear AS "text"; text ------- @@ -129,7 +130,7 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; -------+------+------------------------------------------------------------------------------ 1 | 1 | PREPARE pgss_test (int) AS SELECT $1, $2 LIMIT $3 4 | 4 | SELECT $1 + - | | -- multiline + + | | -- but this one will appear + | | AS "text" 2 | 2 | SELECT $1 + $2 3 | 3 | SELECT $1 + $2 + $3 AS "add" diff --git a/contrib/pg_stat_statements/sql/select.sql b/contrib/pg_stat_statements/sql/select.sql index eb45cb81ad..e0be58d5e2 100644 --- a/contrib/pg_stat_statements/sql/select.sql +++ b/contrib/pg_stat_statements/sql/select.sql @@ -12,8 +12,9 @@ SELECT pg_stat_statements_reset() IS NOT NULL AS t; -- SELECT 1 AS "int"; +/* this comment should not appear in the output */ SELECT 'hello' - -- multiline + -- but this one will appear AS "text"; SELECT 'world' AS "text"; diff --git a/src/backend/nodes/queryjumblefuncs.c b/src/backend/nodes/queryjumblefuncs.c index 5e43fd9229..e8bf95690b 100644 --- a/src/backend/nodes/queryjumblefuncs.c +++ b/src/backend/nodes/queryjumblefuncs.c @@ -90,6 +90,12 @@ CleanQuerytext(const char *query, int *location, int *len) /* * Discard leading and trailing whitespace, too. Use scanner_isspace() * not libc's isspace(), because we want to match the lexer's behavior. + * + * Note: the parser now strips leading comments and whitespace from the + * reported stmt_location, so this first loop will only iterate in the + * unusual case that the location didn't propagate to here. But the + * statement length will extend to the end-of-string or terminating + * semicolon, so the second loop often does something useful. */ while (query_len > 0 && scanner_isspace(query[0])) query++, query_location++, query_len--; diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index 4aa8646af7..4bab2117d9 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -67,39 +67,25 @@ /* - * Location tracking support --- simpler than bison's default, since we only - * want to track the start position not the end position of each nonterminal. + * Location tracking support. Unlike bison's default, we only want + * to track the start position not the end position of each nonterminal. + * Nonterminals that reduce to empty receive position "-1". Since a + * production's leading RHS nonterminal(s) may have reduced to empty, + * we have to scan to find the first one that's not -1. */ #define YYLLOC_DEFAULT(Current, Rhs, N) \ do { \ - if ((N) > 0) \ - (Current) = (Rhs)[1]; \ - else \ - (Current) = (-1); \ + (Current) = (-1); \ + for (int _i = 1; _i <= (N); _i++) \ + { \ + if ((Rhs)[_i] >= 0) \ + { \ + (Current) = (Rhs)[_i]; \ + break; \ + } \ + } \ } while (0) -/* - * The above macro assigns -1 (unknown) as the parse location of any - * nonterminal that was reduced from an empty rule, or whose leftmost - * component was reduced from an empty rule. This is problematic - * for nonterminals defined like - * OptFooList: / * EMPTY * / { ... } | OptFooList Foo { ... } ; - * because we'll set -1 as the location during the first reduction and then - * copy it during each subsequent reduction, leaving us with -1 for the - * location even when the list is not empty. To fix that, do this in the - * action for the nonempty rule(s): - * if (@$ < 0) @$ = @2; - * (Although we have many nonterminals that follow this pattern, we only - * bother with fixing @$ like this when the nonterminal's parse location - * is actually referenced in some rule.) - * - * A cleaner answer would be to make YYLLOC_DEFAULT scan all the Rhs - * locations until it's found one that's not -1. Then we'd get a correct - * location for any nonterminal that isn't entirely empty. But this way - * would add overhead to every rule reduction, and so far there's not been - * a compelling reason to pay that overhead. - */ - /* * Bison doesn't allocate anything that needs to live across parser calls, * so we can easily have it use palloc instead of malloc. This prevents @@ -930,7 +916,7 @@ parse_toplevel: | MODE_PLPGSQL_EXPR PLpgSQL_Expr { pg_yyget_extra(yyscanner)->parsetree = - list_make1(makeRawStmt($2, 0)); + list_make1(makeRawStmt($2, @2)); } | MODE_PLPGSQL_ASSIGN1 PLAssignStmt { @@ -938,7 +924,7 @@ parse_toplevel: n->nnames = 1; pg_yyget_extra(yyscanner)->parsetree = - list_make1(makeRawStmt((Node *) n, 0)); + list_make1(makeRawStmt((Node *) n, @2)); } | MODE_PLPGSQL_ASSIGN2 PLAssignStmt { @@ -946,7 +932,7 @@ parse_toplevel: n->nnames = 2; pg_yyget_extra(yyscanner)->parsetree = - list_make1(makeRawStmt((Node *) n, 0)); + list_make1(makeRawStmt((Node *) n, @2)); } | MODE_PLPGSQL_ASSIGN3 PLAssignStmt { @@ -954,19 +940,15 @@ parse_toplevel: n->nnames = 3; pg_yyget_extra(yyscanner)->parsetree = - list_make1(makeRawStmt((Node *) n, 0)); + list_make1(makeRawStmt((Node *) n, @2)); } ; /* * At top level, we wrap each stmt with a RawStmt node carrying start location - * and length of the stmt's text. Notice that the start loc/len are driven - * entirely from semicolon locations (@2). It would seem natural to use - * @1 or @3 to get the true start location of a stmt, but that doesn't work - * for statements that can start with empty nonterminals (opt_with_clause is - * the main offender here); as noted in the comments for YYLLOC_DEFAULT, - * we'd get -1 for the location in such cases. - * We also take care to discard empty statements entirely. + * and length of the stmt's text. + * We also take care to discard empty statements entirely (which among other + * things dodges the problem of assigning them a location). */ stmtmulti: stmtmulti ';' toplevel_stmt { @@ -976,14 +958,14 @@ stmtmulti: stmtmulti ';' toplevel_stmt updateRawStmtEnd(llast_node(RawStmt, $1), @2); } if ($3 != NULL) - $$ = lappend($1, makeRawStmt($3, @2 + 1)); + $$ = lappend($1, makeRawStmt($3, @3)); else $$ = $1; } | toplevel_stmt { if ($1 != NULL) - $$ = list_make1(makeRawStmt($1, 0)); + $$ = list_make1(makeRawStmt($1, @1)); else $$ = NIL; } @@ -1584,8 +1566,6 @@ CreateSchemaStmt: OptSchemaEltList: OptSchemaEltList schema_stmt { - if (@$ < 0) /* see comments for YYLLOC_DEFAULT */ - @$ = @2; $$ = lappend($1, $2); } | /* EMPTY */ -- 2.43.5 From ab3e000e063f5291aee1b1b8914d029bdf79b90e Mon Sep 17 00:00:00 2001 From: Tom Lane <tgl@sss.pgh.pa.us> Date: Tue, 8 Oct 2024 16:02:57 -0400 Subject: [PATCH v4 2/2] Improve reporting of errors in extension script files. Previously, CREATE/ALTER EXTENSION gave basically no useful context about errors reported while executing script files. I think the idea was that you could run the same commands manually to see the error, but that's often quite inconvenient. Let's improve that. If we get an error during raw parsing, we won't have a current statement identified by a RawStmt node, but we should always get a syntax error position. Show the portion of the script from the last semicolon-newline before the error position to the first one after it. There are cases where this might show only a fragment of a statement, but that should be uncommon, and it seems better than showing the whole script file. Without an error cursor, if we have gotten past raw parsing (which we probably have), we can report just the current SQL statement as an item of error context. In any case also report the script file name as error context, since it might not be entirely obvious which of a series of update scripts failed. We can also show an approximate script line number in case whatever we printed of the query isn't sufficiently identifiable. The error-context code path is already exercised by some test_extensions test cases, but add tests for the syntax-error path. Discussion: https://postgr.es/m/ZvV1ClhnbJLCz7Sm@msg.df7cb.de --- src/backend/commands/extension.c | 167 +++++++++++++++++- src/test/modules/test_extensions/Makefile | 4 +- .../expected/test_extensions.out | 42 +++++ src/test/modules/test_extensions/meson.build | 2 + .../test_extensions/sql/test_extensions.sql | 4 + .../test_ext7--2.0--2.1bad.sql | 14 ++ .../test_ext7--2.0--2.2bad.sql | 13 ++ src/tools/pgindent/typedefs.list | 1 + 8 files changed, 244 insertions(+), 3 deletions(-) create mode 100644 src/test/modules/test_extensions/test_ext7--2.0--2.1bad.sql create mode 100644 src/test/modules/test_extensions/test_ext7--2.0--2.2bad.sql diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c index fab59ad5f6..86ea9cd9da 100644 --- a/src/backend/commands/extension.c +++ b/src/backend/commands/extension.c @@ -54,6 +54,7 @@ #include "funcapi.h" #include "mb/pg_wchar.h" #include "miscadmin.h" +#include "nodes/queryjumble.h" #include "storage/fd.h" #include "tcop/utility.h" #include "utils/acl.h" @@ -107,6 +108,17 @@ typedef struct ExtensionVersionInfo struct ExtensionVersionInfo *previous; /* current best predecessor */ } ExtensionVersionInfo; +/* + * Information for script_error_callback() + */ +typedef struct +{ + const char *sql; /* entire script file contents */ + const char *filename; /* script file pathname */ + ParseLoc stmt_location; /* current stmt start loc, or -1 if unknown */ + ParseLoc stmt_len; /* length in bytes; 0 means "rest of string" */ +} script_error_callback_arg; + /* Local functions */ static List *find_update_path(List *evi_list, ExtensionVersionInfo *evi_start, @@ -670,9 +682,139 @@ read_extension_script_file(const ExtensionControlFile *control, return dest_str; } +/* + * error context callback for failures in script-file execution + */ +static void +script_error_callback(void *arg) +{ + script_error_callback_arg *callback_arg = (script_error_callback_arg *) arg; + const char *query = callback_arg->sql; + int location = callback_arg->stmt_location; + int len = callback_arg->stmt_len; + int syntaxerrposition; + const char *lastslash; + + /* + * If there is a syntax error position, convert to internal syntax error; + * otherwise report the current query as an item of context stack. + * + * Note: we'll provide no context except the filename if there's neither + * an error position nor any known current query. That shouldn't happen + * though: all errors reported during raw parsing should come with an + * error position. + */ + syntaxerrposition = geterrposition(); + if (syntaxerrposition > 0) + { + /* + * If we do not know the bounds of the current statement (as would + * happen for an error occurring during initial raw parsing), we have + * to use a heuristic to decide how much of the script to show. We'll + * also use the heuristic in the unlikely case that syntaxerrposition + * is outside what we think the statement bounds are. + */ + if (location < 0 || syntaxerrposition < location || + (len > 0 && syntaxerrposition > location + len)) + { + /* + * Our heuristic is pretty simple: look for semicolon-newline + * sequences, and break at the last one strictly before + * syntaxerrposition and the first one strictly after. It's + * certainly possible to fool this with semicolon-newline embedded + * in a string literal, but it seems better to do this than to + * show the entire extension script. + */ + int slen = strlen(query); + + location = len = 0; + for (int loc = 0; loc < slen; loc++) + { + if (query[loc] != ';') + continue; + if (query[loc + 1] == '\r') + loc++; + if (query[loc + 1] == '\n') + { + int bkpt = loc + 2; + + if (bkpt < syntaxerrposition) + location = bkpt; + else if (bkpt > syntaxerrposition) + { + len = bkpt - location; + break; /* no need to keep searching */ + } + } + } + } + + /* Trim leading/trailing whitespace, for consistency */ + query = CleanQuerytext(query, &location, &len); + + /* + * Adjust syntaxerrposition. It shouldn't be pointing into the + * whitespace we just trimmed, but cope if it is. + */ + syntaxerrposition -= location; + if (syntaxerrposition < 0) + syntaxerrposition = 0; + else if (syntaxerrposition > len) + syntaxerrposition = len; + + /* And report. */ + errposition(0); + internalerrposition(syntaxerrposition); + internalerrquery(pnstrdup(query, len)); + } + else if (location >= 0) + { + /* + * Since no syntax cursor will be shown, it's okay and helpful to trim + * the reported query string to just the current statement. + */ + query = CleanQuerytext(query, &location, &len); + errcontext("SQL statement \"%.*s\"", len, query); + } + + /* + * Trim the reported file name to remove the path. We know that + * get_extension_script_filename() inserted a '/', regardless of whether + * we're on Windows. + */ + lastslash = strrchr(callback_arg->filename, '/'); + if (lastslash) + lastslash++; + else + lastslash = callback_arg->filename; /* shouldn't happen, but cope */ + + /* + * If we have a location (which, as said above, we really always should) + * then report a line number to aid in localizing problems in big scripts. + */ + if (location >= 0) + { + int linenumber = 1; + + for (query = callback_arg->sql; *query; query++) + { + if (--location < 0) + break; + if (*query == '\n') + linenumber++; + } + errcontext("extension script file \"%s\", near line %d", + lastslash, linenumber); + } + else + errcontext("extension script file \"%s\"", lastslash); +} + /* * Execute given SQL string. * + * The filename the string came from is also provided, for error reporting. + * * Note: it's tempting to just use SPI to execute the string, but that does * not work very well. The really serious problem is that SPI will parse, * analyze, and plan the whole string before executing any of it; of course @@ -682,12 +824,27 @@ read_extension_script_file(const ExtensionControlFile *control, * could be very long. */ static void -execute_sql_string(const char *sql) +execute_sql_string(const char *sql, const char *filename) { + script_error_callback_arg callback_arg; + ErrorContextCallback scripterrcontext; List *raw_parsetree_list; DestReceiver *dest; ListCell *lc1; + /* + * Setup error traceback support for ereport(). + */ + callback_arg.sql = sql; + callback_arg.filename = filename; + callback_arg.stmt_location = -1; + callback_arg.stmt_len = -1; + + scripterrcontext.callback = script_error_callback; + scripterrcontext.arg = (void *) &callback_arg; + scripterrcontext.previous = error_context_stack; + error_context_stack = &scripterrcontext; + /* * Parse the SQL string into a list of raw parse trees. */ @@ -709,6 +866,10 @@ execute_sql_string(const char *sql) List *stmt_list; ListCell *lc2; + /* Report location of this query for error context callback */ + callback_arg.stmt_location = parsetree->stmt_location; + callback_arg.stmt_len = parsetree->stmt_len; + /* * We do the work for each parsetree in a short-lived context, to * limit the memory used when there are many commands in the string. @@ -778,6 +939,8 @@ execute_sql_string(const char *sql) MemoryContextDelete(per_parsetree_context); } + error_context_stack = scripterrcontext.previous; + /* Be sure to advance the command counter after the last script command */ CommandCounterIncrement(); } @@ -1054,7 +1217,7 @@ execute_extension_script(Oid extensionOid, ExtensionControlFile *control, /* And now back to C string */ c_sql = text_to_cstring(DatumGetTextPP(t_sql)); - execute_sql_string(c_sql); + execute_sql_string(c_sql, filename); } PG_FINALLY(); { diff --git a/src/test/modules/test_extensions/Makefile b/src/test/modules/test_extensions/Makefile index 05272e6a40..1dbec14cba 100644 --- a/src/test/modules/test_extensions/Makefile +++ b/src/test/modules/test_extensions/Makefile @@ -13,7 +13,9 @@ EXTENSION = test_ext1 test_ext2 test_ext3 test_ext4 test_ext5 test_ext6 \ DATA = test_ext1--1.0.sql test_ext2--1.0.sql test_ext3--1.0.sql \ test_ext4--1.0.sql test_ext5--1.0.sql test_ext6--1.0.sql \ - test_ext7--1.0.sql test_ext7--1.0--2.0.sql test_ext8--1.0.sql \ + test_ext7--1.0.sql test_ext7--1.0--2.0.sql \ + test_ext7--2.0--2.1bad.sql test_ext7--2.0--2.2bad.sql \ + test_ext8--1.0.sql \ test_ext9--1.0.sql \ test_ext_cine--1.0.sql test_ext_cine--1.0--1.1.sql \ test_ext_cor--1.0.sql \ diff --git a/src/test/modules/test_extensions/expected/test_extensions.out b/src/test/modules/test_extensions/expected/test_extensions.out index f357cc21aa..d5388a1fec 100644 --- a/src/test/modules/test_extensions/expected/test_extensions.out +++ b/src/test/modules/test_extensions/expected/test_extensions.out @@ -72,6 +72,22 @@ Objects in extension "test_ext7" type ext7_table2[] (4 rows) +-- test reporting of errors in extension scripts +alter extension test_ext7 update to '2.1bad'; +ERROR: syntax error at or near "FUNCTIN" +LINE 1: CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE S... + ^ +QUERY: CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL +AS $$ SELECT $1 + 1 $$; +CONTEXT: extension script file "test_ext7--2.0--2.1bad.sql", near line 10 +alter extension test_ext7 update to '2.2bad'; +ERROR: syntax error at or near "," +LINE 1: SELECT $1 + , 1 + ^ +QUERY: SELECT $1 + , 1 +CONTEXT: SQL statement "CREATE FUNCTION my_erroneous_func(int) RETURNS int LANGUAGE SQL +AS $$ SELECT $1 + , 1 $$" +extension script file "test_ext7--2.0--2.2bad.sql", near line 9 -- test handling of temp objects created by extensions create extension test_ext8; -- \dx+ would expose a variable pg_temp_nn schema name, so we can't use it here @@ -295,6 +311,9 @@ CREATE FUNCTION ext_cor_func() RETURNS text CREATE EXTENSION test_ext_cor; -- fail ERROR: function ext_cor_func() is not a member of extension "test_ext_cor" DETAIL: An extension is not allowed to replace an object that it does not own. +CONTEXT: SQL statement "CREATE OR REPLACE FUNCTION ext_cor_func() RETURNS text + AS $$ SELECT 'ext_cor_func: from extension'::text $$ LANGUAGE sql" +extension script file "test_ext_cor--1.0.sql", near line 8 SELECT ext_cor_func(); ext_cor_func ------------------------ @@ -307,6 +326,9 @@ CREATE VIEW ext_cor_view AS CREATE EXTENSION test_ext_cor; -- fail ERROR: view ext_cor_view is not a member of extension "test_ext_cor" DETAIL: An extension is not allowed to replace an object that it does not own. +CONTEXT: SQL statement "CREATE OR REPLACE VIEW ext_cor_view AS + SELECT 'ext_cor_view: from extension'::text AS col" +extension script file "test_ext_cor--1.0.sql", near line 11 SELECT ext_cor_func(); ERROR: function ext_cor_func() does not exist LINE 1: SELECT ext_cor_func(); @@ -323,6 +345,8 @@ CREATE TYPE test_ext_type; CREATE EXTENSION test_ext_cor; -- fail ERROR: type test_ext_type is not a member of extension "test_ext_cor" DETAIL: An extension is not allowed to replace an object that it does not own. +CONTEXT: SQL statement "CREATE TYPE test_ext_type AS ENUM('x', 'y')" +extension script file "test_ext_cor--1.0.sql", near line 17 DROP TYPE test_ext_type; -- this makes a shell "point <<@@ polygon" operator too CREATE OPERATOR @@>> ( PROCEDURE = poly_contain_pt, @@ -331,6 +355,9 @@ CREATE OPERATOR @@>> ( PROCEDURE = poly_contain_pt, CREATE EXTENSION test_ext_cor; -- fail ERROR: operator <<@@(point,polygon) is not a member of extension "test_ext_cor" DETAIL: An extension is not allowed to replace an object that it does not own. +CONTEXT: SQL statement "CREATE OPERATOR <<@@ ( PROCEDURE = pt_contained_poly, + LEFTARG = point, RIGHTARG = polygon )" +extension script file "test_ext_cor--1.0.sql", near line 19 DROP OPERATOR <<@@ (point, polygon); CREATE EXTENSION test_ext_cor; -- now it should work SELECT ext_cor_func(); @@ -379,37 +406,52 @@ CREATE COLLATION ext_cine_coll CREATE EXTENSION test_ext_cine; -- fail ERROR: collation ext_cine_coll is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +CONTEXT: SQL statement "CREATE COLLATION IF NOT EXISTS ext_cine_coll + ( LC_COLLATE = "POSIX", LC_CTYPE = "POSIX" )" +extension script file "test_ext_cine--1.0.sql", near line 10 DROP COLLATION ext_cine_coll; CREATE MATERIALIZED VIEW ext_cine_mv AS SELECT 11 AS f1; CREATE EXTENSION test_ext_cine; -- fail ERROR: materialized view ext_cine_mv is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +CONTEXT: SQL statement "CREATE MATERIALIZED VIEW IF NOT EXISTS ext_cine_mv AS SELECT 42 AS f1" +extension script file "test_ext_cine--1.0.sql", near line 13 DROP MATERIALIZED VIEW ext_cine_mv; CREATE FOREIGN DATA WRAPPER dummy; CREATE SERVER ext_cine_srv FOREIGN DATA WRAPPER dummy; CREATE EXTENSION test_ext_cine; -- fail ERROR: server ext_cine_srv is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +CONTEXT: SQL statement "CREATE SERVER IF NOT EXISTS ext_cine_srv FOREIGN DATA WRAPPER ext_cine_fdw" +extension script file "test_ext_cine--1.0.sql", near line 17 DROP SERVER ext_cine_srv; CREATE SCHEMA ext_cine_schema; CREATE EXTENSION test_ext_cine; -- fail ERROR: schema ext_cine_schema is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +CONTEXT: SQL statement "CREATE SCHEMA IF NOT EXISTS ext_cine_schema" +extension script file "test_ext_cine--1.0.sql", near line 19 DROP SCHEMA ext_cine_schema; CREATE SEQUENCE ext_cine_seq; CREATE EXTENSION test_ext_cine; -- fail ERROR: sequence ext_cine_seq is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +CONTEXT: SQL statement "CREATE SEQUENCE IF NOT EXISTS ext_cine_seq" +extension script file "test_ext_cine--1.0.sql", near line 21 DROP SEQUENCE ext_cine_seq; CREATE TABLE ext_cine_tab1 (x int); CREATE EXTENSION test_ext_cine; -- fail ERROR: table ext_cine_tab1 is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +CONTEXT: SQL statement "CREATE TABLE IF NOT EXISTS ext_cine_tab1 (x int)" +extension script file "test_ext_cine--1.0.sql", near line 23 DROP TABLE ext_cine_tab1; CREATE TABLE ext_cine_tab2 AS SELECT 42 AS y; CREATE EXTENSION test_ext_cine; -- fail ERROR: table ext_cine_tab2 is not a member of extension "test_ext_cine" DETAIL: An extension may only use CREATE ... IF NOT EXISTS to skip object creation if the conflicting object is one thatit already owns. +CONTEXT: SQL statement "CREATE TABLE IF NOT EXISTS ext_cine_tab2 AS SELECT 42 AS y" +extension script file "test_ext_cine--1.0.sql", near line 25 DROP TABLE ext_cine_tab2; CREATE EXTENSION test_ext_cine; \dx+ test_ext_cine diff --git a/src/test/modules/test_extensions/meson.build b/src/test/modules/test_extensions/meson.build index c5f3424da5..2b31cbf425 100644 --- a/src/test/modules/test_extensions/meson.build +++ b/src/test/modules/test_extensions/meson.build @@ -15,6 +15,8 @@ test_install_data += files( 'test_ext6.control', 'test_ext7--1.0--2.0.sql', 'test_ext7--1.0.sql', + 'test_ext7--2.0--2.1bad.sql', + 'test_ext7--2.0--2.2bad.sql', 'test_ext7.control', 'test_ext8--1.0.sql', 'test_ext8.control', diff --git a/src/test/modules/test_extensions/sql/test_extensions.sql b/src/test/modules/test_extensions/sql/test_extensions.sql index 642c82ff5d..b5878f6f80 100644 --- a/src/test/modules/test_extensions/sql/test_extensions.sql +++ b/src/test/modules/test_extensions/sql/test_extensions.sql @@ -28,6 +28,10 @@ create extension test_ext7; alter extension test_ext7 update to '2.0'; \dx+ test_ext7 +-- test reporting of errors in extension scripts +alter extension test_ext7 update to '2.1bad'; +alter extension test_ext7 update to '2.2bad'; + -- test handling of temp objects created by extensions create extension test_ext8; diff --git a/src/test/modules/test_extensions/test_ext7--2.0--2.1bad.sql b/src/test/modules/test_extensions/test_ext7--2.0--2.1bad.sql new file mode 100644 index 0000000000..a055a2dbb2 --- /dev/null +++ b/src/test/modules/test_extensions/test_ext7--2.0--2.1bad.sql @@ -0,0 +1,14 @@ +/* src/test/modules/test_extensions/test_ext7--2.0--2.1bad.sql */ + +-- complain if script is sourced in psql, rather than via ALTER EXTENSION +\echo Use "ALTER EXTENSION test_ext7 UPDATE TO '2.1bad'" to load this file. \quit + +-- test reporting of a simple syntax error in an extension script +CREATE FUNCTION my_okay_func(int) RETURNS int LANGUAGE SQL +AS $$ SELECT $1 + 1 $$; + +CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL +AS $$ SELECT $1 + 1 $$; + +CREATE FUNCTION my_other_func(int) RETURNS int LANGUAGE SQL +AS $$ SELECT $1 + 1 $$; diff --git a/src/test/modules/test_extensions/test_ext7--2.0--2.2bad.sql b/src/test/modules/test_extensions/test_ext7--2.0--2.2bad.sql new file mode 100644 index 0000000000..19aa9bf264 --- /dev/null +++ b/src/test/modules/test_extensions/test_ext7--2.0--2.2bad.sql @@ -0,0 +1,13 @@ +/* src/test/modules/test_extensions/test_ext7--2.0--2.2bad.sql */ + +-- complain if script is sourced in psql, rather than via ALTER EXTENSION +\echo Use "ALTER EXTENSION test_ext7 UPDATE TO '2.2bad'" to load this file. \quit + +-- test reporting of a nested syntax error in an extension script +SET LOCAL check_function_bodies = on; + +CREATE FUNCTION my_erroneous_func(int) RETURNS int LANGUAGE SQL +AS $$ SELECT $1 + , 1 $$; + +CREATE FUNCTION my_other_func(int) RETURNS int LANGUAGE SQL +AS $$ SELECT $1 + 1 $$; diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list index a65e1c07c5..b9dde65150 100644 --- a/src/tools/pgindent/typedefs.list +++ b/src/tools/pgindent/typedefs.list @@ -3884,6 +3884,7 @@ saophash_hash save_buffer scram_state scram_state_enum +script_error_callback_arg security_class_t sem_t sepgsql_context_info_t -- 2.43.5
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)
From
Pavel Stehule
Date:
Hi
út 8. 10. 2024 v 22:18 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
I wrote:
> ... There's still a question
> of whether reporting the whole script as the query is OK when
> we have a syntax error, but I have no good ideas as to how to
> make that terser.
I had an idea about this: we can use a pretty simple heuristic
such as "break at semicolon-newline sequences". That could fail
and show you just a fragment of a statement, but that still seems
better than showing a whole extension script. We can ameliorate
the problem that we might not show enough to clearly identify
what failed by including a separate line number counter.
In the attached v4 I included that in the context line that
reports the script file, eg
+CONTEXT: SQL statement "CREATE OR REPLACE FUNCTION ext_cor_func() RETURNS text
+ AS $$ SELECT 'ext_cor_func: from extension'::text $$ LANGUAGE sql"
+extension script file "test_ext_cor--1.0.sql", near line 8
This way seems a whole lot more usable when dealing with a
large extension script.
I tested it and it is working nicely. I tested it against Orafce and I found an interesting point. The body of plpgsql functions is not checked.
Do you know the reason?
Regards
Pavel
regards, tom lane
Pavel Stehule <pavel.stehule@gmail.com> writes: > I tested it and it is working nicely. I tested it against Orafce and I > found an interesting point. The body of plpgsql functions is not checked. > Do you know the reason? In execute_extension_script(): /* * Similarly disable check_function_bodies, to ensure that SQL functions * won't be parsed during creation. */ if (check_function_bodies) (void) set_config_option("check_function_bodies", "off", PGC_USERSET, PGC_S_SESSION, GUC_ACTION_SAVE, true, 0, false); I wondered if we should reconsider that, but I'm afraid we'd be more likely to break working extensions than to do anything helpful. An extension author who wants that can do what I did in the patch's test cases: manually turn check_function_bodies on in the extension script. regards, tom lane
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)
From
Pavel Stehule
Date:
pá 11. 10. 2024 v 18:08 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> I tested it and it is working nicely. I tested it against Orafce and I
> found an interesting point. The body of plpgsql functions is not checked.
> Do you know the reason?
In execute_extension_script():
/*
* Similarly disable check_function_bodies, to ensure that SQL functions
* won't be parsed during creation.
*/
if (check_function_bodies)
(void) set_config_option("check_function_bodies", "off",
PGC_USERSET, PGC_S_SESSION,
GUC_ACTION_SAVE, true, 0, false);
I wondered if we should reconsider that, but I'm afraid we'd be more
likely to break working extensions than to do anything helpful.
An extension author who wants that can do what I did in the patch's
test cases: manually turn check_function_bodies on in the extension
script.
ok,
Pavel
regards, tom lane
On Wed, Oct 9, 2024 at 4:18 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > In the attached v4 in the upper code two branch, both will call CleanQuerytext so in script_error_callback + /* + * If we have a location (which, as said above, we really always should) + * then report a line number to aid in localizing problems in big scripts. + */ + if (location >= 0) + { + int linenumber = 1; + + for (query = callback_arg->sql; *query; query++) + { + if (--location < 0) + break; + if (*query == '\n') + linenumber++; + } + errcontext("extension script file \"%s\", near line %d", + lastslash, linenumber); + } + else + errcontext("extension script file \"%s\"", lastslash); + /* + * If we have a location (which, as said above, we really always should) + * then report a line number to aid in localizing problems in big scripts. + */ + if (location >= 0) so this part will always be true?
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)
From
Pavel Stehule
Date:
so 12. 10. 2024 v 9:33 odesílatel jian he <jian.universality@gmail.com> napsal:
On Wed, Oct 9, 2024 at 4:18 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> In the attached v4
in the upper code two branch, both will call CleanQuerytext
so in script_error_callback
+ /*
+ * If we have a location (which, as said above, we really always should)
+ * then report a line number to aid in localizing problems in big scripts.
+ */
+ if (location >= 0)
+ {
+ int linenumber = 1;
+
+ for (query = callback_arg->sql; *query; query++)
+ {
+ if (--location < 0)
+ break;
+ if (*query == '\n')
+ linenumber++;
+ }
+ errcontext("extension script file \"%s\", near line %d",
+ lastslash, linenumber);
+ }
+ else
+ errcontext("extension script file \"%s\"", lastslash);
+ /*
+ * If we have a location (which, as said above, we really always should)
+ * then report a line number to aid in localizing problems in big scripts.
+ */
+ if (location >= 0)
so this part will always be true?
yes, after CleanQuerytext the location should not be -1 ever
Regards
Pavel
Pavel Stehule <pavel.stehule@gmail.com> writes: > so 12. 10. 2024 v 9:33 odesílatel jian he <jian.universality@gmail.com> > napsal: >> + /* >> + * If we have a location (which, as said above, we really always should) >> + * then report a line number to aid in localizing problems in big scripts. >> + */ >> + if (location >= 0) >> so this part will always be true? > yes, after CleanQuerytext the location should not be -1 ever Right, but we might not have entered either of those previous if-blocks. The question here is whether the raw parser (gram.y) ever throws an error that doesn't include a cursor position. IMO it shouldn't, but a quick look through gram.y finds a few ereports that lack parser_errposition. We could go fix those, and probably should, but imagining that none will ever be introduced again seems like folly. regards, tom lane
jian he <jian.universality@gmail.com> writes: > So if we are in script_error_callback > `int location = callback_arg->stmt_location;` > location >= 0 will be always true? No, not if the grammar threw an error. regards, tom lane
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)
From
Pavel Stehule
Date:
po 14. 10. 2024 v 5:38 odesílatel jian he <jian.universality@gmail.com> napsal:
On Mon, Oct 14, 2024 at 1:13 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Pavel Stehule <pavel.stehule@gmail.com> writes:
> > so 12. 10. 2024 v 9:33 odesílatel jian he <jian.universality@gmail.com>
> > napsal:
> >> + /*
> >> + * If we have a location (which, as said above, we really always should)
> >> + * then report a line number to aid in localizing problems in big scripts.
> >> + */
> >> + if (location >= 0)
> >> so this part will always be true?
>
> > yes, after CleanQuerytext the location should not be -1 ever
>
> Right, but we might not have entered either of those previous
> if-blocks.
in src/backend/parser/gram.y
your makeRawStmt changes (v4) seem to guarantee that
RawStmt->stmt_location >= 0.
other places {DefineView,DoCopy,PrepareQuery} use makeNode(RawStmt),
In these cases, I am not so sure RawStmt->stmt_location >=0 is always true.
in execute_sql_string
raw_parsetree_list = pg_parse_query(sql);
dest = CreateDestReceiver(DestNone);
foreach(lc1, raw_parsetree_list)
{
RawStmt *parsetree = lfirst_node(RawStmt, lc1);
MemoryContext per_parsetree_context,
oldcontext;
List *stmt_list;
ListCell *lc2;
callback_arg.stmt_location = parsetree->stmt_location;
callback_arg.stmt_len = parsetree->stmt_len;
per_parsetree_context =
AllocSetContextCreate(CurrentMemoryContext,
"execute_sql_string per-statement context",
ALLOCSET_DEFAULT_SIZES);
oldcontext = MemoryContextSwitchTo(per_parsetree_context);
CommandCounterIncrement();
stmt_list = pg_analyze_and_rewrite_fixedparams(parsetree,
sql,
NULL,
0,
NULL);
Based on the above code, we do
`callback_arg.stmt_location = parsetree->stmt_location;`
pg_parse_query(sql) doesn't use script_error_callback.
So if we are in script_error_callback
`int location = callback_arg->stmt_location;`
location >= 0 will be always true?
> The question here is whether the raw parser (gram.y)
> ever throws an error that doesn't include a cursor position. IMO it
> shouldn't, but a quick look through gram.y finds a few ereports that
> lack parser_errposition. We could go fix those, and probably should,
> but imagining that none will ever be introduced again seems like
> folly.
>
I don't know how to add the error position inside the function
insertSelectOptions.
maybe we can add
`parser_errposition(exprLocation(limitClause->limitCount))));`
but limitCount position is a nearby position.
I am also not sure about func mergeTableFuncParameters.
for other places in gram.y, I've added error positions for ereport
that lack it , please check the attached.
I think this can be a separate commitfest issue.
Regards
Pavel
jian he <jian.universality@gmail.com> writes: > On Mon, Oct 14, 2024 at 1:13 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Right, but we might not have entered either of those previous >> if-blocks. > in src/backend/parser/gram.y > your makeRawStmt changes (v4) seem to guarantee that > RawStmt->stmt_location >= 0. Yes, I would expect that any RawStmt we see here will have valid stmt_location. What you seem to be missing is that an error could be thrown from > raw_parsetree_list = pg_parse_query(sql); before execute_sql_string reaches its loop over RawStmts. In that case we'll reach script_error_callback with callback_arg.stmt_location still being -1. > pg_parse_query(sql) doesn't use script_error_callback. Eh? We've put that on the error context callback stack. It is not pg_parse_query's decision whether it will be called. regards, tom lane
jian he <jian.universality@gmail.com> writes: > just found out the"elog(INFO, "should not reached here");" part never reached. You didn't check any of the cases we were discussing I guess? (That is, places in gram.y that throw an error without a parser_errposition call.) Note that even if we fix all of those and keep them fixed, we still couldn't assume the case is unreachable, because gram.y isn't self-contained. For instance, if we hit out-of-memory during raw parsing, the OOM error out of mcxt.c isn't going to provide a syntax error position. I'm not too concerned about doing better than what the patch does now (i.e. nothing) in such edge cases, but we can't do worse. > i guess, we don't need performance in script_error_callback, > but in script_error_callback arrange code seperate syntax error(raw > parser) and other error seems good. > please check the attached minor refactor. I do not think that's an improvement. It's more complicated and less readable, and I don't see why we need to squeeze more performance out of this error-reporting path that should never be taken in production. regards, tom lane
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)
From
Pavel Stehule
Date:
Hi
pá 11. 10. 2024 v 19:39 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
pá 11. 10. 2024 v 18:08 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:Pavel Stehule <pavel.stehule@gmail.com> writes:
> I tested it and it is working nicely. I tested it against Orafce and I
> found an interesting point. The body of plpgsql functions is not checked.
> Do you know the reason?
In execute_extension_script():
/*
* Similarly disable check_function_bodies, to ensure that SQL functions
* won't be parsed during creation.
*/
if (check_function_bodies)
(void) set_config_option("check_function_bodies", "off",
PGC_USERSET, PGC_S_SESSION,
GUC_ACTION_SAVE, true, 0, false);
I wondered if we should reconsider that, but I'm afraid we'd be more
likely to break working extensions than to do anything helpful.
An extension author who wants that can do what I did in the patch's
test cases: manually turn check_function_bodies on in the extension
script.ok,
I tested this patch and I didn't find any issue. The possibility to show errors inside extensions more precisely is very useful.
compilation without problems, all tests passed
I'll mark this patch as ready for committer.
Regards
Pavel
Pavel
regards, tom lane
Pavel Stehule <pavel.stehule@gmail.com> writes: > I'll mark this patch as ready for committer. Pushed then. Thanks for reviewing! regards, tom lane
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)
From
Pavel Stehule
Date:
út 22. 10. 2024 v 17:37 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> I'll mark this patch as ready for committer.
Pushed then. Thanks for reviewing!
Thank you for this patch. It is really practical
Regards
Pavel
regards, tom lane
In the no-good-deed-goes-unpunished department: buildfarm member hamerkop doesn't like this patch [1]. The diffs look like @@ -77,7 +77,7 @@ ERROR: syntax error at or near "FUNCTIN" LINE 1: CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE S... ^ -QUERY: CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL +QUERY: CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL AS $$ SELECT $1 + 1 $$; CONTEXT: extension script file "test_ext7--2.0--2.1bad.sql", near line 10 alter extension test_ext7 update to '2.2bad'; It's hard to be totally sure from the web page, but I suppose what is happening is that the newline within the quoted query fragment is represented as "\r\n" not just "\n". (I wonder why the cfbot failed to detect this; there must be more moving parts involved than just "it's Windows".) The reason why this is happening seems fairly clear: extension.c's read_whole_file() opens the script file with PG_BINARY_R, preventing Windows' libc from hiding DOS-style newlines from us, even though the git checkout on that machine is evidently using DOS newlines. That seems like a rather odd choice. Elsewhere in the same module, parse_extension_control_file() opens control files with plain "r", so that those act differently. I checked the git history and did not learn much. The original extensions commit d9572c4e3 implemented reading with a call to read_binary_file(), and we seem to have just carried that behavioral decision forward through various refactorings. I don't recall if there was an intentional choice to use binary read or that was just a random decision to use an available function. So what I'd like to do to fix this is to change - if ((file = AllocateFile(filename, PG_BINARY_R)) == NULL) + if ((file = AllocateFile(filename, "r")) == NULL) The argument against that is it creates a nonzero chance of breaking somebody's extension script file on Windows. But there's a counter-argument that it might *prevent* bugs on Windows, by making script behavior more nearly identical to what it is on not-Windows. So I think that's kind of a wash. Other approaches we could take with perhaps-smaller blast radii include making script_error_callback() trim \r's out of the quoted text (ugly) or creating a variant expected-file (hard to maintain, and I wonder if it'd survive git-related newline munging). Thoughts? regards, tom lane [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamerkop&dt=2024-10-23%2011%3A00%3A37
I wrote: > In the no-good-deed-goes-unpunished department: buildfarm member > hamerkop doesn't like this patch [1]. The diffs look like > ... > So what I'd like to do to fix this is to change > - if ((file = AllocateFile(filename, PG_BINARY_R)) == NULL) > + if ((file = AllocateFile(filename, "r")) == NULL) Well, that didn't fix it :-(. I went so far as to extract the raw log files from the buildfarm database, and what they show is that there is absolutely no difference between the lines diff is claiming are different: -QUERY: CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL\r\n +QUERY: CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL\r\n It's the same both before and after 924e03917, which made the code change depicted above, so that didn't help. So I'm pretty baffled. I suppose the expected and result files must actually be different, and something in subsequent processing is losing the difference before it gets to the buildfarm database. But I don't have the ability to debug that from here. Does anyone with access to hamerkop want to poke into this? Without additional information, the only thing I can think of that I have any confidence will eliminate these failures is to reformat the affected test cases so that they produce just a single line of output. That's kind of annoying from a functionality-coverage point of view, but I'm not sure avoiding it is worth moving mountains for. In any case, I'm disinclined to revert 924e03917. It seems like a good change on balance, even if it failed to fix whatever is happening on hamerkop. regards, tom lane
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)
From
Pavel Stehule
Date:
ne 27. 10. 2024 v 18:42 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
I wrote:
> In the no-good-deed-goes-unpunished department: buildfarm member
> hamerkop doesn't like this patch [1]. The diffs look like
> ...
> So what I'd like to do to fix this is to change
> - if ((file = AllocateFile(filename, PG_BINARY_R)) == NULL)
> + if ((file = AllocateFile(filename, "r")) == NULL)
Well, that didn't fix it :-(. I went so far as to extract the raw log
files from the buildfarm database, and what they show is that there is
absolutely no difference between the lines diff is claiming are
different:
-QUERY: CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL\r\n
+QUERY: CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL\r\n
It's the same both before and after 924e03917, which made the code
change depicted above, so that didn't help.
So I'm pretty baffled. I suppose the expected and result files must
actually be different, and something in subsequent processing is
losing the difference before it gets to the buildfarm database.
But I don't have the ability to debug that from here. Does anyone
with access to hamerkop want to poke into this?
Without additional information, the only thing I can think of that
I have any confidence will eliminate these failures is to reformat
the affected test cases so that they produce just a single line of
output. That's kind of annoying from a functionality-coverage point
of view, but I'm not sure avoiding it is worth moving mountains for.
In any case, I'm disinclined to revert 924e03917. It seems like a
good change on balance, even if it failed to fix whatever is
happening on hamerkop.
+1
This is very useful feature
Pavel
regards, tom lane
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)
From
Alexander Lakhin
Date:
Hello Tom, 27.10.2024 20:41, Tom Lane wrote: > I wrote: >> In the no-good-deed-goes-unpunished department: buildfarm member >> hamerkop doesn't like this patch [1]. The diffs look like >> ... >> So what I'd like to do to fix this is to change >> - if ((file = AllocateFile(filename, PG_BINARY_R)) == NULL) >> + if ((file = AllocateFile(filename, "r")) == NULL) > Well, that didn't fix it :-(. I went so far as to extract the raw log > files from the buildfarm database, and what they show is that there is > absolutely no difference between the lines diff is claiming are > different: > > -QUERY: CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL\r\n > +QUERY: CREATE FUNCTIN my_erroneous_func(int) RETURNS int LANGUAGE SQL\r\n > > It's the same both before and after 924e03917, which made the code > change depicted above, so that didn't help. > > So I'm pretty baffled. I suppose the expected and result files must > actually be different, and something in subsequent processing is > losing the difference before it gets to the buildfarm database. > But I don't have the ability to debug that from here. Does anyone > with access to hamerkop want to poke into this? > > Without additional information, the only thing I can think of that > I have any confidence will eliminate these failures is to reformat > the affected test cases so that they produce just a single line of > output. That's kind of annoying from a functionality-coverage point > of view, but I'm not sure avoiding it is worth moving mountains for. > I've managed to reproduce the issue with the core.autocrlf=true git setting (which sets CR+LF line ending in test_ext7--2.0--2.1bad.sql) and with diff from msys: C:\msys64\usr\bin\diff.exe --version diff (GNU diffutils) 3.8 (Gnu/Win32 Diff [1] doesn't detect those EOL differences and thus the test doesn't fail.) I can really see different line endings with hexdump: hexdump -C ...testrun\test_extensions\regress\regression.diffs 00000230 20 20 20 20 20 20 20 20 5e 0a 2d 51 55 45 52 59 | ^.-QUERY| 00000240 3a 20 20 43 52 45 41 54 45 20 46 55 4e 43 54 49 |: CREATE FUNCTI| 00000250 4e 20 6d 79 5f 65 72 72 6f 6e 65 6f 75 73 5f 66 |N my_erroneous_f| 00000260 75 6e 63 28 69 6e 74 29 20 52 45 54 55 52 4e 53 |unc(int) RETURNS| 00000270 20 69 6e 74 20 4c 41 4e 47 55 41 47 45 20 53 51 | int LANGUAGE SQ| 00000280 4c 0a 2b 51 55 45 52 59 3a 20 20 43 52 45 41 54 |L.+QUERY: CREAT| 00000290 45 20 46 55 4e 43 54 49 4e 20 6d 79 5f 65 72 72 |E FUNCTIN my_err| 000002a0 6f 6e 65 6f 75 73 5f 66 75 6e 63 28 69 6e 74 29 |oneous_func(int)| 000002b0 20 52 45 54 55 52 4e 53 20 69 6e 74 20 4c 41 4e | RETURNS int LAN| 000002c0 47 55 41 47 45 20 53 51 4c 0d 0a 20 41 53 20 24 |GUAGE SQL.. AS $| hexdump -C .../testrun/test_extensions/regress/results/test_extensions.out | grep -C5 FUNCTIN 00000b80 20 5e 0d 0a 51 55 45 52 59 3a 20 20 43 52 45 41 | ^..QUERY: CREA| 00000b90 54 45 20 46 55 4e 43 54 49 4e 20 6d 79 5f 65 72 |TE FUNCTIN my_er| 00000ba0 72 6f 6e 65 6f 75 73 5f 66 75 6e 63 28 69 6e 74 |roneous_func(int| 00000bb0 29 20 52 45 54 55 52 4e 53 20 69 6e 74 20 4c 41 |) RETURNS int LA| 00000bc0 4e 47 55 41 47 45 20 53 51 4c 0d 0d 0a 41 53 20 |NGUAGE SQL...AS | whilst hexdump -C .../src/test/modules/test_extensions/expected/test_extensions.out | grep -C5 FUNCTIN 00000b80 20 5e 0d 0a 51 55 45 52 59 3a 20 20 43 52 45 41 | ^..QUERY: CREA| 00000b90 54 45 20 46 55 4e 43 54 49 4e 20 6d 79 5f 65 72 |TE FUNCTIN my_er| 00000ba0 72 6f 6e 65 6f 75 73 5f 66 75 6e 63 28 69 6e 74 |roneous_func(int| 00000bb0 29 20 52 45 54 55 52 4e 53 20 69 6e 74 20 4c 41 |) RETURNS int LA| 00000bc0 4e 47 55 41 47 45 20 53 51 4c 0d 0a 41 53 20 24 |NGUAGE SQL..AS $| It looks like --strip-trailing-cr doesn't work as desired for this diff version. I've also dumped buf in read_whole_file() and found that in both PG_BINARY_R and "r" modes the 0d 0a ending is preserved. But it changed to 0a with the "rt" mode (see [1]), and it makes the test (and the whole `meson test`) pass for me. [1] https://gnuwin32.sourceforge.net/packages/diffutils.htm [2] https://learn.microsoft.com/en-us/cpp/c-runtime-library/file-translation-constants?view=msvc-170 Best regards, Alexander
Alexander Lakhin <exclusion@gmail.com> writes: > 27.10.2024 20:41, Tom Lane wrote: >>> In the no-good-deed-goes-unpunished department: buildfarm member >>> hamerkop doesn't like this patch [1]. The diffs look like > I've managed to reproduce the issue with the core.autocrlf=true git setting > (which sets CR+LF line ending in test_ext7--2.0--2.1bad.sql) and with diff > from msys: > C:\msys64\usr\bin\diff.exe --version > diff (GNU diffutils) 3.8 > (Gnu/Win32 Diff [1] doesn't detect those EOL differences and thus the test > doesn't fail.) Thank you for looking at this! > I can really see different line endings with hexdump: > hexdump -C .../testrun/test_extensions/regress/results/test_extensions.out | grep -C5 FUNCTIN > 00000b80 20 5e 0d 0a 51 55 45 52 59 3a 20 20 43 52 45 41 | ^..QUERY: CREA| > 00000b90 54 45 20 46 55 4e 43 54 49 4e 20 6d 79 5f 65 72 |TE FUNCTIN my_er| > 00000ba0 72 6f 6e 65 6f 75 73 5f 66 75 6e 63 28 69 6e 74 |roneous_func(int| > 00000bb0 29 20 52 45 54 55 52 4e 53 20 69 6e 74 20 4c 41 |) RETURNS int LA| > 00000bc0 4e 47 55 41 47 45 20 53 51 4c 0d 0d 0a 41 53 20 |NGUAGE SQL...AS | Wow. How are we producing \r\r\n? I'm not hugely surprised that some versions of diff might see that as different from a single newline even with --strip-trailing-cr --- formally, I think that's supposed to make \r\n match \n, but the extra \r doesn't fit that pattern. > I've also dumped buf in read_whole_file() and found that in both > PG_BINARY_R and "r" modes the 0d 0a ending is preserved. But it changed > to 0a with the "rt" mode (see [1]), and it makes the test (and the whole > `meson test`) pass for me. Interesting. I believe we decided years ago that we didn't need to use "rt" mode because that was the default on Windows, but was that a misreading of the documentation? The link you provided doesn't give any hint that there are more than two behaviors. However ... the link you provided also mentions that text mode includes treating control-Z as EOF; which I'd forgotten, but it does make it less safe than I thought to use text mode for reading script files. What I'm now thinking is that we should revert 924e03917 after all (that is, go back to using PG_BINARY_R) and instead make read_whole_file manually squash \r\n to \n if we're on Windows. Ugly, but I have yet to find anything about that platform that isn't. regards, tom lane
Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)
From
Alexander Lakhin
Date:
28.10.2024 19:06, Tom Lane wrote: >> I've also dumped buf in read_whole_file() and found that in both >> PG_BINARY_R and "r" modes the 0d 0a ending is preserved. But it changed >> to 0a with the "rt" mode (see [1]), and it makes the test (and the whole >> `meson test`) pass for me. > Interesting. I believe we decided years ago that we didn't need to > use "rt" mode because that was the default on Windows, but was that > a misreading of the documentation? The link you provided doesn't > give any hint that there are more than two behaviors. > > However ... the link you provided also mentions that text mode > includes treating control-Z as EOF; which I'd forgotten, but it > does make it less safe than I thought to use text mode for > reading script files. I think that this other behavior can be explained by pgwin32_fopen()/ pgwin32_open() coding (O_TEXT assumed implicitly only #ifdef FRONTEND). Anyway, as you noticed, \x1A injected into test_ext....sql really leads to the file contents truncation on read (with "rt"), so I agree that using the text/translation mode here is not an improvement. > What I'm now thinking is that we should revert 924e03917 after > all (that is, go back to using PG_BINARY_R) and instead make > read_whole_file manually squash \r\n to \n if we're on Windows. > Ugly, but I have yet to find anything about that platform that > isn't. Yes, I think this should help. Best regards, Alexander