Thread: Re: NOT ENFORCED constraint feature
On 2025-Jan-31, Ashutosh Bapat wrote: > But if the constraint is NOT VALID and later marked as NOT ENFORCED, > what is expected behaviour while changing it to ENFORCED? I think what you want is a different mode that would be ENFORCED NOT VALID, which would be an extension of the standard, because the standard does not support the concept of NOT VALID. So while I think what you want is nice, I'm not sure that this patch necessarily must implement it. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
On Fri, Jan 31, 2025 at 7:10 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2025-Jan-31, Ashutosh Bapat wrote: > > > But if the constraint is NOT VALID and later marked as NOT ENFORCED, > > what is expected behaviour while changing it to ENFORCED? > > I think what you want is a different mode that would be ENFORCED NOT > VALID, which would be an extension of the standard, because the standard > does not support the concept of NOT VALID. So while I think what you > want is nice, I'm not sure that this patch necessarily must implement > it. > Here is my understanding behind this feature implementation -- I am not claiming to be 100% correct, I am confident I am not entirely wrong either. Let me explain with an example: imagine a user adds a VALID constraint to a table that already has data, and the user is completely sure that all the data complies with the constraint. Even in this case, the system still runs a validation check. This is expected behavior because the system can't just take the user's word for it -- it needs to explicitly confirm that the data is valid through validation. Now, with a NOT ENFORCED constraint, it's almost like the constraint doesn't exist, because no checks are being performed and there is no visible effect for the user, even though the constraint is technically still there. So when the constraint is switched to ENFORCED, we should be careful not to automatically mark it as validated (regardless of its previous validate status) unless the data is actually checked against the constraint -- treat as adding a new VALID constraint. Even if the user is absolutely sure the data complies, we should still run the validation to ensure reliability. In response to Ashutosh’s point about the VALID/NOT ENFORCED scenario: if a constraint is initially VALID, then marked as NOT ENFORCED, and later switched back to ENFORCED -- IMO, it shouldn't automatically be considered VALID. I had similar thoughts when working on a patch before v5, but after further discussion, I now agree that it makes more sense for the system to keep it as NOT VALID until the data has been validated against the constraint. This is especially important when a user adds the constraint, is confident the data complies, and then needs to enforce it. Validating the data ensures the integrity and consistency of the constraint. In short, even if the user is 100% confident, skipping validation is not an option. We need to make sure the constraint’s VALID status is accurate and reliable before marking it as validated. Regards, Amul
On Mon, Feb 3, 2025 at 9:57 AM Amul Sul <sulamul@gmail.com> wrote: > > On Fri, Jan 31, 2025 at 7:10 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > > On 2025-Jan-31, Ashutosh Bapat wrote: > > > > > But if the constraint is NOT VALID and later marked as NOT ENFORCED, > > > what is expected behaviour while changing it to ENFORCED? > > > > I think what you want is a different mode that would be ENFORCED NOT > > VALID, which would be an extension of the standard, because the standard > > does not support the concept of NOT VALID. So while I think what you > > want is nice, I'm not sure that this patch necessarily must implement > > it. > > This way allows VALID/NOT VALID and ENFORCED/NOT ENFORCED states to work together and also implement behaviour specified by the standard (ref. Peter's email). If there's some other way to implement the behaviour, that's fine too. > > Here is my understanding behind this feature implementation -- I am > not claiming to be 100% correct, I am confident I am not entirely > wrong either. Let me explain with an example: imagine a user adds a > VALID constraint to a table that already has data, and the user is > completely sure that all the data complies with the constraint. Even > in this case, the system still runs a validation check. This is > expected behavior because the system can't just take the user's word > for it -- it needs to explicitly confirm that the data is valid > through validation. > > Now, with a NOT ENFORCED constraint, it's almost like the constraint > doesn't exist, because no checks are being performed and there is no > visible effect for the user, even though the constraint is technically > still there. So when the constraint is switched to ENFORCED, we should > be careful not to automatically mark it as validated (regardless of > its previous validate status) unless the data is actually checked > against the constraint -- treat as adding a new VALID constraint. Even > if the user is absolutely sure the data complies, we should still run > the validation to ensure reliability. > > In response to Ashutosh’s point about the VALID/NOT ENFORCED scenario: > if a constraint is initially VALID, then marked as NOT ENFORCED, and > later switched back to ENFORCED -- IMO, it shouldn't automatically be > considered VALID. I am suggesting that when a constraint is changed from NOT ENFORCED to ENFORCED, if it's marked VALID - we run validation checks. Here's how I see the state conversions happening. NOT VALID, NOT ENFORCED changed to NOT_VALID, ENFORCED - no data validation required, constraint is enforced on the new tuples/changes NOT VALID, ENFORCED changed to NOT VALID, NOT ENFORCED - no data validation, constraint isn't enforced anymore VALID, NOT ENFORCED changed to VALID, ENFORCED - data validation required, constraint is enforced VALID, ENFORCED changed to VALID, NOT ENFORCED - no data validation required, constrain isn't enforced anymore, we rely on user to enforce the constraint on their side -- Best Wishes, Ashutosh Bapat
On Mon, Feb 3, 2025 at 10:49 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > On Mon, Feb 3, 2025 at 9:57 AM Amul Sul <sulamul@gmail.com> wrote: > > > > On Fri, Jan 31, 2025 at 7:10 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > > > > On 2025-Jan-31, Ashutosh Bapat wrote: > > > > > > > But if the constraint is NOT VALID and later marked as NOT ENFORCED, > > > > what is expected behaviour while changing it to ENFORCED? > > > > > > I think what you want is a different mode that would be ENFORCED NOT > > > VALID, which would be an extension of the standard, because the standard > > > does not support the concept of NOT VALID. So while I think what you > > > want is nice, I'm not sure that this patch necessarily must implement > > > it. > > > > > This way allows VALID/NOT VALID and ENFORCED/NOT ENFORCED states to > work together and also implement behaviour specified by the standard > (ref. Peter's email). If there's some other way to implement the > behaviour, that's fine too. > > > > > Here is my understanding behind this feature implementation -- I am > > not claiming to be 100% correct, I am confident I am not entirely > > wrong either. Let me explain with an example: imagine a user adds a > > VALID constraint to a table that already has data, and the user is > > completely sure that all the data complies with the constraint. Even > > in this case, the system still runs a validation check. This is > > expected behavior because the system can't just take the user's word > > for it -- it needs to explicitly confirm that the data is valid > > through validation. > > > > Now, with a NOT ENFORCED constraint, it's almost like the constraint > > doesn't exist, because no checks are being performed and there is no > > visible effect for the user, even though the constraint is technically > > still there. So when the constraint is switched to ENFORCED, we should > > be careful not to automatically mark it as validated (regardless of > > its previous validate status) unless the data is actually checked > > against the constraint -- treat as adding a new VALID constraint. Even > > if the user is absolutely sure the data complies, we should still run > > the validation to ensure reliability. > > > > In response to Ashutosh’s point about the VALID/NOT ENFORCED scenario: > > if a constraint is initially VALID, then marked as NOT ENFORCED, and > > later switched back to ENFORCED -- IMO, it shouldn't automatically be > > considered VALID. > > I am suggesting that when a constraint is changed from NOT ENFORCED to > ENFORCED, if it's marked VALID - we run validation checks. > Ok. > Here's how I see the state conversions happening. > > NOT VALID, NOT ENFORCED changed to NOT_VALID, ENFORCED - no data > validation required, constraint is enforced on the new tuples/changes > NOT VALID, ENFORCED changed to NOT VALID, NOT ENFORCED - no data > validation, constraint isn't enforced anymore > VALID, NOT ENFORCED changed to VALID, ENFORCED - data validation > required, constraint is enforced > VALID, ENFORCED changed to VALID, NOT ENFORCED - no data validation > required, constrain isn't enforced anymore, we rely on user to enforce > the constraint on their side > Understood, thanks for the detailed explanation. This is what I had implemented in the v4 patch, and I agree with this. If we decide to go with this, I can revert the behavior to the v4 patch set. Regards, Amul
On 2025-Feb-03, Ashutosh Bapat wrote: > VALID, NOT ENFORCED changed to VALID, ENFORCED - data validation > required, constraint is enforced There's no such thing as a VALID NOT ENFORCED constraint. It just cannot exist. > NOT VALID, NOT ENFORCED changed to NOT_VALID, ENFORCED - no data > validation required, constraint is enforced on the new tuples/changes This may make sense, but it needs special nonstandard syntax. If you start with a NOT VALID NOT ENFORCED constraint (which is the only way to have a NOT ENFORCED constraint) and apply ALTER TABLE ALTER CONSTRAINT ENFORCE, you will end up with a VALID ENFORCED constraint, therefore validation must be run. If you wanted to add a nonstandard command ALTER TABLE ALTER CONSTRAINT ENFORCE NO VALIDATE then maybe the transition you suggest could be made. It should be a separate patch from regular ALTER CONSTRAINT ENFORCE though, just in case some problems with it emerge later and we're forced to revert it, we can still keep the standard command. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "Use it up, wear it out, make it do, or do without"
On Mon, Feb 3, 2025 at 1:20 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2025-Feb-03, Ashutosh Bapat wrote: > > > VALID, NOT ENFORCED changed to VALID, ENFORCED - data validation > > required, constraint is enforced > > There's no such thing as a VALID NOT ENFORCED constraint. It just > cannot exist. The document in the patch says ``` If the constraint is <literal>NOT ENFORCED</literal>, the database system will not check the constraint. It is then up to the application code to ensure that the constraints are satisfied. The database system might still assume that the data actually satisfies the constraint for optimization decisions where this does not affect the correctness of the result. ``` If a constraint is NOT VALID, NOT ENFORCED it can't be used for optimization. Constraints which are VALID, NOT ENFORCED can be used for optimizatin. That's a correct state if the application is faithfully making sure that the constraint is satisfied, as suggested in our documentation. Otherwise, I don't see how NOT ENFORCED constraints would be useful. > > > NOT VALID, NOT ENFORCED changed to NOT_VALID, ENFORCED - no data > > validation required, constraint is enforced on the new tuples/changes > > This may make sense, but it needs special nonstandard syntax. If you > start with a NOT VALID NOT ENFORCED constraint (which is the only way to > have a NOT ENFORCED constraint) and apply ALTER TABLE ALTER CONSTRAINT > ENFORCE, you will end up with a VALID ENFORCED constraint, therefore > validation must be run. > > If you wanted to add a nonstandard command > ALTER TABLE ALTER CONSTRAINT ENFORCE NO VALIDATE Which state transition needs it? ALTER TABLE ALTER CONSTRAINT ENFORCE is enough to change NOT VALID, NOT ENFORCED constraint to NOT VALID, ENFORCED constraint; it does not need NO VALIDATE. -- Best Wishes, Ashutosh Bapat
On 2025-Feb-03, Ashutosh Bapat wrote: > ``` > If the > constraint is <literal>NOT ENFORCED</literal>, the database system will > not check the constraint. It is then up to the application code to > ensure that the constraints are satisfied. The database system might > still assume that the data actually satisfies the constraint for > optimization decisions where this does not affect the correctness of the > result. > ``` IMO the third sentence should be removed because it is bogus. There's no situation in which a not-enforced constraint can be used for any query optimizations -- you cannot know if a constraint remains valid after it's been turned NOT ENFORCED, because anyone could insert data that violates it milliseconds after it stops being enforced. I think the expectation that the application is going to correctly enforce the constraint after it's told the database server not to enforce it, is going to be problematic. As I recall, we already do this in FDWs for instance and it's already a problem. The second sentence is also somewhat bogus, but not as much, because it doesn't have any implications for the database system. I think we should simply say that no assumptions can be made about not enforced constraints from the server side and that if the user wants to enforce these at the application side, it's up to them to keep it all straight. IMO if the patch is letting constraints that are marked NOT ENFORCED continue to be used for query optimization, the patch is bogus and should be fixed. For instance, I think CheckConstraintFetch() should skip adding to TupleDesc->constr any constraints that are marked as not enforced. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
On 03.02.25 06:19, Ashutosh Bapat wrote: > Here's how I see the state conversions happening. > > NOT VALID, NOT ENFORCED changed to NOT_VALID, ENFORCED - no data > validation required, constraint is enforced on the new tuples/changes > NOT VALID, ENFORCED changed to NOT VALID, NOT ENFORCED - no data > validation, constraint isn't enforced anymore > VALID, NOT ENFORCED changed to VALID, ENFORCED - data validation > required, constraint is enforced > VALID, ENFORCED changed to VALID, NOT ENFORCED - no data validation > required, constrain isn't enforced anymore, we rely on user to enforce > the constraint on their side This looks sensible to me.
On 03.02.25 08:50, Alvaro Herrera wrote: > On 2025-Feb-03, Ashutosh Bapat wrote: > >> VALID, NOT ENFORCED changed to VALID, ENFORCED - data validation >> required, constraint is enforced > There's no such thing as a VALID NOT ENFORCED constraint. It just > cannot exist. The way I interpret this is that the VALID flag is just recording what would happen if the constraint was enforced. So you you take a [NOT] VALID ENFORCED constraint and switch it to NOT ENFORCED and back and you get back to where you started.
On 03.02.25 13:19, Alvaro Herrera wrote: > On 2025-Feb-03, Ashutosh Bapat wrote: > >> ``` >> If the >> constraint is <literal>NOT ENFORCED</literal>, the database system will >> not check the constraint. It is then up to the application code to >> ensure that the constraints are satisfied. The database system might >> still assume that the data actually satisfies the constraint for >> optimization decisions where this does not affect the correctness of the >> result. >> ``` > > IMO the third sentence should be removed because it is bogus. There's > no situation in which a not-enforced constraint can be used for any > query optimizations -- you cannot know if a constraint remains valid > after it's been turned NOT ENFORCED, because anyone could insert data > that violates it milliseconds after it stops being enforced. I think > the expectation that the application is going to correctly enforce the > constraint after it's told the database server not to enforce it, is > going to be problematic. As I recall, we already do this in FDWs for > instance and it's already a problem. The database system could use the presence of a not enforced constraint for selectivity estimation, for example.
On 2025-Feb-04, Peter Eisentraut wrote: > On 03.02.25 08:50, Alvaro Herrera wrote: > > On 2025-Feb-03, Ashutosh Bapat wrote: > > > > > VALID, NOT ENFORCED changed to VALID, ENFORCED - data validation > > > required, constraint is enforced > > There's no such thing as a VALID NOT ENFORCED constraint. It just > > cannot exist. > > The way I interpret this is that the VALID flag is just recording what would > happen if the constraint was enforced. So you you take a [NOT] VALID > ENFORCED constraint and switch it to NOT ENFORCED and back and you get back > to where you started. I think it is dangerous. You can easily end up with undetected violating rows in the table, and then you won't be able to dump/restore it anymore. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "After a quick R of TFM, all I can say is HOLY CR** THAT IS COOL! PostgreSQL was amazing when I first started using it at 7.2, and I'm continually astounded by learning new features and techniques made available by the continuing work of the development team." Berend Tober, http://archives.postgresql.org/pgsql-hackers/2007-08/msg01009.php
On Mon, Feb 10, 2025, at 7:03 AM, Amul Sul wrote: > Attached patch implemented this behaviour. To achieve this, we have to > revert (see 0007) some committed code and relax the restriction that > the NOT ENFORCED constraint must also be NOT VALID. Now, NOT ENFORCED > and NOT VALID are independent statuses, and the psql-meta meta command > will display NOT VALID alongside the NOT ENFORCED constraint. > Previously, we hid NOT VALID for NOT ENFORCED constraints, assuming it > would be implied, but that is no longer the case. I think this proposed state of affairs is problematic. Current queries that assume that pg_constraint.convalidated meansthat a constraint is validated would be broken. My suggestion at this point is that instead of adding a separate booleancolumn to pg_constraint we should be replacing `bool convalidated` with `char convalidity`, with new defines for allthe possible states we require: enforced-and-valid ("V"alid), enforced-not-validated ("i"nvalid), not-enforced-and-not-valid(terribly "I"nvalid or maybe "U"nenforced), not-enforced-but-was-valid-before-turning-unenforced("u"nenforced). Breaking user queries would make all apps reassess whatdo they actually want to know about the constraint without assumptions of how enforcement worked in existing Postgresreleases. This shouldn't be a very large change from what you currently have, I think. > • v13-0001-Add-AlterConstraintStmt-struct-for-ALTER-.-CONST.patch I think the new node name is wrong, because it makes the code looks as if we support ALTER CONSTRAINT as a statement forthis, which is false. (This is a repeat of ReplicaIdentityStmt, which I think is a mistake). I would suggest a namelike ATAlterConstraint instead. Perhaps we can use that in the patch for ALTER CONSTRAINT ... SET [NO] INHERIT as well,instead of (as I suggested Suraj) creating a separate subcommand number.
On Mon, 10 Feb 2025 at 13:48, Álvaro Herrera <alvherre@alvh.no-ip.org> wrote:
I think this proposed state of affairs is problematic. Current queries that assume that pg_constraint.convalidated means that a constraint is validated would be broken. My suggestion at this point is that instead of adding a separate boolean column to pg_constraint we should be replacing `bool convalidated` with `char convalidity`, with new defines for all the possible states we require: enforced-and-valid ("V"alid), enforced-not-validated ("i"nvalid), not-enforced-and-not-valid (terribly "I"nvalid or maybe "U"nenforced), not-enforced-but-was-valid-before-turning-unenforced ("u"nenforced). Breaking user queries would make all apps reassess what do they actually want to know about the constraint without assumptions of how enforcement worked in existing Postgres releases.
I'm having a lot of trouble understanding the operational distinction between your 'u' and 'U'. If it's not enforced, it cannot be assumed to be valid, regardless of whether it was valid in the past. I'm not sure what I think of a single character vs. 2 booleans, but there are only 3 sensible states either way: valid enforced, invalid enforced, and invalid unenforced.
Additionally, if there are officially 4 status possibilities then all code that looks for unenforced constraints has to look for both valid and invalid unenforced constraints if we use a char; it's not as bad with 2 booleans because one can just check the "enforced" boolean.
On 2025-Feb-10, Isaac Morland wrote: > I'm having a lot of trouble understanding the operational distinction > between your 'u' and 'U'. If it's not enforced, it cannot be assumed to be > valid, regardless of whether it was valid in the past. I'm not sure what I > think of a single character vs. 2 booleans, but there are only 3 sensible > states either way: valid enforced, invalid enforced, and invalid unenforced. I kinda agree with you and would prefer that things were that way as well. But look at the discussion starting at https://postgr.es/m/CAExHW5tV23Sw+Nznv0KpdNg_t7LrXY1WM9atiC=eKKSsKHSnuQ@mail.gmail.com whereby it was apparently established that if you have a NOT VALID NOT ENFORCED constraint, and you make it enforced, then you should somehow end up with a NOT VALID ENFORCED constraint, which says to me that we need to store the fact that the constraint was NOT VALID to start with; and correspondingly if it's VALID NOT ENFORCED and you enforce it, then it ends up VALID ENFORCED. If we take this view of the world (with which, I repeat, I disagree) then we must keep track of whether the constraint was valid or not valid to start with. And this means that we need to keep convalidated=true _regardless_ of whether conenforced is false. So in this view of the world there aren't three states but four. I would prefer there to be three states as well, but apparently I'm outvoted on this. > Additionally, if there are officially 4 status possibilities then all code > that looks for unenforced constraints has to look for both valid and > invalid unenforced constraints if we use a char; it's not as bad with 2 > booleans because one can just check the "enforced" boolean. Well, yes. You have kinda the same issue with any other system catalog column that's a 'char', I guess. Maybe this is more of a problem here because it's more user-visible than most other catalogs, not sure. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
On Tue, 11 Feb 2025 at 08:36, Álvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2025-Feb-10, Isaac Morland wrote:
> I'm having a lot of trouble understanding the operational distinction
> between your 'u' and 'U'. If it's not enforced, it cannot be assumed to be
> valid, regardless of whether it was valid in the past. I'm not sure what I
> think of a single character vs. 2 booleans, but there are only 3 sensible
> states either way: valid enforced, invalid enforced, and invalid unenforced.
I kinda agree with you and would prefer that things were that way as
well. But look at the discussion starting at
https://postgr.es/m/CAExHW5tV23Sw+Nznv0KpdNg_t7LrXY1WM9atiC=eKKSsKHSnuQ@mail.gmail.com
whereby it was apparently established that if you have a
NOT VALID NOT ENFORCED
constraint, and you make it enforced, then you should somehow end up
with a NOT VALID ENFORCED constraint, which says to me that we need to
store the fact that the constraint was NOT VALID to start with; and
correspondingly if it's VALID NOT ENFORCED and you enforce it, then it
ends up VALID ENFORCED. If we take this view of the world (with which,
I repeat, I disagree) then we must keep track of whether the constraint
was valid or not valid to start with. And this means that we need to
keep convalidated=true _regardless_ of whether conenforced is false.
So in this view of the world there aren't three states but four.
I would prefer there to be three states as well, but apparently I'm
outvoted on this.
Sounds like we agree. I think the problem is with the statement in the linked discussion that “If the constraint is VALID and later marked as NOT ENFORCED, changing it to ENFORCED should also keep it VALID.” This ignores that if it is changed to NOT ENFORCED that should immediately change it to NOT VALID if it is not already so.
Has anybody argued for how it makes any sense at all to have a constraint that is VALID (and therefore will be assumed to be true by the planner), yet NOT ENFORCED (and therefore may well not be true)? What next, a patch to the planner so that it only treats as true constraints that are both VALID and ENFORCED?
Re: the 3 or 4 values for the single character status, there is a similar issue with relkind, where one can imagine writing "relkind IN ('r')" when one meant "relkind IN ('r', 'v')" or something else; but on the other hand, one can easily imagine actually wanting the first one of those. But here, it's not at all clear to me when you would ever want to distinguish between 'u' and 'U', but it is clear to me that it would be natural to write "… = 'U'" when one actually needs to write "… IN ('u', 'U')", or perhaps "… ILIKE 'u'" (not what I would want to see).
On Tue, Feb 11, 2025 at 9:09 PM Isaac Morland <isaac.morland@gmail.com> wrote: > > On Tue, 11 Feb 2025 at 08:36, Álvaro Herrera <alvherre@alvh.no-ip.org> wrote: >> >> On 2025-Feb-10, Isaac Morland wrote: >> >> > I'm having a lot of trouble understanding the operational distinction >> > between your 'u' and 'U'. If it's not enforced, it cannot be assumed to be >> > valid, regardless of whether it was valid in the past. I'm not sure what I >> > think of a single character vs. 2 booleans, but there are only 3 sensible >> > states either way: valid enforced, invalid enforced, and invalid unenforced. >> >> I kinda agree with you and would prefer that things were that way as >> well. But look at the discussion starting at >> https://postgr.es/m/CAExHW5tV23Sw+Nznv0KpdNg_t7LrXY1WM9atiC=eKKSsKHSnuQ@mail.gmail.com >> whereby it was apparently established that if you have a >> NOT VALID NOT ENFORCED >> constraint, and you make it enforced, then you should somehow end up >> with a NOT VALID ENFORCED constraint, which says to me that we need to >> store the fact that the constraint was NOT VALID to start with; and >> correspondingly if it's VALID NOT ENFORCED and you enforce it, then it >> ends up VALID ENFORCED. If we take this view of the world (with which, >> I repeat, I disagree) then we must keep track of whether the constraint >> was valid or not valid to start with. And this means that we need to >> keep convalidated=true _regardless_ of whether conenforced is false. >> So in this view of the world there aren't three states but four. >> >> I would prefer there to be three states as well, but apparently I'm >> outvoted on this. > > > Sounds like we agree. I think the problem is with the statement in the linked discussion that “If the constraint is VALIDand later marked as NOT ENFORCED, changing it to ENFORCED should also keep it VALID.” This ignores that if it is changedto NOT ENFORCED that should immediately change it to NOT VALID if it is not already so. > > Has anybody argued for how it makes any sense at all to have a constraint that is VALID (and therefore will be assumedto be true by the planner), yet NOT ENFORCED (and therefore may well not be true)? What next, a patch to the plannerso that it only treats as true constraints that are both VALID and ENFORCED? I have been asking a different question: What's the use of not-enforced constraints if we don't allow VALID, NOT ENFORCED state for them? OTOH, consider an application which "knows" that the constraint is valid for the data (either because of checks at application level, or because the data was replicated from some other system where the cosntraints were applied). It's a natural ask to use the constraints for, say optimization, but don't take unnecessary overhead of validating them. VALID, NOT ENFORCED state helps in such a scenario. Of course an application can misuse it (just like stable marking on a function), but well ... they will be penalised for their misuse. -- Best Wishes, Ashutosh Bapat
On 2025-Feb-12, Ashutosh Bapat wrote: > I have been asking a different question: What's the use of > not-enforced constraints if we don't allow VALID, NOT ENFORCED state > for them? That's a question for the SQL standards committee. They may serve schema documentation purposes, for example. https://www.postgresql.eu/events/pgconfeu2024/schedule/session/5677-exploring-postgres-databases-with-graphs/ > OTOH, consider an application which "knows" that the constraint is > valid for the data (either because of checks at application level, or > because the data was replicated from some other system where the > cosntraints were applied). It's a natural ask to use the constraints > for, say optimization, but don't take unnecessary overhead of > validating them. VALID, NOT ENFORCED state helps in such a scenario. > Of course an application can misuse it (just like stable marking on a > function), but well ... they will be penalised for their misuse. I disagree that we should see a VALID NOT ENFORCED constraint as one that can be used for query optimization purposes. This is only going to bring users pain, because it's far too easy to misuse and they will get wrong query results, possibly without knowing for who knows how long. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "El número de instalaciones de UNIX se ha elevado a 10, y se espera que este número aumente" (UPM, 1972)
On 11.02.25 14:36, Álvaro Herrera wrote: > On 2025-Feb-10, Isaac Morland wrote: > >> I'm having a lot of trouble understanding the operational distinction >> between your 'u' and 'U'. If it's not enforced, it cannot be assumed to be >> valid, regardless of whether it was valid in the past. I'm not sure what I >> think of a single character vs. 2 booleans, but there are only 3 sensible >> states either way: valid enforced, invalid enforced, and invalid unenforced. > > I kinda agree with you and would prefer that things were that way as > well. But look at the discussion starting at > https://postgr.es/m/CAExHW5tV23Sw+Nznv0KpdNg_t7LrXY1WM9atiC=eKKSsKHSnuQ@mail.gmail.com > whereby it was apparently established that if you have a > NOT VALID NOT ENFORCED > constraint, and you make it enforced, then you should somehow end up > with a NOT VALID ENFORCED constraint, which says to me that we need to > store the fact that the constraint was NOT VALID to start with; and > correspondingly if it's VALID NOT ENFORCED and you enforce it, then it > ends up VALID ENFORCED. If we take this view of the world (with which, > I repeat, I disagree) then we must keep track of whether the constraint > was valid or not valid to start with. And this means that we need to > keep convalidated=true _regardless_ of whether conenforced is false. > So in this view of the world there aren't three states but four. > > I would prefer there to be three states as well, but apparently I'm > outvoted on this. Just to make this a bit more confusing, here is another interpretation of the state NOT ENFORCED VALID (they call it DISABLE VALIDATE): https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/constraint.html#GUID-1055EA97-BA6F-4764-A15F-1024FD5B6DFE__I1002349 """ DISABLE VALIDATE disables the constraint and drops the index on the constraint, but keeps the constraint valid. This feature is most useful in data warehousing situations, because it lets you load large amounts of data while also saving space by not having an index. This setting lets you load data from a nonpartitioned table into a partitioned table using the exchange_partition_subpart clause of the ALTER TABLE statement or using SQL*Loader. All other modifications to the table (inserts, updates, and deletes) by other SQL statements are disallowed. """
On 12.02.25 12:13, Álvaro Herrera wrote: > On 2025-Feb-12, Ashutosh Bapat wrote: > >> I have been asking a different question: What's the use of >> not-enforced constraints if we don't allow VALID, NOT ENFORCED state >> for them? > > That's a question for the SQL standards committee. They may serve > schema documentation purposes, for example. > https://www.postgresql.eu/events/pgconfeu2024/schedule/session/5677-exploring-postgres-databases-with-graphs/ > >> OTOH, consider an application which "knows" that the constraint is >> valid for the data (either because of checks at application level, or >> because the data was replicated from some other system where the >> cosntraints were applied). It's a natural ask to use the constraints >> for, say optimization, but don't take unnecessary overhead of >> validating them. VALID, NOT ENFORCED state helps in such a scenario. >> Of course an application can misuse it (just like stable marking on a >> function), but well ... they will be penalised for their misuse. > > I disagree that we should see a VALID NOT ENFORCED constraint as one > that can be used for query optimization purposes. This is only going to > bring users pain, because it's far too easy to misuse and they will get > wrong query results, possibly without knowing for who knows how long. I've been digging into the ISO archives for some more background on the intended meaning of this feature. Result: "NOT ENFORCED" just means "off" or "disabled", "could contain anything". You can use this to do data loads, or schema surgery, or things like that. Or just if you want it for documentation. This idea that a not-enforced constraint should contain valid data anyway is not supported by anything I could find written down. I've heard that in discussions, but those could have been speculations. (I still think that could be a feature, but it's clearly not this one, at least not in its default state.) So considering that, I think a three-state system makes more sense. Something like: 1) NOT ENFORCED -- no data is checked 2) NOT VALID -- existing data is unchecked, new data is checked 3) ENFORCED -- all data is checked Transitions: (1) - [ ALTER TABLE ... ALTER CONSTRAINT ... NOT VALID ] -> (2) (1) - [ ALTER TABLE ... ALTER CONSTRAINT ... ENFORCED ] -> (3) (2) - [ ALTER TABLE ... VALIDATE CONSTRAINT ... ] -> (3) (2|3) - [ ALTER TABLE ... ALTER CONSTRAINT ... NOT ENFORCED ] -> (1) (3) - [ ALTER TABLE ... ALTER CONSTRAINT ... NOT VALID ] -> (2)
On Wed, Feb 12, 2025 at 8:15 PM Peter Eisentraut <peter@eisentraut.org> wrote: > > On 12.02.25 12:13, Álvaro Herrera wrote: > > On 2025-Feb-12, Ashutosh Bapat wrote: > > > >> I have been asking a different question: What's the use of > >> not-enforced constraints if we don't allow VALID, NOT ENFORCED state > >> for them? > > > > That's a question for the SQL standards committee. They may serve > > schema documentation purposes, for example. > > https://www.postgresql.eu/events/pgconfeu2024/schedule/session/5677-exploring-postgres-databases-with-graphs/ > > > >> OTOH, consider an application which "knows" that the constraint is > >> valid for the data (either because of checks at application level, or > >> because the data was replicated from some other system where the > >> cosntraints were applied). It's a natural ask to use the constraints > >> for, say optimization, but don't take unnecessary overhead of > >> validating them. VALID, NOT ENFORCED state helps in such a scenario. > >> Of course an application can misuse it (just like stable marking on a > >> function), but well ... they will be penalised for their misuse. > > > > I disagree that we should see a VALID NOT ENFORCED constraint as one > > that can be used for query optimization purposes. This is only going to > > bring users pain, because it's far too easy to misuse and they will get > > wrong query results, possibly without knowing for who knows how long. > > I've been digging into the ISO archives for some more background on the > intended meaning of this feature. > > Result: "NOT ENFORCED" just means "off" or "disabled", "could contain > anything". You can use this to do data loads, or schema surgery, or > things like that. Or just if you want it for documentation. Hmm, so one can convert an enforced constraint to a not-enforced constraint, load the data or make changes and then enforce it again. Makes sense. > > This idea that a not-enforced constraint should contain valid data > anyway is not supported by anything I could find written down. I've > heard that in discussions, but those could have been speculations. > > (I still think that could be a feature, but it's clearly not this one, > at least not in its default state.) Thanks for the background. > > So considering that, I think a three-state system makes more sense. > Something like: > > 1) NOT ENFORCED -- no data is checked > 2) NOT VALID -- existing data is unchecked, new data is checked > 3) ENFORCED -- all data is checked > > Transitions: > > (1) - [ ALTER TABLE ... ALTER CONSTRAINT ... NOT VALID ] -> (2) Per your notation, this means the the constraint is not enforced but new data is checked - that seems a contradiction, how would we check the data when the constraint is not being enforced. Or do you suggest that we convert a NOT ENFORCED constraint to ENFORCED as a result of converting it to NOT VALID? > (1) - [ ALTER TABLE ... ALTER CONSTRAINT ... ENFORCED ] -> (3) Seems ok. > (2) - [ ALTER TABLE ... VALIDATE CONSTRAINT ... ] -> (3) As a result of this a not enforced constraint would turn into an enforced constraint. The user might have intended to just validate the data but not enforce it to avoid paying price for the checks on new data. > (2|3) - [ ALTER TABLE ... ALTER CONSTRAINT ... NOT ENFORCED ] -> (1) Looks fine. > (3) - [ ALTER TABLE ... ALTER CONSTRAINT ... NOT VALID ] -> (2) > This too seems ok assuming the constraint would remain enforced. I think, what you intend to say is clearer with 4 state system {NE, E} * {NV, V} = {(NE, NV), (NE, V), (E, NV), (E, V)} where (NE, V) is unreachable. Let's name them S1, S2, S3, S4 respectively. S1 -> [ ALTER TABLE ... ALTER CONSTRAINT ... NOT VALID ] -> S1 - noop S3 -> [ ALTER TABLE ... ALTER CONSTRAINT ... NOT VALID ] -> S3 - noop S4 -> [ ALTER TABLE ... ALTER CONSTRAINT ... NOT VALID ] -> S3 S1->[ ALTER TABLE ... ALTER CONSTRAINT ... ENFORCED ]->S4 S3->[ ALTER TABLE ... ALTER CONSTRAINT ... ENFORCED ]->S3 - noop S4->[ ALTER TABLE ... ALTER CONSTRAINT ... ENFORCED ]->S4 - noop S1->[ ALTER TABLE ... VALIDATE CONSTRAINT ... ]->S1 - but this is not noop - the existing data gets validated but no change happens to the state of the constraint - it is not enforced on the future data and it's not considered valid. This gives opportunity to the user to just validate the existing data but not enforce the constraint on new data thus avoiding some computation on the new data. Of course we will have to update the documentation to clearly specify the result. I think VALIDATE CONSTRAINT is postgresql extension so we are free to interpret it in the context of ENFORCED feature. S3->[ ALTER TABLE ... VALIDATE CONSTRAINT ... ]->S4 S4->[ ALTER TABLE ... VALIDATE CONSTRAINT ... ]->S4 - noop S1-[ ALTER TABLE ... ALTER CONSTRAINT ... NOT ENFORCED ]->S1 - noop S3-[ ALTER TABLE ... ALTER CONSTRAINT ... NOT ENFORCED ]->S1 S4-[[ ALTER TABLE ... ALTER CONSTRAINT ... NOT ENFORCED ]->S1 Notice that there are no edges to and from S2. -- Best Wishes, Ashutosh Bapat
On 2025-Feb-13, Ashutosh Bapat wrote: > > So considering that, I think a three-state system makes more sense. > > Something like: > > > > 1) NOT ENFORCED -- no data is checked > > 2) NOT VALID -- existing data is unchecked, new data is checked > > 3) ENFORCED -- all data is checked > > > > Transitions: > > > > (1) - [ ALTER TABLE ... ALTER CONSTRAINT ... NOT VALID ] -> (2) > > Per your notation, this means the the constraint is not enforced but > new data is checked - that seems a contradiction, how would we check > the data when the constraint is not being enforced. Or do you suggest > that we convert a NOT ENFORCED constraint to ENFORCED as a result of > converting it to NOT VALID? I agree this one is a little weird. For this I would have the command be ALTER TABLE ... ALTER CONSTRAINT ... ENFORCED NOT VALID this way it's explicit that what we want is flip the ENFORCED bit while leaving NOT VALID as-is. > > (2) - [ ALTER TABLE ... VALIDATE CONSTRAINT ... ] -> (3) > > As a result of this a not enforced constraint would turn into an > enforced constraint. The user might have intended to just validate the > data but not enforce it to avoid paying price for the checks on new > data. I'm not sure there's a use case for validating existing data without starting to enforce the constraint. The data can become invalid immediately after you've run the command, so why bother? > I think, what you intend to say is clearer with 4 state system {NE, E} > * {NV, V} = {(NE, NV), (NE, V), (E, NV), (E, V)} where (NE, V) is > unreachable. Let's name them S1, S2, S3, S4 respectively. [...] > Notice that there are no edges to and from S2. So why list it as a possible state? -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "Las mujeres son como hondas: mientras más resistencia tienen, más lejos puedes llegar con ellas" (Jonas Nightingale, Leap of Faith)
On Thu, Feb 13, 2025 at 5:27 PM Álvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2025-Feb-13, Ashutosh Bapat wrote: > > > > So considering that, I think a three-state system makes more sense. > > > Something like: > > > > > > 1) NOT ENFORCED -- no data is checked > > > 2) NOT VALID -- existing data is unchecked, new data is checked > > > 3) ENFORCED -- all data is checked > > > > > > Transitions: > > > > > > (1) - [ ALTER TABLE ... ALTER CONSTRAINT ... NOT VALID ] -> (2) > > > > Per your notation, this means the the constraint is not enforced but > > new data is checked - that seems a contradiction, how would we check > > the data when the constraint is not being enforced. Or do you suggest > > that we convert a NOT ENFORCED constraint to ENFORCED as a result of > > converting it to NOT VALID? > > I agree this one is a little weird. For this I would have the command > be > ALTER TABLE ... ALTER CONSTRAINT ... ENFORCED NOT VALID > this way it's explicit that what we want is flip the ENFORCED bit while > leaving NOT VALID as-is. > > > > (2) - [ ALTER TABLE ... VALIDATE CONSTRAINT ... ] -> (3) > > > > As a result of this a not enforced constraint would turn into an > > enforced constraint. The user might have intended to just validate the > > data but not enforce it to avoid paying price for the checks on new > > data. > > I'm not sure there's a use case for validating existing data without > starting to enforce the constraint. The data can become invalid > immediately after you've run the command, so why bother? Validating whole table at a time is cheaper than doing it for every row as it appears. So the ability to validate data in batches at regular intervals instead of validating every row has some attractiveness, esp in huge data/analytics cases. And we could implement it without much cost. But I don't have a concrete usecase. > > > I think, what you intend to say is clearer with 4 state system {NE, E} > > * {NV, V} = {(NE, NV), (NE, V), (E, NV), (E, V)} where (NE, V) is > > unreachable. Let's name them S1, S2, S3, S4 respectively. > [...] > > Notice that there are no edges to and from S2. > > So why list it as a possible state? For the sake of combinatorics. :) -- Best Wishes, Ashutosh Bapat
On Fri, 14 Feb 2025 at 10:11, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
> > I think, what you intend to say is clearer with 4 state system {NE, E}
> > * {NV, V} = {(NE, NV), (NE, V), (E, NV), (E, V)} where (NE, V) is
> > unreachable. Let's name them S1, S2, S3, S4 respectively.
> [...]
> > Notice that there are no edges to and from S2.
>
> So why list it as a possible state?
For the sake of combinatorics. :)
Just because there are 2^n combinations of n boolean values does not mean there are 2^n actual meaningful states. That's why we have CHECK constraints.
On Fri, Feb 14, 2025 at 8:41 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > On Thu, Feb 13, 2025 at 5:27 PM Álvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > > On 2025-Feb-13, Ashutosh Bapat wrote: > > > > > > So considering that, I think a three-state system makes more sense. > > > > Something like: > > > > > > > > 1) NOT ENFORCED -- no data is checked > > > > 2) NOT VALID -- existing data is unchecked, new data is checked > > > > 3) ENFORCED -- all data is checked > > > > > > > > Transitions: > > > > > > > > (1) - [ ALTER TABLE ... ALTER CONSTRAINT ... NOT VALID ] -> (2) > > > > > > Per your notation, this means the the constraint is not enforced but > > > new data is checked - that seems a contradiction, how would we check > > > the data when the constraint is not being enforced. Or do you suggest > > > that we convert a NOT ENFORCED constraint to ENFORCED as a result of > > > converting it to NOT VALID? > > > > I agree this one is a little weird. For this I would have the command > > be > > ALTER TABLE ... ALTER CONSTRAINT ... ENFORCED NOT VALID > > this way it's explicit that what we want is flip the ENFORCED bit while > > leaving NOT VALID as-is. > > > > > > (2) - [ ALTER TABLE ... VALIDATE CONSTRAINT ... ] -> (3) > > > > > > As a result of this a not enforced constraint would turn into an > > > enforced constraint. The user might have intended to just validate the > > > data but not enforce it to avoid paying price for the checks on new > > > data. > > > > I'm not sure there's a use case for validating existing data without > > starting to enforce the constraint. The data can become invalid > > immediately after you've run the command, so why bother? > > Validating whole table at a time is cheaper than doing it for every > row as it appears. So the ability to validate data in batches at > regular intervals instead of validating every row has some > attractiveness, esp in huge data/analytics cases. And we could > implement it without much cost. But I don't have a concrete usecase. > Well, I’m not sure if it’s worth validating data in batches when we don’t maintain their state, as this would lead to revalidating the same data in the next validation along with newly inserted records. Also, based on the current implementation, we can perform CHECK constraint validation, but not FOREIGN KEY constraint validation, since the necessary triggers for referential integrity checks haven’t been created for NOT ENFORCED. While we can create those triggers, it’s unclear whether we want to keep them around if they aren’t being used for NOT ENFORCED constraints. Regards, Amul