[PATCH] Proposal: Allow reads to proceed during FK/trigger drops by reducing relation-level lock from AccessExclusive to ShareRowExclusive - Mailing list pgsql-hackers

From Shayon Mukherjee
Subject [PATCH] Proposal: Allow reads to proceed during FK/trigger drops by reducing relation-level lock from AccessExclusive to ShareRowExclusive
Date
Msg-id CANqtF-rbqX6Q+_NroHCaopcvSZVX6ejFrHv-WLgOKj2RdMH3Hw@mail.gmail.com
Whole thread Raw
In response to Proposal for discussions: Allow reads to proceed during FK/trigger drops by reducing relation-level lock from AccessExclusive to ShareRowExclusive  (Shayon Mukherjee <shayonj@gmail.com>)
Responses Re: [PATCH] Proposal: Allow reads to proceed during FK/trigger drops by reducing relation-level lock from AccessExclusive to ShareRowExclusive
List pgsql-hackers
Hello,

Following up on the previous thread - I took a stab at trying to see what a full patch for the proposal to reduce lock levels during FK/trigger drops would look like, and this is what I ended up with.

Recap on the problem space:

When dropping a foreign key constraint (or a table that owns one), PostgreSQL currently takes AccessExclusiveLock on both the table being altered and the referenced table for FKs. This blocks all access, including plain SELECTs, on the other table during FK/trigger removal.

Concrete example:
- Setup: pktable(id PK), fktable(id PK, fk → pktable)
- Action: BEGIN; DROP TABLE fktable
- From another shell: SELECTs on pktable are blocked while fktable's RI action triggers are removed from pktable, even though pktable itself is not being dropped

Why it matters:
In busy production systems, this creates a brief but total read outage on the referenced table. We've seen cases where webhook handlers doing simple SELECTs hit lock_timeout and failed during a routine table drop, because the referenced table (tenants) was locked exclusively to remove FK triggers.

My hope with this patch is that it reduces the lock on the referenced table from AccessExclusive to ShareRowExclusive, allowing SELECTs to proceed while still blocking writers during the brief FK/trigger removal window.

What this patch does:
- Reduces the relation-level lock from AccessExclusive to ShareRowExclusive in RemoveConstraintById(), RemoveTriggerById(), and dropconstraint_internal()
- This allows SELECTs on the other table (e.g., the referenced table in an FK relationship) to proceed during FK/trigger removal
- The table being directly altered/dropped still gets AccessExclusive as before
- Writers remain blocked (ShareRowExclusive conflicts with RowExclusiveLock), so there's no window where DML can bypass enforcement

What it also includes:
- Lock changes in 3 backend files (pg_constraint.c, trigger.c, tablecmds.c)
- Updated isolation tests (detach-partition-concurrently-4 + new fk-drop-constraint-concurrency)
- Documentation updates (alter_table.sgml)
- 8 test permutations covering: regular FKs, DROP TABLE, self-referential FKs, ALTER COLUMN TYPE, prepared plans

All FK-related isolation tests pass. The changes are minimal and surgical—just lock level adjustments with updated comments.

I'd be curious to hear any thoughts/feedback on this, especially:
- Whether the ShareRowExclusive approach makes sense or if there are edge cases I'm missing
- Hot Standby implications (this only affects primary; standbys still use AccessExclusive during WAL replay)
- Any concerns around event triggers, extensions, or operational tooling that might have relied on the stronger lock behavior

Patch attached.

Thanks,
Shayon

On Tue, Oct 7, 2025 at 12:54 PM Shayon Mukherjee <shayonj@gmail.com> wrote:
Hello hackers,

I stumbled upon a case and wanted to raise a quick thread to get some feedback on this.

Basically, dropping a foreign key constraint or a table that owns an FK currently blocks reads on the other table due to AccessExclusive locks taken while removing the FK’s internal triggers and constraint metadata. In busy systems, this short full-read outage can cause user-visible timeouts for otherwise read-only traffic. Similar topic discussed here as well [1]

Example setup
- pktable(id primary key)
- fktable(id primary key, fk references pktable(id))

Where reads get blocked today
- ALTER TABLE fktable DROP CONSTRAINT fkname:
  - The FK lives on fktable, and its RI action triggers live on pktable.
  - Both tables see AccessExclusive-level effects during removal, so SELECTs on either can be blocked while the FK and triggers are dropped.

- DROP TABLE fktable:
  - fktable is dropped with AccessExclusive (expected).
  - While removing RI action triggers on pktable, pktable also sees an AccessExclusive lock, so SELECTs on pktable can be blocked even though pktable is not being dropped.

- DROP TABLE pktable CASCADE:
  - pktable is dropped with AccessExclusive (expected).
  - FK removal on fktable (check triggers) also introduces an AccessExclusive effect on fktable, so SELECTs on fktable can be blocked.

Proposal
I wanted to see if we could reduce the relation-level lock used specifically for FK/trigger removal from AccessExclusive to ShareRowExclusive. This keeps readers moving while still blocking writers during the small window where RI triggers/constraint rows are removed and relcache is invalidated. Ideally, with a change like this `ALTER TABLE fktable DROP CONSTRAINT fkname` should take ShareRowExclusive for fktable and pktable, and `DROP TABLE fktable` should take ShareRowExclusive for fktable and AccessExclusive for pktable.

Next, dropping fktable would take ShareRowExclusive for fktable and AccessExclusive for pktable. And DROP TABLE pktable CASCADE would take AccessExclusive for pktable and ShareRowExclusive for fktable.

My understanding is that the table being dropped still uses AccessExclusive; reads/writes on that table remain blocked as today while it updates relcache, updating metadata/catalog entries and other cleanup tasks.

Also, I believe ShareRowExclusive would still serialize writers, so there should be no window where DML can bypass enforcement during removal. So this change should not affect correctness?

I just attached a small patch to get the idea across. If folks think this direction makes sense, I am happy to work on a more complete patch as well. If I’ve missed any critical cases that truly need AccessExclusive at the relation level during FK/trigger removal, especially around partitions, pending trigger events, hot standby, logical decoding, event triggers or something else perhaps please let me know.

[1] https://www.postgresql.org/message-id/flat/2bdf0fe5679f44ebf56890dfd5c5ead21a0c0bc3.camel%40cybertec.at#c50baf640c680c9e3f4def34b565d3b9

Thanks for your time and feedback.
Shayon
Attachment

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Should we update the random_page_cost default value?
Next
From: Nathan Bossart
Date:
Subject: Re: [PATCH] Remove unused #include's in src/backend/commands/*