Re: [Patch] Add WHERE clause support to REFRESH MATERIALIZED VIEW - Mailing list pgsql-hackers

From Dharin Shah
Subject Re: [Patch] Add WHERE clause support to REFRESH MATERIALIZED VIEW
Date
Msg-id CAOj6k6e7fw8RjAWXc04_A=sg4=jsU0CK7Qi13fwkPW5hMz6a5w@mail.gmail.com
Whole thread Raw
In response to Re: [Patch] Add WHERE clause support to REFRESH MATERIALIZED VIEW  (Dharin Shah <dharinshah95@gmail.com>)
List pgsql-hackers
(repro_include_issue.sql)

Typo fix : test_include_bug.sql (attached file)

Thanks,
Dharin

On Thu, Jan 15, 2026 at 7:46 PM Dharin Shah <dharinshah95@gmail.com> wrote:
Hey Adam,

Apologies for the delay, and as promised on discord, I did a review of the current patch (cf 6305) and wanted to share findings that line up with the thread’s design discussion, plus one additional correctness bug that I could reproduce.

1. In the non-concurrent REFRESH ... WHERE .... path, the UPSERT SQL is built using the unique index metadata. The code currently uses indnatts when building the ON Conflict (...) target list. That includes INCLUDE columns, so for an index like:

CREATE UNIQUE INDEX ON mv(id) INCLUDE (extra);
the generated statement becomes effectively ON CONFLICT (id, extra) ..., which fails with:
ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification

The fix appears straightforward: use indnkeyatts (key attributes only) when generating the conflict target, and also when deciding which columns are “key” for the UPDATE SET clause. I’ve attached a minimal repro SQL script (repro_include_issue.sql)

2. Another small test quality issue: the regression script has a comment “Subqueries -> Error” but the expected output shows no error for the schema-qualified subquery. There is no explicit check forbidding subqueries in transformRefreshWhereClause(), so schema-qualified subqueries appear allowed.

Moving on to broader questions 

 
I believe the issue is that the DELETE -> INSERT strategy leaves a consistency gap. Since we relied on ROW EXCLUSIVE locks to allow concurrent reads, the moment we delete the rows, we lose the physical lock on them. If a concurrent transaction inserts a colliding row during that gap, the materialized view ends up inconsistent with the base query (or hits a constraint violation).

Consistency gap in the non-concurrent mode matches what I’d expect: with ROW EXCLUSIVE you allow concurrent readers/writers, and a pure DELETE → INSERT approach can create a window where the old tuple is gone and a concurrent session can insert a conflicting logical row. 

That said, I think it would help the patch to explicitly define the intended safety model:
1. Is the goal to be safe against concurrent DML on base tables only (i.e., refresh sees a snapshot and updates MV accordingly), or also to be safe against concurrent partial refreshes and direct writes to the MV (when maintenance is enabled)?
2. Should the non-concurrent partial refresh be “best effort” like normal DML (user coordinates), or should it be “maintenance-like” (serialized / logically safe by default)?

If the intent is “safe by default”, I’d encourage documenting very clearly what’s guaranteed, and adding regression/README-style notes for footguns 

From a reviewer standpoint, I think the feature concept is sound and valuable, but it needs a crisp statement of semantics and safety boundaries. The tricky part is exactly what you called out: incremental refresh implies concurrency questions that aren’t present with full rebuild + strong locks.

I’m happy to keep reviewing iterations (especially around the advisory lock approach), and I’ll attach the reproduction scripts and notes I used.

As a possible staging approach: it might be simplest to start with a conservative serialization model for non-concurrent WHERE (while still allowing readers), and then iterate toward finer-grained logical locking if/when needed for throughput.


Thanks,
Dharin


On Sun, Jan 4, 2026 at 3:56 AM Adam Brusselback <adambrusselback@gmail.com> wrote:
Hi all,

I've been running some more concurrency tests against this patch (specifically looking for race conditions), and I found a flaw in the implementation for the  REFRESH ... WHERE ... mode (without CONCURRENTLY).

I believe the issue is that the DELETE -> INSERT strategy leaves a consistency gap. Since we relied on ROW EXCLUSIVE locks to allow concurrent reads, the moment we delete the rows, we lose the physical lock on them. If a concurrent transaction inserts a colliding row during that gap, the materialized view ends up inconsistent with the base query (or hits a constraint violation).

I initially was using SELECT ... FOR UPDATE to lock the rows before modification, but that lock is (now that I know) obviously lost when the row is deleted.

My plan is to replace that row-locking strategy with transaction-level advisory locks inside the refresh logic:

Before the DELETE, run a SELECT pg_advisory_xact_lock(mv_oid, hashtext(ROW(unique_keys)::text)) for the rows matching the WHERE clause.

This effectively locks the "logical" ID of the row, preventing concurrent refreshes on the same ID even while the physical tuple is temporarily gone. Hash collisions should not have any correctness issues that I can think of.

However, before I sink time into implementing that fix:

Is there general interest in having REFRESH MATERIALIZED VIEW ... WHERE ... in core?
If the community feels this feature is a footgun or conceptually wrong for Postgres, I'd rather know now before spending more time on this.

If the feature concept is sound, does the advisory lock approach seem like the right way to handle the concurrency safety here?

Thanks,
Adam Brusselback

pgsql-hackers by date:

Previous
From: Corey Huinker
Date:
Subject: Re: Extended Statistics set/restore/clear functions.
Next
From: Sami Imseih
Date:
Subject: Re: [[BUG] pg_stat_statements crashes with var and non-var expressions in IN clause