Thread: Index not used in certain nested views but not in others

Index not used in certain nested views but not in others

From
Markus Demleitner
Date:
Dear List,

I know how tedious mails with a subject of the type "I don't understand
what the planner does" are, but on this one I'm really stumped.
Regrettably, the situation is also a bit complex.  Hopefully, someone
will bear with me.

So, in a PostgreSQL 15.12 I have a view over a single table with ~20
columns (the only relevant columns here are the ones that somehow
contain "pub[lisher]_did", the others are just there for context; I'm
going to call these "pubdids" from here on in the prose):

  CREATE OR REPLACE VIEW ivoa.obs_radio AS
   SELECT main.obs_publisher_did,
      main.s_resolution_min,
      main.s_resolution_max,
      NULL::real AS s_fov_min,
      [...]
     FROM emi.main

(emi.main is a physical table).

There is another view made up of about 20 tables, looking somewhat
like this:

CREATE OR REPLACE VIEW ivoa.obscore AS
 SELECT 'image'::text AS dataproduct_type,
    NULL::text AS dataproduct_subtype,
    2::smallint AS calib_level,
    'PPAKM31'::text AS obs_collection,
        [...]
    'ivo://org.gavo.dc/~?'::text || gavo_urlescape(maps.accref) AS obs_publisher_did,
   [...]
   FROM ppakm31.maps
UNION ALL
    [lots of similar definitions]
UNION ALL
 SELECT ssa.ssa_dstype AS dataproduct_type,
    NULL::text AS dataproduct_subtype,
    [...]
    ssa.ssa_pubdid AS obs_publisher_did,
    [...]
   FROM dfbsspec.ssa
UNION ALL
    [and still more]

The dfbsspec.ssa in this definition is another view:

CREATE OR REPLACE VIEW dfbsspec.ssa AS
 SELECT q.accref,
    q.owner,
    [...]
    q.ssa_pubdid,
    [...]
   FROM ( SELECT raw_spectra.accref,
           [...]
           raw_spectra.pub_did AS ssa_pubdid,
              [...]
           FROM dfbsspec.raw_spectra
             LEFT JOIN dfbsspec.platemeta ON platemeta.plateid = raw_spectra.plate) q

raw_spectra finally is a physical table that has an index:

    "raw_spectra_pub_did" btree (pub_did)

The first view, ivoa.obs_radio, is just a few hundred records,
dfbsspec.raw_spectra is about 23 Megarows, the total ivoa.obscore is
about 100 MRows which occasionally change, so materialising it is
*really* unattractive.  The pubdids are strings of about 40 characters.

You may argue that this whole system looks a bit insane, but of course
this is part of a large metadata handling suite, and all these views
are, in some sense, more or less automatic adaptations to different
metadata schemes, and dramatic simplifications are at least not entriely
trivial.  So, can you assume for the moment that I can't get rid of the
nested views?

Now, when I say

  EXPLAIN ANALYZE SELECT COUNT(*)
    FROM ivoa.obscore
    JOIN ivoa.obs_radio
    USING (obs_publisher_did);

I get:

      Finalize Aggregate  (cost=5114082.70..5114082.71 rows=1 width=8) (actual time=22595.715..22731.950 rows=1
loops=1)
    [...]
                          ->  Parallel Append  (cost=0.56..4800918.33 rows=19267799 width=40) (actual
time=1.566..18985.964rows=15410027 loops=5)
 
                                ->  Parallel Index Only Scan using phot_r_pkey on phot_r  (cost=0.56..754384.72
rows=5118036width=32) (actual time=0.854..7995.762 rows=10197024 loops=2)
 
                                      Heap Fetches: 0

    [...and  lot more of these that have simple pubdid indexes on plain
    tables, the point being: Postgres *does* use pubdid indexes...]

                                ->  Subquery Scan on "*SELECT* 13"  (cost=0.00..2685028.32 rows=5803266 width=58)
(actualtime=0.142..7554.269 rows=4642657 loops=5)
 
                                      ->  Parallel Seq Scan on raw_spectra  (cost=0.00..2626995.66 rows=5803266
width=756)(actual time=0.137..6841.379 rows=4642657 loops=5)
 
    [... and a few more seqscans where there's no index on the pubdid
    because they are small, and one or two similar cases]

My problem is: I can't seem to figure out why Postgres chooses to ignore
the pubdid index on raw_spectra.pub_did and instead does the
time-consuming seqscan.

I thought maybe the genetic optimiser has kicked in because of the large
number of tables and SELECTs in there and chose a suboptimal plan.  But
switching off the genetic optimiser doesn't change the plan.

Trying to investigate more closely, I wanted to simplify the
situation and created a view like ivoa.obscore but only having the
evil table in it:

CREATE TEMPORARY VIEW bla AS (SELECT
                       [...]
                       CAST(ssa_pubdid AS text) AS obs_publisher_did,
                       [...]
FROM dfbsspec.ssa)

When I then say

EXPLAIN ANALYZE SELECT COUNT(*)
  FROM ivoa.obs_radio
  JOIN bla USING (obs_publisher_did);

the query plan looks like this:

 Aggregate  (cost=4873.00..4873.01 rows=1 width=8) (actual time=2.484..2.486 rows=1 loops=1)
   ->  Nested Loop  (cost=0.56..4871.60 rows=561 width=0) (actual time=2.478..2.479 rows=0 loops=1)
         ->  Seq Scan on main  (cost=0.00..52.61 rows=561 width=48) (actual time=0.011..0.317 rows=561 loops=1)
         ->  Index Scan using raw_spectra_pub_did on raw_spectra  (cost=0.56..8.58 rows=1 width=66) (actual
time=0.003..0.003rows=0 loops=561)
 
               Index Cond: (pub_did = main.obs_publisher_did)
 Planning Time: 5.386 ms
 Execution Time: 2.750 ms

-- exactly as it should.

So, when the SELECT statement on dfbsspec.ssa stands along in the view
definition, Postgres does the right thing; when the exact same query
stands in a UNION ALL with other tables, Postgres doesn't use the
index.  Hu?

Is there anything that would explain that behaviour given I've switched
off the genetic optimiser and postgres has hopefully exhaustively
searched the space of plans in both cases?

Thanks a lot!

         -- Markus




Re: Index not used in certain nested views but not in others

From
Ron Johnson
Date:
On Wed, Apr 30, 2025 at 3:51 PM Markus Demleitner <msdemlei@ari.uni-heidelberg.de> wrote:
Dear List,

I know how tedious mails with a subject of the type "I don't understand
what the planner does" are, but on this one I'm really stumped.
Regrettably, the situation is also a bit complex.  Hopefully, someone
will bear with me.

So, in a PostgreSQL 15.12 I have a view over a single table with ~20
columns (the only relevant columns here are the ones that somehow
contain "pub[lisher]_did", the others are just there for context; I'm
going to call these "pubdids" from here on in the prose):
[snip] 
Is there anything that would explain that behaviour given I've switched
off the genetic optimiser and postgres has hopefully exhaustively
searched the space of plans in both cases?

Are the tables regularly analyzed and vacuumed?  (The default autovacuum analyze threshold of 20% is pretty high.)

--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
Markus Demleitner <msdemlei@ari.uni-heidelberg.de> writes:
> So, when the SELECT statement on dfbsspec.ssa stands along in the view
> definition, Postgres does the right thing; when the exact same query
> stands in a UNION ALL with other tables, Postgres doesn't use the
> index.  Hu?

It's hard to be sure when you've shown us no table definitions and
only fragments of the view definitions.  But I suspect what is
happening here is that the view's UNIONs are causing a data type
coercion of raw_spectra.pub_did before it gets to the top level
of the view output.  That might interfere with the planner's ability
to see that the outer query's join operator is compatible with
the table's index.

            regards, tom lane



Re: Index not used in certain nested views but not in others

From
Laurenz Albe
Date:
On Wed, 2025-04-30 at 17:02 -0400, Tom Lane wrote:
> Markus Demleitner <msdemlei@ari.uni-heidelberg.de> writes:
> > So, when the SELECT statement on dfbsspec.ssa stands along in the view
> > definition, Postgres does the right thing; when the exact same query
> > stands in a UNION ALL with other tables, Postgres doesn't use the
> > index.  Hu?
>
> It's hard to be sure when you've shown us no table definitions and
> only fragments of the view definitions.  But I suspect what is
> happening here is that the view's UNIONs are causing a data type
> coercion of raw_spectra.pub_did before it gets to the top level
> of the view output.  That might interfere with the planner's ability
> to see that the outer query's join operator is compatible with
> the table's index.

For a more detailed description of that problem, see
https://www.cybertec-postgresql.com/en/union-all-data-types-performance/

Yours,
Laurenz Albe



Re: Index not used in certain nested views but not in others

From
"Peter J. Holzer"
Date:
On 2025-04-30 17:16:47 +0200, Markus Demleitner wrote:
> There is another view made up of about 20 tables, looking somewhat
> like this:
[...]
> The first view, ivoa.obs_radio, is just a few hundred records,
> dfbsspec.raw_spectra is about 23 Megarows,
[...]
>                                       ->  Parallel Seq Scan on raw_spectra  (cost=0.00..2626995.66 rows=5803266
width=756)(actual time=0.137..6841.379 rows=4642657 loops=5) 
[...]
>
> My problem is: I can't seem to figure out why Postgres chooses to ignore
> the pubdid index on raw_spectra.pub_did and instead does the
> time-consuming seqscan.

It estimates that it has to read 5803266 of those 23000000 rows. That's
25 %! I'm not surprised that it thinks just reading the whole table is
faster than doing almost 6 million index lookups (The actual count is
4642657, so that estimate wasn't totally off).


> Trying to investigate more closely, I wanted to simplify the
> situation and created a view like ivoa.obscore but only having the
> evil table in it:
[...]
>    ->  Nested Loop  (cost=0.56..4871.60 rows=561 width=0) (actual time=2.478..2.479 rows=0 loops=1)
>          ->  Seq Scan on main  (cost=0.00..52.61 rows=561 width=48) (actual time=0.011..0.317 rows=561 loops=1)
>          ->  Index Scan using raw_spectra_pub_did on raw_spectra  (cost=0.56..8.58 rows=1 width=66) (actual
time=0.003..0.003rows=0 loops=561) 
>                Index Cond: (pub_did = main.obs_publisher_did)

Here you select only 561 rows. That's just a tiny fraction of the whole
table, so the optimizer estimates that doing a few hundred index lookups
is faster than reading the whole table.

> So, when the SELECT statement on dfbsspec.ssa stands along in the view
> definition, Postgres does the right thing; when the exact same query
> stands in a UNION ALL with other tables, Postgres doesn't use the
> index.  Hu?

It is obviously not the exact same query if one of them need to read
10000 times as many rows.

        hjp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment
"Peter J. Holzer" <hjp-pgsql@hjp.at> writes:
> On 2025-04-30 17:16:47 +0200, Markus Demleitner wrote:
>> dfbsspec.raw_spectra is about 23 Megarows,
>> ->  Parallel Seq Scan on raw_spectra  (cost=0.00..2626995.66 rows=5803266 width=756) (actual time=0.137..6841.379
rows=4642657loops=5) 

> It estimates that it has to read 5803266 of those 23000000 rows.

No, you're misreading that (I admit it's confusing).  The rows report
is the average per parallel worker, and the loops count indicates we
had 5 workers.  So actually this parallel seqscan emitted 4642657*5
= 23213285 rows, or the whole table, which is what should be expected
given it has no filter condition and no LIMIT.

(I am wondering why the estimate is only 5803266 rows, because I don't
think that number is scaled for the number of workers...)

>> ->  Nested Loop  (cost=0.56..4871.60 rows=561 width=0) (actual time=2.478..2.479 rows=0 loops=1)
>>       ->  Seq Scan on main  (cost=0.00..52.61 rows=561 width=48) (actual time=0.011..0.317 rows=561 loops=1)
>>       ->  Index Scan using raw_spectra_pub_did on raw_spectra  (cost=0.56..8.58 rows=1 width=66) (actual
time=0.003..0.003rows=0 loops=561) 
>>             Index Cond: (pub_did = main.obs_publisher_did)

> Here you select only 561 rows. That's just a tiny fraction of the whole
> table, so the optimizer estimates that doing a few hundred index lookups
> is faster than reading the whole table.

The point here is that the chosen plan shape allows pushing the join
qual "raw_spectra.pub_did = main.obs_publisher_did" down to be an
index condition, which is exactly what we have to do if we want to
avoid reading all of raw_spectra.  What Markus is complaining about
is that that fails to happen if there's a UNION ALL in the way.
Postgres is capable of doing that in other cases, so it's a fair
question.

            regards, tom lane