Re: [WIP PATCH] Index scan offset optimisation using visibility map - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: [WIP PATCH] Index scan offset optimisation using visibility map
Date
Msg-id 2c171aa4-3df4-3c90-c649-be6f9d1d16dc@iki.fi
Whole thread Raw
In response to Re: [WIP PATCH] Index scan offset optimisation using visibility map  (Michail Nikolaev <michail.nikolaev@gmail.com>)
Responses Re: [WIP PATCH] Index scan offset optimisation using visibility map
List pgsql-hackers
On 21/05/18 18:43, Michail Nikolaev wrote:
> Hello everyone.
> This letter related to “Extended support for index-only-scan” from my
> previous message in the thread.
> 
> WIP version of the patch is ready for a while now and I think it is time to
> resume the work on the feature. BTW, I found a small article about Oracle
> vs Postgres focusing this issue -
> https://blog.dbi-services.com/postgres-vs-oracle-access-paths-viii/
> 
> Current WIP version of the patch is located here -
>
https://github.com/postgres/postgres/compare/88ba0ae2aa4aaba8ea0d85c0ff81cc46912d9308...michail-nikolaev:index_only_fetch,
> passing all checks. In addition, patch includes small optimization for
> caching of amcostestimate results.

Please submit an actual path, extracted e.g. with "git format-patch 
-n1", rather than a link to an external site. That is a requirement for 
archival purposes, so that people reading the email archives later on 
can see what was being discussed. (And that link doesn't return a proper 
diff, anyway.)

> For now, I decide to name the plan as “Index Only Fetch Scan”. Therefore:
> * In case of “Index Scan” – we touch the index and heap for EVERY tuple we
> need to test
> * For “Index Only Scan” – we touch the index for every tuple and NEVER
> touch the heap
> * For “Index Only Fetch Scan” – we touch the index for every tuple and
> touch the heap for those tuples we need to RETURN ONLY.

Hmm. IIRC there was some discussion on doing that, when index-only scans 
were implemented. It's not generally OK to evaluate expressions based on 
data that has already been deleted from the table. As an example of the 
kind of problems you might get:

Imagine that a user does "DELETE * FROM table WHERE div = 0; SELECT * 
FROM table WHERE 100 / div < 10". Would you expect the query to throw a 
"division by zero error"? If there was an index on 'div', you might 
evaluate the "100 / div" expression based on values from the index, 
which still includes entries for the just-deleted tuples with div = 0. 
They would be filtered out later, after performing the visibility 
checks, but it's too late if you already threw an error.

Now, maybe there's some way around that, but I don't know what. Without 
some kind of a solution, this won't work.

- Heikki


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: partition pruning doesn't work with IS NULL clause in multikeyrange partition case
Next
From: "David G. Johnston"
Date:
Subject: Re: pgsql: Fix parallel index and index-only scans to fall back to serial.