Re: incorrect results and different plan with 2 very similar queries - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: incorrect results and different plan with 2 very similar queries
Date
Msg-id 750b4884-7276-49d5-b577-364f1f3061a2@enterprisedb.com
Whole thread Raw
In response to Re: incorrect results and different plan with 2 very similar queries  (Dave Cramer <davecramer@gmail.com>)
List pgsql-hackers

On 3/27/24 23:10, Dave Cramer wrote:
> Dave Cramer
> 
> 
> On Wed, 27 Mar 2024 at 17:57, David Rowley <dgrowleyml@gmail.com> wrote:
> 
>> On Thu, 28 Mar 2024 at 10:33, Dave Cramer <davecramer@gmail.com> wrote:
>>> There is a report on the pgjdbc github JDBC Driver shows erratic
>> behavior when filtering on CURRENT_DATE · pgjdbc/pgjdbc · Discussion #3184 (
>> github.com)
>>>
>>> Here are the plans.
>>>
>>> JDBC - Nested Loop (incorrect result)
>>>
>>>                     Index Cond: (mutation >= ((CURRENT_DATE -
>> '1971-12-31'::date) - 28))
>>
>>> JDBC - Hash Right (correct result)
>>>
>>>                     Recheck Cond: (mutation >= ((CURRENT_DATE -
>> '1971-12-31'::date) - 29))
>>
>> I don't see any version details or queries, but going by the
>> conditions above, the queries don't appear to be the same, so
>> different results aren't too surprising and not a demonstration that
>> there's any sort of bug.
>>
> 
> Sorry, you are correct. Version is 12.14. Here is the query
> 
> SELECT
>   p.partseqno_i
> , p.partno
> , p.partmatch
> , pfe.average_price
> , pfe.sales_price
> , pfe.purch_price
> , pfe.average_price_2
> , pfe.avg_repair_cost
> , pfe.average_price_func
> , pfe.fsv
> , pfe.fsv_func
> , p.status
> 
> FROM part p
> LEFT JOIN part_fa_entity pfe ON (p.partno = pfe.partno)
> WHERE 1=1
> AND (p.mutation >= (CURRENT_DATE - '1971-12-31'::date)-27) ORDER BY p.partno
> 

I guess the confusing bit is that the report does not claim that those
queries are expected to produce the same result, but that the parameter
value affects which plan gets selected, and one of those plans produces
incorrect result.

I think the simplest explanation might be that one of the indexes on
part_fa_entity is corrupted and fails to lookup some rows by partno.
That would explain why the plan with seqscan works fine, but nestloop
with index scan is missing these rows.

They might try a couple things:

1) set enable_nestloop=off, see if the results get correct

2) try bt_index_check on i_39773, might notice some corruption

3) rebuild the index

If it's not this, they'll need to build a reproducer. It's really
difficult to deduce what's going on just from query plans for different
parameter values.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: "Amonson, Paul D"
Date:
Subject: RE: Popcount optimization using AVX512
Next
From: Alvaro Herrera
Date:
Subject: Re: Popcount optimization using AVX512