Getting back to my original point - you pointed out that for queries that need a decent % of the table it will be cheaper to do a scan, which is exactly what the query planner does for the relational version. If it only needs a small % of the values it looks at the index and for a large % it goes for a scan (it also puts everything in shared buffers and is lightening quick!). Is this just a lack of maturity in the jsonb planner or am I missing something?
Hi Anton,
Good selectivity estimators exists only for the scalar data types. For the complex data types such as json/jsonb introducing a reasonable selectivity estimator is very complicated task, so database could only guess in this cases. In your case the database guessed amount of returned rows with 3 order of magnitude error (estimated 3716 rows, actually 1417152 rows).
Personally, I don't expect serious progress in json/jsonb selectivity estimators in short future, so better to avoid using a low-selectivity queries against indexed json/jsonb fields.
"People problems are solved with people. If people cannot solve the problem, try technology. People will then wish they'd listened at the first stage."