Re: [GENERAL] ERROR: functions in index expression must be markedIMMUTABLE - Mailing list pgsql-general
From | Sven R. Kunze |
---|---|
Subject | Re: [GENERAL] ERROR: functions in index expression must be markedIMMUTABLE |
Date | |
Msg-id | 44736291-2981-4b4a-fa4c-32325655334b@mail.de Whole thread Raw |
In response to | Re: [GENERAL] ERROR: functions in index expression must be marked IMMUTABLE (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [GENERAL] ERROR: functions in index expression must be marked IMMUTABLE
|
List | pgsql-general |
Hi Geoff, Adrian and Tom,
thanks for your responses so far. Excuse my late response. I will respond to Tom's mail as it covers most points:
On 26.02.2017 17:50, Tom Lane wrote:
thanks for your responses so far. Excuse my late response. I will respond to Tom's mail as it covers most points:
On 26.02.2017 17:50, Tom Lane wrote:
There are multiple reasons why the text-to-datetime conversion functions are not immutable: * some of them depend on the current timezone (but I don't believe date_in does); * all of them depend on the current datestyle setting, eg to resolve '02/03/2017'; * all of them accept strings with time-varying values, such as 'now' or 'today'. You could get around the second and third points with to_timestamp(), but since the only variant of that is one that yields timestamptz and hence is affected by the timezone setting, it's still not immutable.
I understand that timezone settings can have serious consequences when parsing text to datetime.
My conceptual issue is that wrapping an "unsafe" operation up into a function and **marking** it as "safe" is not making things safer. Basically by-passing security guards.
So, what can I do to parse texts to date(times) in a safe manner?
I'd like to do it the right way. I can safely provide the timezone for those dates but it won't be in the jsonb data.
I'm not entirely sure why the OP feels he needs an index on this expression. If he's willing to restrict the column to have the exact format 'YYYY-MM-DD', then a regular textual index would sort the same anyway. Perhaps what's needed is just to add a CHECK constraint verifying that the column has that format.
These were my reasons:
1) sanity checks (already noted)
2) index date ranges (using gist)
3) maybe performance (comparing texts vs comparing dates) but I couldn't think of ways to test this
That's the current schema:
Table "public.docs"
Column | Type | Modifiers
--------+---------+---------------------------------------------------
id | integer | not null default nextval('docs_id_seq'::regclass)
meta | jsonb |
Indexes:
"docs_pkey" PRIMARY KEY, btree (id)
"docs_address_idx" gin (to_tsvector('english'::regconfig, meta ->> 'address'::text))
"docs_address_trgm_idx" gin ((meta ->> 'address'::text) gin_trgm_ops)
"docs_birthdate_idx" btree ((meta ->> 'birthdate'::text))
"docs_meta_idx" gin (meta jsonb_path_ops)
"docs_name_idx" gin (to_tsvector('english'::regconfig, meta ->> 'name'::text))
Thanks to the ISO date format, I got by with a btree index on birthdate as Tom suggested.
The index supports queries like the following (although 22secs still is not great on 10M rows)
explain analyze select meta->>'birthdate' from docs where meta->>'birthdate' > '2000-01-01' and meta->>'birthdate' < '2000-12-31' order by meta->>'birthdate';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Index Scan using docs_birthdate_idx on docs (cost=0.43..46067.43 rows=50000 width=136) (actual time=2.118..22177.710 rows=209955 loops=1)
Index Cond: (((meta ->> 'birthdate'::text) > '2000-01-01'::text) AND ((meta ->> 'birthdate'::text) < '2000-12-31'::text))
Planning time: 0.205 ms
Execution time: 22229.615 ms
Regard,
Sven
pgsql-general by date: