Thread: Select "todays" timestamps in an index friendly way
Hi, I am looking for a way to select all timestamps that are "today" in an index friendly way. This select should not depend on the concrete value of "today". Given a table create temporary table t ( id SERIAL primary key, ts timestamp not null default now() ); with some data insert into t (ts) select ts from generate_series( '2018-01-01T00:00:01'::timestamp, '2018-12-31T23:59:59'::timestamp, '2 minutes') as ts; and an index create index on t (ts, id); I can of course make an explicit select for `ts` values that are "today": select ts, id from t where ts >= '2018-10-23T00:00:00'::timestamp and ts <= '2018-10-23T23:59:59'::timestamp; This uses an Bitmap Index Scan on `t_ts_id_idx`. Good. But the where conditions depends on concrete values of "today" which will not return the intended result if I execute it tomorrow. I will have to change the where condition. Not good. I am looking for a way to make the where condition independed of the date of execution. I can create a function create function is_today(timestamp) returns boolean as $$ select to_char(now(), 'YYYY-MM-DD') = to_char($1, 'YYYY-MM-DD'); $$ language sql; that converts the timestamps to text. But using this function select * from t where is_today(ts); will not benefit from the existing index. A Seq Scan on `t` will be used. Not good. Is there a way to have both: be independed of the concrete value of "today" *and* use the index on the timestamp column? Lutz
Hi Lutz. On Tue, Oct 23, 2018 at 11:38 AM, Lutz Horn <lutz.horn@posteo.de> wrote: > I am looking for a way to select all timestamps that are "today" in an > index friendly way. This select should not depend on the concrete value > of "today". > Given a table > create temporary table t ( > id SERIAL primary key, > ts timestamp not null default now() > ); > > with some data > > insert into t (ts) > select ts > from generate_series( > '2018-01-01T00:00:01'::timestamp, > '2018-12-31T23:59:59'::timestamp, > '2 minutes') > as ts; > > and an index > > create index on t (ts, id); > > I can of course make an explicit select for `ts` values that are > "today": > > select ts, id > from t > where ts >= '2018-10-23T00:00:00'::timestamp > and ts <= '2018-10-23T23:59:59'::timestamp; 1st remark. Do NOT use closed interval for timestamps. Always use half-open or you'll run into problems ( i.e., you are going to miss 2018-10-23T23:59:59.25 in that query ). For real like things ( which timestamps are, they identify a point on the time line ) use half-open ( you can cover a line with non-overlapping half-open segments, not with closed ones ). I.e., your query will better be stated as where ts >= '2018-10-23T00:00:00'::timestamp and ts < '2018-10-24T00:00:00'::timestamp; Which, as a nice bonus, can rely on the time part defaulting to 0: where ts >= '2018-10-23'::timestamp and ts < '2018-10-24'::timestamp; and then be expressed in other ways, like where ts >= '2018-10-23'::timestamp and ts < ('2018-10-23'::timestamp + '1 day'::interval) > This uses an Bitmap Index Scan on `t_ts_id_idx`. Good. > > But the where conditions depends on concrete values of "today" which > will not return the intended result if I execute it tomorrow. I will > have to change the where condition. Not good. > > I am looking for a way to make the where condition independed of the > date of execution. I can create a function > > create function is_today(timestamp) returns boolean as $$ > select to_char(now(), 'YYYY-MM-DD') = to_char($1, 'YYYY-MM-DD'); > $$ language sql; This is not a good way to deal with timestamp values, they are just numbers, play with them as such. Try using something like date_trunc('day',now()) = date_trunc('day',$1) which states your purposes more clearly. > that converts the timestamps to text. But using this function > > select * from t where is_today(ts); > > will not benefit from the existing index. A Seq Scan on `t` will be > used. Not good. > > Is there a way to have both: be independed of the concrete value of > "today" *and* use the index on the timestamp column? Well, if your definition of today is 'same value as now() when truncated to days' we can use part of what I've written above, 1st calculate today and tomorrow with same timestamp arithmetic and date_trunc: select now(), date_trunc('day',now()) as today, date_trunc('day',now()+'1 day') as tomorrow; now | today | tomorrow -------------------------------+------------------------+------------------------ 2018-10-23 11:58:01.699407+02 | 2018-10-23 00:00:00+02 | 2018-10-24 00:00:00+02 (1 row) Then plug that result in your query ( using the half-open technique ) described above: where ts >= date_trunc('day',now()) and ts < date_trunc('day',now()+'1 day') as tomorrow; IIRC this should use the index, you can RTFM in case you prefer using current_timestamp and her cousins, but bear in mind if you use something like current_date you should convert it to timestamp, not convert ts to date, to get easy index usage. Francisco Olarte.
Lutz Horn schrieb am 23.10.2018 um 11:38: > I can of course make an explicit select for `ts` values that are > "today": > > select ts, id > from t > where ts >= '2018-10-23T00:00:00'::timestamp > and ts <= '2018-10-23T23:59:59'::timestamp; > > This uses an Bitmap Index Scan on `t_ts_id_idx`. Good. > > But the where conditions depends on concrete values of "today" which > will not return the intended result if I execute it tomorrow. I will > have to change the where condition. Not good. > > I am looking for a way to make the where condition independed of the > date of execution. I can create a function > > create function is_today(timestamp) returns boolean as $$ > select to_char(now(), 'YYYY-MM-DD') = to_char($1, 'YYYY-MM-DD'); > $$ language sql; > > that converts the timestamps to text. But using this function > > select * from t where is_today(ts); > > will not benefit from the existing index. A Seq Scan on `t` will be > used. Not good. > > Is there a way to have both: be independed of the concrete value of > "today" *and* use the index on the timestamp column? I typically use: where ts >= date '2018-10-23' and ts < date '2018-10-23' + 1
Hi Francisco, On Tue, Oct 23, 2018 at 12:05:17PM +0200, Francisco Olarte wrote: > 1st remark. Do NOT use closed interval for timestamps. Always use > half-open or you'll run into problems Good point, thanks. > where ts >= date_trunc('day',now()) > and ts < date_trunc('day',now()+'1 day') as tomorrow; > > IIRC this should use the index And it does! Thanks! Lutz
Hi Thomas, On Tue, Oct 23, 2018 at 12:11:55PM +0200, Thomas Kellerer wrote: > I typically use: > > where ts >= date '2018-10-23' > and ts < date '2018-10-23' + 1 But here the date is an explicit value. Francisco reworded my question: > if your definition of today is 'same value as now() when truncated to > days' That's what I am (was, thanks to Francisco) looking for. Lutz
Lutz Horn schrieb am 23.10.2018 um 12:19: > Hi Thomas, > > On Tue, Oct 23, 2018 at 12:11:55PM +0200, Thomas Kellerer wrote: >> I typically use: >> >> where ts >= date '2018-10-23' >> and ts < date '2018-10-23' + 1 > > But here the date is an explicit value. Francisco reworded my question: > >> if your definition of today is 'same value as now() when truncated to >> days' > > That's what I am (was, thanks to Francisco) looking for. Then use current_date: where ts >= current_date and ts < current_date + 1
Hi
El mar., 23 de oct. de 2018 a la(s) 05:41, Thomas Kellerer (spam_eater@gmx.net) escribió:
Lutz Horn schrieb am 23.10.2018 um 12:19:
> Hi Thomas,
>
> On Tue, Oct 23, 2018 at 12:11:55PM +0200, Thomas Kellerer wrote:
>> I typically use:
>>
>> where ts >= date '2018-10-23'
>> and ts < date '2018-10-23' + 1
>
> But here the date is an explicit value. Francisco reworded my question:
>
>> if your definition of today is 'same value as now() when truncated to
>> days'
>
> That's what I am (was, thanks to Francisco) looking for.
Then use current_date:
where ts >= current_date
and ts < current_date + 1
this is equally valid?
where ts >= current_date
cordialmente:
Hellmuth Vargas
On Tue, Oct 23, 2018 at 3:05 PM, Hellmuth Vargas <hivs77@gmail.com> wrote: ... >> Then use current_date: >> >> where ts >= current_date >> and ts < current_date + 1 > > this is equally valid? > > where ts >= current_date It'is not as the problem was stated. Although ts defaulted to now(), and it is probably defaulted, nothing prohibits him from inserting timestamps in the future. Also, I'll point the table used in the sample ( bigserial+timestamp) does not seem like a real one and the "timestamps in today" pattern is commonly used in calendaring applications, which usually insert appointments in the future and recover this way to print "todays schedule". Francisco Olarte.
On Tue, Oct 23, 2018 at 03:50:14PM +0200, Francisco Olarte wrote: > It'is not as the problem was stated. Although ts defaulted to now(), > and it is probably defaulted, nothing prohibits him from inserting > timestamps in the future. Yes, this table is only used as an example for the technical question. In my real use case there are columns like "due_date" which usually contain future dates inserted by application code. > the "timestamps in today" pattern is commonly used in calendaring > applications, which usually insert appointments in the future and > recover this way to print "todays schedule". Exactly. The application must be able to execute queries like "give me all my tasks due today" without having to use a concrete value for "today". Lutz
On Tue, Oct 23, 2018 at 3:57 PM, Lutz Horn <lutz.horn@posteo.de> wrote: > On Tue, Oct 23, 2018 at 03:50:14PM +0200, Francisco Olarte wrote: >> It'is not as the problem was stated. Although ts defaulted to now(), >> and it is probably defaulted, nothing prohibits him from inserting >> timestamps in the future. > Yes, this table is only used as an example for the technical question. > In my real use case there are columns like "due_date" which usually > contain future dates inserted by application code. If your real table uses dates instead of timestamps modify the code accordingly, they are not the same ( dates are countable, instants in time are not (they are in the computer, with finite precision, but you see the difference )) Although I supose they really are timestamps, or you would have just used "date_column=current_date". >> the "timestamps in today" pattern is commonly used in calendaring >> applications, which usually insert appointments in the future and >> recover this way to print "todays schedule". > Exactly. The application must be able to execute queries like "give me > all my tasks due today" without having to use a concrete value for > "today". Been there, done that. With an IBM 84 ( instructional use. It was, not surprissingly, easier but slower, ). Happy hacking. Francisco Olarte.
On Tue, Oct 23, 2018 at 5:38 AM Lutz Horn <lutz.horn@posteo.de> wrote: > > Hi, > > I am looking for a way to select all timestamps that are "today" in an > index friendly way. This select should not depend on the concrete value > of "today". > Per TFM, https://www.postgresql.org/docs/10/static/datatype-datetime.html on table 8.13, you can use special input values: SELECT * FROM t WHERE ts >= 'today'::timestamp AND ts < 'tomorrow'::timestamp; HTH, -- Mike Rylander | Executive Director | Equinox Open Library Initiative | phone: 1-877-OPEN-ILS (673-6457) | email: miker@equinoxinitiative.org | web: http://equinoxinitiative.org
On 24 October 2018 at 07:14, Mike Rylander <mrylander@gmail.com> wrote: > > On Tue, Oct 23, 2018 at 5:38 AM Lutz Horn <lutz.horn@posteo.de> wrote: > > I am looking for a way to select all timestamps that are "today" in an > > index friendly way. This select should not depend on the concrete value > > of "today". > > Per TFM, https://www.postgresql.org/docs/10/static/datatype-datetime.html > on table 8.13, you can use special input values: > > SELECT * FROM t WHERE ts >= 'today'::timestamp AND ts < 'tomorrow'::timestamp; Of course, you'd need to be careful never to use that in a view or even a PREPAREd statement. Those abbreviations are evaluated when the query is parsed. In those cases, you'd just get the results for whatever day you did CREATE VIEW or PREPARE. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
> create temporary table t ( > id SERIAL primary key, > ts timestamp not null default now() > ); * add date( ts ) as a field and index date = now()::date. * Depending on the amount of data in your table the date may not be seletive enough to be worth using, at which point the index may be present and ignored. Only way to be sure is analyze it. * Might be worth looking at a partial index using >= 00:00:00 and < 24:00:00 (PG grocks the 2400 notation for "midnight at the end of today) or where ts && tsrange( ... 00:00:00, ... 24:00:00, '[)] ) Nice thing about the partial index is that you can create it on all of the non-ts fields for fast lookup by whatever and only index the portion for today. * Think about using a materialized view rather than a temp table. May prove simpler to query. -- Steven Lembark 3920 10th Ave South Workhorse Computing Birmingham, AL 35222 lembark@wrkhors.com +1 888 359 3508