Thread: design for multiple time series

design for multiple time series

From

Seb

Date:

12 December 2013, 18:46:01

Hi,

I'm working on the design of a database for time series data collected
by a variety of meteorological sensors.  Many sensors share the same
sampling scheme, but not all.  I initially thought it would be a good
idea to have a table identifying each parameter (variable) that the
sensors report on:

CREATE TABLE parameters (
    parameter_id serial PRIMARY KEY,
    parameter_name character_varying(200) NOT NULL,
    ...
)

and then store the data in a table referencing it:

CREATE TABLE series (
    record_id serial PRIMARY KEY,
    parameter_id integer REFERENCES parameters,
    reading ????
    ...
)

but of course, the data type for the parameters may vary, so it's
impossible to assign a data type to the "reading" column.  The number of
variables measured by the sensors is quite large and may grow or
decrease over time, and grouping them into subjects (tables) is not
clear, so it's not simple to just assign them to different columns.

I've been trying to search for solutions in various sources, but am
having trouble finding relevant material.  I'd appreciate any advice.

Cheers,

--
Seb

Re: design for multiple time series

From

Jayadevan M

Date:

13 December 2013, 11:05:11

On Fri, Dec 13, 2013 at 12:15 AM, Seb <spluque@gmail.com> wrote:

Hi,

I'm working on the design of a database for time series data collected
by a variety of meteorological sensors. Many sensors share the same
sampling scheme, but not all. I initially thought it would be a good
idea to have a table identifying each parameter (variable) that the
sensors report on:

CREATE TABLE parameters (
parameter_id serial PRIMARY KEY,
parameter_name character_varying(200) NOT NULL,
...
)

and then store the data in a table referencing it:

CREATE TABLE series (
record_id serial PRIMARY KEY,
parameter_id integer REFERENCES parameters,
reading ????
...
)

but of course, the data type for the parameters may vary, so it's
impossible to assign a data type to the "reading" column. The number of
variables measured by the sensors is quite large and may grow or
decrease over time, and grouping them into subjects (tables) is not
clear, so it's not simple to just assign them to different columns.

I've been trying to search for solutions in various sources, but am
having trouble finding relevant material. I'd appreciate any advice.

Cheers,

--
Seb

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

If you are not keen on using PostgreSQL, you could have a look at
http://opentsdb.net/

That was one project we found interesting when we were faced with a similar problem a couple of years ago. In the end, many other factors made us opt for Cassandra. We started with PostgreSQL. But our requirements included, among others, ability to add new devices/parameters quickly. So the persistence layer was mostly a data sink and we planned to move cleansed/aggregated data to PostgreSQL for analysis. Most of the master data was also in PostgreSQL - devicies, parameters, units.

Re: design for multiple time series

From

rob stone

Date:

13 December 2013, 12:47:03

O
n Thu, 2013-12-12 at 12:45 -0600, Seb wrote:I
'm working on the design of a database for time series data collected
--
Et in Arcadia, ego.
Floripa -- city of Land Rovers and alligators swimming in creeks.

> sampling scheme, but not all.  I initially thought it would be a good
> idea to have a table identifying each parameter (variable) that the
> sensors report on:
>
> CREATE TABLE parameters (
>     parameter_id serial PRIMARY KEY,
>     parameter_name character_varying(200) NOT NULL,
>     ...
> )
>
> and then store the data in a table referencing it:
>
> CREATE TABLE series (
>     record_id serial PRIMARY KEY,
>     parameter_id integer REFERENCES parameters,
>     reading ????
>     ...
> )
>
> but of course, the data type for the parameters may vary, so it's
> impossible to assign a data type to the "reading" column.  The number of
> variables measured by the sensors is quite large and may grow or
> decrease over time, and grouping them into subjects (tables) is not
> clear, so it's not simple to just assign them to different columns.
>
> I've been trying to search for solutions in various sources, but am
> having trouble finding relevant material.  I'd appreciate any advice.
>
> Cheers,
>
> --
> Seb
>
>

Hello Seb,
I am not a meteorologist and don't know "how" your sensors function, so
please bear with me. I am assuming each sensor reading consists of an
identifier, a timestamp and a numeric value.
As a first cut:-
1) a table to hold type of sensor and its reading metric. E.g, degrees
celsius, kph, litres/minute, etc. Maybe also hold min and max ranges for
validation purposes. E.g. wind direction would have a range of zero to
359.
2) a table to hold geographic location of sensor with a FK pointing to
its type. You could hold its latitude and longitude. Its "identifier"
matches the identifier returned by a reading.
3) a table to hold the readings with a FK pointing to its geographical
location with the actual reading held in a NUMBER(7,3) column, say?
4) a view over these tables using straightforward cartesian joins.

Use the view for analysis.
E.g., if you were going to build an airfield you know the location so
you can graph your wind rose by obtaining wind direction and velocity
each day for 'n' days.

The only fly in the ointment with this is a rain gauge. If you don't
empty it each day the actual rainfall is the difference between
readings.

HTH.

Cheers,
Robert

Re: design for multiple time series

From

John R Pierce

Date:

13 December 2013, 18:53:31

On 12/13/2013 4:46 AM, rob stone wrote:
> The only fly in the ointment with this is a rain gauge. If you don't
> empty it each day the actual rainfall is the difference between
> readings.

(somewhat off topic)

The electronic rain gauges I've seen have all been tip-bucket. they
measure each 0.01" (or equiv metric unit) at a time, and the rain total
is counter based, the weather station software automatically rolls over
daily, weekly, monthly totals, also tracks 'last 24 hours' which is a
rolling total.

the one I have has a 'teeter totter' rocker, each side of it is a
'bucket', when 0.01" of precip. comes down the funnel into the currently
high side of the teeter, it flips over to the other side, dumping that
0.01" and a magnet on the side of the teeter registers a pulse on a
nearby coil.  now the other side fills, and it tips back

we've had 1.40" total at my house since the rainy season started in
August 1.    we should be getting that much or more weekly this time of
year.    this will be the 3rd year of draught on the central coast ;(((

best design for multiple time series *I've* seen is RRDTOOL, which is an
updated version of the classic MRTG.   this is /not/ SQL.

--
john r pierce                                      37N 122W
somewhere on the middle of the left coast

Re: design for multiple time series

From

salah jubeh

Date:

13 December 2013, 19:18:11

>On 12/13/2013 4:46 AM, rob stone wrote:
>> The only fly in the ointment with this is a rain gauge. If you don't
>> empty it each day the actual rainfall is the difference between
>> readings.

>(somewhat off topic)

>The electronic rain gauges I've seen have all been tip-bucket. they
>measure each 0.01" (or equiv metric unit) at a time, and the rain total
>is counter based, the weather station software automatically rolls over
>daily, weekly, monthly totals, also tracks 'last 24 hours' which is a
>rolling total.

>the one I have has a 'teeter totter' rocker, each side of it is a
>'bucket', when 0.01" of precip. comes down the funnel into the currently
>high side of the teeter, it flips over to the other side, dumping that
>0.01" and a magnet on the side of the teeter registers a pulse on a
>nearby coil. now the other side fills, and it tips back

>we've had 1.40" total at my house since the rainy season started in
>August 1. we should be getting that much or more weekly this time of
>year. this will be the 3rd year of draught on the central coast ;(((

>best design for multiple time series *I've* seen is RRDTOOL, which is an
>updated version of the classic MRTG. this is /not/ SQL.

Hello,

(out of the topic also)

NetCDF , SCIDB

Regards

On Friday, December 13, 2013 7:55 PM, John R Pierce <pierce@hogranch.com> wrote:

On 12/13/2013 4:46 AM, rob stone wrote:
> The only fly in the ointment with this is a rain gauge. If you don't
> empty it each day the actual rainfall is the difference between
> readings.

(somewhat off topic)

The electronic rain gauges I've seen have all been tip-bucket. they
measure each 0.01" (or equiv metric unit) at a time, and the rain total
is counter based, the weather station software automatically rolls over
daily, weekly, monthly totals, also tracks 'last 24 hours' which is a
rolling total.

the one I have has a 'teeter totter' rocker, each side of it is a
'bucket', when 0.01" of precip. comes down the funnel into the currently
high side of the teeter, it flips over to the other side, dumping that
0.01" and a magnet on the side of the teeter registers a pulse on a
nearby coil. now the other side fills, and it tips back

we've had 1.40" total at my house since the rainy season started in
August 1. we should be getting that much or more weekly this time of
year. this will be the 3rd year of draught on the central coast ;(((

best design for multiple time series *I've* seen is RRDTOOL, which is an
updated version of the classic MRTG. this is /not/ SQL.

--
john r pierce 37N 122W
somewhere on the middle of the left coast

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general