Re: cast bytea to double precision[] - Mailing list pgsql-novice
From | Mathieu Dubois |
---|---|
Subject | Re: cast bytea to double precision[] |
Date | |
Msg-id | 4E2EF890.9090901@limsi.fr Whole thread Raw |
In response to | Re: cast bytea to double precision[] (Merlin Moncure <mmoncure@gmail.com>) |
Responses |
Re: cast bytea to double precision[]
|
List | pgsql-novice |
On 07/26/2011 04:30 PM, Merlin Moncure wrote: > On Tue, Jul 26, 2011 at 2:45 AM, Mathieu Dubois<mathieu.dubois@limsi.fr> wrote: >> Hello, >> >> Le 25/07/2011 17:58, Mathieu Dubois a écrit : >>> On 07/25/2011 05:54 PM, Merlin Moncure wrote: >>>> On Sun, Jul 24, 2011 at 2:03 PM, Mathieu >>>> Dubois<mathieu.dubois@limsi.fr> wrote: >>>>> I have found a solution by myself for the conversion: >>>>> regexp_split_to_array(sig_vector, E',')::double precision[] (elements >>>>> are in >>>>> fact separated by commas). >>>>> >>>>> To convert my column I have used: >>>>> ALTER TABLE my_table ALTER sig_vector TO double precision[] USING >>>>> regexp_split_to_array(sig_vector, E',')::double precision[]; >>>>> >>>>> Is that correct? >>>>> Is it correct to pass the column name to regexp_split_to_array()? >>>> Yeah -- you are just passing a column's data into a function as an >>>> argument -- standard practice. This will work -- your bytea is really >>>> a text column, so it's just a matter of breaking up the string. >>>> regexp_* functions are great for that. >>> Thank you very much for your reply. >>> >>> I will launch the conversion right now. >>> >> The main reason to do this was to have smaller backups. >> The size of a compressed backup was around 1GB with bytea. >> I have converted the columns (on a copy of the database) but the expected >> gain are not here! >> With double precision[] it is still around 1GB (a little bit smaller but >> just a few MB). >> >> Also the size on the disk is not smaller. >> I have listed the content of /var/lib/postgres/8.4/main/base with du and the >> 2 versions have the same size (3.1GB). >> >> Does it make sense? >> My hypothesis is that the compression algorithm is able to find regularities >> the data so it finds the same regularity in bytea and in double precision[]. >> >> Is there any advantage to use double precision[] over bytea in my case? > probably not -- arrays can be significantly smaller than a set of > individual tuples each holding one value because of the tuple > overhead, but you still have to pay for the array header and a 4 byte > length/null indicator per element. > > A packed string is often the smallest way to store data, although not > necessarily the best. A double precision[] comes with a lot of syntax > advantages. Thanks for your advice! I find the result surprising because the floats are encoded with a lot of characters (something like 20) while a double is 8 bytes. I have tried to run VACCUM but it changed nothing... All of my code is based on strings so I won't take time to modify it if there is no gain. Mathieu > merlin >
pgsql-novice by date: