Thread: advice on how to store variable attributes
Hi all,i need a little of advice on what could be the best way to store this information. We need to calculate the difference in costs for our operations, we are already storing our vendor invoices in the database so calculate the monetary change it is a no-brainer but we need to store special attributes for any of the invoices that we need to compare too, for example:-electric provider: total Kw.-water provider: total m3.-car maintenance: kilometersof the car.-mobile phones provider: international call minutes, national minutes, number of sms, etc.. And much more types/variables, the number of variables can change, not every day but still can change, i would like that they can be defined/changed from our application, so alter table to add columns don't seem the best way (still an option though). We will have "generic" reports that will show us changes in costs and specific reports for the types with "extended attributes" that we want to compare. To compare values from this "extended attributes" i think we have two ways:1- have them in columns and use standard SQL.2-create the columns with a function that reads this attrs and create the columns. So far i thin we have this options:1- a bunch of columns that would be null except when the type of the invoice uses them.2- a table related with the vendor invoices table for every type of invoice with his specifics columns.3- a key/value in a separate table related with the vendor invoices table where i store the extended attrs of every invoice that needs them.4- use a hstore column in the vendor invoces table to store thisattrs. The first two have the problem of probably changes to the number of attributes of every type and give a more closed solution, apart from that 1- seems to be a bit awkward and 2- would need the application that creates the query to know with what table should join for every type (other point we will need to change if we want to create new invoices types). The last two have his own problems too, with 3 i will need to create a function that return rows as columns to compare them, with 4- given that i will store the attrs of every type in the database anyway i can use the operator -> (with a CASE using operator ? returning 0 if the searched attr it is not in the hstore) but still don't seem a clean solution for me. For me it seems i am missing something, probably any of you have a much more elegant (or correct) way to handle this situation, what would be your advice? Thanks. Regards, Miguel Angel
On Oct 22, 2011, at 6:41, Linos <info@linos.es> wrote: > Hi all, > i need a little of advice on what could be the best way to store this information. > > We need to calculate the difference in costs for our operations, we are already > storing our vendor invoices in the database so calculate the monetary change it > is a no-brainer but we need to store special attributes for any of the invoices > that we need to compare too, for example: > -electric provider: total Kw. > -water provider: total m3. > -car maintenance: kilometers of the car. > -mobile phones provider: international call minutes, national minutes, number > of sms, etc.. > > And much more types/variables, the number of variables can change, not every day > but still can change, i would like that they can be defined/changed from our > application, so alter table to add columns don't seem the best way (still an > option though). We will have "generic" reports that will show us changes in > costs and specific reports for the types with "extended attributes" that we want > to compare. > > To compare values from this "extended attributes" i think we have two ways: > 1- have them in columns and use standard SQL. > 2- create the columns with a function that reads this attrs and create the columns. > > So far i thin we have this options: > 1- a bunch of columns that would be null except when the type of the invoice > uses them. > 2- a table related with the vendor invoices table for every type of invoice > with his specifics columns. > 3- a key/value in a separate table related with the vendor invoices table where > i store the extended attrs of every invoice that needs them. > 4- use a hstore column in the vendor invoces table to store this attrs. > > The first two have the problem of probably changes to the number of attributes > of every type and give a more closed solution, apart from that 1- seems to be a > bit awkward and 2- would need the application that creates the query to know > with what table should join for every type (other point we will need to change > if we want to create new invoices types). > > The last two have his own problems too, with 3 i will need to create a function > that return rows as columns to compare them, with 4- given that i will store the > attrs of every type in the database anyway i can use the operator -> (with a > CASE using operator ? returning 0 if the searched attr it is not in the hstore) > but still don't seem a clean solution for me. > > For me it seems i am missing something, probably any of you have a much more > elegant (or correct) way to handle this situation, what would be your advice? > Thanks. > > Create a table with a single numeric column and multiple category columns. ( amount_value, amount_unit, amount_category, vendor_id ) If necessary each "amount_value" data type should have it's own table since the processing logic will vary (I.e., you cannotsubtract text or Boolean values). You are , in effect, creating multiple tables but combining them into one and using the category column to distinguish betweenthem. David J.
2011/10/22 David Johnston <polobo@yahoo.com>: > On Oct 22, 2011, at 6:41, Linos <info@linos.es> wrote: > >> Hi all, >> i need a little of advice on what could be the best way to store this information. >> >> We need to calculate the difference in costs for our operations, we are already >> storing our vendor invoices in the database so calculate the monetary change it >> is a no-brainer but we need to store special attributes for any of the invoices >> that we need to compare too, for example: >> -electric provider: total Kw. >> -water provider: total m3. >> -car maintenance: kilometers of the car. >> -mobile phones provider: international call minutes, national minutes, number >> of sms, etc.. >> >> And much more types/variables, the number of variables can change, not every day >> but still can change, i would like that they can be defined/changed from our >> application, so alter table to add columns don't seem the best way (still an >> option though). We will have "generic" reports that will show us changes in >> costs and specific reports for the types with "extended attributes" that we want >> to compare. >> >> To compare values from this "extended attributes" i think we have two ways: >> 1- have them in columns and use standard SQL. >> 2- create the columns with a function that reads this attrs and create the columns. >> >> So far i thin we have this options: >> 1- a bunch of columns that would be null except when the type of the invoice >> uses them. >> 2- a table related with the vendor invoices table for every type of invoice >> with his specifics columns. >> 3- a key/value in a separate table related with the vendor invoices table where >> i store the extended attrs of every invoice that needs them. >> 4- use a hstore column in the vendor invoces table to store this attrs. >> >> The first two have the problem of probably changes to the number of attributes >> of every type and give a more closed solution, apart from that 1- seems to be a >> bit awkward and 2- would need the application that creates the query to know >> with what table should join for every type (other point we will need to change >> if we want to create new invoices types). >> >> The last two have his own problems too, with 3 i will need to create a function >> that return rows as columns to compare them, with 4- given that i will store the >> attrs of every type in the database anyway i can use the operator -> (with a >> CASE using operator ? returning 0 if the searched attr it is not in the hstore) >> but still don't seem a clean solution for me. >> >> For me it seems i am missing something, probably any of you have a much more >> elegant (or correct) way to handle this situation, what would be your advice? >> Thanks. >> >> > > Create a table with a single numeric column and multiple category columns. > > ( amount_value, amount_unit, amount_category, vendor_id ) This is EAV model - is good for smaller datasets, for larger datasets is problematic. There is second possibility - using a "hstore" contrib module - that emulates HASH table - It has better for larger datasets. Regards Pavel Stehule > > If necessary each "amount_value" data type should have it's own table since the processing logic will vary (I.e., you cannotsubtract text or Boolean values). > > You are , in effect, creating multiple tables but combining them into one and using the category column to distinguishbetween them. > > David J. > -- > Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-sql >
2011/10/22 David Johnston <polobo@yahoo.com>: > On Oct 22, 2011, at 10:07, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> 2011/10/22 David Johnston <polobo@yahoo.com>: >>> On Oct 22, 2011, at 6:41, Linos <info@linos.es> wrote: >>> >>>> Hi all, >>>> i need a little of advice on what could be the best way to store this information. >>>> >>>> We need to calculate the difference in costs for our operations, we are already >>>> storing our vendor invoices in the database so calculate the monetary change it >>>> is a no-brainer but we need to store special attributes for any of the invoices >>>> that we need to compare too, for example: >>>> -electric provider: total Kw. >>>> -water provider: total m3. >>>> -car maintenance: kilometers of the car. >>>> -mobile phones provider: international call minutes, national minutes, number >>>> of sms, etc.. >>>> >>>> And much more types/variables, the number of variables can change, not every day >>>> but still can change, i would like that they can be defined/changed from our >>>> application, so alter table to add columns don't seem the best way (still an >>>> option though). We will have "generic" reports that will show us changes in >>>> costs and specific reports for the types with "extended attributes" that we want >>>> to compare. >>>> >>>> To compare values from this "extended attributes" i think we have two ways: >>>> 1- have them in columns and use standard SQL. >>>> 2- create the columns with a function that reads this attrs and create the columns. >>>> >>>> So far i thin we have this options: >>>> 1- a bunch of columns that would be null except when the type of the invoice >>>> uses them. >>>> 2- a table related with the vendor invoices table for every type of invoice >>>> with his specifics columns. >>>> 3- a key/value in a separate table related with the vendor invoices table where >>>> i store the extended attrs of every invoice that needs them. >>>> 4- use a hstore column in the vendor invoces table to store this attrs. >>>> >>>> The first two have the problem of probably changes to the number of attributes >>>> of every type and give a more closed solution, apart from that 1- seems to be a >>>> bit awkward and 2- would need the application that creates the query to know >>>> with what table should join for every type (other point we will need to change >>>> if we want to create new invoices types). >>>> >>>> The last two have his own problems too, with 3 i will need to create a function >>>> that return rows as columns to compare them, with 4- given that i will store the >>>> attrs of every type in the database anyway i can use the operator -> (with a >>>> CASE using operator ? returning 0 if the searched attr it is not in the hstore) >>>> but still don't seem a clean solution for me. >>>> >>>> For me it seems i am missing something, probably any of you have a much more >>>> elegant (or correct) way to handle this situation, what would be your advice? >>>> Thanks. >>>> >>>> >>> >>> Create a table with a single numeric column and multiple category columns. >>> >>> ( amount_value, amount_unit, amount_category, vendor_id ) >> >> >> This is EAV model - is good for smaller datasets, for larger datasets >> is problematic. There is second possibility - using a "hstore" contrib >> module - that emulates HASH table - It has better for larger datasets. >> >> Regards >> >> Pavel Stehule > > Store was mentioned by the OP. > > Any suggestions on where the line between small and large is drawn? > > Partitions could help in the larger cases. > > My personal first choice is to use separate tables. If going EAV route plan on eventually moving to the separate tableroute and at least try to make migration relatively easy. > > Since both models capture the same data the decision at least partially rests upon the immediate use-cases for the data. Lacking use-cases and size parameters recommendations are difficult to make. > > I have not used hstore so my ability to recommend it is limited. My main "concern" is the fact that the data is less visibleand always stored as text. I see hstore as being useful for situations where the attributes are variable but thisuse case seems to have fairly well-defined attributes. > ook - it is true Pavel > David J. > > >> >>> >>> If necessary each "amount_value" data type should have it's own table since the processing logic will vary (I.e., youcannot subtract text or Boolean values). >>> >>> You are , in effect, creating multiple tables but combining them into one and using the category column to distinguishbetween them. >>> >>> David J. >>> -- >>> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) >>> To make changes to your subscription: >>> http://www.postgresql.org/mailpref/pgsql-sql >>> >> >> -- >> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-sql >
El 22/10/11 14:53, David Johnston escribió: > On Oct 22, 2011, at 6:41, Linos <info@linos.es> wrote: > >> Hi all, >> i need a little of advice on what could be the best way to store this information. >> >> We need to calculate the difference in costs for our operations, we are already >> storing our vendor invoices in the database so calculate the monetary change it >> is a no-brainer but we need to store special attributes for any of the invoices >> that we need to compare too, for example: >> -electric provider: total Kw. >> -water provider: total m3. >> -car maintenance: kilometers of the car. >> -mobile phones provider: international call minutes, national minutes, number >> of sms, etc.. >> >> And much more types/variables, the number of variables can change, not every day >> but still can change, i would like that they can be defined/changed from our >> application, so alter table to add columns don't seem the best way (still an >> option though). We will have "generic" reports that will show us changes in >> costs and specific reports for the types with "extended attributes" that we want >> to compare. >> >> To compare values from this "extended attributes" i think we have two ways: >> 1- have them in columns and use standard SQL. >> 2- create the columns with a function that reads this attrs and create the columns. >> >> So far i thin we have this options: >> 1- a bunch of columns that would be null except when the type of the invoice >> uses them. >> 2- a table related with the vendor invoices table for every type of invoice >> with his specifics columns. >> 3- a key/value in a separate table related with the vendor invoices table where >> i store the extended attrs of every invoice that needs them. >> 4- use a hstore column in the vendor invoces table to store this attrs. >> >> The first two have the problem of probably changes to the number of attributes >> of every type and give a more closed solution, apart from that 1- seems to be a >> bit awkward and 2- would need the application that creates the query to know >> with what table should join for every type (other point we will need to change >> if we want to create new invoices types). >> >> The last two have his own problems too, with 3 i will need to create a function >> that return rows as columns to compare them, with 4- given that i will store the >> attrs of every type in the database anyway i can use the operator -> (with a >> CASE using operator ? returning 0 if the searched attr it is not in the hstore) >> but still don't seem a clean solution for me. >> >> For me it seems i am missing something, probably any of you have a much more >> elegant (or correct) way to handle this situation, what would be your advice? >> Thanks. >> >> > > Create a table with a single numeric column and multiple category columns. > > ( amount_value, amount_unit, amount_category, vendor_id ) > > If necessary each "amount_value" data type should have it's own table since the processing logic will vary (I.e., you cannotsubtract text or Boolean values). > > You are , in effect, creating multiple tables but combining them into one and using the category column to distinguishbetween them. > > David J. Yeah, thanks for the advice David, if i understand you. this is (much better explained) my option 3, i have used this format any times for configuration tables with great success. I am not speaking about much data, maybe 200~300 invoices every month so this should not be a problem in a long long time, the problem i see with this way of doing things it is how should i in the report queries get this values as columns to compare change in time? something like this? SELECT inv.invoice_id, inv.total, (SELECT amount_value FROM vendor_invoices_attrs WHERE invoice_id = inv.invoice_id AND amount_category = 'international call minutes'), (SELECTamount_value FROM vendor_invoices_attrs WHERE invoice_id = inv.invoice_id AND amount_category = 'national call minutes') FROM vendor_invoices AS inv With hstore i think that could be: SELECT inv.invoice_id, inv.total, inv.store -> 'international call minutes', inv.store-> 'national call minutes' FROM vendor_invoices For me the last option seems preferable but maybe i am missing a better way to get the first result or an added problem of hstore (other than datatypes stored that i think i could check for validity in application or database regardless of not been integrated in hstore). Regards, Miguel Angel.
> > Yeah, thanks for the advice David, if i understand you. this is (much better > explained) my option 3, i have used this format any times for configuration > tables with great success. > > I am not speaking about much data, maybe 200~300 invoices every month so this > should not be a problem in a long long time, the problem i see with this way of > doing things it is how should i in the report queries get this values as columns > to compare change in time? something like this? > > SELECT inv.invoice_id, > inv.total, > (SELECT amount_value > FROM vendor_invoices_attrs > WHERE invoice_id = inv.invoice_id > AND amount_category = 'international call minutes'), > (SELECT amount_value > FROM vendor_invoices_attrs > WHERE invoice_id = inv.invoice_id > AND amount_category = 'national call minutes') > FROM vendor_invoices AS inv This should be rewritten little bit more readable SELECT inv.invoice_id, inv.total, a1.amount_value, a2.amount_value FROM vendor_invoices inv, vendor_invoices_attrs a1, vendor_invoices_attrs a2 WHERE a1.id = inv.id AND a2.id = inv.id AND a1.amount_category = 'international call minutes' AND a2. amount_category = 'national call minutes' But it feature of EAV and similar models - you has a less database objects and more complex queries. > > With hstore i think that could be: > SELECT inv.invoice_id, > inv.total, > inv.store -> 'international call minutes', > inv.store -> 'national call minutes' > FROM vendor_invoices > > For me the last option seems preferable but maybe i am missing a better way to > get the first result or an added problem of hstore (other than datatypes stored > that i think i could check for validity in application or database regardless of > not been integrated in hstore). > > Regards, > Miguel Angel. > > -- > Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-sql >
I would opt for a hybrid of 1 & 3. I imagine that many of your costs are the same month in, month out. Those I would put in specific columns, and then retrieve the varying columns from an EAV model. If you only run invoicing monthly then I'd be tempted to store this varying EAV structure in a series of month-specific views, e.g. for October 2011: create view invoice_2011_10 as select inv.invoice_id, inv.total, inv.electricity, inv.mileage, ( select via.value from vendor_invoice_attrs via wherevia.invoice_id = inv.invoice_id and via.amount_category = 'international call minutes') as intl_call_minutes, ( select via.value from vendor_invoice_attrs via where via.invoice_id = inv.invoice_id and via.amount_category= 'national call minutes') as dom_call_minutes from vendor_invoices inv; Since you know that at some point you'll be asked about last June. Scott On Sat, Oct 22, 2011 at 8:39 AM, Linos <info@linos.es> wrote: > El 22/10/11 14:53, David Johnston escribió: >> On Oct 22, 2011, at 6:41, Linos <info@linos.es> wrote: >> >>> Hi all, >>> i need a little of advice on what could be the best way to store this information. >>> >>> We need to calculate the difference in costs for our operations, we are already >>> storing our vendor invoices in the database so calculate the monetary change it >>> is a no-brainer but we need to store special attributes for any of the invoices >>> that we need to compare too, for example: >>> -electric provider: total Kw. >>> -water provider: total m3. >>> -car maintenance: kilometers of the car. >>> -mobile phones provider: international call minutes, national minutes, number >>> of sms, etc.. >>> >>> And much more types/variables, the number of variables can change, not every day >>> but still can change, i would like that they can be defined/changed from our >>> application, so alter table to add columns don't seem the best way (still an >>> option though). We will have "generic" reports that will show us changes in >>> costs and specific reports for the types with "extended attributes" that we want >>> to compare. >>> >>> To compare values from this "extended attributes" i think we have two ways: >>> 1- have them in columns and use standard SQL. >>> 2- create the columns with a function that reads this attrs and create the columns. >>> >>> So far i thin we have this options: >>> 1- a bunch of columns that would be null except when the type of the invoice >>> uses them. >>> 2- a table related with the vendor invoices table for every type of invoice >>> with his specifics columns. >>> 3- a key/value in a separate table related with the vendor invoices table where >>> i store the extended attrs of every invoice that needs them. >>> 4- use a hstore column in the vendor invoces table to store this attrs. >>> >>> The first two have the problem of probably changes to the number of attributes >>> of every type and give a more closed solution, apart from that 1- seems to be a >>> bit awkward and 2- would need the application that creates the query to know >>> with what table should join for every type (other point we will need to change >>> if we want to create new invoices types). >>> >>> The last two have his own problems too, with 3 i will need to create a function >>> that return rows as columns to compare them, with 4- given that i will store the >>> attrs of every type in the database anyway i can use the operator -> (with a >>> CASE using operator ? returning 0 if the searched attr it is not in the hstore) >>> but still don't seem a clean solution for me. >>> >>> For me it seems i am missing something, probably any of you have a much more >>> elegant (or correct) way to handle this situation, what would be your advice? >>> Thanks. >>> >>> >> >> Create a table with a single numeric column and multiple category columns. >> >> ( amount_value, amount_unit, amount_category, vendor_id ) >> >> If necessary each "amount_value" data type should have it's own table since the processing logic will vary (I.e., youcannot subtract text or Boolean values). >> >> You are , in effect, creating multiple tables but combining them into one and using the category column to distinguishbetween them. >> >> David J. > > Yeah, thanks for the advice David, if i understand you. this is (much better > explained) my option 3, i have used this format any times for configuration > tables with great success. > > I am not speaking about much data, maybe 200~300 invoices every month so this > should not be a problem in a long long time, the problem i see with this way of > doing things it is how should i in the report queries get this values as columns > to compare change in time? something like this? > > SELECT inv.invoice_id, > inv.total, > (SELECT amount_value > FROM vendor_invoices_attrs > WHERE invoice_id = inv.invoice_id > AND amount_category = 'international call minutes'), > (SELECT amount_value > FROM vendor_invoices_attrs > WHERE invoice_id = inv.invoice_id > AND amount_category = 'national call minutes') > FROM vendor_invoices AS inv > > With hstore i think that could be: > SELECT inv.invoice_id, > inv.total, > inv.store -> 'international call minutes', > inv.store -> 'national call minutes' > FROM vendor_invoices > > For me the last option seems preferable but maybe i am missing a better way to > get the first result or an added problem of hstore (other than datatypes stored > that i think i could check for validity in application or database regardless of > not been integrated in hstore). > > Regards, > Miguel Angel. > > -- > Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-sql >
On Oct 22, 2011, at 11:39, Linos <info@linos.es> wrote: > El 22/10/11 14:53, David Johnston escribió: >> On Oct 22, 2011, at 6:41, Linos <info@linos.es> wrote: >> >>> Hi all, >>> i need a little of advice on what could be the best way to store this information. >>> >>> We need to calculate the difference in costs for our operations, we are already >>> storing our vendor invoices in the database so calculate the monetary change it >>> is a no-brainer but we need to store special attributes for any of the invoices >>> that we need to compare too, for example: >>> -electric provider: total Kw. >>> -water provider: total m3. >>> -car maintenance: kilometers of the car. >>> -mobile phones provider: international call minutes, national minutes, number >>> of sms, etc.. >>> >>> And much more types/variables, the number of variables can change, not every day >>> but still can change, i would like that they can be defined/changed from our >>> application, so alter table to add columns don't seem the best way (still an >>> option though). We will have "generic" reports that will show us changes in >>> costs and specific reports for the types with "extended attributes" that we want >>> to compare. >>> >>> To compare values from this "extended attributes" i think we have two ways: >>> 1- have them in columns and use standard SQL. >>> 2- create the columns with a function that reads this attrs and create the columns. >>> >>> So far i thin we have this options: >>> 1- a bunch of columns that would be null except when the type of the invoice >>> uses them. >>> 2- a table related with the vendor invoices table for every type of invoice >>> with his specifics columns. >>> 3- a key/value in a separate table related with the vendor invoices table where >>> i store the extended attrs of every invoice that needs them. >>> 4- use a hstore column in the vendor invoces table to store this attrs. >>> >>> The first two have the problem of probably changes to the number of attributes >>> of every type and give a more closed solution, apart from that 1- seems to be a >>> bit awkward and 2- would need the application that creates the query to know >>> with what table should join for every type (other point we will need to change >>> if we want to create new invoices types). >>> >>> The last two have his own problems too, with 3 i will need to create a function >>> that return rows as columns to compare them, with 4- given that i will store the >>> attrs of every type in the database anyway i can use the operator -> (with a >>> CASE using operator ? returning 0 if the searched attr it is not in the hstore) >>> but still don't seem a clean solution for me. >>> >>> For me it seems i am missing something, probably any of you have a much more >>> elegant (or correct) way to handle this situation, what would be your advice? >>> Thanks. >>> >>> >> >> Create a table with a single numeric column and multiple category columns. >> >> ( amount_value, amount_unit, amount_category, vendor_id ) >> >> If necessary each "amount_value" data type should have it's own table since the processing logic will vary (I.e., youcannot subtract text or Boolean values). >> >> You are , in effect, creating multiple tables but combining them into one and using the category column to distinguishbetween them. >> >> David J. > > Yeah, thanks for the advice David, if i understand you. this is (much better > explained) my option 3, i have used this format any times for configuration > tables with great success. > > I am not speaking about much data, maybe 200~300 invoices every month so this > should not be a problem in a long long time, the problem i see with this way of > doing things it is how should i in the report queries get this values as columns > to compare change in time? something like this? > > SELECT inv.invoice_id, > inv.total, > (SELECT amount_value > FROM vendor_invoices_attrs > WHERE invoice_id = inv.invoice_id > AND amount_category = 'international call minutes'), > (SELECT amount_value > FROM vendor_invoices_attrs > WHERE invoice_id = inv.invoice_id > AND amount_category = 'national call minutes') > FROM vendor_invoices AS inv > > With hstore i think that could be: > SELECT inv.invoice_id, > inv.total, > inv.store -> 'international call minutes', > inv.store -> 'national call minutes' > FROM vendor_invoices > > For me the last option seems preferable but maybe i am missing a better way to > get the first result or an added problem of hstore (other than datatypes stored > that i think i could check for validity in application or database regardless of > not been integrated in hstore). > > Regards, > Miguel Angel. SELECT ... FROM vendor_invoices JOIN ( SELECT ... FROM eav WHERE category = '' ) icm USING ( invoice_id ) Add a sub-query join for each desired category. I agree HStore looks cleaner for the query example you give. One apparent advantage is that since each record contains asingle value you can use additional columns to store properties (e.g. Units) whereas the hstore makes that more difficult- though not impossible - since the keys are bot naturally related to each other. From an output perspective hstore looks good. If there are no difficulties on the data input process you might as well justgo with it. Given your volume and use-case it really isn't going to matter which option you pick; though I'd suggestproviding views related to each major category with only the appropriate values and no access to the hstore column. David J.
El 22/10/11 18:10, David Johnston escribió: > > > SELECT ... > FROM vendor_invoices > JOIN ( SELECT ... FROM eav WHERE category = '' ) icm USING ( invoice_id ) > > Add a sub-query join for each desired category. > > I agree HStore looks cleaner for the query example you give. One apparent advantage is that since each record containsa single value you can use additional columns to store properties (e.g. Units) whereas the hstore makes that moredifficult - though not impossible - since the keys are bot naturally related to each other. > > From an output perspective hstore looks good. If there are no difficulties on the data input process you might as welljust go with it. Given your volume and use-case it really isn't going to matter which option you pick; though I'd suggestproviding views related to each major category with only the appropriate values and no access to the hstore column. > > David J. > > I will try with hstore and EAV both and see what seems more natural for me, i have to take account how easy/clean it is to work with both from the Qt application, i think eav would be much better to edit in a table. Thanks to all for the replies. Regards, Miguel Angel.
On Oct 22, 2011, at 10:07, Pavel Stehule <pavel.stehule@gmail.com> wrote: > 2011/10/22 David Johnston <polobo@yahoo.com>: >> On Oct 22, 2011, at 6:41, Linos <info@linos.es> wrote: >> >>> Hi all, >>> i need a little of advice on what could be the best way to store this information. >>> >>> We need to calculate the difference in costs for our operations, we are already >>> storing our vendor invoices in the database so calculate the monetary change it >>> is a no-brainer but we need to store special attributes for any of the invoices >>> that we need to compare too, for example: >>> -electric provider: total Kw. >>> -water provider: total m3. >>> -car maintenance: kilometers of the car. >>> -mobile phones provider: international call minutes, national minutes, number >>> of sms, etc.. >>> >>> And much more types/variables, the number of variables can change, not every day >>> but still can change, i would like that they can be defined/changed from our >>> application, so alter table to add columns don't seem the best way (still an >>> option though). We will have "generic" reports that will show us changes in >>> costs and specific reports for the types with "extended attributes" that we want >>> to compare. >>> >>> To compare values from this "extended attributes" i think we have two ways: >>> 1- have them in columns and use standard SQL. >>> 2- create the columns with a function that reads this attrs and create the columns. >>> >>> So far i thin we have this options: >>> 1- a bunch of columns that would be null except when the type of the invoice >>> uses them. >>> 2- a table related with the vendor invoices table for every type of invoice >>> with his specifics columns. >>> 3- a key/value in a separate table related with the vendor invoices table where >>> i store the extended attrs of every invoice that needs them. >>> 4- use a hstore column in the vendor invoces table to store this attrs. >>> >>> The first two have the problem of probably changes to the number of attributes >>> of every type and give a more closed solution, apart from that 1- seems to be a >>> bit awkward and 2- would need the application that creates the query to know >>> with what table should join for every type (other point we will need to change >>> if we want to create new invoices types). >>> >>> The last two have his own problems too, with 3 i will need to create a function >>> that return rows as columns to compare them, with 4- given that i will store the >>> attrs of every type in the database anyway i can use the operator -> (with a >>> CASE using operator ? returning 0 if the searched attr it is not in the hstore) >>> but still don't seem a clean solution for me. >>> >>> For me it seems i am missing something, probably any of you have a much more >>> elegant (or correct) way to handle this situation, what would be your advice? >>> Thanks. >>> >>> >> >> Create a table with a single numeric column and multiple category columns. >> >> ( amount_value, amount_unit, amount_category, vendor_id ) > > > This is EAV model - is good for smaller datasets, for larger datasets > is problematic. There is second possibility - using a "hstore" contrib > module - that emulates HASH table - It has better for larger datasets. > > Regards > > Pavel Stehule Store was mentioned by the OP. Any suggestions on where the line between small and large is drawn? Partitions could help in the larger cases. My personal first choice is to use separate tables. If going EAV route plan on eventually moving to the separate table routeand at least try to make migration relatively easy. Since both models capture the same data the decision at least partially rests upon the immediate use-cases for the data. Lacking use-cases and size parameters recommendations are difficult to make. I have not used hstore so my ability to recommend it is limited. My main "concern" is the fact that the data is less visibleand always stored as text. I see hstore as being useful for situations where the attributes are variable but thisuse case seems to have fairly well-defined attributes. David J. > >> >> If necessary each "amount_value" data type should have it's own table since the processing logic will vary (I.e., youcannot subtract text or Boolean values). >> >> You are , in effect, creating multiple tables but combining them into one and using the category column to distinguishbetween them. >> >> David J. >> -- >> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-sql >> > > -- > Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-sql