Thread: Index on a Decrypt / Bytea2Text Function
Hi all, We tend to do a lot of lookups on our database that look something like: select e.id from employee e ,app_user au where au.id=user_id and au.corporation_id=$1 and e.ssn is not null and e.ssn!=' ' and e.ssn!='' and e.deleted='N'and bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)), 'bf'))=$2 The analyze here looks like: > explain analyze select e.id from employee e ,app_user au where au.id=user_id and au.corporation_id=41197 and e.ssn is not null and e.ssn!=' ' and e.ssn!='' and e.deleted='N'and bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)), 'bf'))='188622250'; QUERY PLAN -------------------------------------------------------------------------- Nested Loop (cost=0.00..19282.05 rows=122 width=8) (actual time=24.591..192.435 rows=1 loops=1) -> Index Scan using emp_del on employee e (cost=0.00..18625.99 rows=122 width=16) (actual time=24.556..192.398 rows=1 loops=1) Index Cond: (deleted = 'N'::bpchar) Filter: ((ssn IS NOT NULL) AND (ssn <> ' '::text) AND (ssn <> ''::text) AND (bytea2text(decrypt(decode(ssn, 'hex'::text), text2bytea((id)::text), 'bf'::text)) = '188622250'::text)) -> Index Scan using app_user_pkey on app_user au (cost=0.00..5.36 rows=1 width=8) (actual time=0.032..0.033 rows=1 loops=1) Index Cond: (au.id = e.user_id) Filter: (au.corporation_id = 41197) Total runtime: 192.565 ms (8 rows) It would appear that almost 100% of this time is taken up by doing the bytea2text and decrypt() functions. How would I create an index based on the results of the decrypt and bytea2text function to improve this select statement? Thanks! -- Anthony
In response to Anthony Presley <anthony@resolution.com>: > Hi all, > > We tend to do a lot of lookups on our database that look something like: > > select > e.id > from > employee e ,app_user au > where > au.id=user_id and > au.corporation_id=$1 and > e.ssn is not null and > e.ssn!=' ' and > e.ssn!='' and > e.deleted='N'and > bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)), > 'bf'))=$2 > > The analyze here looks like: > > > explain analyze select e.id from employee e ,app_user au where > au.id=user_id and au.corporation_id=41197 and e.ssn is not null and > e.ssn!=' ' and e.ssn!='' and e.deleted='N'and > bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)), > 'bf'))='188622250'; > > QUERY > PLAN > -------------------------------------------------------------------------- > Nested Loop (cost=0.00..19282.05 rows=122 width=8) (actual > time=24.591..192.435 rows=1 loops=1) > -> Index Scan using emp_del on employee e (cost=0.00..18625.99 > rows=122 width=16) (actual time=24.556..192.398 rows=1 loops=1) > Index Cond: (deleted = 'N'::bpchar) > Filter: ((ssn IS NOT NULL) AND (ssn <> ' '::text) AND (ssn <> > ''::text) AND (bytea2text(decrypt(decode(ssn, 'hex'::text), > text2bytea((id)::text), 'bf'::text)) = '188622250'::text)) > -> Index Scan using app_user_pkey on app_user au (cost=0.00..5.36 > rows=1 width=8) (actual time=0.032..0.033 rows=1 loops=1) > Index Cond: (au.id = e.user_id) > Filter: (au.corporation_id = 41197) > Total runtime: 192.565 ms > (8 rows) > > It would appear that almost 100% of this time is taken up by doing the > bytea2text and decrypt() functions. > > How would I create an index based on the results of the decrypt and > bytea2text function to improve this select statement? The best way would be to unencrypt the column and use a normal index. Since you're simply using a value in another column as the key anyway, your design has created all the performance headaches of encryption with no actual security. -- Bill Moran http://www.potentialtech.com http://people.collaborativefusion.com/~wmoran/
On 14 July 2010 20:23, Anthony Presley <anthony@resolution.com> wrote: > Hi all, > > We tend to do a lot of lookups on our database that look something like: > > select > e.id > from > employee e ,app_user au > where > au.id=user_id and > au.corporation_id=$1 and > e.ssn is not null and > e.ssn!=' ' and > e.ssn!='' and > e.deleted='N'and > bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)), > 'bf'))=$2 > > The analyze here looks like: > >> explain analyze select e.id from employee e ,app_user au where > au.id=user_id and au.corporation_id=41197 and e.ssn is not null and > e.ssn!=' ' and e.ssn!='' and e.deleted='N'and > bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)), > 'bf'))='188622250'; > > QUERY > PLAN > -------------------------------------------------------------------------- > Nested Loop (cost=0.00..19282.05 rows=122 width=8) (actual > time=24.591..192.435 rows=1 loops=1) > -> Index Scan using emp_del on employee e (cost=0.00..18625.99 > rows=122 width=16) (actual time=24.556..192.398 rows=1 loops=1) > Index Cond: (deleted = 'N'::bpchar) > Filter: ((ssn IS NOT NULL) AND (ssn <> ' '::text) AND (ssn <> > ''::text) AND (bytea2text(decrypt(decode(ssn, 'hex'::text), > text2bytea((id)::text), 'bf'::text)) = '188622250'::text)) > -> Index Scan using app_user_pkey on app_user au (cost=0.00..5.36 > rows=1 width=8) (actual time=0.032..0.033 rows=1 loops=1) > Index Cond: (au.id = e.user_id) > Filter: (au.corporation_id = 41197) > Total runtime: 192.565 ms > (8 rows) > > It would appear that almost 100% of this time is taken up by doing the > bytea2text and decrypt() functions. > > How would I create an index based on the results of the decrypt and > bytea2text function to improve this select statement? > > Thanks! > > > -- > Anthony > > Would the following work?: CREATE INDEX idx_employee_functional ON employee (bytea2text(DECRYPT(DECODE(ssn,'hex'), text2bytea(CAST(id AS text)),'bf')) Thom
On 14 July 2010 20:32, Bill Moran <wmoran@potentialtech.com> wrote: > In response to Anthony Presley <anthony@resolution.com>: > >> Hi all, >> >> We tend to do a lot of lookups on our database that look something like: >> >> select >> e.id >> from >> employee e ,app_user au >> where >> au.id=user_id and >> au.corporation_id=$1 and >> e.ssn is not null and >> e.ssn!=' ' and >> e.ssn!='' and >> e.deleted='N'and >> bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)), >> 'bf'))=$2 >> >> The analyze here looks like: >> >> > explain analyze select e.id from employee e ,app_user au where >> au.id=user_id and au.corporation_id=41197 and e.ssn is not null and >> e.ssn!=' ' and e.ssn!='' and e.deleted='N'and >> bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)), >> 'bf'))='188622250'; >> >> QUERY >> PLAN >> -------------------------------------------------------------------------- >> Nested Loop (cost=0.00..19282.05 rows=122 width=8) (actual >> time=24.591..192.435 rows=1 loops=1) >> -> Index Scan using emp_del on employee e (cost=0.00..18625.99 >> rows=122 width=16) (actual time=24.556..192.398 rows=1 loops=1) >> Index Cond: (deleted = 'N'::bpchar) >> Filter: ((ssn IS NOT NULL) AND (ssn <> ' '::text) AND (ssn <> >> ''::text) AND (bytea2text(decrypt(decode(ssn, 'hex'::text), >> text2bytea((id)::text), 'bf'::text)) = '188622250'::text)) >> -> Index Scan using app_user_pkey on app_user au (cost=0.00..5.36 >> rows=1 width=8) (actual time=0.032..0.033 rows=1 loops=1) >> Index Cond: (au.id = e.user_id) >> Filter: (au.corporation_id = 41197) >> Total runtime: 192.565 ms >> (8 rows) >> >> It would appear that almost 100% of this time is taken up by doing the >> bytea2text and decrypt() functions. >> >> How would I create an index based on the results of the decrypt and >> bytea2text function to improve this select statement? > > The best way would be to unencrypt the column and use a normal index. > > Since you're simply using a value in another column as the key anyway, > your design has created all the performance headaches of encryption > with no actual security. > > -- > Bill Moran > http://www.potentialtech.com > http://people.collaborativefusion.com/~wmoran/ > > -- Yes, I immediately thought about what's actually happening as soon as I sent the last message. Forget the functional index. Thom
Thom Brown <thombrown@gmail.com> writes: > On 14 July 2010 20:23, Anthony Presley <anthony@resolution.com> wrote: >> select >> e.id >> from >> employee e ,app_user au >> where >> au.id=user_id and >> au.corporation_id=$1 and >> e.ssn is not null and >> e.ssn!=' ' and >> e.ssn!='' and >> e.deleted='N'and >> bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)), >> 'bf'))=$2 >> >> How would I create an index based on the results of the decrypt and >> bytea2text function to improve this select statement? > Would the following work?: > CREATE INDEX idx_employee_functional ON employee > (bytea2text(DECRYPT(DECODE(ssn,'hex'), text2bytea(CAST(id AS > text)),'bf')) That would work as far as speeding up the query goes. However, as Bill Moran points out nearby, the query reveals a totally incompetent security design. There is no value to speak of in encrypting a data value and then storing the decryption key right beside it. Perhaps the excuse is to not have the SSN in cleartext on disk, nevermind whether a halfway competent attacker could get it back --- but even with that barely-useful goal, you do *not* want an index like this, because all the index entries will be cleartext SSNs. What you really need is to take two steps back and figure out why you want to encrypt this data and what threats you intend to protect against. It's probably possible to make a credibly-secure design that runs faster than this does, but there's no point at all in improving the performance of a fundamentally broken design. regards, tom lane
On Wed, 2010-07-14 at 15:56 -0400, Tom Lane wrote: > Thom Brown <thombrown@gmail.com> writes: > > On 14 July 2010 20:23, Anthony Presley <anthony@resolution.com> wrote: > >> select > >> e.id > >> from > >> employee e ,app_user au > >> where > >> au.id=user_id and > >> au.corporation_id=$1 and > >> e.ssn is not null and > >> e.ssn!=' ' and > >> e.ssn!='' and > >> e.deleted='N'and > >> bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)), > >> 'bf'))=$2 > >> > >> How would I create an index based on the results of the decrypt and > >> bytea2text function to improve this select statement? > > > Would the following work?: > > > CREATE INDEX idx_employee_functional ON employee > > (bytea2text(DECRYPT(DECODE(ssn,'hex'), text2bytea(CAST(id AS > > text)),'bf')) > > That would work as far as speeding up the query goes. However, as Bill > Moran points out nearby, the query reveals a totally incompetent > security design. There is no value to speak of in encrypting a data > value and then storing the decryption key right beside it. Perhaps the > excuse is to not have the SSN in cleartext on disk, nevermind whether a > halfway competent attacker could get it back --- but even with that > barely-useful goal, you do *not* want an index like this, because all > the index entries will be cleartext SSNs. > > What you really need is to take two steps back and figure out why you > want to encrypt this data and what threats you intend to protect > against. It's probably possible to make a credibly-secure design that > runs faster than this does, but there's no point at all in improving > the performance of a fundamentally broken design. > > regards, tom lane > Yes, you are right ... the security here serves no purpose other than to not have SSN's stored on disk in an un-encrypted way. Unfortunately, we need to be able to easily, and quickly, reverse the security, so that we can get access to unencrypted data ... because our application does export to payroll providers, and many of them still use SSN's as keys. IE, storing the SSN in an encrypted manner (or using a one-way salt), won't work. IE, the reality is that our application has to be able to show / hide the SSN, so someone breaking into the application (it is likely easier to steal your manager's password than it would be to hack into the server), would be able to access the data. Even if we wanted to tackle *real* security here, I'm not sure how we'd go about it. Encrypting any data on a web app would require that the encryption key and/or salt be stored in some combination of the database, or app code, which is all vulnerable if someone breaks into and/or steals the server. There isn't a "client" piece, like you'd have with Carbonite, etc... -- Anthony
On Wed, 2010-07-14 at 20:32 +0100, Thom Brown wrote: > On 14 July 2010 20:23, Anthony Presley <anthony@resolution.com> wrote: > > Hi all, > > > > We tend to do a lot of lookups on our database that look something like: > > > > select > > e.id > > from > > employee e ,app_user au > > where > > au.id=user_id and > > au.corporation_id=$1 and > > e.ssn is not null and > > e.ssn!=' ' and > > e.ssn!='' and > > e.deleted='N'and > > bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)), > > 'bf'))=$2 > > > > The analyze here looks like: > > > >> explain analyze select e.id from employee e ,app_user au where > > au.id=user_id and au.corporation_id=41197 and e.ssn is not null and > > e.ssn!=' ' and e.ssn!='' and e.deleted='N'and > > bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)), > > 'bf'))='188622250'; > > > > QUERY > > PLAN > > -------------------------------------------------------------------------- > > Nested Loop (cost=0.00..19282.05 rows=122 width=8) (actual > > time=24.591..192.435 rows=1 loops=1) > > -> Index Scan using emp_del on employee e (cost=0.00..18625.99 > > rows=122 width=16) (actual time=24.556..192.398 rows=1 loops=1) > > Index Cond: (deleted = 'N'::bpchar) > > Filter: ((ssn IS NOT NULL) AND (ssn <> ' '::text) AND (ssn <> > > ''::text) AND (bytea2text(decrypt(decode(ssn, 'hex'::text), > > text2bytea((id)::text), 'bf'::text)) = '188622250'::text)) > > -> Index Scan using app_user_pkey on app_user au (cost=0.00..5.36 > > rows=1 width=8) (actual time=0.032..0.033 rows=1 loops=1) > > Index Cond: (au.id = e.user_id) > > Filter: (au.corporation_id = 41197) > > Total runtime: 192.565 ms > > (8 rows) > > > > It would appear that almost 100% of this time is taken up by doing the > > bytea2text and decrypt() functions. > > > > How would I create an index based on the results of the decrypt and > > bytea2text function to improve this select statement? > > > > Thanks! > > > > > > -- > > Anthony > > > > > > Would the following work?: > > CREATE INDEX idx_employee_functional ON employee > (bytea2text(DECRYPT(DECODE(ssn,'hex'), text2bytea(CAST(id AS > text)),'bf')) > > Thom Unfortunately, that doesn't work: # CREATE INDEX idx_employee_functional ON employee (bytea2text(DECRYPT(DECODE(ssn,'hex'), text2bytea(CAST(id AS text)),'bf'))); ERROR: functions in index expression must be marked IMMUTABLE Guess we'll need to come up with something else. -- Anthony
In response to Anthony Presley <anthony@resolution.com>: > On Wed, 2010-07-14 at 15:56 -0400, Tom Lane wrote: > > Thom Brown <thombrown@gmail.com> writes: > > > On 14 July 2010 20:23, Anthony Presley <anthony@resolution.com> wrote: > > >> select > > >> e.id > > >> from > > >> employee e ,app_user au > > >> where > > >> au.id=user_id and > > >> au.corporation_id=$1 and > > >> e.ssn is not null and > > >> e.ssn!=' ' and > > >> e.ssn!='' and > > >> e.deleted='N'and > > >> bytea2text(DECRYPT(decode(e.ssn,'hex'), text2bytea(cast(e.id as text)), > > >> 'bf'))=$2 > > >> > > >> How would I create an index based on the results of the decrypt and > > >> bytea2text function to improve this select statement? > > > > > Would the following work?: > > > > > CREATE INDEX idx_employee_functional ON employee > > > (bytea2text(DECRYPT(DECODE(ssn,'hex'), text2bytea(CAST(id AS > > > text)),'bf')) > > > > That would work as far as speeding up the query goes. However, as Bill > > Moran points out nearby, the query reveals a totally incompetent > > security design. There is no value to speak of in encrypting a data > > value and then storing the decryption key right beside it. Perhaps the > > excuse is to not have the SSN in cleartext on disk, nevermind whether a > > halfway competent attacker could get it back --- but even with that > > barely-useful goal, you do *not* want an index like this, because all > > the index entries will be cleartext SSNs. > > > > What you really need is to take two steps back and figure out why you > > want to encrypt this data and what threats you intend to protect > > against. It's probably possible to make a credibly-secure design that > > runs faster than this does, but there's no point at all in improving > > the performance of a fundamentally broken design. > > > > regards, tom lane > > > > Yes, you are right ... the security here serves no purpose other than to > not have SSN's stored on disk in an un-encrypted way. Unfortunately, we > need to be able to easily, and quickly, reverse the security, so that we > can get access to unencrypted data ... because our application does > export to payroll providers, and many of them still use SSN's as keys. > IE, storing the SSN in an encrypted manner (or using a one-way salt), > won't work. > > IE, the reality is that our application has to be able to show / hide > the SSN, so someone breaking into the application (it is likely easier > to steal your manager's password than it would be to hack into the > server), would be able to access the data. > > Even if we wanted to tackle *real* security here, I'm not sure how we'd > go about it. Encrypting any data on a web app would require that the > encryption key and/or salt be stored in some combination of the > database, or app code, which is all vulnerable if someone breaks into > and/or steals the server. There isn't a "client" piece, like you'd have > with Carbonite, etc... You need to do more research on this. I understand that bosses are arbitrarily requiring "encrypt the SSN" without understanding what they're asking for, but doing it half-assed like this is irresponsible to the point of being criminal. As Tom says, first identify what attack vectors you're protecting from. If you just want to protect the data if the server if physically stolen, disk encryption of the partition where PG has it's data files is probably your best bet, and on most OSen is pretty easy to set up. The only headache there is someone has to manually enter the disk passphrase any time the system is rebooted. If you want to protect from other attack vectors, such as SQL injections, it's a little trickier, but still doable. The complexity depends on the rules of your access model. In the simplest case, you generate a shared secret that every SSN is encrypted with. You don't keep the shared secret anywhere in the DB. But when you create users that need access to the SSNs, you create a copy of the shared secret that's encrypted with their password. Now, when they log in, they can decrypt the shared secret and get access to the SSNs. It even makes your application look better, because when they go to access the SSNs, a screen pops up saying "please enter your password again to verify that your session hasn't been hijacked" ... which you can tout as a benefit to the users. If your access rules are more complex (which it doesn't seem that they are, but I only know so much) the implementation gets more complex, with hierarchies of shared secrets and interesting grant/revoke routines -- maybe even PKI -- but at it's core, it's the same setup as what I just described with additional layers. -- Bill Moran http://www.potentialtech.com http://people.collaborativefusion.com/~wmoran/