Thread: CTE and function
Hello, I'm trying to convert a select after a CTE into a function for generic use. The CTE is normally a complex query but I want to capsulate then the calculation of the Gini coefficient it into a function: Based on: http://www.heckler.com.br/blog/2010/06/15/gini-coeficient-having-fun-in-both-sql-and-python/ Details at: https://en.wikipedia.org/wiki/Gini_coefficient ================================================================================================================================================================ = OK ================================================================================================================================================================ WITH tab AS ( SELECT unnest(ARRAY[1,2,3,4]) AS col ) SELECT ((N+1.0)/(N-1.0)) - ((PiXi_sum/(N*(N-1)*u))*2.0) AS gini_coefficient FROM ( SELECT SUM(PiXi) AS PiXi_sum, COUNT(*) AS N, (SELECT AVG(col) FROM tab) AS u FROM ( SELECT row_number() OVER() * col AS PiXi FROM (SELECT col FROM tab ORDER BY col DESC) t1 ) t2 ) t3 ; ================================================================================================================================================================ = OK: Create function ================================================================================================================================================================ CREATE OR REPLACE FUNCTION gini_coefficient(IN table_name text, IN column_name text, OUT gini_coefficient DOUBLE PRECISION) AS $$ BEGIN EXECUTE format(' SELECT ((N+1.0)/(N-1.0)) - ((PiXi_sum/(N*(N-1)*u))*2.0) AS gini_coefficient FROM ( SELECT SUM(PiXi) AS PiXi_sum, COUNT(*) AS N, (SELECT AVG(%s) FROM %s) AS u FROM ( SELECT row_number() OVER() * col AS PiXi FROM (SELECT %s FROM %s ORDER BY %s DESC) t1 ) t2 ) t3 ; ', column_name, table_name, column_name, table_name, column_name) INTO gini_coefficient; END $$ LANGUAGE plpgsql; ================================================================================================================================================================ = NOT OK: ================================================================================================================================================================ WITH tab AS ( SELECT unnest(ARRAY[1,2,3,4]) AS col ) SELECT gini_coefficient('tab', 'col'); ERROR: relation "tab" does not exist LINE 13: (SELECT col FROM tab ORDER BY col DESC) t1 ================================================================================================================================================================ = NOT OK: ================================================================================================================================================================ WITH tab AS ( SELECT unnest(ARRAY[1,2,3,4]) AS col ) SELECT * FROM gini_coefficient('tab', 'col'); ERROR: relation "tab" does not exist LINE 13: (SELECT col FROM tab ORDER BY col DESC) t1 So it looks like the table tab from the CTE is not available in the function. Any ideas how to solve it and an explaination would be fine? Thank you. Ciao, Gerhard -- https://www.wiesinger.com/
This is my understanding, hopefully someone will chime in if I'm off. Using EXECUTE, the SQL is executed in a separate context than the current statement. So it's checking for a table with the name you pass, not aliases within the current statement. Giving the function another parameter to allow an expression lets this run. That said, running user-given SQL is very insecure, and this should never be run in a live database. I changed the formatting to make it easier (for me) to read. Note the two table aliases ("as tab") where the table is passed in. CREATE OR REPLACE FUNCTION gini_coefficient(IN table_name text ,IN table_expression text ,IN column_name text ,OUT gini_coefficient DOUBLE PRECISION ) AS $$ BEGIN EXECUTE format('SELECT ((N+1.0)/(N-1.0)) - ((PiXi_sum/(N*(N-1)*u))*2.0) AS gini_coefficient FROM (SELECT SUM(PiXi) AS PiXi_sum , COUNT(*) AS N , (SELECT AVG(%s) FROM %s as tab) AS u FROM (SELECT row_number() OVER() * col AS PiXi FROM (SELECT %s FROM %s as tab ORDER BY %s DESC) t1 ) t2 ) t3;' , column_name , COALESCE (table_name ,'(' || table_expression || ')' ) , column_name , COALESCE (table_name ,'(' || table_expression || ')' ) , column_name ) INTO gini_coefficient; END; $$ LANGUAGE plpgsql; SELECT gini_coefficient(null ,'SELECT unnest(ARRAY[1,2,3,4]) AS col' ,'col' ); A better solution is (using your original definition for gini_coefficient): create view tab as SELECT unnest(ARRAY[1,2,3,4]) AS col; SELECT gini_coefficient('tab','col'); Ben -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Gerhard Wiesinger Sent: Thursday, February 25, 2016 5:32 AM To: pgsql-general@postgresql.org Subject: [GENERAL] CTE and function Hello, I'm trying to convert a select after a CTE into a function for generic use. The CTE is normally a complex query but I want to capsulate then the calculation of the Gini coefficient it into a function: Based on: http://www.heckler.com.br/blog/2010/06/15/gini-coeficient-having-fun-in-both-sql-and-python/ Details at: https://en.wikipedia.org/wiki/Gini_coefficient ================================================================================================================================================================ = OK ================================================================================================================================================================ WITH tab AS ( SELECT unnest(ARRAY[1,2,3,4]) AS col ) SELECT ((N+1.0)/(N-1.0)) - ((PiXi_sum/(N*(N-1)*u))*2.0) AS gini_coefficient FROM ( SELECT SUM(PiXi) AS PiXi_sum, COUNT(*) AS N, (SELECT AVG(col) FROM tab) AS u FROM ( SELECT row_number() OVER() * col AS PiXi FROM (SELECT col FROM tab ORDER BY col DESC) t1 ) t2 ) t3 ; ================================================================================================================================================================ = OK: Create function ================================================================================================================================================================ CREATE OR REPLACE FUNCTION gini_coefficient(IN table_name text, IN column_name text, OUT gini_coefficient DOUBLE PRECISION) AS $$ BEGIN EXECUTE format(' SELECT ((N+1.0)/(N-1.0)) - ((PiXi_sum/(N*(N-1)*u))*2.0) AS gini_coefficient FROM ( SELECT SUM(PiXi) AS PiXi_sum, COUNT(*) AS N, (SELECT AVG(%s) FROM %s) AS u FROM ( SELECT row_number() OVER() * col AS PiXi FROM (SELECT %s FROM %s ORDER BY %s DESC) t1 ) t2 ) t3 ; ', column_name, table_name, column_name, table_name, column_name) INTO gini_coefficient; END $$ LANGUAGE plpgsql; ================================================================================================================================================================ = NOT OK: ================================================================================================================================================================ WITH tab AS ( SELECT unnest(ARRAY[1,2,3,4]) AS col ) SELECT gini_coefficient('tab', 'col'); ERROR: relation "tab" does not exist LINE 13: (SELECT col FROM tab ORDER BY col DESC) t1 ================================================================================================================================================================ = NOT OK: ================================================================================================================================================================ WITH tab AS ( SELECT unnest(ARRAY[1,2,3,4]) AS col ) SELECT * FROM gini_coefficient('tab', 'col'); ERROR: relation "tab" does not exist LINE 13: (SELECT col FROM tab ORDER BY col DESC) t1 So it looks like the table tab from the CTE is not available in the function. Any ideas how to solve it and an explaination would be fine? Thank you. Ciao, Gerhard -- https://www.wiesinger.com/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
================================================================================================================================================================
= NOT OK:
================================================================================================================================================================
WITH tab AS (
SELECT unnest(ARRAY[1,2,3,4]) AS col
)
SELECT gini_coefficient('tab', 'col');
ERROR: relation "tab" does not exist
LINE 13: (SELECT col FROM tab ORDER BY col DESC) t1
================================================================================================================================================================
= NOT OK:
================================================================================================================================================================
WITH tab AS (
SELECT unnest(ARRAY[1,2,3,4]) AS col
)
SELECT * FROM gini_coefficient('tab', 'col');
ERROR: relation "tab" does not exist
LINE 13: (SELECT col FROM tab ORDER BY col DESC) t1
So it looks like the table tab from the CTE is not available in the function.
Any ideas how to solve it and an explaination would be fine?
Not tested but:
CREATE TEMP TABLE tab AS SELECT ... AS col;
SELECT * FROM gini_ciefficient('tab','col');
A function is able to access (session) global objects and whatever data is passed in to it via is parameters.
I don't know if there is any fundamental reason the contents of a CTE cannot be seen by a function executing in the same context but that is not how it works today.
So turn the CTE into its own standalone TABLE and you should be able to then refer to it by name in subsequent queries. It works for actual queries and so functions should be no different.
David J.