Home > mailing lists

Re: total and partial sums in the same query?? - Mailing list pgsql-sql

From	Iuri Sampaio
Subject	Re: total and partial sums in the same query??
Date	October 10, 2020 15:28:20
Msg-id	ACAD364B-3DB0-49B1-835B-67F8C877A746@gmail.com Whole thread
In response to	Re: total and partial sums in the same query?? ("David G. Johnston" <david.g.johnston@gmail.com>)
Responses	Re: total and partial sums in the same query??
List	pgsql-sql

Tree view

David,

Going further in our conversation. I went to PG's documentation and did a better research in order to come out with better solution/approaches.

Furthermore, based on your previous email, your words were “key” in the process. Thanks a lot!

“ … you should construct a simplified self-contained example and,… “

I reviewed the original query:

SELECT split_part(v.description, ' ', 25) AS type, t.partial, COUNT(1) AS total

FROM qt_vehicle_ti v

RIGHT OUTER JOIN (

SELECT split_part(description, ' ', 25) AS type1, COUNT(1) AS partial

FROM qt_vehicle_ti

WHERE EXTRACT(MONTH FROM creation_date) = 10

GROUP BY type1) AS t

ON t.type1 = split_part(v.description, ' ', 25)

GROUP BY type, partial

and rewrote it to the following one:

WITH 

 cte1 AS (SELECT split_part(description, ' ', 25) AS type1, COUNT(1) AS partial FROM qt_vehicle_ti WHERE EXTRACT(MONTH FROM creation_date) = 10 GROUP BY type1), 

 cte2 AS (SELECT split_part(description, ' ', 25) AS type2, COUNT(1) AS total FROM qt_vehicle_ti GROUP BY type2) 

SELECT type1, total, partial FROM cte1 JOIN cte2 ON cte1.type1 = cte2.type2;

Indeed! Performance is way better now. As well as readability, and less code written!

Nevertheless, I’m still reluctant to the necessity of using v_normalized. On the other hand my reluctancy comes from not entirely understanding this approach and the benefits of it.

Your words were:  

“ 
A simple conditional (filter) count would be much easier to understand and should be much faster:

Select type, count(*) as total_count, count(*) filter (where month=10) as m10_count from v_normalized_data group by type;

https://www.postgresql.org/docs/13/sql-expressions.html#SYNTAX-AGGREGATES

I added v_normalized because the expressions the decompose your data tend to be better placed in a view and your main queries focus just on their purpose and not structural data manipulation.  Especially something expensive like duplicating split_part.
"

So, feel free whether to go deeper continuing this discussion 

Best wishes,

I

On Saf. 23, 1442 AH, at 00:58, David G. Johnston <david.g.johnston@gmail.com> wrote:

On Friday, October 9, 2020, Iuri Sampaio <iuri.sampaio@gmail.com> wrote:
Hi David,

RIGHT OUTER JOIN is the key!

TOTAL
SELECT split_part(description, ' ', 25) AS type, COUNT(1) AS total FROM qt_vehicle_ti GROUP BY type

OCTOBER
SELECT split_part(description, ' ', 25) AS type, COUNT(1) AS total FROM qt_vehicle_ti WHERE EXTRACT(MONTH FROM creation_date) = 10 GROUP BY type

FINAL
SELECT split_part(v.description, ' ', 25) AS type, t.partial, COUNT(1) AS total FROM qt_vehicle_ti v RIGHT OUTER JOIN ( SELECT split_part(description, ' ', 25) AS type1, COUNT(1) AS partial FROM qt_vehicle_ti WHERE EXTRACT(MONTH FROM creation_date) = 10 GROUP BY type1) AS t ON t.type1 = split_part(v.description, ' ', 25) GROUP BY type, partial

Let me know if you would use a different approach

The convention I try to observe when using outer joins is to use left join, not right (outer is implied). That said, you seem to have written a left join query since the totals, a superset of october, are on the left. Also, count(*) is my learned convention instead of count(1).

A simple conditional (filter) count would be much easier to understand and should be much faster:

Select type, count(*) as total_count, count(*) filter (where month=10) as m10_count from v_normalized_data group by type;

https://www.postgresql.org/docs/13/sql-expressions.html#SYNTAX-AGGREGATES

I added v_normalized because the expressions the decompose your data tend to be better placed in a view and your main queries focus just on their purpose and not structural data manipulation. Especially something expensive like duplicating split_part.

David J.

pgsql-sql by date:

From: "David G. Johnston"
Date: 10 October 2020, 03:58:19
Subject: Re: total and partial sums in the same query??

From: "David G. Johnston"
Date: 10 October 2020, 17:55:20
Subject: Re: total and partial sums in the same query??

Re: total and partial sums in the same query?? - Mailing list pgsql-sql

Previous

Next