Re: total and partial sums in the same query?? - Mailing list pgsql-sql
From | Iuri Sampaio |
---|---|
Subject | Re: total and partial sums in the same query?? |
Date | |
Msg-id | ACAD364B-3DB0-49B1-835B-67F8C877A746@gmail.com Whole thread Raw |
In response to | Re: total and partial sums in the same query?? ("David G. Johnston" <david.g.johnston@gmail.com>) |
Responses |
Re: total and partial sums in the same query??
|
List | pgsql-sql |
David,
Going further in our conversation. I went to PG's documentation and did a better research in order to come out with better solution/approaches.
Furthermore, based on your previous email, your words were “key” in the process. Thanks a lot!
“ … you should construct a simplified self-contained example and,… “
I reviewed the original query:
SELECT split_part(v.description, ' ', 25) AS type, t.partial, COUNT(1) AS total
FROM qt_vehicle_ti v
RIGHT OUTER JOIN (
SELECT split_part(description, ' ', 25) AS type1, COUNT(1) AS partial
FROM qt_vehicle_ti
WHERE EXTRACT(MONTH FROM creation_date) = 10
GROUP BY type1) AS t
ON t.type1 = split_part(v.description, ' ', 25)
GROUP BY type, partial
and rewrote it to the following one:
WITH
cte1 AS (SELECT split_part(description, ' ', 25) AS type1, COUNT(1) AS partial FROM qt_vehicle_ti WHERE EXTRACT(MONTH FROM creation_date) = 10 GROUP BY type1),
cte2 AS (SELECT split_part(description, ' ', 25) AS type2, COUNT(1) AS total FROM qt_vehicle_ti GROUP BY type2)
SELECT type1, total, partial FROM cte1 JOIN cte2 ON cte1.type1 = cte2.type2;
Indeed! Performance is way better now. As well as readability, and less code written!
Nevertheless, I’m still reluctant to the necessity of using v_normalized. On the other hand my reluctancy comes from not entirely understanding this approach and the benefits of it.
Your words were:
“
A simple conditional (filter) count would be much easier to understand and should be much faster:
Select type, count(*) as total_count, count(*) filter (where month=10) as m10_count from v_normalized_data group by type;
I added v_normalized because the expressions the decompose your data tend to be better placed in a view and your main queries focus just on their purpose and not structural data manipulation. Especially something expensive like duplicating split_part.
"
So, feel free whether to go deeper continuing this discussion
Best wishes,
I
On Saf. 23, 1442 AH, at 00:58, David G. Johnston <david.g.johnston@gmail.com> wrote:On Friday, October 9, 2020, Iuri Sampaio <iuri.sampaio@gmail.com> wrote:Hi David,RIGHT OUTER JOIN is the key!TOTALSELECT split_part(description, ' ', 25) AS type, COUNT(1) AS total FROM qt_vehicle_ti GROUP BY typeOCTOBERSELECT split_part(description, ' ', 25) AS type, COUNT(1) AS total FROM qt_vehicle_ti WHERE EXTRACT(MONTH FROM creation_date) = 10 GROUP BY typeFINALSELECT split_part(v.description, ' ', 25) AS type, t.partial, COUNT(1) AS total FROM qt_vehicle_ti v RIGHT OUTER JOIN ( SELECT split_part(description, ' ', 25) AS type1, COUNT(1) AS partial FROM qt_vehicle_ti WHERE EXTRACT(MONTH FROM creation_date) = 10 GROUP BY type1) AS t ON t.type1 = split_part(v.description, ' ', 25) GROUP BY type, partialLet me know if you would use a different approachThe convention I try to observe when using outer joins is to use left join, not right (outer is implied). That said, you seem to have written a left join query since the totals, a superset of october, are on the left. Also, count(*) is my learned convention instead of count(1).A simple conditional (filter) count would be much easier to understand and should be much faster:Select type, count(*) as total_count, count(*) filter (where month=10) as m10_count from v_normalized_data group by type;I added v_normalized because the expressions the decompose your data tend to be better placed in a view and your main queries focus just on their purpose and not structural data manipulation. Especially something expensive like duplicating split_part.David J.