Re: [HACKERS] Counting bool flags in a complex query - Mailing list pgsql-sql
From | Michael Richards |
---|---|
Subject | Re: [HACKERS] Counting bool flags in a complex query |
Date | |
Msg-id | Pine.BSF.4.10.9907160447220.38362-100000@scifair.acadiau.ca Whole thread Raw |
Responses |
Re: [HACKERS] Counting bool flags in a complex query
Re: [SQL] Re: [HACKERS] Counting bool flags in a complex query Re: [HACKERS] Counting bool flags in a complex query |
List | pgsql-sql |
On Thu, 15 Jul 1999, Tom Lane wrote: > Michael Richards <miker@scifair.acadiau.ca> writes: > > I'm not sure this is correct, but I think I see a bug of some sort... > > I committed a fix last night; it will be in 6.5.1. I've found what I believe is another set of bugs: This is my monster query again... My folder numbers are: negative numbers are system folders such as New mail, trash, drafts and sentmail. I wanted to order the tuples so that the folderids were sorted from -1 to -4, then 1 to x. This way the system folders would always appear first in the list. This may not be valid SQL, as none of my books mention it. Is it possible to order by an expression? Here are some examples which some some odd behaviour. My suspected bug findings are at the end: SELECT folderid,foldername,count(*) as "messgaes",sum(bool2int(flagnew)) as "newmessages",sum(contentlength) as "size" FROM usermail,folders WHERE usermail.loginid='michael' and folders.loginid=usermail.loginid AND usermail.folder = folders.folderid GROUP BY folderid,foldername UNION SELECT folderid,foldername,0,0,0 FROM folders WHERE loginid='michael' AND NOT EXISTS (SELECT folder FROM usermail WHERE loginid='michael' AND folder=folderid) order by (folderid>0); folderid|foldername |messgaes|newmessages| size --------+----------------+--------+-----------+------- -4|Deleted Messages| 110| 50| 245627 -2|Sent Mail | 7| 2| 10878 -1|New Mail Folder | 73| 1|8831226 1|OOL | 7| 0| 8470 2|suggestions | 26| 0| 35433 3|Acadia | 5| 0| 17703 4|advertising | 4| 2| 5394 5|dealt with | 3| 0| 2883 36|dauphne | 9| 0| 66850 -3|Saved Drafts | 0| 0| 0 (10 rows) It looks like the order by is only being applied to the original select, not the unioned select. Some authority should check on it, but by thought it that a union does not necessarily maintain the order, so the entire select should be applied to the order. I'm not so good at interpreting the query plan, but here it is: Unique (cost=8.10 rows=0 width=0) -> Sort (cost=8.10 rows=0 width=0) -> Append (cost=8.10 rows=0 width=0) -> Aggregate (cost=6.05 rows=1 width=49) -> Group (cost=6.05 rows=1 width=49) -> Sort (cost=6.05 rows=1 width=49) -> Nested Loop (cost=6.05 rows=1 width=49) -> Index Scan using usermail_pkey on usermail (cost=2.05 rows=2 width=21) -> Index Scan using folders_pkey on folders (cost=2.00 rows=8448 width=28) -> Index Scan using folders_pkey on folders (cost=2.05 rows=2 width=16) SubPlan ->Index Scan using usermail_pkey on usermail (cost=2.05 rows=1 width=4) I would have expected the folderid -3 to appear as the 3rd one in this case. I'm probably going to change the numbering scheme of the system folders so they will sort correctly without a kluge such as: create function ordfolderid(int) returns int as 'select $1*-1 where $1<0 union select $1+1*10 where $1>=0' language 'sql'; Then running the order clause as: order by (folderid<0),ordfolderid(folderid) My thought behind this kludge is that the table should first be ordered by the t/f value of the fact folderid<0, then within each of the true and false sortings, subsort those by the value of folderid. Complicated enough for you? Well, in my playing I notice what appears to be more of a bug... SELECT folderid,foldername,count(*) as "messages",sum(bool2int(flagnew)) as "newmessages",sum(contentlength) as "size" FROM usermail,folders WHERE usermail.loginid='michael' and folders.loginid=usermail.loginid AND usermail.folder = folders.folderid GROUP BY folderid,foldername UNION SELECT folderid,foldername,0,0,0 FROM folders WHERE loginid='michael' AND NOT EXISTS (SELECT folder FROM usermail WHERE loginid='michael' AND folder=folderid) order by (folderid<0); folderid|foldername |messgaes|newmessages| size --------+----------------+--------+-----------+------- 1|OOL | 7| 0| 8470 2|suggestions | 26| 0| 35433 3|Acadia | 5| 0| 17703 4|advertising | 4| 2| 5394 5|dealt with | 3| 0| 2883 36|dauphne | 9| 0| 66850 -4|Deleted Messages| 110| 50| 245627 -2|Sent Mail | 7| 2| 10878 -1|New Mail Folder | 73| 1|8831226 -3|Saved Drafts | 0| 0| 0 (10 rows) SELECT folderid,foldername,count(*) as "messages",sum(bool2int(flagnew)) as "newmessages",sum(contentlength) as "size" FROM usermail,folders WHERE usermail.loginid='michael' and folders.loginid=usermail.loginid AND usermail.folder = folders.folderid GROUP BY folderid,foldername UNION SELECT folderid,foldername,0,0,0 FROM folders WHERE loginid='michael' AND NOT EXISTS (SELECT folder FROM usermail WHERE loginid='michael' AND folder=folderid) order by (messages<10); ERROR: attribute 'messages' not found Using a column name within an expression in the order by does not seem to work... Or a much simpler example to illustrate the bug: fastmail=> select 1 as "test" order by (test<9); ERROR: attribute 'test' not found fastmail=> select 1 as "test" order by test; test ---- 1 (1 row) I was almost able to make it work properly aside from the sorting issue with my kludged up routine... This is so nasty that I most definitely don't want to put it into production: SELECT folderid,foldername,count(*) as "messages",sum(bool2int(flagnew)) as "newmessages",sum(contentlength) as "size",(folderid>=0) FROM usermail,folders WHERE usermail.loginid='michael' and folders.loginid=usermail.loginid AND usermail.folder = folders.folderid GROUP BY folderid,foldername UNION SELECT folderid,foldername,0,0,0,(folderid>=0) FROM folders WHERE loginid='michael' AND NOT EXISTS (SELECT folder FROM usermail WHERE loginid='michael' AND folder=folderid) order by 6,ordfolderid(folderid); folderid|foldername |messages|newmessages| size|?column? --------+----------------+--------+-----------+-------+-------- -1|New Mail Folder | 73| 1|8831226|f -2|Sent Mail | 7| 2| 10878|f -4|Deleted Messages| 110| 50| 245627|f -3|Saved Drafts | 0| 0| 0|f 1|OOL | 7| 0| 8470|t 2|suggestions | 26| 0| 35433|t 3|Acadia | 5| 0| 17703|t 4|advertising | 4| 2| 5394|t 5|dealt with | 3| 0| 2883|t 36|dauphne | 9| 0| 66850|t (10 rows) Do I need outer joins to make this work instead of the screwed up union method I'm trying here, or is it just a series of bugs? -Michael