Thread: Strange behavior of the random() function
Hello, Can anybody explain me that strange behavior? select random() as "rnd", random() as "rnd1", random() as "rnd2" from generate_series( 1, 5 ); --- 0.948556384071708 0.769186236895621 0.381390900816768 0.684488877654076 0.594888080842793 0.214444123208523 0.668408528901637 0.452859566546977 0.924952184315771 All of the values are different. select random() as "rnd", random() as "rnd1", random() as "rnd2" from generate_series( 1, 3 ) order by random(); --- 0.166236794553697 0.166236794553697 0.166236794553697 0.629051928408444 0.629051928408444 0.629051928408444 0.698161978274584 0.698161978274584 0.698161978274584 All values in any row are the same. select random() as "rnd", random() as "rnd1", random() as "rnd2" from generate_series( 1, 3 ) order by random() || 'test'; --- 0.569052047561854 0.531697370111942 0.338135569822043 0.880468992516398 0.172520748339593 0.0296505615115166 0.862012444529682 0.119546371512115 0.21626165881753 All of the values are different again. Is it a bug?
r.zharkov@postgrespro.ru writes: > Can anybody explain me that strange behavior? It's a squishiness in the SQL language, if you ask me. Consider this simplified query: select random() from generate_series(1, 3) order by random(); Would you expect the output of this query to appear ordered, or not? There's an argument to be made that the two instances of random() ought to be executed independently for each row, so that the output would (probably) seem unordered. But practical usage generally expects that we unify the two textually-identical expressions, so that the behavior is the same as select random() from generate_series(1, 3) order by 1; > select random() as "rnd", random() as "rnd1", random() as "rnd2" > from generate_series( 1, 3 ) > order by random(); > All values in any row are the same. Here, we've unified *all* the appearances of the ORDER BY expression in the SELECT-list. Maybe that's wrong, but it's hard to make a principled argument why we shouldn't do it. If only one of them should be unified with the ORDER BY expression, then which one? > select random() as "rnd", random() as "rnd1", random() as "rnd2" > from generate_series( 1, 3 ) > order by random() || 'test'; Here, the ORDER BY expression is not a match to any SELECT-list expression, so they don't get unified with it. If you'd like more control over this sort of thing, I'd recommend using a sub-select, ie select x,y,z from (select random() as x, random() as y, random() as z from generate_series(1,3)) ss order by x; which makes it clear what your intention is. There's some interesting related behaviors: compare the outputs of select random() as x, random() as y, random() as z from generate_series(1,3) order by 1; select random() as x, random() as y, random() as z from generate_series(1,3) order by 1,2; select random() as x, random() as y, random() as z from generate_series(1,3) order by 1,2,3; I could buy the idea that there's a bug involved in that; but again the question is exactly which textually identical expressions should get unified and why. regards, tom lane
> 26 сент. 2018 г., в 6:35, Tom Lane <tgl@sss.pgh.pa.us> написал(а): > > r.zharkov@postgrespro.ru writes: >> Can anybody explain me that strange behavior? > > It's a squishiness in the SQL language, if you ask me. Consider this > simplified query: > > select random() from generate_series(1, 3) order by random(); > > Would you expect the output of this query to appear ordered, or not? > There's an argument to be made that the two instances of random() ought > to be executed independently for each row, so that the output would > (probably) seem unordered. But practical usage generally expects that > we unify the two textually-identical expressions, so that the behavior > is the same as > > select random() from generate_series(1, 3) order by 1; > >> select random() as "rnd", random() as "rnd1", random() as "rnd2" >> from generate_series( 1, 3 ) >> order by random(); >> All values in any row are the same. > > Here, we've unified *all* the appearances of the ORDER BY expression in > the SELECT-list. Maybe that's wrong, but it's hard to make a principled > argument why we shouldn't do it. Because random() is volatile function, but «unified» in such way can be only stable expressions. That the «volatile» functionbehaves like the «stable» function this is obviously is a bug.