Thanks for running those tests. I had a quick look at the results and I think to say that all 4 are better is not quite right. One is actually a tiny bit slower and one is only faster due to a plan change.
Yes.. Thanks for pointing it out.
Q18 uses a result cache for 2 x nested loop joins and has a 0% hit ratio. The execution time is reduced to 91% of the original time only because the planner uses a different plan, which just happens to be faster by chance. Q20 uses a result cache for the subplan and has a 0% hit ratio. The execution time is 100.27% of the original time. There are 8620 cache misses.
Looks the case here is some statistics issue or cost model issue. I'd
like to check more about that. But before that, I upload the steps[1] I used