Thread: sort performance better with little memory than big memory
hello,
I am running some sorting sql on my machine,test data is tpch100g, and sql is:explain analyze verbose select l_shipdate,l_orderkey from lineitem_0 order by l_shipdate,l_orderkey desc .
I found that when I set work_mem to 65MB,sort method is external merge with disk,which cost 50s in my server.
and when I set work_mem to 6GB,sort method is quicksort in memory, which cost 78s in same server.
It is strange that more memory bring worse performance.I used perf and find that when work_mem is 6GB,L1-dcache-load-misses is much more than 64MB when qsort and tuplesort_gettuple_common.
So,can we try to split memory to pieces and qsort every one,and merge than all in memory,I have tried this in my local code, and got about 12% improvement when memory is enough.
On Sun, 4 Dec 2022 at 00:14, yang zhao <azurezhao8369@gmail.com> wrote: > So,can we try to split memory to pieces and qsort every one,and merge than all in memory,I have tried this in my localcode, and got about 12% improvement when memory is enough. We're not very good at CPU cache awareness. The hard part here would be to figure out how large to make those pieces. Maybe the value could be a GUC which can be tuned similar to work_mem. One problem with that might be that L3 is shared with other processes and it does not seem unreasonable that smaller-than-L3 sizes will be better when the system is busy. However, it also seems possible that sorting in L3-sized chunks might still be faster even when the system is busy with many other tasks. The best thing to do here is to post your proposal along with a patch and the performance tests you've done to the pgsql-hackers mailing list. This way people can test this for themselves and see if they get similar results. Patches such as this one need to be tested on lots of different hardware. pgsql-hackers is the place for discussions about that. It's full of people that might be willing to patch with your patch and test if the performance is also improved on their hardware. David