Thanks again - for some reason I thought that each page should be fsynced separately...
I am running ZFS and going to try following config on 32gb server:
shared_buffers = 512mb (previously was 6gb)
max_wal_size = 8gb
zfs_arc_max = 24gb
i.e. run with minimal shared buffers and do all the caching in ZFS. As I understand it now such config can provide better results since data will be cached once in ZFS.