Re: rand48 replacement - Mailing list pgsql-hackers
From | Yura Sokolov |
---|---|
Subject | Re: rand48 replacement |
Date | |
Msg-id | 3136e2672d5ab395521bae05a80df469@postgrespro.ru Whole thread Raw |
In response to | Re: rand48 replacement (Fabien COELHO <coelho@cri.ensmp.fr>) |
Responses |
Re: rand48 replacement
|
List | pgsql-hackers |
Fabien COELHO писал 2021-07-04 23:29: >> The important property of determinism is that if I set a seed, and >> then make an identical set of calls to the random API, the results >> will be identical every time, so that it's possible to write tests >> with predictable/repeatable results. > > Hmmm… I like my stronger determinism definition more than this one:-) > >>> I would work around that by deriving another 128 bit generator >>> instead of splitmix 64 bit, but that is overkill. >> >> Not really relevant now, but I'm pretty sure that's impossible to do. >> You might try it as an interesting academic exercise, but I believe >> it's a logical impossibility. > > Hmmm… I was simply thinking of seeding a new pg_prng_state from the > main pg_prng_state with some transformation, and then iterate over the > second PRNG, pretty much like I did with splitmix, but with 128 bits > so that your #states argument does not apply, i.e. something like: > > /* select in a range with bitmask rejection */ > uint64 pg_prng_u64_range(pg_prng_state *state, uint64 range) > { > /* always advance state once */ > uint64 next = xoroshiro128ss(state); > uint64 val; > > if (range >= 2) > { > uint64 mask = mask_u64(range-1); > > val = next & mask; > > if (val >= range) > { > /* copy and update current prng state */ > pg_prng_state iterator = *state; > > iterator.s0 ^= next; > iterator.s1 += UINT64CONST(0x9E3779B97f4A7C15); > > /* iterate till val in [0, range) */ > while ((val = xoroshiro128ss(&iterator) & mask) >= range) > ; > } > } > else > val = 0; > > return val; > } > > The initial pseudo-random sequence is left to proceed, and a new PRNG > is basically forked for iterating on the mask, if needed. I believe most "range" values are small, much smaller than UINT32_MAX. In this case, according to [1] fastest method is Lemire's one (I'd take original version from [2]) Therefore combined method pg_prng_u64_range could branch on range value uint64 pg_prng_u64_range(pg_prng_state *state, uint64 range) { uint64 val = xoroshiro128ss(state); uint64 m; if ((range & (range-1) == 0) /* handle all power 2 cases */ return range != 0 ? val & (range-1) : 0; if (likely(range < PG_UINT32_MAX/32) { /* * Daniel Lamire's unbiased range random algorithm based on rejection sampling * https://lemire.me/blog/2016/06/30/fast-random-shuffling/ */ m = (uint32)val * range; if ((uint32)m < range) { uint32 t = -range % range; while ((uint32)m < t) m = (uint32)xoroshiro128ss(state) * range; } return m >> 32; } /* Apple's mask method */ m = mask_u64(range-1); val &= m; while (val >= range) val = xoroshiro128ss(state) & m; return val; } Mask method could also be faster when range is close to mask. For example, fast check for "range is within 1/1024 to mask" is range < (range/512 + (range&(range*2))) And then method choose could like: if (likely(range < UINT32_MAX/32 && range > (range/512 + (range&(range*2))))) But I don't know does additional condition worth difference or not. [1] https://www.pcg-random.org/posts/bounded-rands.html [2] https://lemire.me/blog/2016/06/30/fast-random-shuffling/ regards, Sokolov Yura
pgsql-hackers by date: