"Devulapalli, Raghuveer" <raghuveer.devulapalli@intel.com> writes:
> Great catch! From the intrinsic manual:
>
> Cast vector of type __m128i to type __m512i; the upper 384 bits of the
> result are undefined.
Just be curious, what kind of optimization (like what -O2 does) could
mask this issue?
> Replacing that with _mm512_zextsi128_si512 fixes the problem.
congratulations!
--
Best Regards
Andy Fan