more simd this morning.
-
more simd this morning.
i am doing variable-length integers, for which i need to measure how much space they need. the usual tricks do not work for 64-bit integers on avx2, so i've had to roll my own.
it's 16 cycles for 4 integers, vs the stackoverflow suggestion coming out at about 20? if i didn't know CLZ was still 4/4 on zen4 that would look kinda terrible, but we still have to pay the several cycles for a simd load to do it at all, so it might not be a win on avx2.nonetheless it's pleasing to beat the "experts" at their own game.
-
@dysfun You want to find the highest set bit of each 64-bit value in a packed SIMD register?
Copyright © 2024 NodeBB | Contributors