You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
for (int i = 0; i < SizeWords; ++i) {
__m256i k = _mm256_loadu_si256((__m256i*) (keys + i * 8));
You use a single 256bit load instruction to load the first 32bit word for all the 8 keys. So the Nth words of all the 8 keys are already put together?
Even for columar data format like parquet, it only grantees the values of one column are put continuously in a byte array. It doesn't order the bytes or words of the keys in the way being able to load like above code.
This seems to be a very strictly data layout.
Would you please confirm this?
The text was updated successfully, but these errors were encountered:
For the below code from parallel-murmur3:
for (int i = 0; i < SizeWords; ++i) {
__m256i k = _mm256_loadu_si256((__m256i*) (keys + i * 8));
You use a single 256bit load instruction to load the first 32bit word for all the 8 keys. So the Nth words of all the 8 keys are already put together?
Even for columar data format like parquet, it only grantees the values of one column are put continuously in a byte array. It doesn't order the bytes or words of the keys in the way being able to load like above code.
This seems to be a very strictly data layout.
Would you please confirm this?
The text was updated successfully, but these errors were encountered: