smhasher test results for funny hash show failures #1

demerphq · 2018-01-11T17:53:37Z

https://github.com/demerphq/smhasher/blob/master/doc/FunnyHash64-2.128.128.64.txt

FunnyHash64-2 128 128 64 FAILED 17 of 187 tests. Avalanche: 17-24, OverAll: 187, Sanity: 4-5, TwoBytes: 59, 61, 63, Zeroes: 166-167, 170

Just a heads up.

funny-falcon · 2018-01-11T19:02:22Z

Thank you for test and reminding.

I've looked inattentive before, and i've seen only avalanche one short keys. But seems like there are more. I'll investigate.

BTW, is non-zero seed used for failed tests? FunnyHash is known to be bad with all-zero seed.

demerphq · 2018-01-11T20:14:58Z

On 11 January 2018 at 20:02, Sokolov Yura ***@***.***> wrote: Thank you for test and reminding. I've looked inattentive before, and i've seen only avalanche one short keys. But seems like there are more. I'll investigate. BTW, is non-zero seed used for failed tests? FunnyHash is known to be bad with all-zero seed.

Yes, it includes some zero-seed tests. But you are failing other tests too. From a testing standpoint the hash is expected to deal with zero seeds itself. You could for instance do something like: seed ^= len; seed += !seed; to ensure it is nonzero at the start of your hash. Yves

funny-falcon · 2018-01-11T20:38:29Z

FunnyHash is intended for usage with hidden random-like seed. Zero seed is certainly neither hidden nor random-like.

May I send pull request with change of seeding for funny_hash? Like:

*(uint64_t *) out = fh64_string_hash2(key, len, s64[0] ^ 0x12345678, s64[1] ^ 0xabcdef12 );

I'll investigate issues that wouldn't be fixed with this.

demerphq · 2018-01-11T20:44:34Z

On 11 January 2018 at 21:14, demerphq ***@***.***> wrote: On 11 January 2018 at 20:02, Sokolov Yura ***@***.***> wrote: > > Thank you for test and reminding. > > I've looked inattentive before, and i've seen only avalanche one short > keys. But seems like there are more. I'll investigate. > > BTW, is non-zero seed used for failed tests? FunnyHash is known to be bad > with all-zero seed. Yes, it includes some zero-seed tests. But you are failing other tests too. From a testing standpoint the hash is expected to deal with zero seeds itself. You could for instance do something like: seed ^= len; seed += !seed; to ensure it is nonzero at the start of your hash.

Looking at the result more closely, there are three main sets of results: * zero seed vulnerabilities in various tests. * avalanche tests show weakness in the high bits of the seed for short keys. Consider your hash is purely multiplicative, so the higher bits of the seed have less effect on the result. Consider that (1<<63) * odd_integer == (1<<63) under modulo 2**64, for all odd integers. (I hope you get what I mean by that - changing the high bit of the state has relatively less effect on the hash compared to lower bits) * twobytes test, this looks for collisions/distribution from strings with various multiples of two bytes long strings, for all two byte non-zero values. I imagine you have collisions there because of the way you read the last K bytes twice, although I am not sure of the details. Yves

demerphq · 2018-01-11T21:04:30Z

On 11 January 2018 at 21:38, Sokolov Yura ***@***.***> wrote: FunnyHash is intended for usage with hidden random-like seed. Zero seed is certainly neither hidden nor random-like.

No, that is wrong. when choosing a number from 0 to 2**64-1, 0 is no less random than 0x0123456789ABCDEF. Put another way, pretty much any proper RNG produces 0 exactly as often as it produces every other integer. May I send pull request with change of seeding for funny_hash? Like:

*(uint64_t *) out = fh64_string_hash2(key, len, s64[0] ^ 0x12345678, s64[1] ^ 0xabcdef12 );

How would that help? That changes your magic "bad seed" from {0,0} to {0x12345678, 0xabcdef12}. You might pass some tests, but it would leave the core weakness in the hash. Anyway, you can send me whatever pull requests you like, I don't mind really, its your hash. :-) I think there is an alternative. Your hash is used similarly to Perl's, and my fork of smhasher was designed to test perls hash functions, which means it supports a distinct interface for initializing the state before doing any hashing. The function takes the seed and then returns a frozen "state" which is passed through to the hashing functiion. You can define a state initialization function and then do whatever seed guards you need there. In fact, i think right now there is already one of these functions, and you just need to educate it as you wish. I would say you library should define such state initialization code itself, so users do not have to be aware of any issues like this. This would also give you an opportunity to pre-mix your seeds, possible together, so the state is better mixed when you start hashing.

I'll investigate issues that wouldn't be fixed with this.

Sure. Really its your call. I have a personal interest in this subject, and I consider all of these tests sane minimum requirements for any hash. If you don't agree that is fine with me. :-) FWIW, I think your hash is kinda interesting. It is really two hashes run in parallel, each of which has an easy multicollision vulnerability which is ameliorated by the fact that the input vector that breaks one does not break the other, and vice versa. It is a bit scary actually. I kinda wonder what a cryptographer would make of it. cheers, Yves

funny-falcon · 2018-01-11T21:11:24Z

I don't care for zero seed at all. I'll put pull request for seed xoring as in comment above.

No, that is wrong. when choosing a number from 0 to 2**64-1, 0 is no less random than 0x0123456789ABCDEF.

Yes. But seed is 128bit. And all-zero seed has probability is 2^-128 . It is same probability there is another UUIDv4 matching my UUIDv4.
So, I don't care.

I don't care much for avalanche for short keys. I know how to fix it: just add another one multiplication round. It will slow function for short keys. (That is why it 8 byte keys are good: there are one more multplication round compared to 7 byte).
Especially I don't care much for this avalanche dependency on seed value. Even your test results shows there is always at least 70 bits of seed value that produces good avalanche, and that is more than enough for FunnyHash purposes (ie hash for internal hash table without exposed seed or hashsum value).

But I care for twobytes. I'll certainly will try to understand whats happening.

btw, can you help me to compile benchmark? I do:

$ git clone https://github.com/demerphq/smhasher.git
$ cd smhasher
$ mkdir build
$ cd build
$ cp ../../funny_hash/funny_hash.h .
$ cmake ..
$ # VERSION.h not found. What should I install?
$ make
....
/home/yura/Project/smhasher-demerphq/TAP.cpp:5:21: fatal error: VERSION.h: No such file or directory
 #include "VERSION.h"
                     ^
compilation terminated.

funny-falcon · 2018-01-11T21:14:04Z

btw: did you try https://github.com/funny-falcon/fanom_hash ? I'll check it as well and will send PR for it.

demerphq · 2018-01-11T21:18:03Z

On 11 January 2018 at 22:11, Sokolov Yura ***@***.***> wrote: I don't care for zero seed at all. I'll put pull request for seed xoring as in comment above. No, that is wrong. when choosing a number from 0 to 2**64-1, 0 is no less random than 0x0123456789ABCDEF. Yes. But seed is 128bit. And all-zero seed has probability is 2^-128 . It is same probability there is another UUIDv4 matching my UUIDv4. So, I don't care.

Shrug. That is your call.

I don't care much for avalanche for short keys. I know how to fix it: just add another one multiplication round. It will slow function for short keys. (That is why it 8 byte keys are good: there are one more multplication round compared to 7 byte). Especially I don't care much for this avalanche dependency on seed value. Even your test results shows there is always at least 70 bits of seed value that produces good avalanche, and that is more than enough for FunnyHash purposes (ie hash for internal hash table without exposed seed or hashsum value).

That is your call again. I suspect that it means you leak seed state, so attacking the hash is a lot easier than you think, but I cant prove it right now. Also, dont you have a 64 bit version? But I care for twobytes. I'll certainly will try to understand whats

happening.

As you wish.

btw, can you help me to compile benchmark? I do: $ git clone https://github.com/demerphq/smhasher.git $ cd smhasher $ mkdir build $ cd build $ cp ../../funny_hash/funny_hash.h . $ cmake .. $ # VERSION.h not found. What should I install? $ make .... /home/yura/Project/smhasher-demerphq/TAP.cpp:5:21: fatal error: VERSION.h: No such file or directory #include "VERSION.h" ^ compilation terminated.

Run ./make_smhasher, the file will be created - sorry about that, I will add that to the docs. cheers, Yves

…

-- perl -Mre=debug -e "/just|another|perl|hacker/"

demerphq · 2018-01-11T21:20:53Z

On 11 January 2018 at 22:14, Sokolov Yura ***@***.***> wrote: btw: did you try https://github.com/funny-falcon/fanom_hash ? I'll check it as well and will send PR for it.

Sure. Sounds good. cheers, Yves

demerphq · 2018-01-11T21:30:59Z

On 11 January 2018 at 22:18, demerphq ***@***.***> wrote: On 11 January 2018 at 22:11, Sokolov Yura ***@***.***> wrote: > > I don't care for zero seed at all. I'll put pull request for seed xoring > as in comment above. > > No, that is wrong. when choosing a number from 0 to 2**64-1, 0 is no less > random than 0x0123456789ABCDEF. > > Yes. But seed is 128bit. And all-zero seed has probability is 2^-128 . It > is same probability there is another UUIDv4 matching my UUIDv4. > So, I don't care. Shrug. That is your call.

Having said that, I don't get it. How hard is it to define a function that ensures your state has reasonable mixed version of its input state? I mean if it gets executed exactly once per seed, and you normally use the same seed per process why would you not? It just seems irresponsible, not matter how unlikely it is. Maybe I should give some context on my view of this. Perl used to use a hash function with the same problem, and with some builds it would end up seeded as zero. Thus there was a security issue and etc. Even if it is unlikely under totally correct usage, it is still a design flaw. Yves

…

-- perl -Mre=debug -e "/just|another|perl|hacker/"

funny-falcon · 2018-01-11T21:31:42Z

That is your call again. I suspect that it means you leak seed state, so
attacking the hash is a lot easier than you think, but I cant prove it
right now.

FunnyHash is only for use-cases where neither seed, nor result hash-sum is exposed (in any way, except timing for insertion/lookup into hash-table) (and seed is strongly random).
For any other use case there is no any promises.
70 bits of seed is more than enough to protect against timing investigations.
If it were avalanche fail on keybits (not on seed), or if there were only 30bits of seed value that produces good avalanche, I'd bother more.

funny-falcon · 2018-01-11T21:33:39Z

Maybe I should give some context on my view of this. Perl used to use
a hash function with the same problem, and with some builds it would
end up seeded as zero. Thus there was a security issue and etc. Even
if it is unlikely under totally correct usage, it is still a design
flaw.

Any function with known seed is vulnurable to bruteforce searching of collisioned keys.
It doesn't matter, have you SHA-1, SHA-2, SHA-3 or BLAKE-2 or SIPHASH.
If you know seed, you easily generate collisioned keys.

That is why I don't bother for known seed. And zero-seed is known seed.

funny-falcon · 2018-01-11T21:34:25Z

If you know seed, you easily generate collisioned keys.

I mean, collisioned into same bucket of hash table.

funny-falcon · 2018-01-11T21:37:25Z

Look at Microsoft Marvin32 hash function. It is awesome. (and patented). But it is also dumb with zero-seed.

demerphq · 2018-01-11T21:51:55Z

On 11 January 2018 at 22:37, Sokolov Yura ***@***.***> wrote: Look at Microsoft Marvin32 hash function. It is awesome. (and patented). But it is also dumb with zero-seed.

We have radically different definitions of awesome, Marvin32 doesn't do well in the smhasher tests either. Anyway, good luck. Ill look out for your PR's. cheers, Yves

…

-- perl -Mre=debug -e "/just|another|perl|hacker/"

funny-falcon · 2018-01-11T22:00:21Z

Marvin32 doesn't do well in the smhasher tests either.

Just try it with pre-xor-ed seed (so it is non-zero).

demerphq · 2018-01-11T22:02:41Z

On 11 January 2018 at 22:33, Sokolov Yura ***@***.***> wrote: Maybe I should give some context on my view of this. Perl used to use a hash function with the same problem, and with some builds it would end up seeded as zero. Thus there was a security issue and etc. Even if it is unlikely under totally correct usage, it is still a design flaw. Any function with known seed is vulnurable to bruteforce searching of collisioned keys.

Which for any decent hash is itself a difficult task. It doesn't matter, have you SHA-1, SHA-2, SHA-3 or BLAKE-2 or SIPHASH.

If you know seed, you easily generate collisioned keys.

Nope. Sorry. It took Google level resources and a major research to find any collisions at all in SHA1, which by the way has a fixed seed. SHA-2 and SHA-3 have fixed seeds. I dont know much about BLAKE-2, but i think its a SHA-3 candidate, so it would have a fixed seed as well. Siphash is seeded, and has no zero vulnerabilities, as it uses a 128 seed to initialize a 256 bit state in way that zeros are impossible. (It is quite elegant actually.) Anyway, listen, like I said, I don't care what you do with this hash really, it wont affect me at all. I am in this to learn, and to share what I have learned, hence the reason I have pursued this at all. cheers, Yves

demerphq · 2018-01-11T22:03:21Z

On 11 January 2018 at 23:00, Sokolov Yura ***@***.***> wrote: Marvin32 doesn't do well in the smhasher tests either. Just try it with pre-xor-ed seed (so it is non-zero).

It doesn't mix its seed properly, and fails avalanche tests. Yves

funny-falcon · 2018-01-12T08:05:42Z

SHA-2 and SHA-3 have fixed seeds. I dont know much about BLAKE-2, but i
think its a SHA-3 candidate, so it would have a fixed seed as well.

Both BLAKE-2 and SHA3 are keyed hash functions, ie they have arbitrary seed.

Nope. Sorry. It took Google level resources and a major research to find
any collisions at all in SHA1, which by the way has a fixed seed.

I didn't mean full hashsum collision. Only collision to same bucket of hash-table. If hash-table has 100_000 buckets, it is enough to examine 10_000*100_000 = 1_000_000_000 keys to find collision chain longer than 50_000, if seed is known. That is just a day of GPU even with newest SHA-3.

That is why it is important to generate secure random seed and keep it secret.

That is why language/db runtime designer have to be sure seed may be all-zero only in strict random case, so attacker cannot rely on it.

funny-falcon · 2018-01-12T08:36:12Z

I've tested funny_hash with xored seed (as I suggest above). It passes all tests except avalanche for short keys. I will send PR for this.

funny-falcon · 2018-01-12T21:55:17Z

Thank you for this issue.

I found way to feed 64bit seed into 128bit state so even no avalanche errors happens (at a cost of halving seed).
And I accept your idea of "preparing state" function. I will add this "prepare" to funny_hash.h and will make PR to your tests for "funny_hash with 64bit seed and 128bit state".

Also I will PR with fanom hash. It is quite fast, and looks to pass tests (after little tweak I will publish soon). Though I believe its complexity makes it less attractive than funny_hash for language implementations.

funny_hash has good property of easy chaining. It has same 128 intermediate state equal to (random/expanded) seed.
fanom_hash has 256bit intermediate state with 128bit initial seed, and it is less convenient to use in language implementations, where state should be passed while deep structure hashed.

funny-falcon · 2018-01-12T22:27:45Z

Damn, fanom_hash didn't pass differential "up-to-3-bit differentials in 256-bit keys -> 64 bit hashes" ... pitty...

funny-falcon · 2018-01-13T08:38:52Z

No, there was just small error. FanomHash passes tests.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

smhasher test results for funny hash show failures #1

smhasher test results for funny hash show failures #1

demerphq commented Jan 11, 2018

funny-falcon commented Jan 11, 2018

demerphq commented Jan 11, 2018 via email

funny-falcon commented Jan 11, 2018

demerphq commented Jan 11, 2018 via email

demerphq commented Jan 11, 2018 via email

funny-falcon commented Jan 11, 2018

funny-falcon commented Jan 11, 2018

demerphq commented Jan 11, 2018 via email

demerphq commented Jan 11, 2018 via email

demerphq commented Jan 11, 2018 via email

funny-falcon commented Jan 11, 2018

funny-falcon commented Jan 11, 2018

funny-falcon commented Jan 11, 2018

funny-falcon commented Jan 11, 2018

demerphq commented Jan 11, 2018 via email

funny-falcon commented Jan 11, 2018

demerphq commented Jan 11, 2018 via email

demerphq commented Jan 11, 2018 via email

funny-falcon commented Jan 12, 2018

funny-falcon commented Jan 12, 2018

funny-falcon commented Jan 12, 2018

funny-falcon commented Jan 12, 2018

funny-falcon commented Jan 13, 2018

smhasher test results for funny hash show failures #1

smhasher test results for funny hash show failures #1

Comments

demerphq commented Jan 11, 2018

funny-falcon commented Jan 11, 2018

demerphq commented Jan 11, 2018 via email

funny-falcon commented Jan 11, 2018

demerphq commented Jan 11, 2018 via email

demerphq commented Jan 11, 2018 via email

funny-falcon commented Jan 11, 2018

funny-falcon commented Jan 11, 2018

demerphq commented Jan 11, 2018 via email

demerphq commented Jan 11, 2018 via email

demerphq commented Jan 11, 2018 via email

funny-falcon commented Jan 11, 2018

funny-falcon commented Jan 11, 2018

funny-falcon commented Jan 11, 2018

funny-falcon commented Jan 11, 2018

demerphq commented Jan 11, 2018 via email

funny-falcon commented Jan 11, 2018

demerphq commented Jan 11, 2018 via email

demerphq commented Jan 11, 2018 via email

funny-falcon commented Jan 12, 2018

funny-falcon commented Jan 12, 2018

funny-falcon commented Jan 12, 2018

funny-falcon commented Jan 12, 2018

funny-falcon commented Jan 13, 2018