-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
smhasher test results for funny hash show failures #1
Comments
Thank you for test and reminding. I've looked inattentive before, and i've seen only avalanche one short keys. But seems like there are more. I'll investigate. BTW, is non-zero seed used for failed tests? FunnyHash is known to be bad with all-zero seed. |
On 11 January 2018 at 20:02, Sokolov Yura ***@***.***> wrote:
Thank you for test and reminding.
I've looked inattentive before, and i've seen only avalanche one short
keys. But seems like there are more. I'll investigate.
BTW, is non-zero seed used for failed tests? FunnyHash is known to be bad
with all-zero seed.
Yes, it includes some zero-seed tests. But you are failing other tests too.
From a testing standpoint the hash is expected to deal with zero seeds
itself.
You could for instance do something like:
seed ^= len;
seed += !seed;
to ensure it is nonzero at the start of your hash.
Yves
|
FunnyHash is intended for usage with hidden random-like seed. Zero seed is certainly neither hidden nor random-like. May I send pull request with change of seeding for funny_hash? Like: *(uint64_t *) out = fh64_string_hash2(key, len, s64[0] ^ 0x12345678, s64[1] ^ 0xabcdef12 ); I'll investigate issues that wouldn't be fixed with this. |
On 11 January 2018 at 21:14, demerphq ***@***.***> wrote:
On 11 January 2018 at 20:02, Sokolov Yura ***@***.***> wrote:
>
> Thank you for test and reminding.
>
> I've looked inattentive before, and i've seen only avalanche one short
> keys. But seems like there are more. I'll investigate.
>
> BTW, is non-zero seed used for failed tests? FunnyHash is known to be bad
> with all-zero seed.
Yes, it includes some zero-seed tests. But you are failing other tests too.
From a testing standpoint the hash is expected to deal with zero seeds
itself.
You could for instance do something like:
seed ^= len;
seed += !seed;
to ensure it is nonzero at the start of your hash.
Looking at the result more closely, there are three main sets of results:
* zero seed vulnerabilities in various tests.
* avalanche tests show weakness in the high bits of the seed for short
keys. Consider your hash is purely multiplicative, so the higher bits
of the seed have less effect on the result. Consider that (1<<63) *
odd_integer == (1<<63) under modulo 2**64, for all odd integers. (I
hope you get what I mean by that - changing the high bit of the state
has relatively less effect on the hash compared to lower bits)
* twobytes test, this looks for collisions/distribution from strings
with various multiples of two bytes long strings, for all two byte
non-zero values. I imagine you have collisions there because of the
way you read the last K bytes twice, although I am not sure of the
details.
Yves
|
On 11 January 2018 at 21:38, Sokolov Yura ***@***.***> wrote:
FunnyHash is intended for usage with hidden random-like seed. Zero seed is
certainly neither hidden nor random-like.
No, that is wrong. when choosing a number from 0 to 2**64-1, 0 is no less
random than 0x0123456789ABCDEF.
Put another way, pretty much any proper RNG produces 0 exactly as often as
it produces every other integer.
May I send pull request with change of seeding for funny_hash? Like:
*(uint64_t *) out = fh64_string_hash2(key, len, s64[0] ^ 0x12345678, s64[1] ^ 0xabcdef12 );
How would that help? That changes your magic "bad seed" from {0,0} to
{0x12345678, 0xabcdef12}.
You might pass some tests, but it would leave the core weakness in the hash.
Anyway, you can send me whatever pull requests you like, I don't mind
really, its your hash. :-)
I think there is an alternative. Your hash is used similarly to Perl's, and
my fork of smhasher was designed to test perls hash functions, which means
it supports a distinct interface for initializing the state before doing
any hashing. The function takes the seed and then returns a frozen "state"
which is passed through to the hashing functiion. You can define a state
initialization function and then do whatever seed guards you need there. In
fact, i think right now there is already one of these functions, and you
just need to educate it as you wish. I would say you library should define
such state initialization code itself, so users do not have to be aware of
any issues like this. This would also give you an opportunity to pre-mix
your seeds, possible together, so the state is better mixed when you start
hashing.
I'll investigate issues that wouldn't be fixed with this.
Sure. Really its your call. I have a personal interest in this subject, and
I consider all of these tests sane minimum requirements for any hash. If
you don't agree that is fine with me. :-)
FWIW, I think your hash is kinda interesting. It is really two hashes run
in parallel, each of which has an easy multicollision vulnerability which
is ameliorated by the fact that the input vector that breaks one does not
break the other, and vice versa. It is a bit scary actually. I kinda wonder
what a cryptographer would make of it.
cheers,
Yves
|
I don't care for zero seed at all. I'll put pull request for seed xoring as in comment above.
Yes. But seed is 128bit. And all-zero seed has probability is 2^-128 . It is same probability there is another UUIDv4 matching my UUIDv4. I don't care much for avalanche for short keys. I know how to fix it: just add another one multiplication round. It will slow function for short keys. (That is why it 8 byte keys are good: there are one more multplication round compared to 7 byte). But I care for twobytes. I'll certainly will try to understand whats happening. btw, can you help me to compile benchmark? I do:
|
btw: did you try https://github.com/funny-falcon/fanom_hash ? I'll check it as well and will send PR for it. |
On 11 January 2018 at 22:11, Sokolov Yura ***@***.***> wrote:
I don't care for zero seed at all. I'll put pull request for seed xoring
as in comment above.
No, that is wrong. when choosing a number from 0 to 2**64-1, 0 is no less
random than 0x0123456789ABCDEF.
Yes. But seed is 128bit. And all-zero seed has probability is 2^-128 . It
is same probability there is another UUIDv4 matching my UUIDv4.
So, I don't care.
Shrug. That is your call.
I don't care much for avalanche for short keys. I know how to fix it: just
add another one multiplication round. It will slow function for short keys.
(That is why it 8 byte keys are good: there are one more multplication
round compared to 7 byte).
Especially I don't care much for this avalanche dependency on seed value.
Even your test results shows there is always at least 70 bits of seed value
that produces good avalanche, and that is more than enough for FunnyHash
purposes (ie hash for internal hash table without exposed seed or hashsum
value).
That is your call again. I suspect that it means you leak seed state, so
attacking the hash is a lot easier than you think, but I cant prove it
right now.
Also, dont you have a 64 bit version?
But I care for twobytes. I'll certainly will try to understand whats
happening.
As you wish.
btw, can you help me to compile benchmark? I do:
$ git clone https://github.com/demerphq/smhasher.git
$ cd smhasher
$ mkdir build
$ cd build
$ cp ../../funny_hash/funny_hash.h .
$ cmake ..
$ # VERSION.h not found. What should I install?
$ make
....
/home/yura/Project/smhasher-demerphq/TAP.cpp:5:21: fatal error: VERSION.h: No such file or directory
#include "VERSION.h"
^
compilation terminated.
Run ./make_smhasher, the file will be created - sorry about that, I will
add that to the docs.
cheers,
Yves
…--
perl -Mre=debug -e "/just|another|perl|hacker/"
|
On 11 January 2018 at 22:14, Sokolov Yura ***@***.***> wrote:
btw: did you try https://github.com/funny-falcon/fanom_hash ? I'll check
it as well and will send PR for it.
Sure. Sounds good.
cheers,
Yves
|
On 11 January 2018 at 22:18, demerphq ***@***.***> wrote:
On 11 January 2018 at 22:11, Sokolov Yura ***@***.***> wrote:
>
> I don't care for zero seed at all. I'll put pull request for seed xoring
> as in comment above.
>
> No, that is wrong. when choosing a number from 0 to 2**64-1, 0 is no less
> random than 0x0123456789ABCDEF.
>
> Yes. But seed is 128bit. And all-zero seed has probability is 2^-128 . It
> is same probability there is another UUIDv4 matching my UUIDv4.
> So, I don't care.
Shrug. That is your call.
Having said that, I don't get it. How hard is it to define a function
that ensures your state has reasonable mixed version of its input
state? I mean if it gets executed exactly once per seed, and you
normally use the same seed per process why would you not? It just
seems irresponsible, not matter how unlikely it is.
Maybe I should give some context on my view of this. Perl used to use
a hash function with the same problem, and with some builds it would
end up seeded as zero. Thus there was a security issue and etc. Even
if it is unlikely under totally correct usage, it is still a design
flaw.
Yves
…--
perl -Mre=debug -e "/just|another|perl|hacker/"
|
FunnyHash is only for use-cases where neither seed, nor result hash-sum is exposed (in any way, except timing for insertion/lookup into hash-table) (and seed is strongly random). |
Any function with known seed is vulnurable to bruteforce searching of collisioned keys. That is why I don't bother for known seed. And zero-seed is known seed. |
I mean, collisioned into same bucket of hash table. |
Look at Microsoft Marvin32 hash function. It is awesome. (and patented). But it is also dumb with zero-seed. |
On 11 January 2018 at 22:37, Sokolov Yura ***@***.***> wrote:
Look at Microsoft Marvin32 hash function. It is awesome. (and patented).
But it is also dumb with zero-seed.
We have radically different definitions of awesome, Marvin32 doesn't do
well in the smhasher tests either.
Anyway, good luck. Ill look out for your PR's.
cheers,
Yves
…--
perl -Mre=debug -e "/just|another|perl|hacker/"
|
Just try it with pre-xor-ed seed (so it is non-zero). |
On 11 January 2018 at 22:33, Sokolov Yura ***@***.***> wrote:
Maybe I should give some context on my view of this. Perl used to use
a hash function with the same problem, and with some builds it would
end up seeded as zero. Thus there was a security issue and etc. Even
if it is unlikely under totally correct usage, it is still a design
flaw.
Any function with known seed is vulnurable to bruteforce searching of
collisioned keys.
Which for any decent hash is itself a difficult task.
It doesn't matter, have you SHA-1, SHA-2, SHA-3 or BLAKE-2 or SIPHASH.
If you know seed, you easily generate collisioned keys.
Nope. Sorry. It took Google level resources and a major research to find
any collisions at all in SHA1, which by the way has a fixed seed.
SHA-2 and SHA-3 have fixed seeds. I dont know much about BLAKE-2, but i
think its a SHA-3 candidate, so it would have a fixed seed as well.
Siphash is seeded, and has no zero vulnerabilities, as it uses a 128 seed
to initialize a 256 bit state in way that zeros are impossible. (It is
quite elegant actually.)
Anyway, listen, like I said, I don't care what you do with this hash
really, it wont affect me at all. I am in this to learn, and to share what
I have learned, hence the reason I have pursued this at all.
cheers,
Yves
|
On 11 January 2018 at 23:00, Sokolov Yura ***@***.***> wrote:
Marvin32 doesn't do well in the smhasher tests either.
Just try it with pre-xor-ed seed (so it is non-zero).
It doesn't mix its seed properly, and fails avalanche tests.
Yves
|
Both BLAKE-2 and SHA3 are keyed hash functions, ie they have arbitrary seed.
I didn't mean full hashsum collision. Only collision to same bucket of hash-table. If hash-table has That is why it is important to generate secure random seed and keep it secret. That is why language/db runtime designer have to be sure seed may be all-zero only in strict random case, so attacker cannot rely on it. |
I've tested funny_hash with xored seed (as I suggest above). It passes all tests except avalanche for short keys. I will send PR for this. |
Thank you for this issue. I found way to feed 64bit seed into 128bit state so even no avalanche errors happens (at a cost of halving seed). Also I will PR with fanom hash. It is quite fast, and looks to pass tests (after little tweak I will publish soon). Though I believe its complexity makes it less attractive than funny_hash for language implementations. funny_hash has good property of easy chaining. It has same 128 intermediate state equal to (random/expanded) seed. |
Damn, fanom_hash didn't pass differential "up-to-3-bit differentials in 256-bit keys -> 64 bit hashes" ... pitty... |
No, there was just small error. FanomHash passes tests. |
https://github.com/demerphq/smhasher/blob/master/doc/FunnyHash64-2.128.128.64.txt
FunnyHash64-2 128 128 64 FAILED 17 of 187 tests. Avalanche: 17-24, OverAll: 187, Sanity: 4-5, TwoBytes: 59, 61, 63, Zeroes: 166-167, 170
Just a heads up.
The text was updated successfully, but these errors were encountered: