-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serialize / Deserialize Xor8 type. #1
Comments
Hi, @prataprc, I would like to write the persistent(SerDes) function :) |
There are many types of serialization formats. IMHO SerDe wants to Serialize any Rust type to any of those serialization formats. In this case, I think, we only need binary serialization. So to begin with we can implement a simple encode() decode() API and do SerDe at a later point ? And thanks for the offer. |
https://lemire.me/blog/2019/12/19/xor-filters-faster-and-smaller-than-bloom-filters/ |
FWIW, I have another impl of the xor filters in Rust with optional serialization/deserialization with serde behind a feature flag: https://github.com/ayazhafiz/xorf. Feel free to use that implementation, or we can even merge these two libraries. Let me know what you think. |
@ayazhafiz, |
@ayazhafiz thanks for the offer, will give a shout-out when the need arises. Cheers, |
For new filter data structure, I would add an upgraded version of the persistent function, which could save new attributes(keys and hash_builder). |
IMHO, in case of Xor8, Serialization / De-serialization is only applicable to bitmap-index and its associated fields. That is, we only need those fields required to execute the "contain()" API. I have tried to scope the problem of handing really large set of keys in #9. |
@prataprc |
* Now includes `hash_builder` field as part of Xor8 serialization. * Test cases for TL1 (backward compatibility) and TL2. * File version moves from `TL1` to `TL2`. * METADATA includes length of the serialized `hash_builder`. * Shape of the serialized file has changed. * `Xor8::write_file`, `Xor8::read_file`, `Xor8::to_bytes`, `Xor8::from_bytes` methods expect that type parameter implements `Default`, `Clone`, `From<Vec<u8>>`, `Into<Vec<u8>>` traits. * Having said this, the new change is backward compatible for `Xor8::read_file` and `Xor8::from_bytes` to de-serialize Xor8 from previous version (TL1).
So that it can be persisted onto disk and retrieved later for membership checks.
Update1: Now serializing and de-serializing Xor8::build_hasher() is more challenging. For instance documentation from std has this to say:
If
RandomState
is used as BuildHasher,std
has got this to sayIf
DefaultHasher
is used as BuildHasher,std
has got this to say,So unless we have a stable BuildHasher type that is stable across releases and across instances, we may not be able to provide a stable serialization and de-serialization API.
The text was updated successfully, but these errors were encountered: