Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(p2p): redefine the gossip message id and contents #18

Merged
merged 1 commit into from
Jan 18, 2024

Conversation

dancoombs
Copy link
Contributor

p2p-specs/p2p-interface.md Outdated Show resolved Hide resolved
* Otherwise, set `message-id` to the first 20 bytes of the `SHA256` hash of
the concatenation of `MESSAGE_DOMAIN_INVALID_SNAPPY` with the raw message data,
i.e. `SHA256(MESSAGE_DOMAIN_INVALID_SNAPPY + message.data)[:20]`.
* If `message.data` has a valid snappy decompression, set `message-id` to the first 20 bytes of the `SHA256` hash of the concatenation of the following data: `MESSAGE_DOMAIN_VALID_SNAPPY`, the length of the topic byte string (encoded as little-endian `uint64`), the topic byte string, and the snappy decompressed message data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to understand the rationale: in any case, the message ID is a hash based on the real (uncompressed) message data - which makes sense.
why do we encode the transport validity (snappy) in it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was done for continuity with the eth2 spec https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/p2p-interface.md#topics-and-messages https://github.com/ethereum/consensus-specs/blob/dev/specs/altair/p2p-interface.md#topics-and-messages

It looks to be for:

Note: The above logic handles two exceptional cases: (1) multiple snappy data can decompress to the same value, and (2) some message data can fail to snappy decompress altogether.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't understand this funny encoding. maybe they had a "hidden agenda" feature, of supporting an older encoding scheme

  • compression is a transport-layer feature, that could be done at a lower layer. message-id should (and it does) depend on the actual (uncompressed) content.
  • why should we care if different snappy data decompress to the same value? they are thus the same message. The only possible reason I can think of is if there are 2 ids in the system: one is message-id based on our (uncompressed) and topic, and another raw id which is based on raw data. I think that such raw id should be removed completely, instead of patching the protocol to support it.
  • encoding the "success/failure" of decompression into the message-id, implies it is expected that different decompress implementations get different results - which is strange.
  • if snappy decompress fails, then it must be an invalid message and should be ignored completely. we don't
    handle failed compression (or raw, uncompressed) messages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The need for MESSAGE_DOMAIN_INVALID_SNAPPY is likely due to how libp2p implementations handle assigning message ids.

The go implementation doesn't let you return an error when you calculate message id, and it looks to happen on the post-compressed value (hence the need to decompress to calculate msg id, which I thought was odd): https://github.com/libp2p/go-libp2p-pubsub/blob/b5ee222289aabef29ebf90647d7c0d99d5c8ee19/pubsub.go#L343

How Prysm uses it: https://github.com/prysmaticlabs/prysm/blob/c3dbfa66d090ac40818f5ddc5a229599c78db8ab/beacon-chain/p2p/message_id.go#L55

Its likely that this is used within the libp2p library where the message is already being sent and needs to be assigned an ID, and there isn't error handling. So they need to assign something, and chose a format.

Lighthouse never uses this value as it calculates message id on the decompressed data directly https://github.com/sigp/lighthouse/blob/441fc1691b69f9edc4bbdc6665f3efab16265c9b/beacon_node/lighthouse_network/src/config.rs#L486

p2p-specs/p2p-interface.md Outdated Show resolved Hide resolved
p2p-specs/p2p-interface.md Outdated Show resolved Hide resolved
p2p-specs/p2p-interface.md Show resolved Hide resolved
p2p-specs/p2p-interface.md Show resolved Hide resolved
@dancoombs dancoombs force-pushed the danc/gossip branch 3 times, most recently from 3e848d7 to 271f70c Compare December 22, 2023 17:31
@0xSulpiride 0xSulpiride mentioned this pull request Jan 4, 2024
8 tasks
@dancoombs dancoombs force-pushed the danc/gossip branch 2 times, most recently from 25bb077 to 261bba0 Compare January 9, 2024 17:09
@@ -202,6 +213,11 @@ minimumStake: '0.0'
```
The `mempool-id` of the canonical mempool is `TBD` (IPFS hash of the yaml/JSON file).

#### Canonical Mempools

There will be a published list of canonical mempools maintained by the bundler community. This list represents mempools that support the full [ERC-7562](https://github.com/ethereum/ERCs/pull/105) validation rules as well as certain mempool configuration parameters and a specific entry point contract. All bundlers SHOULD support these mempools. User operations that do not require access to alternative mempools will be supported by at least one of these canonical mempools.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@drortirosh drortirosh merged commit c249743 into eth-infinitism:main Jan 18, 2024
0 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants