feat(p2p): redefine the gossip message id and contents #18

dancoombs · 2023-11-30T17:10:14Z

Change is described here: https://hackmd.io/@dancoombs/r1u4CcB-p#Gossip

p2p-specs/p2p-interface.md

drortirosh · 2023-12-13T10:20:20Z

p2p-specs/p2p-interface.md

-* Otherwise, set `message-id` to the first 20 bytes of the `SHA256` hash of
-  the concatenation of `MESSAGE_DOMAIN_INVALID_SNAPPY` with the raw message data,
-  i.e. `SHA256(MESSAGE_DOMAIN_INVALID_SNAPPY + message.data)[:20]`.
+* If `message.data` has a valid snappy decompression, set `message-id` to the first 20 bytes of the `SHA256` hash of the concatenation of the following data: `MESSAGE_DOMAIN_VALID_SNAPPY`, the length of the topic byte string (encoded as little-endian `uint64`), the topic byte string, and the snappy decompressed message data


I'm trying to understand the rationale: in any case, the message ID is a hash based on the real (uncompressed) message data - which makes sense.
why do we encode the transport validity (snappy) in it?

This was done for continuity with the eth2 spec https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/p2p-interface.md#topics-and-messages https://github.com/ethereum/consensus-specs/blob/dev/specs/altair/p2p-interface.md#topics-and-messages

It looks to be for:

Note: The above logic handles two exceptional cases: (1) multiple snappy data can decompress to the same value, and (2) some message data can fail to snappy decompress altogether.

I still don't understand this funny encoding. maybe they had a "hidden agenda" feature, of supporting an older encoding scheme

compression is a transport-layer feature, that could be done at a lower layer. message-id should (and it does) depend on the actual (uncompressed) content.

why should we care if different snappy data decompress to the same value? they are thus the same message. The only possible reason I can think of is if there are 2 ids in the system: one is message-id based on our (uncompressed) and topic, and another raw id which is based on raw data. I think that such raw id should be removed completely, instead of patching the protocol to support it.

encoding the "success/failure" of decompression into the message-id, implies it is expected that different decompress implementations get different results - which is strange.

if snappy decompress fails, then it must be an invalid message and should be ignored completely. we don't
handle failed compression (or raw, uncompressed) messages.

The need for MESSAGE_DOMAIN_INVALID_SNAPPY is likely due to how libp2p implementations handle assigning message ids.

The go implementation doesn't let you return an error when you calculate message id, and it looks to happen on the post-compressed value (hence the need to decompress to calculate msg id, which I thought was odd): https://github.com/libp2p/go-libp2p-pubsub/blob/b5ee222289aabef29ebf90647d7c0d99d5c8ee19/pubsub.go#L343

How Prysm uses it: https://github.com/prysmaticlabs/prysm/blob/c3dbfa66d090ac40818f5ddc5a229599c78db8ab/beacon-chain/p2p/message_id.go#L55

Its likely that this is used within the libp2p library where the message is already being sent and needs to be assigned an ID, and there isn't error handling. So they need to assign something, and chose a format.

Lighthouse never uses this value as it calculates message id on the decompressed data directly https://github.com/sigp/lighthouse/blob/441fc1691b69f9edc4bbdc6665f3efab16265c9b/beacon_node/lighthouse_network/src/config.rs#L486

p2p-specs/p2p-interface.md

ch4r10t33r · 2024-01-09T18:40:36Z

p2p-specs/p2p-interface.md

@@ -202,6 +213,11 @@ minimumStake: '0.0'
 ```
 The `mempool-id` of the canonical mempool is `TBD` (IPFS hash of the yaml/JSON file).

+#### Canonical Mempools
+
+There will be a published list of canonical mempools maintained by the bundler community. This list represents mempools that support the full [ERC-7562](https://github.com/ethereum/ERCs/pull/105) validation rules as well as certain mempool configuration parameters and a specific entry point contract. All bundlers SHOULD support these mempools. User operations that do not require access to alternative mempools will be supported by at least one of these canonical mempools.


Can we pls link to this file? https://github.com/eth-infinitism/bundler-spec/pull/22/files

ch4r10t33r suggested changes Dec 12, 2023

View reviewed changes

p2p-specs/p2p-interface.md Outdated Show resolved Hide resolved

drortirosh reviewed Dec 13, 2023

View reviewed changes

dancoombs force-pushed the danc/gossip branch 3 times, most recently from 3e848d7 to 271f70c Compare December 22, 2023 17:31

0xSulpiride mentioned this pull request Jan 4, 2024

p2p spec updates etherspot/skandha#139

Merged

8 tasks

dancoombs force-pushed the danc/gossip branch 2 times, most recently from 25bb077 to 261bba0 Compare January 9, 2024 17:09

feat(p2p): redefine the gossip message id and contents

d66407f

dancoombs force-pushed the danc/gossip branch from 261bba0 to d66407f Compare January 9, 2024 17:17

ch4r10t33r approved these changes Jan 9, 2024

View reviewed changes

drortirosh merged commit c249743 into eth-infinitism:main Jan 18, 2024
0 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(p2p): redefine the gossip message id and contents #18

feat(p2p): redefine the gossip message id and contents #18

dancoombs commented Nov 30, 2023

drortirosh Dec 13, 2023

dancoombs Dec 22, 2023

drortirosh Jan 16, 2024

dancoombs Jan 18, 2024

ch4r10t33r Jan 9, 2024

feat(p2p): redefine the gossip message id and contents #18

feat(p2p): redefine the gossip message id and contents #18

Conversation

dancoombs commented Nov 30, 2023

drortirosh Dec 13, 2023

Choose a reason for hiding this comment

dancoombs Dec 22, 2023

Choose a reason for hiding this comment

drortirosh Jan 16, 2024

Choose a reason for hiding this comment

dancoombs Jan 18, 2024

Choose a reason for hiding this comment

ch4r10t33r Jan 9, 2024

Choose a reason for hiding this comment