-
Notifications
You must be signed in to change notification settings - Fork 304
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* wip - Implement offloading of literals * Fix use of metadata bucket prefix * Fix repeated use of uri * Add temporary representation for offloaded literal * Add one unit test * Add another test * Stylistic changes to the two tests * Add test for min offloading threshold set to 1MB * Pick a unique engine-dir for tests * s/new_outputs/literal_map_copy/ * Remove unused constant * Use output_prefix in definition of offloaded literals * Add initial version of pbhash.py * Add tests to verify that overriding the hash is carried over to offloaded literals * Add a few more tests * Always import ParamSpec from `typing_extensions` * Fix lint warnings * Set inferred_type using the task type interface * Add comment about offloaded literals files and how they are uploaded to the metadata bucket * Add offloading_enabled * Add more unit tests including a negative test * Fix bad merge * Incorporate feedback. * Fix image name (unrelated to this PR - just a nice-to-have to decrease flakiness) * Add `is_map_task` to `_dispatch_execute` --------- Signed-off-by: Eduardo Apolinario <[email protected]> Co-authored-by: Eduardo Apolinario <[email protected]>
- Loading branch information
1 parent
61c066c
commit 01c51b9
Showing
9 changed files
with
679 additions
and
18 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
# This is a module that provides hashing utilities for Protobuf objects. | ||
import base64 | ||
import hashlib | ||
import json | ||
|
||
from google.protobuf import json_format | ||
from google.protobuf.message import Message | ||
|
||
|
||
def compute_hash(pb: Message) -> bytes: | ||
""" | ||
Computes a deterministic hash in bytes for the Protobuf object. | ||
""" | ||
try: | ||
pb_dict = json_format.MessageToDict(pb) | ||
# json.dumps with sorted keys to ensure stability | ||
stable_json_str = json.dumps( | ||
pb_dict, sort_keys=True, separators=(",", ":") | ||
) # separators to ensure no extra spaces | ||
except Exception as e: | ||
raise ValueError(f"Failed to marshal Protobuf object {pb} to JSON with error: {e}") | ||
|
||
try: | ||
# Deterministically hash the JSON object to a byte array. Using SHA-256 for hashing here, | ||
# assuming it provides a consistent hash output. | ||
hash_obj = hashlib.sha256(stable_json_str.encode("utf-8")) | ||
except Exception as e: | ||
raise ValueError(f"Failed to hash JSON for Protobuf object {pb} with error: {e}") | ||
|
||
# The digest is guaranteed to be 32 bytes long | ||
return hash_obj.digest() | ||
|
||
|
||
def compute_hash_string(pb: Message) -> str: | ||
""" | ||
Computes a deterministic hash in base64 encoded string for the Protobuf object | ||
""" | ||
hash_bytes = compute_hash(pb) | ||
return base64.b64encode(hash_bytes).decode("utf-8") |
Oops, something went wrong.