Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement callbacks from vertices when serving is used as source #2311

Merged
merged 35 commits into from
Jan 21, 2025

Conversation

BulkBeing
Copy link
Contributor

@BulkBeing BulkBeing commented Jan 7, 2025

Implements callbacks to source vertex when Serving source is used. #2318

@vigith vigith changed the title Implement callbacks from vertices when serving is used as source feat: implement callbacks from vertices when serving is used as source Jan 8, 2025
Comment on lines 253 to 255
if let Some(ref callback_handler) = this.callback_handler {
let metadata = message.metadata.ok_or_else(|| {
Error::Source(format!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should write the callback after pafs are resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update. I wrongly assumed .resolve_pafs() would wait for them to resolve.

Copy link

codecov bot commented Jan 13, 2025

Codecov Report

Attention: Patch coverage is 91.41304% with 79 lines in your changes missing coverage. Please review.

Project coverage is 67.93%. Comparing base (05cd792) to head (8ccb4f1).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
rust/numaflow-core/src/tracker.rs 95.37% 18 Missing ⚠️
rust/serving/src/callback.rs 94.35% 14 Missing ⚠️
rust/numaflow-core/src/config/monovertex.rs 31.25% 11 Missing ⚠️
rust/numaflow-core/src/config/pipeline.rs 65.62% 11 Missing ⚠️
rust/numaflow/src/main.rs 0.00% 10 Missing ⚠️
rust/numaflow-core/src/mapper/map.rs 88.23% 6 Missing ⚠️
rust/numaflow-core/src/pipeline.rs 83.33% 3 Missing ⚠️
rust/numaflow-core/src/message.rs 50.00% 2 Missing ⚠️
rust/serving/src/app.rs 88.88% 2 Missing ⚠️
rust/numaflow-core/src/config.rs 0.00% 1 Missing ⚠️
... and 1 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2311      +/-   ##
==========================================
+ Coverage   67.47%   67.93%   +0.45%     
==========================================
  Files         351      352       +1     
  Lines       45822    46528     +706     
==========================================
+ Hits        30918    31608     +690     
- Misses      13828    13849      +21     
+ Partials     1076     1071       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

pub(crate) async fn update(&self, message: &Message) -> Result<()> {
let offset = message.id.offset.clone();
let mut responses: Option<Vec<String>> = None;
if self.enable_callbacks {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

match, to avoid mut

@BulkBeing
Copy link
Contributor Author

The build-rust-amd64 stage was building a dynamically linked binary in this PR: https://github.com/numaproj/numaflow/actions/runs/12869012445/job/35877121103?pr=2311.

./numaflow-rs-linux-amd64: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=6ba3d19f2e1988adad1260005d261e44da230496, for GNU/Linux 3.2.0, not stripped

In all other PRs

./numaflow-rs-linux-amd64: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), static-pie linked, BuildID[sha1]=903cf4c1a25be91fe8c8d4d133416e02f16c005d, for GNU/Linux 3.2.0, not stripped

This was causing numa container to fail in Monovertex integration tests:

Command:
    /bin/numaflow-rs
Args:
    --rust
State:          Waiting
    Reason:       CrashLoopBackOff
Last State:     Terminated
    Reason:       Error
    Exit Code:    139

Unfortunately, no other info was coming in kubectl logs or kubectl logs --previous:

kubectl -n numaflow-system logs -l 'app.kubernetes.io/name=transformer-mono-vertex' -c numa
kubectl -n numaflow-system logs -p -l 'app.kubernetes.io/name=transformer-mono-vertex' -c numa

Downloaded the binary from the assets of the PR build, and ran it on an Intel Ubuntu 24.04 VM:

$ ./numaflow-rs-linux-amd64
Segmentation fault

Somehow binary is linked with libssl:

$ ldd numaflow-rs-linux-amd64
        libssl.so.3 => /lib/x86_64-linux-gnu/libssl.so.3 (0x00007ffffe89c000)
        libcrypto.so.3 => /lib/x86_64-linux-gnu/libcrypto.so.3 (0x00007ffffe389000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ffffe177000)
        /lib64/ld-linux-x86-64.so.2 (0x00007ffffffc6000)

Tried building static binary on an Intel Ubuntu 24.04 VM (same as Github CI runner, verified glibc version also). And this builds a static binary successfully.

Disabled default features in reqwest and builds in CI started to work:

reqwest = { version = "0.12.12", default-features = false, features = [
    "http2",
    "rustls-tls",
    "json",
] }

@vigith
Copy link
Member

vigith commented Jan 21, 2025

how come we didn't see this problem when running locally? perhaps we should see whether we can reproduce this in cloud-workspaces?

@BulkBeing
Copy link
Contributor Author

how come we didn't see this problem when running locally? perhaps we should see whether we can reproduce this in cloud-workspaces?

I'm couldn't find the exact reason for this. It looks like sometimes the option to build static binary (RUSTFLAGS='-C target-feature=+crt-static') is silently ignored if some of the libraries pass dynamically linking flags to rustc. rust-lang/rust#39998

Another reference: https://github.com/rust-lang/rust/pull/71586/files#diff-6e9a08e032c83ccfc4b967a192b6d38c932eaaf6a153027bd79365830e4d7139

WARNING! If some libraries (native or Rust crates) in the dependency tree exist only in the
dynamic variant they may be linked dynamically breaking the self-contained-ness property,
re-adding the INTERP header for ELF executables, and possibly working incorrectly due to mismatch
between dynamically linked libraries and injected startup objects specific to static linking.
A request to turn this case into an error is tracked in
issue #39998.

https://doc.rust-lang.org/reference/linkage.html#r-link.crt.ineffective

Targets which do not support switching between linkage of the C runtime will ignore this flag. It’s recommended to inspect the resulting binary to ensure that it’s linked as you would expect after the compiler succeeds.

Our target (x86_64-unknown-linux-gnu) supports linking C runtime statically, but I've added a step to validate the binary is static in the CI pipeline.

@@ -304,6 +305,7 @@ impl MapHandle {
Ok(())
});

tracing::info!("Returning output_rx stream");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

@@ -286,9 +296,15 @@ impl PipelineConfig {
.map(|(key, val)| (key.into(), val.into()))
.filter(|(key, _val)| {
// FIXME(cr): this filter is non-exhaustive, should we invert?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

@@ -343,14 +346,16 @@ impl MapHandle {
match receiver.await {
Ok(Ok(mut mapped_messages)) => {
// update the tracker with the number of messages sent and send the mapped messages
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change

@@ -605,6 +605,9 @@ pub(crate) async fn start_metrics_https_server(
addr: SocketAddr,
metrics_state: UserDefinedContainerState,
) -> crate::Result<()> {
// Setup the CryptoProvider (controls core cryptography used by rustls) for the process
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this?

@yhl25 yhl25 marked this pull request as ready for review January 21, 2025 16:38
@yhl25 yhl25 requested a review from whynowy as a code owner January 21, 2025 16:38
Signed-off-by: Vigith Maurice <[email protected]>
@vigith vigith enabled auto-merge (squash) January 21, 2025 16:39
Signed-off-by: Vigith Maurice <[email protected]>
@vigith vigith merged commit c492520 into main Jan 21, 2025
28 checks passed
@vigith vigith deleted the serving-callbacks branch January 21, 2025 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants