Initial code and CI for trace data pipeline #368

bantonsson · 2024-03-25T16:29:20Z

What does this PR do?

The work done on a shared component data pipeline done during the R&D week.

Motivation

Add a base build and CI that we can continue to iterate on to refine the API and functionality of the shared data pipeline component.

Additional Notes

Please don't be to hard on the public APIs, and the implementation. They will evolve over time, but in the interest of not having a ginormous PR from a feature branch, we would like to get the machinery into main.

How to test the change?

Describe here in detail how the change can be validated.

For Reviewers

If this PR touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.
This PR doesn't touch any of that.

paullegranddc · 2024-03-27T15:47:51Z

Are we going to ship different shared libraries for data-pipeline and profiling?

bantonsson · 2024-03-27T16:10:41Z

We are not there yet, and in some languages we definitely will since they don't use the profiler in libdatadog.

ivoanjo · 2024-03-27T16:24:00Z

Not a blocker at this point, but more of an observation, note that shipping profiling AND data-pipeline separately for the same language will probably increase the binary size a lot more than if a unified libdatadog was produced ;)

bantonsson · 2024-03-27T16:26:15Z

Yes we are aware of the added binary size and are going to work on reducing this and combining the features into a single binary where it makes sense.

paullegranddc · 2024-03-28T14:15:10Z

Not a blocker at this point, but more of an observation, note that shipping profiling AND data-pipeline separately for the same language will probably increase the binary size a lot more than if a unified libdatadog was produced ;)

It's also going to be hard to load both of them in the same process because of symbols exposed in ddcomon being in both shared library

ivoanjo · 2024-03-28T14:37:43Z

It's also going to be hard to load both of them in the same process because of symbols exposed in ddcomon being in both shared library

Good point! For Ruby we explicitly load the profiling native extension with RTLD_LOCAL and I believe that's a potential solution to this issue.

I'm saying this more out of curiosity / sharing notes :)

r1viollet · 2024-03-29T15:49:28Z

build-data-pipeline-ffi.sh

+set -eu
+
+destdir="$1"
+


do we want to fix the build logics before adding the tracing logics or after ?

Could you please elaborate a bit on what this would mean.

This is relative to the RFC + issue #353
The idea would be to have a single build where we can select what is required to build.
A good first step could be to unify the build scripts.
Ideally all logics would be embedded in cargo in the long term.
cargo build --feature trace-pipeline,profiling,telemetry

Also I don't want to block the PR, if you feel it is already valuable to have a build out there.
We should mainly do this to ensure we can load everything together (and to have a clean build process).

Fixed in the latest commit. As discussed internally profiling library will be the main artifact and data-pipeline symbols will be exported there.

Why do we export the symbols here ?

Forgot the last commit (pushed it but the prompt asked for the credentials while I was AFK) now it's complete.

gleocadie

LGTM
can you remove this change please windows/libdatadog.csproj
It will generate a +300MB nuget package. We need to rethink how we package this on windows. CI (dd-trace-dotnet for example, this can be a problem on the runners - not enough disk space)

r1viollet · 2024-04-08T12:11:03Z

profiling-ffi/src/lib.rs

@@ -18,6 +18,10 @@ pub use crashtracker::*;
 #[cfg(feature = "ddtelemetry-ffi")]
 pub use ddtelemetry_ffi::*;

+#[cfg(feature = "data-pipeline-ffi")]
+#[allow(unused_imports)]
+pub use data_pipeline_ffi::*;


…nd Linux (#356)

This is to avoid building a 300MB+ large nuget package.

* Add data-pipeline-ffi symbols in order to treat profiling shared library as the main artifact. * Add feature flag "data-pipeline-ffi" so data-pipeline items are included based on it. * Add flag to handle features in build-profiling-ffi.sh. * Remove data-pipeline-ffi script. * Profiling tests are build with the data-pipeline feature enabled hence trace_exporter source file is built linking against profiling library. Update LICENSE-3rdparty file.

r1viollet · 2024-04-10T14:36:17Z

examples/ffi/trace_exporter.c

+    ddog_CharSlice language_version = DDOG_CHARSLICE_C("10.0");
+    ddog_CharSlice language_interpreter = DDOG_CHARSLICE_C("X");
+    ddog_TraceExporter* trace_exporter = ddog_trace_exporter_new(
+        host,


we have an endpoint object that could be useful to share

r1viollet · 2024-04-10T14:38:20Z

.github/workflows/test.yml

@@ -97,7 +97,7 @@ jobs:
      - name: "Generate profiling FFI"


Minor: I think we could have a general all feature build.
The profiling workflow currently uses gitlab to release.

The important part is having at least one build per delivered combination.

r1viollet

LGTM
Minor suggestions. Exciting to see this progress 👏

github-actions bot added ci-build data-pipeline labels Mar 25, 2024

bantonsson force-pushed the ban/cleaned-up-data-pipeline-poc branch 4 times, most recently from e72e1a4 to ad57c31 Compare March 27, 2024 15:29

bantonsson changed the title ~~[WIP] Cleaned up POC branch~~ Initial code and CI for trace data pipeline Mar 27, 2024

bantonsson requested a review from a team March 27, 2024 15:36

bantonsson marked this pull request as ready for review March 27, 2024 15:45

bantonsson requested review from a team as code owners March 27, 2024 15:45

bantonsson force-pushed the ban/cleaned-up-data-pipeline-poc branch from 10654c7 to 824fa21 Compare March 28, 2024 08:12

bantonsson force-pushed the ban/cleaned-up-data-pipeline-poc branch from 824fa21 to 782ae18 Compare March 28, 2024 16:17

r1viollet reviewed Mar 29, 2024

View reviewed changes

bantonsson force-pushed the ban/cleaned-up-data-pipeline-poc branch 5 times, most recently from ee61413 to e18be6b Compare April 4, 2024 12:23

gleocadie reviewed Apr 5, 2024

View reviewed changes

bantonsson requested a review from a team as a code owner April 8, 2024 10:20

github-actions bot added the profiling Relates to the profiling* modules. label Apr 8, 2024

r1viollet reviewed Apr 8, 2024

View reviewed changes

hoolioh force-pushed the ban/cleaned-up-data-pipeline-poc branch from 4a6a251 to 4b42ada Compare April 10, 2024 12:51

hoolioh force-pushed the ban/cleaned-up-data-pipeline-poc branch from 4b42ada to e80bf85 Compare April 10, 2024 13:30

hoolioh and others added 7 commits April 10, 2024 16:19

[data-pipeline] Initial code

623ad5d

[data-pipeline] Use hyper and allow deserialization/serialization

1edda40

[data-pipeline] Add FFI to trace exporter

faee264

[data-pipeline] Keep TraceExporter handle around so it can be reused

1d9e1fc

[data-pipeline] Fix cbindgen generation + add CI build/test Windows a…

51ad210

…nd Linux (#356)

[data-pipeline] Remove data pipeline from windows/libdatadog.csproj

997cd87

This is to avoid building a 300MB+ large nuget package.

bantonsson force-pushed the ban/cleaned-up-data-pipeline-poc branch from e80bf85 to d8dccce Compare April 10, 2024 14:26

r1viollet reviewed Apr 10, 2024

View reviewed changes

r1viollet approved these changes Apr 10, 2024

View reviewed changes

bantonsson merged commit c461065 into main Apr 10, 2024
20 checks passed

bantonsson deleted the ban/cleaned-up-data-pipeline-poc branch April 10, 2024 14:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial code and CI for trace data pipeline #368

Initial code and CI for trace data pipeline #368

bantonsson commented Mar 25, 2024 •

edited

Loading

paullegranddc commented Mar 27, 2024

bantonsson commented Mar 27, 2024

ivoanjo commented Mar 27, 2024

bantonsson commented Mar 27, 2024 •

edited

Loading

paullegranddc commented Mar 28, 2024

ivoanjo commented Mar 28, 2024

r1viollet Mar 29, 2024

bantonsson Apr 3, 2024

r1viollet Apr 4, 2024

r1viollet Apr 4, 2024

hoolioh Apr 8, 2024

r1viollet Apr 8, 2024

hoolioh Apr 8, 2024

gleocadie left a comment

r1viollet Apr 8, 2024

r1viollet Apr 10, 2024

r1viollet Apr 10, 2024

r1viollet Apr 10, 2024

r1viollet left a comment

		set -eu

		destdir="$1"

Initial code and CI for trace data pipeline #368

Initial code and CI for trace data pipeline #368

Conversation

bantonsson commented Mar 25, 2024 • edited Loading

What does this PR do?

Motivation

Additional Notes

How to test the change?

For Reviewers

paullegranddc commented Mar 27, 2024

bantonsson commented Mar 27, 2024

ivoanjo commented Mar 27, 2024

bantonsson commented Mar 27, 2024 • edited Loading

paullegranddc commented Mar 28, 2024

ivoanjo commented Mar 28, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gleocadie left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

r1viollet left a comment

Choose a reason for hiding this comment

bantonsson commented Mar 25, 2024 •

edited

Loading

bantonsson commented Mar 27, 2024 •

edited

Loading