Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROF-9476] Add experimental profiling managed string storage #725

Open
wants to merge 57 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
6e847e3
Introduce StringId copies of all functions that deal with Sample
ivoanjo Nov 5, 2024
c8a2e6e
Introduce StringStorage implementation
ivoanjo Nov 5, 2024
795d25f
Introduce StringStorage instance into Profile
ivoanjo Nov 5, 2024
8d68513
Ran `cargo fmt`
ivoanjo Nov 5, 2024
d3c7d6d
Introduce use of PersistentStringId in StringId* variants
ivoanjo Nov 5, 2024
7716d56
Ran `cargo fmt`
ivoanjo Nov 5, 2024
0570625
Fix label validation
ivoanjo Nov 5, 2024
2887d2f
Fix incorrect error message
ivoanjo Nov 5, 2024
81fb1ff
Remove duplicate comments from StringId variants
ivoanjo Nov 5, 2024
cc6b6c0
Extract method to avoid redundancy in `add_sample`
ivoanjo Nov 5, 2024
8fa0653
Introduce `PersistentStringId::new` to make it easier to create struc…
ivoanjo Nov 5, 2024
9413b95
Import + wire up ffi changes for string storage
ivoanjo Nov 6, 2024
82b1593
Fix incorrect operation names on errors
ivoanjo Nov 6, 2024
a65c31d
Isolate `SimpleStringStorage` into seprate module
ivoanjo Nov 6, 2024
36d5d32
Mark `last_usage_gen` as not in use
ivoanjo Nov 6, 2024
64d6277
Add TODO to ManagedStringStorage_intern
ivoanjo Nov 6, 2024
624ebd0
Make sure header gets refreshed when things change
ivoanjo Nov 6, 2024
30d9a65
Fix `ddog_prof_Profile_with_string_storage` not being correctly publi…
ivoanjo Nov 6, 2024
f9f3703
Remove unused includes
ivoanjo Nov 6, 2024
4aa5f44
Remove unused struct
ivoanjo Nov 6, 2024
0a99208
Add short-circuit for looking up empty strings
ivoanjo Nov 8, 2024
e105dd1
Replace `ManagedStringStorageResult` with equivalent `MaybeError`
ivoanjo Nov 11, 2024
04d76d2
Add TODO to `get_string` API
ivoanjo Nov 11, 2024
b273e47
Have intern receive a CharSlice unconditionally
ivoanjo Nov 11, 2024
560316f
Introduce `ManagedStringStorageNewResult`
ivoanjo Nov 11, 2024
66cf6fb
Add TODO about shape of ManagedStringStorage argument
ivoanjo Nov 11, 2024
08f5760
Rename `PersistentStringId` -> `ManagedStringId`
ivoanjo Nov 11, 2024
621ed3a
Also expose `ManagedStringId` to ffi instead of u32
ivoanjo Nov 11, 2024
4be030b
Avoid using `expect` in string storage ffi APIs
ivoanjo Nov 11, 2024
95555be
Panic on `get_seq_num` == 0 to make sure callers never do it
ivoanjo Nov 26, 2024
6a658b4
Disallow attempts to unintern with id == 0
ivoanjo Nov 26, 2024
00a76f2
Fix incorrect comment
ivoanjo Nov 26, 2024
e1c237d
Merge branch 'main' into ivoanjo/prof-9476-managed-string-storage-try…
ivoanjo Nov 26, 2024
0b3f57d
Add missing license headers to new files
ivoanjo Nov 26, 2024
3cb2cb0
Merge branch 'main' into ivoanjo/prof-9476-managed-string-storage-try…
ivoanjo Nov 28, 2024
172a0a2
Revert "Make sure header gets refreshed when things change"
ivoanjo Jan 3, 2025
e291371
Merge branch 'main' into ivoanjo/prof-9476-managed-string-storage-try…
ivoanjo Jan 3, 2025
d414999
Remove redundant must_use as pointed out by clippy
ivoanjo Jan 3, 2025
fdfaa1f
Apply another clippy suggestion
ivoanjo Jan 3, 2025
c634964
More linting... Why can't all of this be applied in one step... :/
ivoanjo Jan 3, 2025
b335d14
Merge branch 'main' into ivoanjo/prof-9476-managed-string-storage-try…
morrisonlevi Jan 3, 2025
d4194a2
Improve API description when reading string from managed string storage
ivoanjo Jan 6, 2025
28782b5
Remove `StringStorage` / `SimpleStringStorage` experimental trait/imp…
ivoanjo Jan 6, 2025
fe29cf9
Cargo fmt
ivoanjo Jan 6, 2025
03e7649
Use `NonZeroU32` in managed string table functions that shouldn't be …
ivoanjo Jan 7, 2025
79901eb
Minor tweak to comment
ivoanjo Jan 7, 2025
1f2b953
Simplify cached field into tuple, as both are related
ivoanjo Jan 7, 2025
2f1dc92
Remove `last_usage_gen` from string storage since we're not using it …
ivoanjo Jan 7, 2025
3e94997
Check for string storage initialization to make check inside resolve …
ivoanjo Jan 7, 2025
8a1ebcc
Document that failure to acquire lock is unlikely
ivoanjo Jan 7, 2025
800d5ee
Allow intern to fail cleanly rather than blowing up
ivoanjo Jan 7, 2025
9211511
Refactor: Allow Profile::resolve to fail
ivoanjo Jan 7, 2025
8387b85
Properly handle invalid ManagedStringIds by returning errors
ivoanjo Jan 7, 2025
c2be336
Avoid updating reference counts on the empty string
ivoanjo Jan 8, 2025
ce63b4e
Add fast-path for empty string interning
ivoanjo Jan 8, 2025
20b9469
Make `intern_new` private
ivoanjo Jan 13, 2025
3d22213
Merge branch 'main' into ivoanjo/prof-9476-managed-string-storage-try…
ivoanjo Jan 14, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions profiling-ffi/cbindgen.toml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,8 @@ renaming_overrides_prefixing = true
"SendResult" = "ddog_prof_Exporter_SendResult"
"SerializeResult" = "ddog_prof_Profile_SerializeResult"
"Slice_File" = "ddog_prof_Exporter_Slice_File"
"ManagedStringStorage" = "ddog_prof_ManagedStringStorage"
"ManagedStringId" = "ddog_prof_ManagedStringId"

[export.mangle]
rename_types = "PascalCase"
Expand Down
1 change: 1 addition & 0 deletions profiling-ffi/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ pub use symbolizer_ffi::*;

mod exporter;
mod profiles;
mod string_storage;

// re-export crashtracker ffi
#[cfg(feature = "crashtracker-ffi")]
Expand Down
165 changes: 149 additions & 16 deletions profiling-ffi/src/profiles.rs
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
// Copyright 2021-Present Datadog, Inc. https://www.datadoghq.com/
// SPDX-License-Identifier: Apache-2.0

use crate::string_storage::get_inner_string_storage;
use crate::string_storage::ManagedStringStorage;
use anyhow::Context;
use datadog_profiling::api;
use datadog_profiling::api::ManagedStringId;
use datadog_profiling::internal;
use datadog_profiling::internal::ProfiledEndpointsStats;
use ddcommon_ffi::slice::{AsBytes, CharSlice, Slice};
Expand Down Expand Up @@ -128,9 +131,11 @@ pub struct Period<'a> {
#[derive(Copy, Clone, Default)]
pub struct Label<'a> {
pub key: CharSlice<'a>,
pub key_id: ManagedStringId,

/// At most one of the following must be present
pub str: CharSlice<'a>,
pub str_id: ManagedStringId,
ivoanjo marked this conversation as resolved.
Show resolved Hide resolved
pub num: i64,

/// Should only be present when num is present.
Expand All @@ -141,35 +146,29 @@ pub struct Label<'a> {
/// units and units like "seconds" and "nanoseconds" as time units,
/// and apply appropriate unit conversions to these.
pub num_unit: CharSlice<'a>,
pub num_unit_id: ManagedStringId,
}

#[repr(C)]
#[derive(Copy, Clone, Default)]
pub struct Function<'a> {
/// Name of the function, in human-readable form if available.
pub name: CharSlice<'a>,
pub name_id: ManagedStringId,

/// Name of the function, as identified by the system.
/// For instance, it can be a C++ mangled name.
pub system_name: CharSlice<'a>,
pub system_name_id: ManagedStringId,

/// Source file containing the function.
pub filename: CharSlice<'a>,
pub filename_id: ManagedStringId,

/// Line number in source file.
pub start_line: i64,
}

#[repr(C)]
#[derive(Copy, Clone)]
pub struct Line<'a> {
/// The corresponding profile.Function for this line.
pub function: Function<'a>,

/// Line number in source code.
pub line: i64,
}

#[repr(C)]
#[derive(Copy, Clone, Default)]
pub struct Location<'a> {
Expand Down Expand Up @@ -202,11 +201,13 @@ pub struct Mapping<'a> {
/// disk for the main binary and shared libraries, or virtual
/// abstractions like "[vdso]".
pub filename: CharSlice<'a>,
pub filename_id: ManagedStringId,

/// A string that uniquely identifies a particular program version
/// with high probability. E.g., for binaries generated by GNU tools,
/// it could be the contents of the .note.gnu.build-id field.
pub build_id: CharSlice<'a>,
pub build_id_id: ManagedStringId,
}

#[repr(C)]
Expand Down Expand Up @@ -244,6 +245,22 @@ impl<'a> TryFrom<&'a Mapping<'a>> for api::Mapping<'a> {
}
}

impl<'a> TryFrom<&'a Mapping<'a>> for api::StringIdMapping {
type Error = Utf8Error;

fn try_from(mapping: &'a Mapping<'a>) -> Result<Self, Self::Error> {
let filename = ManagedStringId::new(mapping.filename_id.value);
let build_id = ManagedStringId::new(mapping.build_id_id.value);
Ok(Self {
memory_start: mapping.memory_start,
memory_limit: mapping.memory_limit,
file_offset: mapping.file_offset,
filename,
build_id,
})
}
}

impl<'a> From<&'a ValueType<'a>> for api::ValueType<'a> {
fn from(vt: &'a ValueType<'a>) -> Self {
Self::new(
Expand Down Expand Up @@ -278,13 +295,18 @@ impl<'a> TryFrom<&'a Function<'a>> for api::Function<'a> {
}
}

impl<'a> TryFrom<&'a Line<'a>> for api::Line<'a> {
impl<'a> TryFrom<&'a Function<'a>> for api::StringIdFunction {
type Error = Utf8Error;

fn try_from(line: &'a Line<'a>) -> Result<Self, Self::Error> {
fn try_from(function: &'a Function<'a>) -> Result<Self, Self::Error> {
let name = ManagedStringId::new(function.name_id.value);
let system_name = ManagedStringId::new(function.system_name_id.value);
let filename = ManagedStringId::new(function.filename_id.value);
Ok(Self {
function: api::Function::try_from(&line.function)?,
line: line.line,
name,
system_name,
filename,
start_line: function.start_line,
})
}
}
Expand All @@ -304,6 +326,21 @@ impl<'a> TryFrom<&'a Location<'a>> for api::Location<'a> {
}
}

impl<'a> TryFrom<&'a Location<'a>> for api::StringIdLocation {
type Error = Utf8Error;

fn try_from(location: &'a Location<'a>) -> Result<Self, Self::Error> {
let mapping = api::StringIdMapping::try_from(&location.mapping)?;
let function = api::StringIdFunction::try_from(&location.function)?;
Ok(Self {
mapping,
function,
address: location.address,
line: location.line,
})
}
}

impl<'a> TryFrom<&'a Label<'a>> for api::Label<'a> {
type Error = Utf8Error;

Expand All @@ -327,6 +364,33 @@ impl<'a> TryFrom<&'a Label<'a>> for api::Label<'a> {
}
}

impl<'a> TryFrom<&'a Label<'a>> for api::StringIdLabel {
type Error = Utf8Error;

fn try_from(label: &'a Label<'a>) -> Result<Self, Self::Error> {
let key = ManagedStringId::new(label.key_id.value);
let str = label.str_id.value;
let str = if str == 0 {
None
} else {
Some(ManagedStringId::new(str))
};
let num_unit = label.num_unit_id.value;
let num_unit = if num_unit == 0 {
None
} else {
Some(ManagedStringId::new(num_unit))
};

Ok(Self {
key,
str,
num: label.num,
num_unit,
})
}
}

impl<'a> TryFrom<Sample<'a>> for api::Sample<'a> {
type Error = Utf8Error;

Expand All @@ -352,6 +416,31 @@ impl<'a> TryFrom<Sample<'a>> for api::Sample<'a> {
}
}

impl TryFrom<Sample<'_>> for api::StringIdSample {
type Error = Utf8Error;

fn try_from(sample: Sample<'_>) -> Result<Self, Self::Error> {
let mut locations: Vec<api::StringIdLocation> = Vec::with_capacity(sample.locations.len());

for location in sample.locations.as_slice().iter() {
locations.push(location.try_into()?)
}

let values: Vec<i64> = sample.values.into_slice().to_vec();

let mut labels: Vec<api::StringIdLabel> = Vec::with_capacity(sample.labels.len());
for label in sample.labels.as_slice().iter() {
labels.push(label.try_into()?);
}

Ok(Self {
locations,
values,
labels,
})
}
}

/// Create a new profile with the given sample types. Must call
/// `ddog_prof_Profile_drop` when you are done with the profile.
///
Expand Down Expand Up @@ -380,6 +469,30 @@ pub unsafe extern "C" fn ddog_prof_Profile_new(
ProfileNewResult::Ok(ffi_profile)
}

/// Same as `ddog_profile_new` but also configures a `string_storage` for the profile.
#[no_mangle]
#[must_use]
/// TODO: @ivoanjo Should this take a `*mut ManagedStringStorage` like Profile APIs do?
pub unsafe extern "C" fn ddog_prof_Profile_with_string_storage(
sample_types: Slice<ValueType>,
period: Option<&Period>,
start_time: Option<&Timespec>,
string_storage: ManagedStringStorage,
) -> ProfileNewResult {
let types: Vec<api::ValueType> = sample_types.into_slice().iter().map(Into::into).collect();
let start_time = start_time.map_or_else(SystemTime::now, SystemTime::from);
let period = period.map(Into::into);
let string_storage = match get_inner_string_storage(string_storage, true) {
Ok(string_storage) => string_storage,
Err(err) => return ProfileNewResult::Err(err.into()),
};

let internal_profile =
internal::Profile::with_string_storage(start_time, &types, period, string_storage);
let ffi_profile = Profile::new(internal_profile);
ProfileNewResult::Ok(ffi_profile)
}

/// # Safety
/// The `profile` can be null, but if non-null it must point to a Profile
/// made by this module, which has not previously been dropped.
Expand Down Expand Up @@ -433,8 +546,16 @@ pub unsafe extern "C" fn ddog_prof_Profile_add(
) -> ProfileResult {
(|| {
let profile = profile_ptr_to_inner(profile)?;
let sample = sample.try_into()?;
profile.add_sample(sample, timestamp)
let uses_string_ids = sample
.labels
.first()
.is_some_and(|label| label.key.is_empty() && label.key_id.value > 0);

if uses_string_ids {
profile.add_string_id_sample(sample.try_into()?, timestamp)
} else {
profile.add_sample(sample.try_into()?, timestamp)
}
})()
.context("ddog_prof_Profile_add failed")
.into()
Expand Down Expand Up @@ -808,8 +929,11 @@ mod tests {
mapping,
function: Function {
name: "{main}".into(),
name_id: ManagedStringId { value: 0 },
system_name: "{main}".into(),
system_name_id: ManagedStringId { value: 0 },
filename: "index.php".into(),
filename_id: ManagedStringId { value: 0 },
start_line: 0,
},
..Default::default()
Expand Down Expand Up @@ -870,8 +994,11 @@ mod tests {
mapping,
function: Function {
name: "{main}".into(),
name_id: ManagedStringId { value: 0 },
system_name: "{main}".into(),
system_name_id: ManagedStringId { value: 0 },
filename: "index.php".into(),
filename_id: ManagedStringId { value: 0 },
start_line: 0,
},
..Default::default()
Expand All @@ -880,8 +1007,11 @@ mod tests {
mapping,
function: Function {
name: "test".into(),
name_id: ManagedStringId { value: 0 },
system_name: "test".into(),
system_name_id: ManagedStringId { value: 0 },
filename: "index.php".into(),
filename_id: ManagedStringId { value: 0 },
start_line: 3,
},
line: 4,
Expand All @@ -890,9 +1020,12 @@ mod tests {
let values: Vec<i64> = vec![1];
let labels = vec![Label {
key: Slice::from("pid"),
key_id: ManagedStringId { value: 0 },
str: Slice::from(""),
str_id: ManagedStringId { value: 0 },
num: 101,
num_unit: Slice::from(""),
num_unit_id: ManagedStringId { value: 0 },
}];

let main_sample = Sample {
Expand Down
Loading
Loading