Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(propagation): add DD_TRACE_PROPAGATION_BEHAVIOR_EXTRACT to handle x-org propagation #11631

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

ZStriker19
Copy link
Contributor

@ZStriker19 ZStriker19 commented Dec 5, 2024

Configuration: DD_TRACE_PROPAGATION_BEHAVIOR_EXTRACT
Values:
continue (default): The tracing library continues the trace from the incoming headers, if present. Also, incoming baggage is propagated.

restart: The tracing library always starts a new trace with a new trace-id. If there are distributed tracing headers, a span link will be added to reference the trace context. Also, incoming baggage is propagated.

ignore: The tracing library always starts a new trace with a new trace-id without creating any span links. Also, incoming baggage is dropped.
Note: We do not need to implement this at the moment because the workaround we have in place DD_TRACE_PROPAGATION_STYLE_EXTRACT=none performs this exact function of completely ignoring incoming trace context headers and baggage headers. If we receive feedback, we can add this new option that we maintain.

Result
✅ Head services: Customers can configure these services with the restart configuration so that they are always the root span, regardless of incoming distributed tracing headers.
✅ Internal services: By default, internal services will remain unchanged and continue to report complete traces to Datadog. If a user has configured an internal service with the restart configuration, this will lead to partial traces. Thankfully with Span Links we can easily direct the customer to the upstream trace (if it was sampled).
❌ Sometimes-head services: This solution does not solve for services that are sometimes head services and sometimes not. The best solution for these services is to set the restart configuration so that a new trace can begin and sampling priority can be recalculated, and in the case that this was not a head service then the upstream service can be found with a span link.
❌ Reentrant systems: This solution does not improve this scenario no matter how the reentrant service is configured. Under the continue configuration, the reentrant service is part of the same trace as the original root span but it is orphaned. Under the restart configuration, a new trace will be started and the span will be decorated with a span link containing the original trace-id but the contained span-id would not point to a span tracked in their Datadog organization.

Note: This contains a fix, allowing only baggage to be propagated.

Checklist

  • PR author has checked that all the criteria below are met
  • The PR description includes an overview of the change
  • The PR description articulates the motivation for the change
  • The change includes tests OR the PR description describes a testing strategy
  • The PR description notes risks associated with the change, if any
  • Newly-added code is easy to change
  • The change follows the library release note guidelines
  • The change includes or references documentation updates if necessary
  • Backport labels are set (if applicable)

Reviewer Checklist

  • Reviewer has checked that all the criteria below are met
  • Title is accurate
  • All changes are related to the pull request's stated goal
  • Avoids breaking API changes
  • Testing strategy adequately addresses listed risks
  • Newly-added code is easy to change
  • Release note makes sense to a user of the library
  • If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
  • Backport labels are set in a manner that is consistent with the release branch maintenance policy

Copy link
Contributor

github-actions bot commented Dec 5, 2024

CODEOWNERS have been resolved as:

releasenotes/notes/propagation_behavior_extract-3d16765cfd07485b.yaml   @DataDog/apm-python
.github/workflows/system-tests.yml                                      @DataDog/python-guild @DataDog/apm-core-python
ddtrace/_trace/tracer.py                                                @DataDog/apm-sdk-api-python
ddtrace/contrib/trace_utils.py                                          @DataDog/apm-core-python @DataDog/apm-idm-python
ddtrace/internal/constants.py                                           @DataDog/apm-core-python
ddtrace/propagation/http.py                                             @DataDog/apm-sdk-api-python
ddtrace/settings/config.py                                              @DataDog/python-guild @DataDog/apm-sdk-api-python
pyproject.toml                                                          @DataDog/python-guild
tests/telemetry/test_writer.py                                          @DataDog/apm-python
tests/tracer/test_propagation.py                                        @DataDog/apm-sdk-api-python
tests/utils.py                                                          @DataDog/python-guild

# loop through all extract propagation styles
else:
contexts, styles_w_ctx = HTTPPropagator._extract_configured_contexts_avail(normalized_headers)
if config._propagation_behavior_extract == _PROPAGATION_BEHAVIOR_CONTINUE:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Code Quality Violation

Found too many nested ifs within this condition (...read more)

Too many nested loops make the code hard to read and understand. Simplify your code by removing nesting levels and separate code in small units.

View in Datadog  Leave us feedback  Documentation

@pr-commenter
Copy link

pr-commenter bot commented Dec 5, 2024

Benchmarks

Benchmark execution time: 2025-01-09 21:43:21

Comparing candidate commit 57b3239 in PR branch zachg/handle_cross_org_propagation with baseline commit 04ee68f in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 394 metrics, 2 unstable metrics.

@ZStriker19 ZStriker19 force-pushed the zachg/handle_cross_org_propagation branch from 5d2512b to ef045c8 Compare December 6, 2024 22:53
context = propagator._extract(normalized_headers)
rec_context = propagator._extract(normalized_headers)
# if configured to restart, append our first context to the current context as a span link
if rec_context:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly changed behavior here so that the context never is set to None. If this goes against spec I can change back.

erikayasuda and others added 3 commits December 16, 2024 12:57
…1723)

This PR updates the `version_schema` in the `../pyproject.toml` file for
the 2.18 release branch from `release-branch-semver` to
`guess-next-dev`. This is to ensure that system tests work as intended
with backports to this release branch.

IMPORTANT: This PR needs to be merged before the first backport is
created for 2.18.Otherwise, system tests will not work as expected.

- [x] Reviewer check
…ncions [backport 2.18] (#11756)

Backport a4f0e38 from #11752 to 2.18.

This fix resolves an issue where AppSec was using a patched request and
builtins functions, creating telemetry errors.
## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met 
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

Co-authored-by: Alberto Vara <[email protected]>
@ZStriker19 ZStriker19 force-pushed the zachg/handle_cross_org_propagation branch 2 times, most recently from e21c4a3 to 1c4b37e Compare December 19, 2024 04:08
@ZStriker19 ZStriker19 force-pushed the zachg/handle_cross_org_propagation branch 5 times, most recently from e2911cd to 1568543 Compare December 19, 2024 23:22
@ZStriker19 ZStriker19 force-pushed the zachg/handle_cross_org_propagation branch from 0fb3045 to e1b10df Compare December 19, 2024 23:29
@datadog-dd-trace-py-rkomorn
Copy link

datadog-dd-trace-py-rkomorn bot commented Dec 20, 2024

Datadog Report

Branch report: zachg/handle_cross_org_propagation
Commit report: 57b3239
Test service: dd-trace-py

✅ 0 Failed, 87 Passed, 1468 Skipped, 4m 3.25s Total duration (35m 36.02s time saved)

@@ -20,6 +20,7 @@ jobs:
with:
persist-credentials: false
repository: 'DataDog/system-tests'
ref: 'zach.montoya/poc/propagation-behavior'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to remove these, just doing because struggling to build tracer locally

@@ -577,7 +575,10 @@ def activate_distributed_headers(tracer, int_config=None, request_headers=None,
# app = Flask(__name__) # Traced via Flask instrumentation
# app = DDWSGIMiddleware(app) # Extra layer on top for WSGI
current_context = tracer.current_trace_context()
if current_context and current_context.trace_id == context.trace_id:

# We accept incoming contexts with only baggage, however if we
Copy link
Contributor Author

@ZStriker19 ZStriker19 Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We accept contexts lacking trace_id to cover the following cases:

  1. Baggage only extraction
  2. DD_TRACE_PROPAGATION_BEHAVIOR_EXTRACT=restart which can result in span_link only extraction:

Unfortunately we can't compare baggages to tell if they're the same as the original headers, since the original incoming baggage could've been modified before we try to activate new headers in a nested framework. However, an alternative to tossing out an incoming baggage after we have an active context would be to merge them in some way, however I think that this is worse since it's possible customers may want to delete items from baggage.

I think the same applies to how span links could be modified as well.

@ZStriker19 ZStriker19 force-pushed the zachg/handle_cross_org_propagation branch from 5904cad to 4436896 Compare January 9, 2025 19:17
@ZStriker19
Copy link
Contributor Author

Ignore the failing system-test, it's just due to telemetry config differences since we added a new config.

@ZStriker19 ZStriker19 marked this pull request as ready for review January 9, 2025 20:15
@ZStriker19 ZStriker19 requested review from a team as code owners January 9, 2025 20:15
links = context._span_links if not parent else []

if trace_id:
if trace_id or links:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we now count just span_links as a non-empty context, we need to make sure we initialize the span with them if they're available. This seems a little worrisome because we rely on the span init to make the right choices when we pass in None for some of these values. However, it does work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants