Skip to content

[do not merge] feat: Span streaming & new span API#5551

Draft
sentrivana wants to merge 168 commits intomasterfrom
feat/span-first
Draft

[do not merge] feat: Span streaming & new span API#5551
sentrivana wants to merge 168 commits intomasterfrom
feat/span-first

Conversation

@sentrivana
Copy link
Contributor

Introduce a new start_span() API with a simpler and more intuitive signature to eventually replace the original start_span() and start_transaction() APIs.

Additionally, introduce a new streaming mode (sentry_sdk.init(_experiments={"trace_lifecycle": "stream"})) that will send spans as they finish, rather than by transaction.

import sentry_sdk

sentry_sdk.init(
    _experiments={"trace_lifecycle": "stream"},
)

with sentry_sdk.traces.start_span(name="my_span"):
    ...

The new API MUST be used with the new streaming mode, and the old API MUST be used in the legacy non-streaming (static) mode.

Migration guide: getsentry/sentry-docs#16072

Notes

  • The diff is huge mostly because I've optimized for easy removal of legacy code in the next major, deliberately duplicating a lot. I'll of course split it up to reviewable PRs once ready.
    • Chose to go with a new file and a new span class so that we can just remove the old Span and drop the new StreamedSpan in tracing.py as a replacement.
  • The batcher for spans is a bit different from the logs and metrics batchers because it needs to batch by trace_id (we can't send spans from different traces in the same envelope).

Release Plan

  • There will be prereleases for internal testing.
  • We'll release the new API in a minor version as opt-in.
  • In the next major, we'll drop the legacy API.

Project

https://linear.app/getsentry/project/span-first-sdk-python-727da28dd037/overview

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing try/finally causes span leak when Redis command raises exception (sentry_sdk/integrations/redis/_async_common.py:137)

In _sentry_execute_command, the async version does not wrap the await old_execute_command() call in a try/finally block, unlike the sync version in _sync_common.py. If the Redis command raises an exception, db_span.__exit__() and cache_span.__exit__() will never be called, causing the spans to remain unclosed. This could lead to resource leaks and corrupted tracing state.

Scope corruption when real_putrequest raises exception in streaming mode (sentry_sdk/integrations/stdlib.py:127)

In the span streaming code path (lines 109-127), span.start() is called which sets the span as active on the scope and saves the old span in _context_manager_state. If real_putrequest() at line 148 raises an exception, span.end() in getresponse is never called, leaving the scope's span attribute pointing to an orphaned span and never restoring the previous span. This corrupts the scope state for subsequent operations in the same request/thread.

Dict rules with unrecognized keys in ignore_spans config silently ignore ALL spans (sentry_sdk/tracing_utils.py:1498)

When ignore_spans contains a dict with only unrecognized keys (e.g., a typo like {"nme": "/health"} instead of {"name": "/health"}), both name_matches and attributes_match default to True, causing the rule to match ALL spans. This could silently drop all trace data due to a simple configuration mistake.

Identified by Warden find-bugs

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SpanBatcher lacks unit tests for new size-based flush logic (sentry_sdk/_span_batcher.py:15)

The new SpanBatcher class introduces size-based flush limits (MAX_BYTES_BEFORE_FLUSH, _running_size, _estimate_size) and batching logic, but there are no unit tests for this component. Key untested scenarios include: size-based flush triggering, the envelope splitting logic when exceeding MAX_ENVELOPE_SIZE (1000 spans), and the queue overflow handling.

_finished check placed too late allows double-counting of lost events (sentry_sdk/traces.py:441)

The _finished check at line 441 occurs after the lost event recording for discarded spans (lines 425-438). If _end() is called multiple times on the same span (e.g., explicit end() call followed by context manager __exit__), the span will record lost events to the transport before detecting it's already finished. This causes incorrect telemetry data.

Identified by Warden code-review

This reverts commit f00ac21.
if operation_name:
_graphql_span.set_attribute("graphql.operation.name", operation_name)
_graphql_span.set_attribute("graphql.operation.type", operation_type)
_graphql_span.start()
Copy link
Contributor

@github-actions github-actions bot Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StreamedSpan.start() method does not exist - will cause AttributeError at runtime

The code calls _graphql_span.start() on line 158, but StreamedSpan does not have a public start() method. It only has a private _start() method that is automatically called in __init__. When span streaming is enabled, this will raise an AttributeError: 'StreamedSpan' object has no attribute 'start' at runtime, breaking GraphQL operation tracing entirely.

Verification

Read sentry_sdk/traces.py lines 199-538 and verified StreamedSpan class methods. Confirmed only _start() (private, called in init), end(), and finish() exist - no public start() method. Grep for def start\( in traces.py returned no matches.

Suggested fix: Remove the explicit .start() call since start_span() returns an already-started span. The _start() method is called automatically in StreamedSpan's __init__. Use the span as a context manager or remove the manual start call.

Also found at 2 additional locations
  • sentry_sdk/integrations/rust_tracing.py:221-221
  • sentry_sdk/integrations/stdlib.py:127-127

Identified by Warden code-review, find-bugs · B8S-FLX

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix attempt detected (commit a77f45c)

The commit added code that calls the non-existent start() method on StreamedSpan (line 158), which will cause AttributeError at runtime since StreamedSpan only has a private _start() method called automatically in init, not a public start() method.

The original issue appears unresolved. Please review and try again.

Evaluated by Warden

import copy
import sentry_sdk
from sentry_sdk._lru_cache import LRUCache
from sentry_sdk.tracing import Span
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feature flags silently ignored when using streaming span mode

The new StreamedSpan class does not implement the set_flag() method, and the isinstance(span, Span) check explicitly excludes it. When users enable streaming mode (_experiments={"trace_lifecycle": "stream"}), feature flag evaluations recorded via add_feature_flag() will not be attached to spans, causing silent data loss. This is a side effect that affects users who migrate to the new API.

Verification

Read sentry_sdk/feature_flags.py lines 64-66 showing the isinstance(span, Span) check before calling set_flag(). Grepped sentry_sdk/traces.py for 'set_flag' - no matches found. Read StreamedSpan class definition (lines 199-498) and confirmed no set_flag method exists. Read get_current_span() in tracing_utils.py:1099-1107 which shows it returns Union[Span, StreamedSpan].

Also found at 1 additional location
  • sentry_sdk/integrations/redis/_sync_common.py:148-148

Identified by Warden code-review · BUS-8HA

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants