feat: add OTLP export and W3C trace context propagation to tracing#9414
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
I have read the CLA document and I hereby sign the CLA. |
There was a problem hiding this comment.
No issues found across 6 files
Architecture diagram
sequenceDiagram
participant Env as Environment
participant Init as _set_tracer_provider()
participant Mid as OpenTelemetryMiddleware
participant Prop as W3C Propagator
participant Exp as SpanExporter
participant Ext as OTLP Collector / Disk
Note over Env, Ext: Initialization (Server Startup)
Init->>Env: Read MARIMO_TRACING
opt MARIMO_TRACING is true
Init->>Env: NEW: Read OTEL_EXPORTER_OTLP_ENDPOINT
alt NEW: OTLP Endpoint exists AND grpc installed
Init->>Env: NEW: Read OTEL_SERVICE_NAME & OTEL_RESOURCE_ATTRIBUTES
Init->>Init: NEW: _build_resource()
Init->>Exp: NEW: Initialize OTLPSpanExporter (gRPC)
else NEW: Fallback or No OTLP
Init->>Exp: CHANGED: Initialize FileExporter (Local JSONL)
end
end
Note over Env, Ext: Request Handling Flow
rect rgb(240, 240, 240)
Env->>Mid: Incoming Request (with traceparent header)
opt TRACING enabled
Mid->>Prop: NEW: extract(carrier=request.headers)
Prop-->>Mid: Return W3C Trace Context
Mid->>Mid: CHANGED: start_as_current_span(context=ctx)
Note right of Mid: Span is now child of external caller
end
Mid->>Mid: Process Request
opt TRACING enabled
Mid->>Exp: Batch spans for export
alt NEW: Using OTLP
Exp->>Ext: Export via gRPC (OTLP Collector)
else CHANGED: Using File
Exp->>Ext: Append to spans.jsonl (Local Disk)
end
end
end
Mid-->>Env: Response (200 OK)
|
there are many unrelated tests breaking on main, but this should be fixed shortly |
|
@tigretigre, you can rebase now off our |
When OTEL_EXPORTER_OTLP_ENDPOINT is set, the tracer provider now exports spans via gRPC OTLP instead of the local JSONL file. This lets marimo participate in distributed tracing stacks (Jaeger, Grafana Tempo, GCP Cloud Trace, etc.) without any code changes to notebooks. The OpenTelemetryMiddleware now extracts incoming W3C TraceContext (traceparent header) so server spans become children of the caller trace, enabling end-to-end distributed trace visibility. - New otel optional-dependencies group - _set_tracer_provider branches on OTEL_EXPORTER_OTLP_ENDPOINT - Resource built from OTEL_SERVICE_NAME and OTEL_RESOURCE_ATTRIBUTES - Graceful fallback to file export if grpc exporter not installed - Tests for tracer provider selection and middleware propagation Made-with: Cursor
for more information, see https://pre-commit.ci
8461886 to
5ad3e07
Compare
|
@cubic-dev-ai re-review |
@mscolnick I have started the AI code review. It will take a few minutes to complete. |
|
@tigretigre, this looks great, thank you for the changes! |
There was a problem hiding this comment.
2 issues found across 6 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="tests/test_tracer.py">
<violation number="1" location="tests/test_tracer.py:55">
P2: Importing `marimo._tracer` inside the patched block runs its module-level tracer setup, so this mock can be called twice before the assertion. That makes the `assert_called_once_with()` check order-dependent.</violation>
</file>
<file name="marimo/_tracer.py">
<violation number="1" location="marimo/_tracer.py:166">
P2: Explicitly passing `service.name` to `Resource.create()` silently overrides any user-provided `OTEL_SERVICE_NAME` or `OTEL_RESOURCE_ATTRIBUTES` environment variables. Merge the fallback resource instead.</violation>
</file>
Architecture diagram
sequenceDiagram
participant Caller as External Caller / Client
participant MW as OpenTelemetryMiddleware
participant OTel as OpenTelemetry SDK
participant Tracer as marimo._tracer
participant Collector as OTLP Collector (Jaeger/Tempo)
participant FS as Local Filesystem (~/.marimo/traces)
Note over Caller,FS: Runtime Request Flow with Distributed Tracing
Caller->>MW: Request (Headers: traceparent)
opt MARIMO_TRACING is enabled
MW->>OTel: NEW: extract(request.headers)
OTel-->>MW: Remote Trace Context
MW->>MW: NEW: start_as_current_span(context=extracted_ctx)
Note over Tracer: _set_tracer_provider logic (CHANGED)
alt OTEL_EXPORTER_OTLP_ENDPOINT is set
Tracer->>Tracer: Check OTEL_EXPORTER_OTLP_PROTOCOL
alt Protocol is grpc OR http/protobuf
alt Required libraries installed (marimo[otel])
Tracer->>Collector: NEW: Export Batch via OTLP
else Missing grpc/http library
Tracer->>Tracer: Log Warning (Fallback)
Tracer->>FS: CHANGED: Export to spans.jsonl
end
else Unsupported protocol
Tracer->>FS: CHANGED: Export to spans.jsonl
end
else Default (No OTLP Endpoint)
Tracer->>FS: Export to spans.jsonl
end
end
MW->>MW: Process Request logic
MW-->>Caller: Response
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.
| @@ -0,0 +1,258 @@ | |||
| # Copyright 2026 Marimo. All rights reserved. | |||
There was a problem hiding this comment.
P2: Importing marimo._tracer inside the patched block runs its module-level tracer setup, so this mock can be called twice before the assertion. That makes the assert_called_once_with() check order-dependent.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At tests/test_tracer.py, line 55:
<comment>Importing `marimo._tracer` inside the patched block runs its module-level tracer setup, so this mock can be called twice before the assertion. That makes the `assert_called_once_with()` check order-dependent.</comment>
<file context>
@@ -0,0 +1,258 @@
+ "opentelemetry.exporter.otlp.proto.http.trace_exporter.OTLPSpanExporter",
+ mock_exporter_cls,
+ ):
+ from marimo._tracer import _set_tracer_provider
+
+ _set_tracer_provider()
</file context>
| resource = Resource.create( | ||
| { | ||
| "service.name": "marimo", | ||
| }, | ||
| ) |
There was a problem hiding this comment.
P2: Explicitly passing service.name to Resource.create() silently overrides any user-provided OTEL_SERVICE_NAME or OTEL_RESOURCE_ATTRIBUTES environment variables. Merge the fallback resource instead.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At marimo/_tracer.py, line 166:
<comment>Explicitly passing `service.name` to `Resource.create()` silently overrides any user-provided `OTEL_SERVICE_NAME` or `OTEL_RESOURCE_ATTRIBUTES` environment variables. Merge the fallback resource instead.</comment>
<file context>
@@ -103,37 +135,80 @@ def _set_tracer_provider() -> None:
+ )
+
+ if OTLPSpanExporter is not None:
+ resource = Resource.create(
+ {
+ "service.name": "marimo",
</file context>
| resource = Resource.create( | |
| { | |
| "service.name": "marimo", | |
| }, | |
| ) | |
| resource = Resource.create() | |
| if str(resource.attributes.get("service.name", "")).startswith("unknown_service"): | |
| resource = resource.merge(Resource({"service.name": "marimo"})) |
When OTEL_EXPORTER_OTLP_ENDPOINT is set, the tracer provider now exports spans via gRPC OTLP instead of the local JSONL file. This lets marimo participate in distributed tracing stacks (Jaeger, Grafana Tempo, GCP Cloud Trace, etc.) without any code changes to notebooks.
📝 Summary
The OpenTelemetryMiddleware now extracts incoming W3C TraceContext (traceparent header) so server spans become children of the caller trace, enabling end-to-end distributed trace visibility.
Made-with: Cursor
📋 Pre-Review Checklist
✅ Merge Checklist