-
Notifications
You must be signed in to change notification settings - Fork 84
Support policy-based transport header extraction and propagation in OtapPdata #2508
Description
Pre-filing checklist
- I searched existing issues and didn't find a duplicate
Component(s)
Rust OTAP dataflow (rust/otap-dataflow/)
Objective
Introduce a policy-based transport header capability that allows header-capable receivers to extract selected inbound transport headers into OtapPdata context, and allows header-capable exporters to propagate selected stored headers on egress.
Rationale
Protocols such as OTLP and OTAP can carry important request-scoped metadata outside the payload itself, for example in HTTP headers or gRPC metadata. That metadata can represent trace context, tenant, auth, routing, correlation, or policy hints that need to survive the pipeline.
OtapPdata already has context, but it does not currently preserve transport headers. This makes end-to-end propagation and policy-driven use of request metadata difficult.
A shared policy-based model is the preferred direction because multiple receivers and exporters are expected to need the same capability. Receivers and exporters can still expose node-specific config (by overriding the policy locally), but that config should primarily select or override shared transport-header policy rules rather than
redefine the feature independently at every node.
Scope
Add a new transport_headers policy family and a protocol-neutral transport header abstraction in OtapPdata context.
The transport header abstraction should preserve the semantics needed across protocols:
- duplicate header names must be preserved
- matching should use a normalized logical name
- the original wire name should remain available
- values should support both
textandbinary - values should be stored as raw bytes, not as protocol-specific header objects
A conceptual TransportHeader entry should contain:
name: normalized logical name used for matching and policy lookupwire_name: original header or metadata name observed on ingressvalue_kind:textorbinaryvalue: raw bytes
The new transport_headers policy family should define:
- extraction rules
- propagation rules
- shared limits and failure handling
Receivers and exporters should activate policy rules from node config. A receiver should be able to opt into extraction rules, and an exporter should be able to opt into propagation rules. Processors should be transport-header transparent unless they explicitly add support for reading or mutating this context.
The design should be protocol-neutral. OTLP/gRPC and OTAP/gRPC are suitable initial targets, but the abstraction should also cover other header-capable protocols.
Acceptance Criteria
OtapPdatacan carry transport headers in context without losing duplicate-name or binary-value semantics.- A shared
transport_headerspolicy family exists in the config model. - Receivers can activate extraction rules from policy through node config.
- Exporters can activate propagation rules from policy through node config.
- A normal in-pipeline path preserves transport headers through at least one processor.
- An end-to-end pipeline demonstrates extraction on
receiver:otlpanreceiver:otap, preservation through a basic processor, and propagation onexporter:otap. - Extraction and propagation are explicit and opt-in. The default behavior is not to forward all inbound headers.
- Tests cover extraction, propagation, filtering, limits, and invalid-header handling.
Dependencies or Blockers
No response
Additional Context
Proposed policy shape:
version: otel_dataflow/v1
groups:
default:
pipelines:
ingest:
policies:
header_capture:
defaults:
max_entries: 32 # default: 32 captured headers per message
max_name_bytes: 128 # default: 128 bytes per header name
max_value_bytes: 4096 # default: 4096 bytes per header value
on_error: drop # default: drop the offending captured header
# Per-header defaults:
# - store_as: first matched name, normalized
# - value_kind: text unless protocol-specific inference marks it binary
# - sensitive: false
headers:
- match_names: ["x-tenant-id"]
store_as: tenant_id
- match_names: ["x-request-id"] // store_as default on the extracted header name
- match_names: ["authorization"]
sensitive: true
header_propagation:
default:
selector: all_captured # default: all_captured
action: propagate # default: propagate
name: preserve # default: preserve stored header name
on_error: drop # default: drop the offending outbound header
# Per-override defaults:
# - action: propagate
# - name: preserve
# - on_error: drop
overrides:
- match:
stored_names: ["authorization"]
action: drop
nodes:
otlp_ingest:
type: receiver:otlp
config:
protocols:
grpc:
listening_addr: "0.0.0.0:4317"
otap_ingest:
type: receiver:otap
header_capture: // node level override
headers:
- match_names: ["x-request-id"]
store_as: request_id
config:
listening_addr: "0.0.0.0:50051"
batch:
type: processor:batch
config: {}
otap_export:
type: exporter:otap
config:
grpc_endpoint: "http://127.0.0.1:60051"
connections:
- from: otlp_ingest
to: batch
- from: otap_ingest
to: batch
- from: batch
to: otap_exportBehavior in this example:
- otlp_ingest uses the pipeline header_capture policy and captures tenant_id, request_id, and authorization.
- otap_ingest overrides header_capture, so it only captures x-request-id and uses tighter limits.
- batch preserves the captured headers unchanged.
- otap_export propagates all captured headers by default.
- authorization is dropped on egress by the propagation override.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status