Production-ready distributed tracing for AI agents and LLM applications
Traccia is a lightweight, high-performance Python SDK for observability and tracing of AI agents, LLM applications, and complex distributed systems. Built on OpenTelemetry standards with specialized instrumentation for AI workloads.
Traccia is available on PyPI.
- π Automatic Instrumentation: Auto-patch OpenAI, Anthropic, requests, and HTTP libraries
- π€ Framework Integrations: Support for LangChain, CrewAI, and OpenAI Agents SDK
- π LLM-Aware Tracing: Track tokens, costs, prompts, and completions automatically
- π OpenTelemetry Metrics: Emit OTEL-compliant metrics for accurate cost/token tracking (independent of sampling)
- β‘ Zero-Config Start: Simple
init()call with automatic config discovery - π― Decorator-Based: Trace any function with
@observedecorator - π§ Multiple Exporters: OTLP (compatible with Grafana Tempo, Jaeger, Zipkin), Console, or File
- π‘οΈ Production-Ready: Rate limiting, error handling, config validation, robust flushing
- π‘οΈ Guardrail Detection: Passive, zero-overhead detection of guardrails in traces β explicit, provider-native, and heuristic
- π Type-Safe: Full Pydantic validation for configuration
- π High Performance: Efficient batching, async support, minimal overhead
- π Secure: No secrets in logs, configurable data truncation
pip install tracciafrom traccia import init, observe
# Initialize (auto-loads from traccia.toml if present)
init()
# Trace any function
@observe()
def my_function(x, y):
return x + y
# That's it! Traces are automatically created and exported
result = my_function(2, 3)from traccia import init, observe
from openai import OpenAI
init() # Auto-patches OpenAI
client = OpenAI()
@observe(as_type="llm")
def generate_text(prompt: str) -> str:
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.content
# Automatically tracks: model, tokens, cost, prompt, completion, latency
text = generate_text("Write a haiku about Python")Create a callback handler and pass it to config={"callbacks": [traccia_handler]}. Install the optional extra: pip install traccia[langchain].
from traccia import init
from traccia.integrations.langchain import CallbackHandler # or TracciaCallbackHandler
from langchain_openai import ChatOpenAI
init()
# Create Traccia handler (no args)
traccia_handler = CallbackHandler()
# Use with any LangChain runnable
llm = ChatOpenAI(model="gpt-4o-mini")
result = llm.invoke(
"Tell me a joke",
config={"callbacks": [traccia_handler]}
)Spans for LLM/chat model runs are created automatically with the same attributes as direct OpenAI instrumentation (model, prompt, usage, cost).
Note: pip install traccia[langchain] installs traccia plus langchain-core; you need this extra to use the callback handler. If you already have langchain-core (e.g. from langchain or langchain-openai), base pip install traccia may be enough at runtime, but traccia[langchain] is the supported way to get a compatible dependency.
Traccia automatically detects and instruments the OpenAI Agents SDK when installed. No extra code needed:
from traccia import init
from agents import Agent, Runner
init() # Automatically enables Agents SDK tracing
agent = Agent(
name="Assistant",
instructions="You are a helpful assistant"
)
result = Runner.run_sync(agent, "Write a haiku about recursion")Configuration: Auto-enabled by default when openai-agents is installed. To disable:
init(openai_agents=False) # Explicit parameter
# OR set environment variable: TRACCIA_OPENAI_AGENTS=false
# OR in traccia.toml under [instrumentation]: openai_agents = falseCompatibility: If you have openai-agents installed but don't use it (e.g., using LangChain or pure OpenAI instead), the integration is registered but never invokedβno overhead or extra spans.
Traccia automatically instruments CrewAI when it is installed in your environment.
from traccia import init
from crewai import Agent, Task, Crew, Process
init() # Auto-enables CrewAI tracing when CrewAI is installed
researcher = Agent(role="Research Analyst", goal="Research a topic", llm="gpt-4o-mini")
task = Task(description="Research Shawn Michaels", agent=researcher)
crew = Crew(agents=[researcher], tasks=[task], process=Process.sequential, verbose=True)
result = crew.kickoff()Traccia will create spans for the crew (crewai.crew.kickoff), each task (crewai.task.*), agents (crewai.agent.*), and underlying LLM calls, which nest under the existing OpenAI spans.
Configuration: Auto-enabled by default when crewai is installed. To disable:
init(crewai=False) # Explicit parameter
# OR set environment variable: TRACCIA_CREWAI=false
# OR in traccia.toml under [instrumentation]: crewai = falseTraccia includes a passive guardrail detection engine that runs as a span processor β no runtime enforcement, no changes required to existing agent code. It inspects every span as it ends, classifies guardrail signals into structured findings, and writes results back onto spans so they appear in any configured exporter.
Detection is automatic once traccia.init() is called. The processor runs three tiers against every span's attributes:
| Tier | Source | Confidence | How |
|---|---|---|---|
| A β Explicit | explicit |
high |
@observe(as_type="guardrail") or guardrail_span() |
| B β Provider-native | provider_native |
high/medium |
LLM finish reason, stop reason, safety ratings |
| C β Heuristic | heuristic |
always low |
Denial keywords in tool error messages |
At the end of each trace (when the root span ends), the processor evaluates which guardrail categories should exist given the agent's observed capabilities (LLM calls, tool use, user-provided text) and reports which are missing.
Option 1: guardrail_span context manager β recommended for inline checks
from traccia.guardrails import guardrail_span
with guardrail_span("pii_check", category="pii", enforcement_mode="warn") as span:
result = run_pii_check(user_input)
span.set_attribute("guardrail.triggered", result.found_pii)Option 2: @observe(as_type="guardrail") decorator β for function-level guardrails
If your guardrail function returns a bool, guardrail.triggered is set automatically:
from traccia import observe
@observe(
as_type="guardrail",
attributes={
"guardrail.name": "prompt_injection_check",
"guardrail.category": "prompt_injection",
"guardrail.enforcement_mode": "block",
}
)
def check_injection(text: str) -> bool:
return any(kw in text.lower() for kw in INJECTION_KEYWORDS)
# True β triggered (blocked), False β not triggeredFor non-bool returns, set guardrail.triggered manually on the current span.
Batch pipelines or internal-only agents often get flagged for prompt_injection / input_validation because they make LLM calls with llm.prompt. Suppress specific categories for a run:
from traccia.guardrails import guardrail_span
# Convenience: suppress_missing on guardrail_span
with guardrail_span("root", category="unknown", suppress_missing=["prompt_injection", "input_validation"]):
run_batch_pipeline()
# Or directly on any span
from traccia.guardrails import ATTR_GUARDRAIL_SUPPRESS_MISSING
span.set_attribute(ATTR_GUARDRAIL_SUPPRESS_MISSING, ["prompt_injection"])Per span (when a guardrail signal is found):
guardrail.finding.countβ number of findings on this spanguardrail.findingsβ JSON array ofGuardrailFindingobjects
On the root span (aggregated summary of the entire run):
guardrail.summaryβ fullGuardrailSummaryJSONguardrail.summary.coverage_confidenceβ"high","medium", or"low"guardrail.summary.missing_countβ number of expected-but-missing guardrail categoriesguardrail.summary.detected_categoriesβ list of detected category strings
These signals are detected automatically from LLM span attributes β no annotation required:
| Provider | Signal | Attribute |
|---|---|---|
| OpenAI | finish_reason = "content_filter" |
llm.finish_reason |
| Azure OpenAI | finish_reason = "content_filtered" |
llm.finish_reason |
| Google GenAI | finish_reason = "SAFETY" |
llm.finish_reason |
| Anthropic | stop_reason = "content_filter" |
llm.stop_reason |
| Anthropic | Policy violation error message | error.message + llm.vendor=anthropic |
| Google/LangChain | "blocked": true or "probability": "HIGH" in safety ratings |
llm.safety_ratings |
Tier A/B stay on. Turn off tool-error keyword heuristics if you do not want Tier C at all:
init(guardrail_heuristics=False)Or TRACCIA_GUARDRAIL_HEURISTICS=false, or in traccia.toml: [instrumentation] β guardrail_heuristics = false.
- Guardrails running outside the traced process (API gateways, proxies, external validators) are invisible unless they write span attributes.
- A guardrail that exists but never fires cannot be distinguished from a missing guardrail. Only explicit annotation proves presence.
- Prompt injection detection requires an explicit span β the model's output alone cannot reliably indicate whether a check ran.
- Capability inference (
handles_user_text) is inferred fromllm.promptbeing present; batch pipelines and user-facing agents look the same. Use suppression to opt out.
Traccia merges configuration from multiple sources with the following priority (highest to lowest):
- Explicit parameters β
init(endpoint="...", agent_id="...")orstart_tracing(...) - Environment variables β
TRACCIA_ENDPOINT,TRACCIA_AGENT_ID, etc. - Config file β
traccia.toml(current directory) or~/.traccia/config.toml - Defaults β Built-in SDK defaults
Example: If you set TRACCIA_ENDPOINT in your environment and pass endpoint=... to init(), the explicit parameter wins.
Create a traccia.toml file in your project root:
traccia config initThis creates a template config file:
[tracing]
# API key β required for the Traccia platform, not needed for local OTLP backends
api_key = ""
# Endpoint URL for OTLP trace ingestion (default: Traccia platform)
# For local OTLP backends use e.g. endpoint = "http://localhost:4318/v1/traces"
endpoint = "https://api.traccia.ai/v2/traces"
sample_rate = 1.0 # 0.0 to 1.0
auto_start_trace = true # Auto-start root trace on init
auto_trace_name = "root" # Name for auto-started trace
use_otlp = true # Use OTLP exporter
# service_name = "my-app" # Optional service name
# service_role has no env var β pass via init(service_role="orchestrator")
[exporters]
# Only enable ONE exporter at a time
enable_console = false # Print traces to console
enable_file = false # Write traces to file
file_exporter_path = "traces.jsonl"
reset_trace_file = false # Reset file on initialization
[instrumentation]
enable_patching = true # Auto-patch libraries (OpenAI, Anthropic, requests)
enable_token_counting = true # Count tokens for LLM calls
enable_costs = true # Calculate costs
openai_agents = true # Auto-enable OpenAI Agents SDK integration
crewai = true # Auto-enable CrewAI integration
guardrail_heuristics = true # Tier C heuristic guardrail detection (tool error keywords)
auto_instrument_tools = false # Auto-instrument tool calls (experimental)
max_tool_spans = 100 # Max tool spans to create
max_span_depth = 10 # Max nested span depth
[rate_limiting]
# Optional: limit spans per second
# max_spans_per_second = 100.0
max_queue_size = 5000 # Max buffered spans
max_block_ms = 100 # Max ms to block before dropping
max_export_batch_size = 512 # Spans per export batch
schedule_delay_millis = 5000 # Delay between batches
[metrics]
enable_metrics = true # Enable OpenTelemetry metrics
# metrics_endpoint = "" # Defaults to {traces_base}/v2/metrics
metrics_sample_rate = 1.0 # Metrics sampling rate (1.0 = 100%)
[runtime]
# Optional runtime metadata (agent identity: prefer init(agent_id=..., agent_name=..., env=...) or TRACCIA_* env)
# session_id = ""
# user_id = ""
# tenant_id = ""
# project_id = ""
# agent_id = "" # Single-agent: set in code or TRACCIA_AGENT_ID
# agent_name = ""
# env = "" # e.g. production, staging, dev
[logging]
debug = false # Enable debug logging
enable_span_logging = false # Enable span-level logging
[advanced]
# attr_truncation_limit = 1000 # Max attribute value lengthIf you do not set endpoint (in config, environment, or when calling init() / start_tracing()), the SDK uses the Traccia platform by default (https://api.traccia.ai/v2/traces). You can override it to send traces to your own OTLP-compatible backend.
The default is defined in traccia.config: DEFAULT_OTLP_TRACE_ENDPOINT. The alias DEFAULT_ENDPOINT is kept for backward compatibility (same value).
Traccia is fully OTLP-compatible and works with:
- Grafana Tempo -
http://tempo:4318/v1/traces - Jaeger -
http://jaeger:4318/v1/traces - Zipkin - Configure via OTLP endpoint
- SigNoz - Self-hosted observability platform
- Traccia Platform -
https://api.traccia.ai/v2/traces(requires API key)
All config parameters can be set via environment variables with the TRACCIA_ prefix:
Tracing: TRACCIA_API_KEY, TRACCIA_ENDPOINT, TRACCIA_SAMPLE_RATE, TRACCIA_AUTO_START_TRACE, TRACCIA_AUTO_TRACE_NAME, TRACCIA_USE_OTLP, TRACCIA_SERVICE_NAME
Exporters: TRACCIA_ENABLE_CONSOLE, TRACCIA_ENABLE_FILE, TRACCIA_FILE_PATH, TRACCIA_RESET_TRACE_FILE
Instrumentation: TRACCIA_ENABLE_PATCHING, TRACCIA_ENABLE_TOKEN_COUNTING, TRACCIA_ENABLE_COSTS, TRACCIA_AUTO_INSTRUMENT_TOOLS, TRACCIA_MAX_TOOL_SPANS, TRACCIA_MAX_SPAN_DEPTH, TRACCIA_OPENAI_AGENTS, TRACCIA_CREWAI, TRACCIA_GUARDRAIL_HEURISTICS
Rate Limiting: TRACCIA_MAX_SPANS_PER_SECOND, TRACCIA_MAX_QUEUE_SIZE, TRACCIA_MAX_BLOCK_MS, TRACCIA_MAX_EXPORT_BATCH_SIZE, TRACCIA_SCHEDULE_DELAY_MILLIS
Runtime: TRACCIA_SESSION_ID, TRACCIA_USER_ID, TRACCIA_TENANT_ID, TRACCIA_PROJECT_ID, TRACCIA_AGENT_ID, TRACCIA_AGENT_NAME, TRACCIA_ENV
Legacy alias: TRACCIA_PROJECT (maps to project_id)
Logging: TRACCIA_DEBUG, TRACCIA_ENABLE_SPAN_LOGGING
Advanced: TRACCIA_ATTR_TRUNCATION_LIMIT
from traccia import init
# Override config programmatically (including agent identity for single-agent services)
init(
endpoint="http://tempo:4318/v1/traces",
sample_rate=0.5,
enable_costs=True,
max_spans_per_second=100.0,
agent_id="my-agent",
agent_name="My Agent",
env="production",
)For services that orchestrate many logical agents in one process, set a service role and scope per-run identity:
from traccia import init, runtime_config
init(
service_name="my-multi-agent-api",
service_role="orchestrator",
auto_start_trace=False,
)
with runtime_config.run_identity(agent_id="billing-agent", agent_name="Billing Agent", env="production"):
# run one logical agent task
...This prevents the host service from being registered as a synthetic agent in the Traccia platform.
If one process runs many agents concurrently (for example, an API server or orchestrator), use this pattern:
- Call
init()once per process (for example at startup), not per request. - Wrap each logical "run" in
runtime_config.run_identity(...)to set agent id/name/env for that run. - Do not call
stop_tracing()per request; useforce_flush()to flush spans/metrics after a run without shutting down the provider.
from traccia import init, span, force_flush, runtime_config
init(service_name="multi-agent-service", auto_start_trace=False)
def run_agent(agent_id: str, env: str, payload: dict):
# Scope identity to this run only
with runtime_config.run_identity(agent_id=agent_id, agent_name=agent_id, env=env):
with span("agent.run") as root:
root.set_attribute("agent.id", agent_id)
root.set_attribute("agent.run.mode", "api")
# ... your agent logic here ...
# Flush without tearing down the global provider
force_flush(5.0)The @observe decorator is the primary way to instrument your code:
from traccia import observe
# Basic usage
@observe()
def process_data(data):
return transform(data)
# Custom span name
@observe(name="data_pipeline")
def process_data(data):
return transform(data)
# Add custom attributes
@observe(attributes={"version": "2.0", "env": "prod"})
def process_data(data):
return transform(data)
# Specify span type
@observe(as_type="llm") # "span", "llm", "tool"
def call_llm():
pass
# Skip capturing specific arguments
@observe(skip_args=["password", "secret"])
def authenticate(username, password):
pass
# Skip capturing result (for large returns)
@observe(skip_result=True)
def fetch_large_dataset():
return huge_dataAvailable Parameters:
name(str, optional): Custom span name (defaults to function name)attributes(dict, optional): Initial span attributesas_type(str): Span type -"span","llm","tool", or"guardrail"skip_args(list, optional): List of argument names to skip capturingskip_result(bool): Skip capturing the return value
@observe works seamlessly with async functions:
@observe()
async def async_task(x):
await asyncio.sleep(1)
return x * 2
result = await async_task(5)For more control, create spans manually:
from traccia import get_tracer, span
# Using convenience function
with span("operation_name") as s:
s.set_attribute("key", "value")
s.add_event("checkpoint_reached")
do_work()
# Using tracer directly
tracer = get_tracer("my_service")
with tracer.start_as_current_span("operation") as s:
s.set_attribute("user_id", 123)
do_work()Traccia automatically captures and records errors:
@observe()
def failing_function():
raise ValueError("Something went wrong")
# Span will contain:
# - error.type: "ValueError"
# - error.message: "Something went wrong"
# - error.stack_trace: (truncated stack trace)
# - span status: ERRORSpans are automatically nested based on call hierarchy:
@observe()
def parent_operation():
child_operation()
return "done"
@observe()
def child_operation():
grandchild_operation()
@observe()
def grandchild_operation():
pass
# Creates nested span hierarchy:
# parent_operation
# βββ child_operation
# βββ grandchild_operationTraccia includes a powerful CLI for configuration and diagnostics:
Create a new traccia.toml configuration file:
traccia config init
traccia config init --force # Overwrite existingValidate configuration and diagnose issues:
traccia doctor
# Output:
# π©Ί Running Traccia configuration diagnostics...
#
# β
Found config file: ./traccia.toml
# β
Configuration is valid
#
# π Configuration summary:
# β’ API Key: β Not set (optional)
# β’ Endpoint: https://api.traccia.ai/v2/traces
# β’ Sample Rate: 1.0
# β’ OTLP Exporter: β
EnabledTest connectivity to your exporter endpoint:
traccia check
traccia check --endpoint http://tempo:4318/v1/tracesManage the local LLM pricing snapshot used for cost estimation:
# Show current pricing source, age, and model count
traccia pricing status
# Download the latest pricing β tries the Traccia platform first,
# automatically falls back to the upstream pricing source if the platform
# is unreachable (e.g. no Traccia account, network error)
traccia pricing refresh
# Force-fetch directly from the upstream pricing source (skips the platform)
traccia pricing refresh --source upstream
# Remove local cache β reverts to the bundled snapshot shipped with the SDK
traccia pricing clearTraccia computes estimated LLM costs locally at span-end time using a pricing table that resolves in the following order (highest precedence first):
| Priority | Source | Set via |
|---|---|---|
| 1 | Programmatic override | start_tracing(pricing_override={...}) |
| 2 | Env var override | TRACCIA_PRICING_OVERRIDE_JSON='{"model": {"prompt": x, "completion": y}}' |
| 3 | Local cache | traccia pricing refresh (stored in ~/.cache/traccia/pricing.json) |
| 4 | Bundled snapshot | Ships inside the SDK wheel (generated from LiteLLM at release time) |
The bundled snapshot is generated at SDK release time from an upstream pricing
database (2500+ models). It is refreshed automatically on every SDK release via
the CI/CD pipeline β so pip install --upgrade traccia always brings a
reasonably current snapshot.
Between releases, you can pull the latest pricing without upgrading the SDK:
traccia pricing refresh # platform β upstream fallbackThe SDK warns you when the active snapshot is outdated:
- >7 days old:
logger.infoβ reminder to refresh - >30 days old:
logger.warningβ includes the exact command
Every span also carries these attributes so the platform can always see how fresh the SDK's pricing was at emit time:
| Attribute | Description |
|---|---|
llm.pricing.generated_at |
ISO timestamp of the pricing snapshot |
llm.pricing.age_days |
Integer age in days at emit time |
llm.pricing.source |
One of bundled | local_cache | env | override |
If you use the Traccia platform, it maintains its own pricing sync (every 6 hours by default). Costs shown in the platform UI are recomputed using the platform's authoritative snapshot, which is always fresher than the SDK's bundled table:
org override > platform snapshot > bundled fallback
When the platform recomputes a cost, it writes platform_cost_usd alongside the
original llm.cost.usd on each span β the SDK-reported value is never
overwritten. In the span detail panel you will see both values clearly labeled:
- "Traccia recomputed cost" β what the platform computed, with a tooltip showing whether it came from an automatic pricing update or an org override.
- "SDK-side pricing context (at emit time)" β the
llm.pricing.*attributes, collapsed by default, showing what prices the SDK saw when the span was emitted.
Use traccia pricing refresh to sync the platform's snapshot into your SDK's local
cache β then llm.cost.usd and platform_cost_usd will converge.
You may occasionally see a muted "SDK reported $X.XX" line on the costs page. This happens when the two sources used different rates, for three legitimate reasons:
| Reason | How to resolve |
|---|---|
| SDK's local cache or bundled snapshot is slightly older than the platform's current snapshot | Temporary β platform catches up within 6 hours, or click "Sync now" in Settings β Pricing. Running traccia pricing refresh also syncs the SDK. |
SDK-level override set via pricing_override=... or TRACCIA_PRICING_OVERRIDE_JSON |
Intentional β the SDK override applies only to llm.cost.usd for that process. The platform uses its own snapshot + org overrides for platform_cost_usd. |
| Org override set in Settings β Pricing on the platform | Intentional β org overrides apply to platform_cost_usd for all agents in the org. They do not change what SDK processes report. |
The platform never uses the SDK's local cache as input to its own recomputation. Each agent process may have a different cached version, and the platform needs one consistent source for cross-agent reporting. SDK pricing affects what gets emitted in spans; platform pricing determines what gets shown in aggregate dashboards.
All pricing overrides β whether set programmatically, via environment variable, or via the Traccia platform settings β use the same JSON shape:
{
"<model-name>": {
"prompt": 0.005,
"completion": 0.015,
"cache_write": 0.001,
"cached_prompt": 0.0005
}
}All rates are USD per 1,000 tokens. cache_write and cached_prompt are
optional (for models with prompt-caching pricing, e.g. Claude 3.5).
Overrides always win over any other source:
# Programmatic (per-process) β both span attributes and cost metrics will use this
traccia.start_tracing(
pricing_override={
"gpt-4o": {"prompt": 0.005, "completion": 0.015},
"my-fine-tuned-model": {"prompt": 0.01, "completion": 0.02},
}
)# Environment variable (persistent across restarts)
export TRACCIA_PRICING_OVERRIDE_JSON='{"gpt-4o": {"prompt": 0.005, "completion": 0.015}}'Deprecation notice:
AGENT_DASHBOARD_PRICING_JSONis accepted as a back-compat alias forTRACCIA_PRICING_OVERRIDE_JSONbut will be removed in a future minor version. Rename the variable in your environment.
Platform overrides (org-level): Org admins can set pricing overrides in
Settings β Pricing on the Traccia platform. These apply to the
platform-recomputed cost (platform_cost_usd) for all agents in the org. They
do not change llm.cost.usd on existing spans retroactively unless you
explicitly enable the "Also recompute past traces" option in the save dialog.
Protect your infrastructure with built-in rate limiting:
[rate_limiting]
max_spans_per_second = 100.0 # Limit to 100 spans/sec
max_queue_size = 5000 # Max buffered spans
max_block_ms = 100 # Block up to 100ms before droppingBehavior:
- Try to acquire capacity immediately
- If unavailable, block for up to
max_block_ms - If still unavailable, drop span and log warning
When spans are dropped due to rate limiting, warnings are logged to help you monitor and adjust limits.
Control trace volume with sampling:
# Sample 10% of traces
init(sample_rate=0.1)
# Sampling is applied at trace creation time
# Traces are either fully included or fully excludedAutomatic for supported LLM providers (OpenAI, Anthropic):
@observe(as_type="llm")
def call_openai(prompt):
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.content
# Span automatically includes:
# - llm.token.prompt_tokens
# - llm.token.completion_tokens
# - llm.token.total_tokens
# - llm.cost.total (in USD)Traccia emits OTEL-compliant metrics for accurate cost and token tracking, independent of trace sampling.
With trace sampling (e.g., sample_rate=0.1), only 10% of traces are exported. Cost calculated from traces will be 10x underestimated. Metrics solve this by recording data for every LLM call, regardless of sampling.
Traccia automatically emits these metrics:
| Metric | Type | Unit | Description |
|---|---|---|---|
gen_ai.client.token.usage |
Histogram | {token} |
Input/output tokens per call |
gen_ai.client.operation.duration |
Histogram | s |
LLM operation duration |
gen_ai.client.operation.cost |
Histogram | usd |
Cost per call (USD) |
gen_ai.client.completions.exceptions |
Counter | 1 |
Exception count |
gen_ai.agent.runs |
Counter | 1 |
Agent runs (CrewAI, OpenAI Agents) |
gen_ai.agent.turns |
Counter | 1 |
Agent turns |
gen_ai.agent.execution_time |
Histogram | s |
Agent execution time |
Attributes: gen_ai.system (openai, anthropic), gen_ai.request.model, gen_ai.agent.id, gen_ai.agent.name
from traccia import init
init(
enable_metrics=True, # Default: True
metrics_endpoint="https://your-backend.com/v2/metrics",
metrics_sample_rate=1.0, # Default: 1.0 (100%)
)Or via traccia.toml:
[metrics]
enable_metrics = true
metrics_endpoint = "https://your-backend.com/v2/metrics"
metrics_sample_rate = 1.0Or via environment variables:
export TRACCIA_ENABLE_METRICS=true
export TRACCIA_METRICS_ENDPOINT=https://your-backend.com/v2/metrics
export TRACCIA_METRICS_SAMPLE_RATE=1.0Record your own metrics:
from traccia.metrics import record_counter, record_histogram
# Record a counter
record_counter("my_custom_events", 1, {"event_type": "user_action"})
# Record a histogram
record_histogram("my_custom_latency", 0.123, {"service": "api"}, unit="s")Agent-level metrics (such as gen_ai.agent.runs and gen_ai.agent.execution_time) are only emitted when Traccia can
see a real agent lifecycle (for example, CrewAI crews or OpenAI Agents SDK runs). For plain OpenAI/Anthropic calls
and most simple LangChain usages, you will still get full LLM metrics (gen_ai.client.*), but no agent metrics unless
you build an explicit agent abstraction on top.
import logging
logging.basicConfig(level=logging.DEBUG)
# Or via config
init(debug=True)
# Or via env var
# TRACCIA_DEBUG=1 python your_script.py- Check connectivity:
traccia check - Validate config:
traccia doctor - Enable debug logging
- Verify endpoint is correct and accessible
- Reduce
max_queue_sizein rate limiting config - Lower
sample_rateto reduce volume - Enable rate limiting with
max_spans_per_second
- Check rate limiter logs for warnings
- Increase
max_spans_per_secondif set - Increase
max_queue_sizeif spans are queued - Check
traccia doctoroutput
Initialize the Traccia SDK. All parameters are optional; configuration is merged from traccia.toml β env vars β explicit parameters (highest wins).
Parameters:
Tracing
endpoint(str): OTLP endpoint URL (default:https://api.traccia.ai/v2/traces)api_key(str): API key for the Traccia platformsample_rate(float): Sampling rate 0.0β1.0 (default: 1.0)auto_start_trace(bool): Auto-start a root trace on init (default: True)auto_trace_name(str): Name for the auto-started trace (default:"root")use_otlp(bool): Use OTLP exporter (default: True)service_name(str): Service name (auto-detected if not set)service_role(str):"orchestrator"to prevent this service being registered as an agentconfig_file(str): Path to a customtraccia.toml
Exporters
enable_console_exporter(bool): Print spans to stdout (default: False)enable_file_exporter(bool): Write spans to file (default: False)file_exporter_path(str): Path for file exporter (default:"traces.jsonl")reset_trace_file(bool): Clear file on init (default: False)
Instrumentation
enable_patching(bool): Auto-patch OpenAI, Anthropic, requests (default: True)enable_token_counting(bool): Count tokens (default: True)enable_costs(bool): Calculate costs (default: True)pricing_override(dict): Per-model pricing override β always wins over all other sources. See Pricing section below.openai_agents(bool): Auto-enable OpenAI Agents SDK integration (default: True)crewai(bool): Auto-enable CrewAI integration (default: True)guardrail_heuristics(bool): Enable Tier C heuristic guardrail detection (default: True)auto_instrument_tools(bool): Experimental tool auto-instrumentation (default: False)max_tool_spans(int): Max tool spans per trace (default: 100)max_span_depth(int): Max nested span depth (default: 10)
Agent identity (single-agent services)
agent_id(str): Logical agent identifieragent_name(str): Human-readable agent nameenv(str): Deployment environment, e.g."production","staging"
Runtime metadata
session_id(str): Session identifieruser_id(str): User identifiertenant_id(str): Tenant / org identifierproject_id(str): Project identifier
Metrics
enable_metrics(bool): Emit OTEL metrics (default: True)metrics_endpoint(str): Metrics endpoint (derived from tracing endpoint if not set)metrics_sample_rate(float): Metrics sampling rate (default: 1.0)
Rate limiting
max_spans_per_second(float): Rate limit spans/sec (default: None = unlimited)max_block_ms(int): Max ms to block before dropping a span (default: 100)max_queue_size(int): Max buffered spans (default: 5000)max_export_batch_size(int): Spans per export batch (default: 512)schedule_delay_millis(int): Batch export interval ms (default: 5000)
Misc
debug(bool): Enable debug logging (default: False)attr_truncation_limit(int): Max attribute value length (default: None)
Returns: TracerProvider instance
Stop tracing and flush pending spans.
Parameters:
flush_timeout(float): Max seconds to wait for flush
Get a tracer instance.
Parameters:
name(str): Tracer name (typically module/service name)
Returns: Tracer instance
Create a span context manager.
Parameters:
name(str): Span nameattributes(dict, optional): Initial attributes
Returns: Span context manager
@observe(name=None, *, attributes=None, tags=None, as_type="span", skip_args=None, skip_result=False)
Decorate a function to create spans automatically.
Parameters:
name(str, optional): Span name (default: function name)attributes(dict, optional): Initial attributestags(list[str], optional): User-defined identifiers for the observed methodas_type(str): Span type ("span","llm","tool","guardrail")skip_args(list, optional): Arguments to skip capturingskip_result(bool): Skip capturing return value
Load and validate configuration.
Parameters:
config_file(str, optional): Path to config fileoverrides(dict, optional): Override values
Returns: Validated TracciaConfig instance
Raises: ConfigError if invalid
Validate configuration without loading.
Returns: Tuple of (is_valid, message, config_or_none)
Application Code (@observe)
β
Span Creation
β
Processors (token counting, cost, enrichment)
β
Rate Limiter (optional)
β
Batch Processor (buffering)
β
Exporter (OTLP/Console/File)
β
Backend (Grafana Tempo / Jaeger / Zipkin / etc.)
-
traccia.instrumentation.*: Infrastructure and vendor instrumentation.- HTTP client/server helpers (including FastAPI middleware).
- Vendor SDK hooks and monkey patching (e.g., OpenAI, Anthropic,
requests). - Decorators and utilities used for auto-instrumenting arbitrary functions.
-
traccia.integrations.*: AI/agent framework integrations.- Adapters that plug into higher-level frameworks via their official extension points (e.g., LangChain callbacks).
- Work at the level of chains, tools, agents, and workflows rather than raw HTTP or SDK calls.
Contributions are welcome! Whether it's bug fixes, new features, documentation improvements, or examples - we appreciate your help.
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes and add tests
- Run tests:
pytest traccia/tests/ - Lint your code:
ruff check traccia/ - Commit:
git commit -m "Add amazing feature" - Push:
git push origin feature/amazing-feature - Open a Pull Request
# Clone the repository (Python SDK)
git clone https://github.com/traccia-ai/traccia-py.git
cd traccia-py
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install in editable mode with dev dependencies
pip install -e ".[dev]"
# Run tests
pytest traccia/tests/ -v
# Run with coverage
pytest traccia/tests/ --cov=traccia --cov-report=html- Follow PEP 8
- Use type hints where appropriate
- Add docstrings for public APIs
- Write tests for new features
- Keep PRs focused and atomic
- Integrations: Add support for more LLM providers (Cohere, AI21, local models)
- Backends: Test and document setup with different OTLP backends
- Examples: Real-world examples of agent instrumentation
- Documentation: Tutorials, guides, video walkthroughs
- Performance: Optimize hot paths, reduce overhead
- Testing: Improve test coverage, add integration tests
Apache License 2.0 - see LICENSE for full terms and conditions.
Built with:
- OpenTelemetry - Vendor-neutral observability framework
- Pydantic - Data validation
- tiktoken - Token counting
Inspired by observability tools in the ecosystem and designed to work seamlessly with the OTLP standard.
- Issues: GitHub Issues - Report bugs or request features
- Discussions: GitHub Discussions - Ask questions, share ideas
Made with β€οΈ for the AI agent community