Skip to content

Session replay: discovery, fetch, analyzer, and bundle analytics (044)#193

Merged
jaredmixpanel merged 19 commits into
mainfrom
044-session-replay
Jun 5, 2026
Merged

Session replay: discovery, fetch, analyzer, and bundle analytics (044)#193
jaredmixpanel merged 19 commits into
mainfrom
044-session-replay

Conversation

@jaredmixpanel
Copy link
Copy Markdown
Contributor

@jaredmixpanel jaredmixpanel commented Jun 5, 2026

TL;DR — this PR is large on paper, small to review

52 files, +14,714 lines. But only 2,647 of those lines (18%) are executable source code
the part that needs real scrutiny. Here's where the rest goes:

Bucket Lines %
Executable source code 2,647 18%
Tests 5,158 35%
Docstrings (this repo mandates one per function) 1,948 13%
Specs / planning 2,388 16%
Planning scratch — context/session-replay-plan.md (skip it) 1,429 10%
Shipped docs 495 3%
Comments + blank in source 644 4%
Config 5 <1%

Tests outweigh real code ~1.8:1, and across the source files the mandated docstrings are nearly
as large as the code. If you only have 30 minutes, read the ~2,600 lines of logic and the
How to review this section below.

What it adds

A first-class session-replay surface: discover a user's rrweb recordings, sign and pull the raw
files, normalize them into an action timeline, and aggregate across sessions — from Python or the
mp CLI.

Python — 10 methods on Workspace, 6 result types, 5 exceptions, 3 label helpers:

import mixpanel_headless as mp
ws = mp.Workspace()

# one-call: discover + fetch + analyze a user's sessions
bundle = ws.replays_for_user("user-42", from_date="2026-05-20", to_date="2026-05-27")

bundle.df                      # one row per session (clicks, errors, duration, …)
bundle.rage_clicks(threshold=3, window_ms=1000)
bundle.error_sessions().replays
bundle.where(contains_url="/checkout").filter(lambda r: r.duration_seconds > 60)

replay = ws.fetch_replay("r-19221397401184")
print(replay.summary_markdown)         # LLM-facing action timeline
replay.to_rrweb_player_json()          # hand off to the rrweb JS player
  • Discovery / fetch: list_replays, sign_replay(s), fetch_replay(s), stream_replay,
    events_for_replay(s), replays_for_user, analyze_replay.
  • Types: ReplaySummary, SignedReplay, Replay, ReplayBundle, UserAction,
    ReplayEvent — all with long-format pandas DataFrames (sessions_df, actions_df,
    events_df, mixpanel_df, elements_df).
  • Aggregations / filters: top_clicks, rage_clicks, long_pauses, error_sessions,
    chainable where/filter/find_pattern, and compare for action-frequency diffs.
  • Label helpers (mixpanel_headless.replay_labels): default_label_fn, selector_label_fn,
    url_normalizer — stable cross-session activity labels.

CLImp replays {list, events, sign, fetch, analyze, for-user}:

mp replays list --user user-42 --from 2026-05-20 --to 2026-05-27
mp replays fetch r-19221397401184 -o recording.json
mp replays analyze r-19221397401184
mp replays for-user user-42 --from 2026-05-20 --to 2026-05-27 --include analyze --out-dir ./out/

Architecture

One pipeline, reusing the existing Workspace + Insights query path for discovery and a new async
CDN walker for the bytes:

flowchart LR
  U["distinct_id + date window"] --> D["discover<br/>Insights query on session-record events"]
  D --> S["sign<br/>POST /replays/sign/bulk<br/>~5-min bearer URL"]
  S --> F["fetch<br/>async CDN walk (batched GETs)<br/>404 = end · 403 = re-sign once"]
  F --> A["analyze<br/>rrweb analyzer: DOMTracker + EventAnalyzer<br/>single pass"]
  A --> R["Replay / ReplayBundle<br/>pandas DataFrames + aggregations"]
Loading
  • discover (_internal/services/replays.py::discover) — one Insights query grouped on
    replay id + retention; returns lightweight ReplaySummary handles, no bytes.
  • sign (api_client.py::sign_replays) — POST /app/projects/<id>/replays/sign/bulk. Returns
    SignedReplay, a ~5-minute bearer URL.
  • fetch (services/replays.py::walk_cdn_async) — concurrent batched GETs; 404 on file 0 →
    ReplayNotFoundError, later 404 → clean end-of-recording, 403 → one transparent re-sign.
  • analyze (_internal/replays/rrweb_analyzer.py) — a fork adapted from a production-tested
    internal Mixpanel analyzer, now owned and evolving in this repo. A single pass over the event
    stream maintains DOM state (DOMTracker) and emits UserAction records plus a markdown
    timeline (EventAnalyzer). Pure stdlib, no native deps.
  • aggregate (_internal/replays/aggregators.py) — ReplayBundle DataFrame projections and
    the cross-session aggregations.

How to review this

Reading order (≈90 min for the careful pass):

  1. specs/044-session-replay/spec.md + plan.md — what and why (~10 min).
  2. src/mixpanel_headless/types.py — the 6 result types. Focus on SignedReplay masking and the
    Replay/ReplayBundle DataFrame contracts. (Biggest file, but ~40% is docstrings.)
  3. src/mixpanel_headless/exceptions.py — the SessionReplayError hierarchy (base + 4 leaves).
  4. src/mixpanel_headless/_internal/services/replays.pyload-bearing. walk_cdn_async
    batching + 404/403 sentinels, discover/_parse_summaries Insights parsing, and credential
    redaction in the error path.
  5. src/mixpanel_headless/workspace.py — thin facades that delegate to the service.
  6. src/mixpanel_headless/_internal/api_client.py::sign_replays — the HTTP boundary + the
    SESSION_RECORDING_SENSITIVE_DATA 403 → SessionReplayAccessError mapping.
  7. src/mixpanel_headless/_internal/replays/rrweb_analyzer.py — the analyzer (561 code lines).
    DOMTracker (+ the MAX_NODES guard) and EventAnalyzer's single pass.
  8. _internal/replays/aggregators.py + replay_labels.py — small, pure.
  9. src/mixpanel_headless/cli/commands/replays.py — the 6 subcommands; masking +
    --reveal-signed-urls.
  10. Tests alongside each module to confirm the contracts.

Scrutinize — the security/correctness boundaries:

  • Credential masking, 3 sites: SignedReplay.__repr__/__str__ (types.py), error scrubbing in
    services/replays.py, and the CLI --reveal-signed-urls gating (cli/commands/replays.py).
  • DOMTracker.MAX_NODES = 50_000 (rrweb_analyzer.py) — bounds memory on huge/hostile snapshots.
  • CDN walk sentinels (services/replays.py) — 404-on-first-file vs later-404 vs 403-re-sign.
  • _parse_summaries$overall rollup handling + retention default-with-UserWarning.

Skim — boilerplate / standard patterns: the Workspace replay methods (delegation), the
DataFrame projection properties (plain pandas), CLI formatting (existing patterns).

Skip — not the shipped surface:

  • context/session-replay-plan.md — planning scratch, 1,429 lines.
  • specs/044-session-replay/* beyond spec + plan — reference contracts.
  • tests/fixtures/rrweb/*.json — sample recordings.
Full line-count breakdown (how the 14,714 lines split)
Bucket Lines %
Source — executable code 2,647 18%
Source — docstrings 1,948 13%
Source — comments + blank 644 4%
Tests — logic 4,803 33%
Tests — JSON fixtures 355 2%
Specs / planning 2,388 16%
Docs — shipped 495 3%
Docs — planning scratch (context/) 1,429 10%
Config 5 <1%

Largest source files (code / docstring): types.py 698/581, rrweb_analyzer.py 561/231,
services/replays.py 441/400, workspace.py 301/314, cli/commands/replays.py 367/104.
In workspace.py, exceptions.py, and replay_labels.py the docstrings outweigh the code.

How the 19 commits cluster (5 logical phases)
  1. Build the surface (1–6): foundation (spec, exceptions, api client, types) → service +
    Workspace methods → CLI → analyzer + ReplayBundle + aggregations + analyzer tests.
  2. Harden from live QA (7–9): parse the discovery series (not the lossy .df); scope the
    events window; batch the fetch-time Insights queries.
  3. Trim scope (10–12): drop the pm4py/tslearn process-mining + clustering extras; cut the
    graph/tree/path projections that went degenerate on real SPA sessions; harden the core.
  4. Document (13): README, docs site, plugin skill.
  5. Resolve review (14–19): two human rounds + Copilot — public label helpers, docstrings,
    typed CLI env, selector_label_fn fix, DOMTracker MAX_NODES correctness, strip phase
    residue — plus a typed UnsupportedReplayFormatError (19) so a mobile-session attempt returns a
    clean CLI message + exit 1 instead of a leaked traceback.

Security

  • Signed URLs are time-bounded (~5 min) bearer credentials, handled in-process only — never
    written to disk.
    Masked in repr/str, scrubbed from error messages, never logged at any level.
  • to_dict() keeps the full credential (round-trip) but stamps a _warning marker.
  • CLI redacts by default; --reveal-signed-urls is the single opt-in and prints a stderr warning
    on every use.
  • SESSION_RECORDING_SENSITIVE_DATA 403 → SessionReplayAccessError with structured details
    naming the permission to request.
  • DOMTracker caps at 50,000 nodes to bound memory.

Post-QA hardening

Live QA against a real high-volume SPA project drove a round of cuts and fixes:

  • Cut the graph/tree/path-mining projections (page_graph, element_graph, path_tree,
    transitions_df, pages_df, top_paths, top_pages, dead_clicks) — empty or degenerate on
    real SPA sessions (page_graph = 0 nodes, path_tree = a linear chain, dead_clicks flagged
    44% of actions).
  • Fixed repr: Replay/ReplayBundle no longer dump the full rrweb stream (bundle repr ~69 MB
    → ~85 chars).
  • Fixed distinct_id: threaded from discovery into fetched replays, so sessions_df
    identifies the user.
  • Fixed summary_markdown: the analyzer computed the rich description and fetch_replay
    discarded it; now stored on UserAction, rendered, with consecutive duplicates collapsed.
  • Reworked top_clicks / elements_df: exclude focus-only interactions, normalize URLs.

Testing

just check green: 6,689 passed, 1 skipped, 91.89% coverage, mypy --strict clean (317 files), ruff clean, docstring coverage 99.4%, build OK.
Coverage floor 90%; mutation floor 80% on the four replay modules. Verified live against a real
project (discovery, fetch, analyze, CLI render). Perf targets met: list_replays(7d) ≤ 2s,
fetch_replay(30 MB) ≤ 5s, stream_replay first event ≤ 1s, bundle.actions_df(100) ≤ 10s.

🤖 Generated with Claude Code

jaredmixpanel and others added 12 commits June 4, 2026 16:51
Lays the spec + foundational layer for session-replay support:

Spec (specs/044-session-replay/):
- spec.md, plan.md, tasks.md, data-model.md, research.md, quickstart.md
- contracts/python-api.md, cli-commands.md, error-messages.md
- checklists/requirements.md (16/16 complete)
- 3-PR phased rollout: P1 discovery+fetch, P2 analyzer+bundle, P3 pm4py+tslearn

Exception hierarchy (exceptions.py):
- SessionReplayError(APIError) base + 3 leaf classes
  (SessionReplayAccessError, SignedURLExpiredError, ReplayNotFoundError)
- Re-exported from package root

API client (_internal/api_client.py):
- MixpanelAPIClient.sign_replays(ids, env) POSTs to
  /app/projects/<id>/replays/sign/bulk, returns raw decoded results
- _handle_response: 403 bodies mentioning SESSION_RECORDING_SENSITIVE_DATA
  now map to SessionReplayAccessError with structured details
  (project_id, flag, permission_required)

Result types (types.py):
- ReplaySummary, SignedReplay, ReplayEvent, UserAction (Phase 1 placeholder),
  Replay — frozen dataclasses inheriting ResultWithDataFrame where applicable
- SignedReplay masks query_string in __repr__/__str__ so default logging
  cannot leak the 5-minute bearer credential; to_dict() is the documented
  escape hatch and carries a _warning key
- Replay.events_df / actions_df / mixpanel_df / pages_df lazy projections
  matching documented schemas; analyzer-dependent accessors raise
  NotImplementedError until Phase 2 wires the vendored analyzer

Tests (91 new, all green; 6533 total / 0 failed / 92.08% coverage):
- tests/unit/test_exceptions_session_replay.py (17 tests)
- tests/unit/_internal/test_api_client_sign_replays.py (10 tests)
- tests/unit/test_types_replay_summary.py (14 tests)
- tests/unit/test_types_signed_replay.py (16 tests, includes credential-
  leak invariant: no 12-char chunk of query_string appears in repr/str)
- tests/unit/test_types_replay_event.py (10 tests)
- tests/unit/test_types_replay.py (14 tests)
- tests/fixtures/rrweb/sample-replay-001.json + README — hand-built
  20-event login→navigate→click stream for Phase 1 unit tests

Status: PR 1 of 3 partially landed. Remaining for PR 1:
T015-T019 (service/workspace tests), T022-T028.5 (ReplaysService +
Workspace.list_replays/sign_replay(s)/fetch_replay/stream_replay/
events_for_replay(s)), T033-T042 (CLI `mp replays list/events/sign/fetch`).

just check passes (lint + format + typecheck + tests + coverage + build).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes Phase 3 (User Story 1) of the session-replay rollout: a Workspace
can now discover, sign, and pull raw rrweb bytes end-to-end.

ReplaysService (_internal/services/replays.py, 750 LoC):
- sign(replay_ids, env) wraps api_client.sign_replays + attaches signed_at
  before the network call so callers' expiry arithmetic stays conservative
- walk_cdn_async: async generator over parallel httpx.AsyncClient GETs.
  Algorithm — batches of `concurrency`, asyncio.gather per batch, walk
  results in file-number order. 404 on file 0 → ReplayNotFoundError;
  404 mid-walk is the clean end-of-replay sentinel; 403 triggers a single
  re-sign retry when re_sign_on_expiry=True, raises SignedURLExpiredError
  when False or when the retry also 403s. max_files caps the walk.
  Mobile-replay detection on the first event of the walk raises
  NotImplementedError per error-messages.md §9.
- fetch_files: asyncio.run-driven buffered wrapper over walk_cdn_async.
- discover / events_for: Insights queries against $mp_session_record
  grouped on $mp_replay_id (+ $mp_replay_retention_period for discover).
  Take query_fn as a Callable to avoid a circular Workspace dependency;
  workspace constructs the service with `query_fn=self.query`. Missing
  $mp_replay_retention_period defaults to 30 with a UserWarning.

Workspace methods (workspace.py, +400 LoC):
- list_replays: XOR validation, date-window requirement, delegates to
  ReplaysService.discover
- events_for_replay / events_for_replays: ≤5 event_properties cap
  (ValueError with the catalog message)
- sign_replay / sign_replays
- fetch_replay: optional retention auto-discover, parallel CDN fetch,
  optional Mixpanel-event join; Phase 1 invariant `actions=[]`
- stream_replay: sync iterator wrapper around walk_cdn_async via a
  private event loop with aclose() cleanup in finally
- replays_for_user: Phase 1 stub raising NotImplementedError until US2
  ships ReplayBundle (T062)
- _replays_service property: lazy ReplaysService construction, cleared
  on every `use(...)` axis switch alongside the other lazy services

Tests (33 new in this commit; 97 total for US1; 6566 / 0 failed):
- tests/unit/_internal/test_replays_service.py (11): sign wrapping,
  CDN walker happy path, file naming, max_files bound, 404 sentinel,
  ReplayNotFoundError on file 0, 403 re-sign retry, expired error,
  mobile detection, discover-without-query_fn guards
- tests/unit/test_workspace_replays.py (19): list_replays validation,
  query-call shape, missing-retention UserWarning, event_properties
  cap on both events_for variants, fetch_replay flow (retention skip
  vs discover, with/without Mixpanel-event join), replays_for_user stub,
  sign_replay(s) wiring
- tests/pbt/test_cdn_walker_pbt.py (3): walker terminates at 404 (or
  max_files), never re-fetches the sentinel, returns timestamp-sorted
  events regardless of in-batch fetch order; first-file 404 always
  raises ReplayNotFoundError
- tests/integration/test_replays_live.py (4 — marked @pytest.mark.live,
  deselected by default; set MP_LIVE_TESTS=1 + fixture env vars to run)

Deferred:
- T031 mutation testing on _internal/services/replays.py — slow run;
  gate to verify before PR 1 ships.

just check green: lint, format, typecheck, 6566 pass, ≥90% coverage, build.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the PR 1 scope of the session-replay rollout — `mp replays
{list,events,sign,fetch}` lets operators discover, sign, and pull raw
rrweb bytes from the shell without writing Python.

CLI commands (cli/commands/replays.py, ~320 LoC):
- list: --user / --replay-id (XOR), --from / --to, --limit; delegates
  to Workspace.list_replays; empty result is exit 0 not error
- events: positional REPLAY_ID, --properties (comma-separated; >5 →
  ValueError → exit 3 via handle_errors)
- sign: variadic REPLAY_IDs, --env, --reveal-signed-urls. Default JSON
  output masks query_string as `<redacted N chars>` and exposes
  expires_at; --reveal-signed-urls uses SignedReplay.to_dict() AND
  prints the bearer-credential warning to stderr on every invocation
  per contracts/cli-commands.md §4
- fetch: positional REPLAY_ID, -o/--output, --env, --include-events,
  --max-files. With -o writes a timestamp-sorted JSON array directly
  compatible with the rrweb JS player; without -o prints a one-line
  summary (event count + duration + retention)

handle_errors (cli/utils.py):
- SessionReplayAccessError → exit 2 (AUTH_ERROR) — catches BEFORE the
  generic MixpanelHeadlessError handler so the access-denied case lands
  on the right code with the catalog message wording
- ReplayNotFoundError → exit 4 (NOT_FOUND)
- SignedURLExpiredError falls through to GENERAL_ERROR (exit 1) per
  the contract

Tests (14 new):
- tests/unit/cli/test_replays_cli.py covering --help discovery, list
  happy path + empty result, events JSON output + >5-properties cap,
  sign default masking + --reveal-signed-urls disclosure + stderr
  warning, fetch -o file vs one-line summary, and exit-code mapping for
  both new exception subclasses

Security audit (T042) clean:
- No literal `Signature=` / `URLPrefix=` / `Expires=` anywhere in src/
- query_string only appears in intentional contexts (SignedReplay
  storage, __repr__ mask, validation, to_dict escape hatch, doc
  examples). No print/logger call references the credential field.

PR 1 status: T001–T042 complete, T031 mutation gate + T040 live smoke
deferred to pre-merge. Tests 6580 / 0 failed / ≥90% coverage / mypy
clean / ruff clean / build OK.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the remaining session-replay scope: vendored-style rrweb analyzer,
ReplayBundle with the full cross-session projection / aggregation /
filter surface, pm4py + tslearn optional adapters, and the Phase 2 CLI
commands.

Phase 2 / US2:

- _internal/replays/rrweb_analyzer.py (~360 LoC): pragmatic from-scratch
  rrweb event-stream analyzer producing normalized UserAction records +
  markdown timeline. Handles click/input/scroll/navigate/viewport_resize/
  console_error event families plus a DOM tracker for human-readable
  target descriptions (`button "Sign in"`, `input[type=email]`, etc.).
  Pure stdlib; quarterly upstream-diff cadence in the module docstring.
  NOTE: The spec called for vendoring analytics/backend/replays/
  rrweb_analyzer.py; that source isn't reachable from this repo, so the
  analyzer is implemented against the rrweb-types spec + the sample-001
  fixture. Re-port once the monorepo path becomes accessible.

- _internal/replays/labels.py: default_label_fn, selector_label_fn,
  url_normalizer. URL normalizer strips query strings and replaces
  numeric / hex path segments with `:id` so parameterized URLs
  aggregate cleanly.

- _internal/replays/aggregators.py: top_paths / top_clicks / top_pages /
  dead_clicks / rage_clicks / long_pauses / error_sessions.

- ReplayBundle (types.py, ~800 LoC): seven DataFrame projections
  (sessions / actions / events / mixpanel / pages / elements /
  transitions); two graph projections (page_graph, element_graph —
  networkx.DiGraph); one tree projection (path_tree — anytree.AnyNode);
  event_log() with pm4py wrapping when available; seven aggregations
  surfaced as methods; six chainable filters (filter / where /
  find_pattern / error_sessions / head / sample); join_mixpanel_events;
  summary_markdown; compare; cluster (delegates to ml_adapter).
  Immutable semantics — filters return new bundles, original unchanged.

- Workspace.fetch_replays (ThreadPoolExecutor across replays + per-replay
  async CDN walk), Workspace.replays_for_user (list_replays +
  fetch_replays composition, defaults include_mixpanel_events=True),
  Workspace.analyze_replay (sugar over fetch_replay.summary_markdown).

- Replay's Phase 1 NotImplementedError raises on summary_markdown /
  errors / clicks_on replaced with real implementations driven by the
  populated action stream.

Phase 2 CLI (US3):

- mp replays analyze: markdown timeline default, --format json for the
  raw action list.
- mp replays for-user --include analyze --include events --out-dir DIR:
  writes per-replay markdown + index.json, prints a one-line summary
  with action/click/error totals.

Phase 3 / US4 (gated on demand per the source plan):

- pyproject.toml extras: replay-mining (pm4py>=2.7), replay-ml
  (tslearn>=0.6), replay-all (both). networkx + anytree are core deps
  in this repo, so they're not re-listed.
- _internal/replays/pm4py_adapter.py: wrap_event_log_dataframe(df) →
  pm4py.objects.log.obj.EventLog via pm4py.format_dataframe. Lazy import.
- _internal/replays/ml_adapter.py: cluster_bundle(bundle, n, features,
  seed) — DTW k-means via tslearn.clustering.TimeSeriesKMeans. Each
  replay gains a cluster_label attribute.
- ReplayBundle.event_log() / cluster() now delegate via importlib so
  mypy doesn't flag the not-yet-installed optional modules; missing
  extras surface as the canonical ImportError with the install hint.

Tests (75 new in this batch; 6614 total / 0 failed / ≥90% coverage):

- tests/unit/test_us2_replay_bundle.py: consolidated US2 verification
  covering labels, analyzer (against sample-replay-001), ReplayBundle
  projections / aggregations / filters / import errors. Replaces the
  individual T045-T052 test files with one focused suite.
- tests/unit/test_types_replay.py: updated Phase 1 analyzer-accessor
  tests to verify the new empty-actions fallback behavior instead of
  NotImplementedError.
- tests/unit/test_workspace_replays.py: replays_for_user test now
  verifies the empty-window short-circuit (returns an empty bundle).
- tests/unit/cli/test_replays_cli.py: added test_analyze_prints_markdown
  and test_for_user_writes_to_out_dir.

Polish:

- CHANGELOG.md created with PR 1 / 2 / 3 entries under Unreleased.
- handle_errors: SessionReplayAccessError → exit 2, ReplayNotFoundError
  → exit 4 (added in the Phase 1 CLI commit; reaffirmed here).
- Final security audit: zero literal Signature= / URLPrefix= / Expires=
  in src/; query_string only appears in validation, masking, the
  documented to_dict() escape hatch, and doc examples.

Deferred (pre-merge polish, NOT blocking):

- T031 mutation gate on _internal/services/replays.py + the four new
  pure modules — run pre-merge.
- T045 upstream test_rrweb_analyzer.py port — needs the monorepo
  source to be reachable.
- T074-T075 / T083-T085 pm4py + tslearn skipif tests — need the
  optional extras installed in a CI matrix.
- T087-T088 mixpanel-plugin help.py + skill updates — outside main
  package surface.
- T090 version bumps — release decision per PR.
- T040 / T071 / T085 quickstart smoke-tests — need live fixture project.

just check green throughout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the from-scratch Phase 2 analyzer with a richer implementation
that handles console errors correctly and debounces high-frequency
sources. Initial code structure (DOM tracker, debouncing thresholds,
mouse-interaction naming, console-plugin filtering) was modeled after
a similar analyzer used internally inside Mixpanel; from this point on
it lives entirely in this repo and evolves on its own cadence.

Two real bugs fixed:

1. Console errors silently dropped. The previous code keyed on
   `IncrementalSource.LOG = 11`, which isn't standard rrweb. Real
   detection is `EventType.PLUGIN = 6` + `plugin.startswith(
   "rrweb/console@")`. Without this fix every recording's console
   errors would have been invisible to `ReplayBundle.error_sessions`
   and the analyzer's markdown timeline.

2. No debouncing on scroll / input / selection. Continuous scrolling
   during a 30-second recording would emit hundreds of duplicate
   scroll actions and swamp `ReplayBundle.actions_df`. Now debounced
   at 1s per source (input is per-node).

Additional behavioral improvements:
- Richer DOMTracker with ancestor traversal up to 3 levels, descriptive
  attribute extraction (aria-label, title, alt, placeholder, href, id,
  type), text extraction from interactive tags, mutation handling for
  adds / removes / text / attribute changes.
- More mouse-interaction types: dbl_click, right_click, focus,
  touch_start. All click-family interactions collapse to action="click"
  so ReplayBundle aggregations stay schema-stable; the original
  interaction is preserved in metadata["interaction"].
- Selection event handling with text excerpt extraction.

Bridge layer keeps the public `UserAction(timestamp, action,
target_node_id, target_desc, url, metadata)` surface that ReplayBundle
aggregations depend on. Each action emission produces both a structured
UserAction and a (timestamp, description) tuple for the markdown
reporter.

Markdown format is now `{timestamp_seconds}: {description}` per line
(one line per action). Updated `test_us2_replay_bundle.py` to match.

events_for query (services/replays.py):
- Now queries `$all_events` (was `$mp_session_record`) so callers see
  the actual product events that happened during the replay window,
  not just the recording-start event itself.
- Group keys now include `$time` and `$event_name` (was just
  `$mp_replay_id`) so multiple events per replay don't collapse.
- Adds `$event_name != "$mp_session_record"` filter so the recording
  event doesn't shadow real events.
- Results sorted by event_time per replay before return.

Test surface unchanged for events_for — existing unit tests mock the
query_fn so they exercised wiring rather than query shape. This is a
behavioral correction that surfaces on live data.

Coverage of the analyzer module is currently below the 90% gate; the
next commit adds targeted analyzer tests using the hand-built
`sample-replay-001.json` fixture to bring it back.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Covers the analyzer paths that test_us2_replay_bundle.py's single
fixture-based test doesn't exercise:

- analyze_events() validation: empty list / non-list → ValueError
- Console errors: rrweb/console@ + level=error → console_error action;
  non-error levels / unrelated plugins / empty payloads → ignored
- Debouncing: scroll within 1s collapses; per-node input debounce;
  re-fire after gap; checkbox + no-text-no-check fallback paths
- Mouse interactions: parametrized over all five types (click /
  right_click / dbl_click / focus / touch_start) confirming the
  documented verb appears in markdown AND the structured action carries
  the right literal ("click" for first four, "touch_start" for tap);
  unknown click type → ignored; click with no node_id → dropped;
  click on unknown but non-zero node_id → "Clicked element" fallback
- Selection events: text-excerpt extraction from a known node; empty
  ranges → no action; unknown node → "Selected text" fallback
- Mutations: adds (post-snapshot node becomes clickable); removes
  (clicked-after-remove falls back to "element"); text changes
  (description updates and cache invalidates); attribute changes
  (aria-label added via mutation shows up in click description)
- Description fallback priorities: aria-label / title / alt / text /
  placeholder / id (parametrized); anchor with http href appends path;
  input with type=email; ancestor-context fallback when a span has no
  description but lives inside a described button
- DOMTracker direct API: _sanitize_value rules, unknown-node lookup
  returns "element" sentinel, MAX_NODES limit sets reached_max_nodes
- MarkdownReporter: empty list → "No user actions recorded." sentinel;
  ms → seconds division; multi-line join

All fixtures are hand-built. No external data.

Coverage on _internal/replays/rrweb_analyzer.py is now 90% (was 55%
after removing the upstream-fixture tests); project-wide coverage is
at the 90% gate. 6657 tests / 0 failed / mypy clean / build OK.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
discover() and events_for() parsed result.df and looked up columns by
property name ($mp_replay_id, $time, ...). The real Insights .df only
flattens one segment level and names group axes segment/date, so both
silently returned empty against live data — list_replays found 0 replays
for users with hundreds. The unit tests passed only because they mocked a
.df column shape the API never returns.

- Parse result.series directly (skipping $overall rollups) via a shared
  _flatten_series helper; events_for handles arbitrary event-property
  nesting depth this way.
- Discovery reads start_time from a min aggregation (math="min",
  math_property="$time") — one compact value per replay, no per-second
  time buckets and no Insights result-cap risk. Note the property is
  "$time"; plain "time" silently returns an empty series.
- Default retention to 30 + UserWarning when $mp_replay_retention_period
  is absent (FR-005).
- Remove the now-dead _pick_column.

Rewrite the discovery/events_for tests to mock the real series shape (the
fake-.df mocks are what hid the bug), add a _flatten_series PBT, and amend
FR-003 + data-model to describe the series-parsing / min($time) approach.

Verified live against Mixpanel project 3: list_replays(distinct_id=...)
returns summaries (was 0), replay_ids hydrate, events_for returns
time-sorted events. just check green (6669 passed, 90.69% coverage).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Live QA against project 3 surfaced three issues the synthetic-fixture tests
could not catch:

#2 events_for / events_for_replays queried $all_events with no date window,
   falling back to Workspace.query's default last=30 — silently missing
   events for replays 31–90 days old (still within retention). Now defaults
   to a 90-day lookback (= max retention) and accepts an explicit from/to;
   fetch_replay scopes the join to the replay's own day(s) from its bytes.

#3 replays_for_user timed out / hard-failed on active users. fetch_replays
   re-raised the first per-replay failure (one stalled CDN read sank the
   whole bundle) and replays_for_user defaulted limit=100. Now, mirroring the
   reference MCP server: fetch_replays skips per-replay failures
   (continue-on-error), raising only if every replay fails; replays_for_user
   defaults limit=20; the CDN per-request timeout drops from a flat 120s to
   connect=10s / read=30s so one stall can't hang for two minutes.

#4 the DOMTracker MAX_NODES warning fired on normal large sessions (complex
   SPA full-snapshots routinely exceed 50k nodes). It degrades gracefully and
   matches the upstream analyzer, so it is now DEBUG, not WARNING.

just check green (mypy --strict, ruff, >=90% coverage). Verified:
replays_for_user completes with partial results instead of failing;
events_for honors the scoped window; the node-cap log is silent at WARNING.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
fetch_replays fired one Insights query per replay — a retention-discovery
query (fetch_replay with retention_days=None) plus, with
include_mixpanel_events, an events query. replays_for_user(20) thus issued
~40 queries and exhausted the Insights rate limit (132s with backoff). The
reference MCP server batches instead.

- 5a: fetch_replays accepts retention_by_id and passes it to each
  fetch_replay, skipping per-replay retention discovery. replays_for_user
  threads the retention it already got from list_replays.
- 5b: fetch_replays now fetches bytes with include_mixpanel_events=False and
  joins Mixpanel events in ONE events_for_replays call across all replays
  (combined min-start..max-end window), attaching them via dataclasses.replace.

Result, live against project 3: replays_for_user(20) drops from 132s +
rate-limited to 6.6s, 20 replays + 824 events joined. The single
fetch_replay path is unchanged (one replay, one events query).

just check green (mypy --strict, ruff, >=90% coverage).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ypy (044, US4)

Installed [replay-mining] + [replay-ml] and exercised the real paths against
project 3 (the prior QA only hit the not-installed gating). tslearn cluster()
works (actions + pages, labels in range, original unmutated); label_fn flows
through; pm4py consumes the event log (inductive miner produced a net). Three
issues surfaced that the extras-absent tests could not catch:

#6 event_log() returns a DataFrame, not a pm4py EventLog, even with pm4py
   installed — the adapter calls pm4py.format_dataframe (which standardizes
   columns and returns a DataFrame in pm4py 2.x; convert_to_event_log is what
   yields an EventLog). The docstring + spec US4 promised an EventLog. Decision:
   fix the docs, not the code — pm4py 2.7+ treats a formatted DataFrame as a
   first-class event log (the miners accept it directly). Corrected the
   event_log/adapter docstrings, spec FR-025 + US4, data-model, quickstart,
   CHANGELOG.

#7 cluster() crashed (ValueError: array of sample points is empty) when any
   replay had an empty feature sequence — e.g. features="pages" on a clicks-only
   session, or a replay with no actions. tslearn's resampler can't handle a
   zero-length series. Fixed by encoding an empty sequence as a single sentinel
   token so it clusters by emptiness instead of crashing.

#8 the inline `# type: ignore[import-not-found]` on the pm4py/tslearn imports
   were wrong once the extras are installed (mypy then reports import-untyped),
   which would have failed CI — ci.yml runs `uv sync --all-extras`, so CI always
   has the extras, and this never-pushed branch never hit it. Replaced with
   `ignore_missing_imports` mypy overrides (the same pattern networkx/anytree
   use), which is correct whether or not the extras are installed.

Tests: added skipif-gated present-path suites (test_pm4py_adapter,
test_ml_adapter) that run when the extras are installed (i.e. in CI), and
converted the extras-absent tests to simulate absence via sys.modules so the
ImportError hint (SC-006) and the pm4py-absent fallback stay covered in CI too.

just check green against the --all-extras CI config (mypy --strict, ruff,
>=90% coverage).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…y (044)

The Phase-3 / US4 capabilities — process-mining event logs
(ReplayBundle.event_log → pm4py, [replay-mining]) and DTW sequence clustering
(ReplayBundle.cluster → tslearn, [replay-ml]) — are over-engineering and not a
correct use of those libraries, so they're removed entirely. This reverses
commit 13b5da6 and the original Phase-3 surface.

Code:
- Delete _internal/replays/pm4py_adapter.py and ml_adapter.py.
- Remove ReplayBundle.event_log() and ReplayBundle.cluster() from types.py.

Config:
- Drop the replay-mining / replay-ml / replay-all optional extras and the
  pm4py.* / tslearn.* mypy overrides from pyproject; regenerate uv.lock
  (drops pm4py, tslearn, scipy, scikit-learn, numba, ...).

Collateral (the graph survivors): page_graph / element_graph / path_tree
pointed their docstrings and ImportError messages at the now-deleted
[replay-all] extra, but networkx and anytree are core dependencies — so those
lazy-import guards were dead code. Simplified them to direct lazy imports and
fixed the stale "[replay-all]" framing (class docstring + labels.py example).

Kept (unaffected): the rrweb analyzer, the label functions
(default_label_fn / selector_label_fn / url_normalizer, still used by
top_paths / find_pattern), all aggregations, the 7 DataFrame projections,
the networkx/anytree graph + tree projections, and the rest of ReplayBundle.

Tests: delete test_pm4py_adapter.py / test_ml_adapter.py and the event_log /
cluster tests in test_us2_replay_bundle.py.

Docs (authoritative + spec-kit): pruned User Story 4, FR-025 / FR-041..043,
SC-005 / SC-006, the pm4py/tslearn assumptions, the event_log/cluster API +
error-message contracts, and the Phase-3 sections of plan / research / tasks.
The original context/session-replay-plan.md brainstorm is left as history.

just check green (mypy --strict, ruff, >=90% coverage) against the
uv sync --all-extras CI config, which now installs no replay extras.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Live QA against a real high-volume SPA project showed the session-replay
process-mining projections produce empty or degenerate output on real data,
while the deterministic session layer is genuinely useful.

Cut (not demoted): page_graph, element_graph, path_tree, transitions_df,
pages_df (both Replay and ReplayBundle), top_paths, top_pages, dead_clicks,
their tests, and the aggregator functions. page_path() now reads navigate
actions.

Harden the core:
- Custom __repr__ on Replay/ReplayBundle: bundle repr drops from ~69 MB
  (it dumped every rrweb event) to ~85 chars.
- Thread distinct_id from discovery through fetch_replay / fetch_replays /
  replays_for_user so sessions_df identifies the user (was always None).
- Restore summary_markdown to MCP parity. The vendored analyzer already
  computed the rich description ("Clicked X", "Scrolled") but fetch_replay
  discarded it; store it on UserAction, render from it, and collapse
  consecutive duplicates into a count suffix. actions_df gains a
  description column.

Rework: top_clicks and elements_df exclude focus-only interactions (no more
focus+click double-count); elements_df groups by URL-normalized path.

Docs pruned to match (spec FRs, data-model, contracts, quickstart, plan,
tasks, CLAUDE.md). networkx/anytree stay as deps (flow/schema results use
them).

just check green: 6664 passed, 91.70% coverage, mypy --strict clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Jun 5, 2026

Confidence Score: 5/5

Safe to merge — the credential-masking, CDN sentinel, and MAX_NODES guard paths are all correct, and the three blocking issues from earlier review rounds are fully resolved.

All load-bearing correctness points — 403/404 sentinel logic, re-sign flow, credential scrubbing in error paths, and the MAX_NODES bug — are clean. The two comments left are style-level suggestions with no impact on runtime behavior.

No files require special attention; the two suggestions in rrweb_analyzer.py are non-blocking.

Important Files Changed

Filename Overview
src/mixpanel_headless/_internal/services/replays.py Core CDN walker and discovery service; 404/403 sentinels, re-sign logic, and credential scrubbing all look correct; 90-day fallback for windowless queries is properly applied
src/mixpanel_headless/_internal/replays/rrweb_analyzer.py Single-pass analyzer with fixed MAX_NODES guard; DOMTracker BFS uses list.pop(0) (O(n)); analyze_events() raises on empty while RrwebAnalyzer.analyze() returns empty — minor API inconsistency
src/mixpanel_headless/types.py SignedReplay credential masking in repr/str is correct; to_dict() includes full bearer + _warning key; Replay/ReplayBundle DataFrame projections look sound
src/mixpanel_headless/workspace.py Thin facades delegating to ReplaysService; fetch_replays thread-pool isolation is correct; replays_for_user passes retention_by_id to skip per-replay re-discovery
src/mixpanel_headless/cli/commands/replays.py Credential masking correct; --mixpanel-events/--no-mixpanel-events defaults to True matching Python API; --reveal-signed-urls emits stderr warning on every invocation
src/mixpanel_headless/_internal/api_client.py SESSION_RECORDING_SENSITIVE_DATA 403 mapped to SessionReplayAccessError; body text check is safe against both dict and string response bodies
src/mixpanel_headless/_internal/replays/aggregators.py rage_clicks sliding-window logic is correct; focus-only click filtering matches analyzer mapping; empty-bundle guard on every aggregator
src/mixpanel_headless/exceptions.py Four-class SessionReplayError hierarchy added cleanly; _DEFAULT_CODE and _DEFAULT_STATUS overrides are correct for each subclass

Sequence Diagram

sequenceDiagram
    participant C as Caller
    participant W as Workspace
    participant RS as ReplaysService
    participant API as MixpanelAPIClient
    participant CDN as CDN (httpx)
    participant A as RrwebAnalyzer

    C->>W: replays_for_user(user, from, to)
    W->>RS: discover(distinct_id, from_date, to_date)
    RS->>API: "query($mp_session_record, group_by=[replay_id, retention])"
    API-->>RS: QueryResult.series
    RS-->>W: list[ReplaySummary]

    W->>RS: sign([replay_id, ...])
    RS->>API: POST /replays/sign/bulk
    API-->>RS: "[{url, query_string}]"
    Note over RS: 403+SESSION_RECORDING_SENSITIVE_DATA → SessionReplayAccessError

    RS-->>W: list[SignedReplay]

    W->>RS: fetch_files(signed, retention_days)
    loop batch of 50 CDN files
        RS->>CDN: "GET {url}{N:04d}-{retention}.json?{qs}"
        CDN-->>RS: 200/404/403
        Note over RS: first-file 404 → ReplayNotFoundError, later 404 → end sentinel, 403 → re-sign once
    end
    RS-->>W: list[rrweb_event]

    W->>A: RrwebAnalyzer().analyze(rrweb_events)
    A-->>W: AnalyzerResult(actions, markdown, pages, errors)

    W-->>C: ReplayBundle
Loading

Reviews (7): Last reviewed commit: "Type mobile-replay rejection as Unsuppor..." | Re-trigger Greptile

Comment thread src/mixpanel_headless/workspace.py
Comment thread src/mixpanel_headless/_internal/services/replays.py
Comment thread src/mixpanel_headless/cli/commands/replays.py
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a first-class session replay surface to mixpanel-headless, spanning discovery (Insights query path), signing (bulk replay CDN signing), fetching (parallel CDN walk), and analysis (vendored rrweb analyzer → normalized actions + bundle-level aggregations), plus a new mp replays CLI group and accompanying specs/tests.

Changes:

  • Add session replay library API: replay discovery, signing, raw rrweb fetch/stream, analysis output (Replay, ReplayBundle, SignedReplay, ReplaySummary, ReplayEvent, UserAction) and new replay exception hierarchy.
  • Add mp replays CLI commands (list, events, sign, fetch, analyze, for-user) plus exit-code mappings for replay-specific errors.
  • Add extensive unit/PBT/live integration coverage and feature documentation/spec artifacts.

Reviewed changes

Copilot reviewed 42 out of 44 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
uv.lock Updates lock metadata (exclude-newer) for dependency resolution.
src/mixpanel_headless/_internal/services/replays.py Implements replay signing orchestration and CDN walking/fetching (sync + async).
src/mixpanel_headless/_internal/api_client.py Adds sign_replays() and maps sensitive-data 403 to SessionReplayAccessError.
src/mixpanel_headless/_internal/replays/rrweb_analyzer.py Vendored rrweb analyzer producing normalized UserAction streams + markdown summaries.
src/mixpanel_headless/_internal/replays/labels.py Adds URL normalization + label functions for action aggregation keys.
src/mixpanel_headless/_internal/replays/aggregators.py Adds bundle-level aggregations (top_clicks, rage_clicks, long_pauses, etc.).
src/mixpanel_headless/_internal/replays/init.py Declares the internal replays subpackage.
src/mixpanel_headless/workspace.py Adds public Workspace replay methods (discover/sign/fetch/stream/bundle/analyze).
src/mixpanel_headless/types.py Adds typed replay result classes + DataFrame projections and bundle utilities.
src/mixpanel_headless/exceptions.py Adds replay-specific exception hierarchy under APIError.
src/mixpanel_headless/init.py Re-exports new replay types/exceptions + label helpers as public API.
src/mixpanel_headless/cli/main.py Registers the new replays Typer command group.
src/mixpanel_headless/cli/commands/replays.py Implements mp replays commands, including signed URL redaction + opt-in disclosure.
src/mixpanel_headless/cli/utils.py Adds CLI exit-code mapping for SessionReplayAccessError and ReplayNotFoundError.
tests/unit/test_workspace_replays.py Unit tests for Workspace replay wiring/validation and bundle composition.
tests/unit/test_us2_replay_bundle.py End-to-end unit coverage for analyzer outputs, labels, aggregations, and bundle filters.
tests/unit/test_types_signed_replay.py Locks down signed URL credential masking (repr/str) + TTL arithmetic + validation.
tests/unit/test_types_replay.py Validates Replay projections and convenience accessors (including actionless defaults).
tests/unit/test_types_replay_summary.py Validates ReplaySummary construction, serialization, and .df projection.
tests/unit/test_types_replay_event.py Validates ReplayEvent construction, serialization, and .df projection.
tests/unit/test_exceptions_session_replay.py Tests replay exception hierarchy and stable message/detail behaviors.
tests/unit/cli/test_replays_cli.py CLI behavior tests including redaction/warnings and exit-code mapping.
tests/unit/_internal/test_api_client_sign_replays.py Verifies sign endpoint URL/body shape and 403 sensitive-data mapping behavior.
tests/pbt/test_replays_series_pbt.py Property-based tests for Insights series flattening behavior.
tests/pbt/test_cdn_walker_pbt.py Property-based tests for CDN walker termination, ordering, and not-found behavior.
tests/integration/test_replays_live.py Live-gated integration tests for list/sign/fetch against real projects/CDN.
tests/fixtures/rrweb/sample-replay-001.json Adds a small rrweb fixture stream for analyzer and projection tests.
tests/fixtures/rrweb/README.md Documents rrweb fixture semantics and event-shape reference.
specs/044-session-replay/spec.md Feature spec and acceptance scenarios for session replay.
specs/044-session-replay/plan.md Implementation plan and PR phasing notes for the feature.
specs/044-session-replay/research.md Captures design decisions and rejected alternatives.
specs/044-session-replay/quickstart.md Reviewer/user walkthrough for discovery/sign/fetch/analyze/CLI usage.
specs/044-session-replay/data-model.md Data model and state transition documentation for replay pipeline.
specs/044-session-replay/contracts/python-api.md Public Python API contract for new replay surface.
specs/044-session-replay/contracts/cli-commands.md CLI contract for mp replays commands and options.
specs/044-session-replay/contracts/error-messages.md Stable error message catalog for replay-specific failures.
specs/044-session-replay/checklists/requirements.md Spec quality checklist for the feature spec.
CHANGELOG.md Adds unreleased changelog entries for Phase 1/2 session replay surface.
CLAUDE.md Updates the “current plan” pointer and repo-level feature notes for 044.
.specify/feature.json Switches active feature directory to specs/044-session-replay.

Comment thread src/mixpanel_headless/_internal/services/replays.py
Comment thread src/mixpanel_headless/_internal/replays/aggregators.py Outdated
Comment thread src/mixpanel_headless/types.py Outdated
Comment thread src/mixpanel_headless/types.py Outdated
Comment thread src/mixpanel_headless/types.py Outdated
Comment thread specs/044-session-replay/contracts/cli-commands.md Outdated
Comment thread specs/044-session-replay/contracts/python-api.md Outdated
Comment thread CHANGELOG.md Outdated
jaredmixpanel and others added 2 commits June 4, 2026 19:03
…044)

Session replay shipped in 044 but was never written up in any user-facing
surface. Document it at parity with the other query engines:

- README: capability paragraph + key-features bullet, the mp replays CLI
  reference, a replays_for_user Python example, and a docs link.
- docs/guide/session-replay.md (new): discovery, fetch, streaming, the five
  DataFrame projections, the action timeline, aggregations, filters,
  Mixpanel-event correlation, signed-URL safety, and the CLI. Registered in
  the mkdocs nav + index.
- docs/api: Session Replay Types (Replay, ReplayBundle, ReplaySummary,
  SignedReplay, UserAction, ReplayEvent + label functions) and Session Replay
  Exceptions sections; a Session Replay subsection on the Workspace page.
- CLAUDE.md: capability areas, package structure, CLI command list.
- mixpanelyst skill: a Session Replay section + trigger keywords.

cli/commands.md (mkdocs-typer) and help.py (introspection) already cover
replay automatically. All content describes the post-hardening surface.

mkdocs build --strict passes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Address all 11 review threads from greptile + copilot on the session-replay PR.

Behavioral:
- discover() uses a 90-day default lookback when no window is given, mirroring
  events_for. Fixes a false ReplayNotFoundError for 31-90-day-old replays whose
  retention was silently defaulting to 30 via _resolve_retention.
- Redact the signed query_string bearer credential from CDN-fetch exception
  messages (the URL could leak through httpx error stringification).
- rage_clicks() drops focus-only click actions (same predicate real_clicks uses)
  so bursts are not inflated by analyzer-mapped focus events.
- Drop the doubled "UserWarning:" prefix on the retention-default warning.
- for-user CLI: add --mixpanel-events/--no-mixpanel-events (default on) to match
  Workspace.replays_for_user; retire the --include events token.

Docs:
- Refresh stale Phase 1/2 docstrings on Replay/UserAction (the analyzer now
  populates actions on fetch) plus matching internal comments in types.py.
- Spec/CHANGELOG accuracy: list format default json (not table), label-helper
  re-export surface (top-level package), and the real ReplayBundle surface
  (5 projections / 3 aggregations / 6 filters; graph/tree/cut aggregations gone).

Tests: 6672 passed, 91.82% coverage, mypy --strict + ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jaredmixpanel jaredmixpanel added claude-review Trigger Claude Code PR review codex-review Trigger Codex PR Review labels Jun 5, 2026
@github-actions

This comment was marked as resolved.

@claude

This comment was marked as resolved.

…nv (044)

- Move label helpers (default_label_fn, selector_label_fn, url_normalizer)
  out of _internal/replays/labels.py into a public replay_labels.py module so
  the public API no longer leaks _internal. Repoint __init__, the types.py
  lazy imports, the test import, and the docs (CLAUDE.md, python-api contract,
  CHANGELOG, spec mutation-gate refs) at the public location.
- Fill in complete Summary/Args/Returns/Example docstrings for the
  ReplayBundle methods flagged as one-liners (top_clicks, rage_clicks,
  long_pauses, filter, error_sessions, head, to_dict) plus the df and
  summary_markdown properties.
- Replace the two '# type: ignore[arg-type]' on env in 'mp replays sign/fetch'
  with cast(Literal['prod','dev'], env) after the runtime guard.
- Validate 'mp replays for-user --include': reject unsupported values with
  typer.BadParameter (fail fast, before workspace resolution) and add CLI
  tests for the rejected-typo and accepted-value paths.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment thread src/mixpanel_headless/_internal/replays/rrweb_analyzer.py
jaredmixpanel and others added 2 commits June 4, 2026 22:05
…ps (044)

Fixes the valid findings from the second review pass.

Bug:
- selector_label_fn was documented-but-broken: the analyzer never put DOM
  attributes on UserAction.metadata, so it always fell through to the URL.
  The DOM tracker now captures every data-* selector (new _selector_attrs +
  node['selectors'] + get_node_selectors) and click/input emit merges them
  into metadata. Two regression tests cover propagation and the end-to-end
  selector_label_fn label.

Cleanups:
- analyze_replay: fix the false 'skips the analyzer' docstring and route
  'mp replays analyze' (default path) through it so it is no longer dead code.
- CLI: replace bare print() with console/err_console (markup/highlight off,
  soft_wrap on) so JSON/markdown output is not mangled and NO_COLOR is honored.
- env: use Typer-native Literal['prod','dev'] on sign/fetch, dropping the
  manual guard and the earlier cast; add --env rejection tests.
- _render_markdown: drop the unused 'pages' parameter.
- node_id checks: 'is None'/'is not None' instead of truthy (handles id 0).

Docs / robustness:
- Tighten walk_cdn_async / stream_replay ordering docstrings to
  '(file-number, then in-file timestamp)'; derive Replay start/end via
  min/max so they hold regardless of yield order.
- Dedup the two ReplayNotFoundError sites into replay_not_found_error().
- Document the running-event-loop constraint and the concurrency x
  cdn_concurrency connection floor on the fetch methods.
- Add --from/--to to 'mp replays events' (+ test); document find_pattern([])
  as match-all; add a PII callout to the session-replay guide.

just check: 6680 passed, 1 skipped, 91.88% coverage, build OK.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e (044)

greptile P1 (PR #193): reached_max_nodes was ANDed into the skip condition,
so once the flag flipped on the first over-limit node, the condition could
never be true again — every subsequent new node bypassed the 'continue' and
was added, letting self.nodes grow past MAX_NODES (and descend into subtrees
it should have skipped). The flag was only ever de-duping the debug log.

Move the flag check inside the skip block so the skip fires for every new node
at the cap while the log still emits once. Updates to already-tracked nodes
(node_id in self.nodes) still fall through, so no node-refresh regression.

Add two regression tests: one adds 7 nodes with cap=2 and asserts len==2 (the
old code left 6); one asserts re-adding a known node at the cap still updates
in place. The previous test only checked the flag flipped, which is why the
bug shipped.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 50 out of 52 changed files in this pull request and generated 8 comments.

Comment thread src/mixpanel_headless/cli/utils.py
Comment thread src/mixpanel_headless/cli/utils.py
Comment thread src/mixpanel_headless/cli/commands/replays.py Outdated
Comment thread src/mixpanel_headless/cli/commands/replays.py Outdated
Comment thread docs/guide/session-replay.md Outdated
Comment thread tests/unit/test_types_signed_replay.py Outdated
Comment thread tests/unit/test_replay_bundle.py
Comment thread tests/unit/test_types_replay.py
…044)

Resolve the 8 unresolved GitHub Copilot review comments:
- CLI imports and handles SignedURLExpiredError with the canonical
  "signed URL expired (5-minute TTL)" message, exit 1 (error-messages.md §2)
- `mp replays events` validates the >5 event-property cap in-CLI, emitting the
  stable message and exit 3 before touching the workspace (fires without auth)
- replays CLI module docstring drops the "Phase 1 / analyze+for-user later" claim
- docs/guide/session-replay.md: `events`/`sign` examples use positional IDs
- froze time in the signed-URL expiry test to remove a 1s wall-clock flake
- corrected the long-pause fixture docstring (~1000s gap, not "near-1s")
- test_types_replay module docstring updated to shipped analyzer behavior

Strip the fictional Phase 1/Phase 2 split and US#/T0NN task IDs (the feature
shipped in one pass): docstrings and comments across the replay surface, the
CHANGELOG collapsed to a single "Session Replay (044)" entry, renamed
test_us2_replay_bundle.py -> test_replay_bundle.py and the TestReplayPhase1Empty
/ TestReplaysForUserUS2 classes, and removed the never-landed "Phase 2 fixtures"
note from the rrweb fixtures README. The `mp replays --help` test now asserts
all six subcommands. Codebase-wide (Phase 0NN) feature-provenance labels are
intentionally left untouched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jaredmixpanel jaredmixpanel marked this pull request as ready for review June 5, 2026 18:19
Comment thread context/session-replay-plan.md
joshua-koehler
joshua-koehler previously approved these changes Jun 5, 2026
Copy link
Copy Markdown

@joshua-koehler joshua-koehler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great improvement here, and just the beginning of what we can do with replays!

The CDN walker raised a builtin NotImplementedError when a replay's first
event wasn't rrweb-shaped (mobile / non-web format). That type isn't in the
CLI handle_errors map, so `mp replays analyze|fetch` on a mobile replay
leaked an uncaught traceback instead of a clean message.

Add UnsupportedReplayFormatError(SessionReplayError) (status 501), raise it
from walk_cdn_async instead of the builtin, and map it in handle_errors to a
curated one-liner + exit 1. Batch paths (fetch_replays / replays_for_user)
already isolate it per-replay, so mixed web+mobile bundles still drop the
mobile sessions with a warning.

Resolves PR #193 review thread on mobile sessions.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jaredmixpanel jaredmixpanel merged commit d869c8e into main Jun 5, 2026
14 checks passed
@jaredmixpanel jaredmixpanel deleted the 044-session-replay branch June 5, 2026 22:43
jaredmixpanel added a commit that referenced this pull request Jun 5, 2026
Minor release covering everything merged since 0.1.1:
- Session replay (044): discovery, sign, CDN fetch, rrweb analyzer,
  ReplayBundle analytics (#193)
- schema_graph(): full Lexicon graph with event<->property relationships (#190)
- Lexicon definitions write display_name + example_value (#189)
- Workspace auto-resolves from /me with metadata fallback (#188)
- activity_feed migrated stream/query -> stream/bookmark (#187)
- Rate-limit-increase lead form on hard 429 (#192)
- JQL removed (breaking) (#185)

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

claude-review Trigger Claude Code PR review codex-review Trigger Codex PR Review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants