Skip to content

[codex] prune raw train rollout payloads#2839

Draft
samsja wants to merge 1 commit into
fix/orchestrator-release-batch-refsfrom
fix/orchestrator-prune-raw-rollout-payload
Draft

[codex] prune raw train rollout payloads#2839
samsja wants to merge 1 commit into
fix/orchestrator-release-batch-refsfrom
fix/orchestrator-prune-raw-rollout-payload

Conversation

@samsja

@samsja samsja commented Jun 18, 2026

Copy link
Copy Markdown
Member

Summary

Prunes duplicate raw train rollout payloads after trainer samples are built:

  • projects raw token dicts into a small allowlisted compact payload
  • keeps only token lengths and routed-expert shape metadata in the raw trajectory
  • drops token fields that are not needed after TrainingSample construction
  • keeps trainer-bound TrainingSample payloads intact
  • lets filters read generated token/logprob data from samples when raw tokens are pruned
  • keeps token length penalty, turn-count metrics, and monitor logging compatible with compacted raw payloads
  • keeps scalar temperature changes out of this PR

Ablation

Same fast no-inference routed-payload run used for the memory stack: 5 steps, batch=64, inflight=128, seq_len=32768, turns=50, max_off_policy=4, delay mean 0s, std 3.33s.

layer post-step RSS totals GiB post-step peak external peak runtime
trim + GC + release refs 4.378, 6.056, 7.571, 9.283, 9.741 9.741 15.838 1m33s
+ raw pruning 4.099, 5.746, 6.729, 8.377, 9.276 9.276 12.508 1m35s

Takeaway: raw pruning reduces external peak RSS by about 3.33 GiB; post-step retained RSS drops by about 0.47 GiB.

Validation

  • uv run ruff check src/prime_rl/orchestrator/trajectories.py src/prime_rl/orchestrator/filters.py src/prime_rl/orchestrator/train_sink.py src/prime_rl/orchestrator/types.py src/prime_rl/orchestrator/utils.py src/prime_rl/orchestrator/metrics.py src/prime_rl/orchestrator/orchestrator.py src/prime_rl/utils/monitor/prime.py src/prime_rl/utils/monitor/wandb.py tests/unit/orchestrator/test_trajectories.py tests/unit/orchestrator/test_filters.py tests/unit/orchestrator/test_advantage.py
  • uv run ruff format --check src/prime_rl/orchestrator/trajectories.py src/prime_rl/orchestrator/filters.py src/prime_rl/orchestrator/train_sink.py src/prime_rl/orchestrator/types.py src/prime_rl/orchestrator/utils.py src/prime_rl/orchestrator/metrics.py src/prime_rl/orchestrator/orchestrator.py src/prime_rl/utils/monitor/prime.py src/prime_rl/utils/monitor/wandb.py tests/unit/orchestrator/test_trajectories.py tests/unit/orchestrator/test_filters.py tests/unit/orchestrator/test_advantage.py
  • uv run pytest tests/unit/orchestrator/test_trajectories.py tests/unit/orchestrator/test_filters.py tests/unit/orchestrator/test_advantage.py tests/unit/orchestrator/test_batch.py tests/unit/train/rl/test_packer.py
  • git diff --check

@samsja samsja force-pushed the fix/orchestrator-prune-raw-rollout-payload branch 2 times, most recently from 3455a57 to 1182bcc Compare June 19, 2026 01:17

@mikasenghaas mikasenghaas left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldnt merge this imo

@samsja samsja force-pushed the fix/orchestrator-release-batch-refs branch from bd1b10a to 82e4eff Compare June 22, 2026 18:51
@samsja samsja force-pushed the fix/orchestrator-prune-raw-rollout-payload branch from 1182bcc to 11b33d5 Compare June 22, 2026 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants