Skip to content

feat: MuJoCo simulation backend - AgentTool with 50+ actions#85

Merged
cagataycali merged 110 commits into
strands-labs:mainfrom
cagataycali:feat/mujoco-backend
May 11, 2026
Merged

feat: MuJoCo simulation backend - AgentTool with 50+ actions#85
cagataycali merged 110 commits into
strands-labs:mainfrom
cagataycali:feat/mujoco-backend

Conversation

@cagataycali
Copy link
Copy Markdown
Member

@cagataycali cagataycali commented Apr 1, 2026

TL;DR

Complete MuJoCo simulation backend for strands-robots, shipped as a Strands AgentTool with 50+ actions (added replace_scene_mjcf + patch_scene_mjcf in the MjSpec refactor). An agent can spin up a physics world, load robots + objects, step physics, render RGB/depth cameras, run policies, record LeRobot-format datasets, and perform advanced physics queries — all via natural language through a single tool.

Part 4 of 6 in the MuJoCo-sim PR decomposition (follows #83 build-system, #84 sim foundation).

🧑‍⚖️ Reviewer note — the diff is big (~17.7k/−900 lines, 101 commits on the fork branch) but most of the noise is cosmetic. The How to review this PR section below lays out a ~10-file reading order that covers the actual new functionality, followed by visual proof that the whole surface works end-to-end.


🎬 Visual proof it actually works

50 deterministic scenarios were run against the current code (commit ffc3ba0) to produce
videos + images for every category of tool action. Artifacts live on the
cagataycali/robots:pr85-demos
branch; the regeneration script is generate_demos.py (755 LOC, scripted — no LLM calls,
reproducible). 50/50 scenarios completed successfully.

hero collage

Samples (click to open the raw MP4 / PNG)

# Category Demo Artifact
1 Single-robot policy rollout MockPolicy on a 3-DoF arm, 30fps video 📹 cycle_01.mp4
2 Multi-robot concurrent ctrl Two arms in the same scene, alternating cameras (#114) 📹 cycle_11.mp4
3 replace_scene_mjcf escape hatch Agent-authored <tendon> pendulum (can't be expressed via SimObject) 📹 cycle_16.mp4
4 patch_scene_mjcf iterative build Scene grown block-by-block, one render per step cycle_21
5 add / remove object lifecycle Empty → + cube → + ball → + cylinder → fall → remove all cycle_26
6 Multi-camera capture Same scene rendered from arm/side + arm/top + arm/front cycle_31
7 Physics probe Live get_mass_matrix + get_contacts overlaid on rendered scene cycle_36
8 Raycast fan 7 rays scanning across target + obstacle, hit/miss annotated cycle_41
9 Endurance rollout ~5s rollout, ~100 frames, proves stability over time 📹 cycle_46.mp4

All 50 cycles + findings JSON: cagataycali/robots:pr85-demos.


How to review this PR

There's a lot going on. To keep the review tractable, here's what actually matters vs. what's noise.

✅ 1. Must-read — the simulation backend

These are the ~3–4k lines of real new functionality. Review in this order:

# File LOC Purpose
1 strands_robots/simulation/base.py 465 SimEngine ABC — the public contract every backend implements
2 strands_robots/simulation/factory.py 229 create_simulation() + runtime register_backend()
3 strands_robots/simulation/mujoco/backend.py 156 Lazy import mujoco + headless-GL auto-config (osmesa/egl detection)
4 strands_robots/simulation/mujoco/simulation.py 2,030 Simulation(AgentTool) — the orchestrator. All 37 agent actions live here. Primary review target.
5 strands_robots/simulation/mujoco/tool_spec.json 370 JSON schema for the 37 actions (this is what the LLM sees)
6 strands_robots/simulation/mujoco/spec_builder.py 445 NEWSpecBuilder uses mujoco.MjSpec AST instead of string-concat MJCF (replaced 273-line mjcf_builder.py)
7 strands_robots/simulation/mujoco/scene_ops.py 510 Live MjSpec.recompile()-based scene mutation (down from 980 LOC on the old XML-round-trip path)
8 strands_robots/simulation/mujoco/physics.py 1,126 PhysicsMixin — raycast, jacobian, energy, forces, mass matrix, checkpoints, inverse dynamics. Review by feature.
9 strands_robots/simulation/mujoco/rendering.py 741 RenderingMixin — offscreen RGB + depth cameras, multi-camera capture
10 strands_robots/simulation/policy_runner.py 790 Backend-agnostic PolicyRunner.run() loop with video recording + on_frame hooks
11 strands_robots/simulation/mujoco/recording.py 218 RecordingMixin — per-episode LeRobot dataset capture
12 strands_robots/simulation/mujoco/randomization.py 137 RandomizationMixin — domain randomization hooks

🧪 2. Tests — proves the above works

980 passing tests on the sim+registry suite, 1 skipped (headless-CI video recording). New this round:

File What it locks in
tests/simulation/mujoco/test_spec_builder.py Replaces the old XML-builder unit tests; asserts on MjModel/MjsBody structure, not XML strings
tests/simulation/mujoco/test_replace_scene_mjcf.py Agent-authored raw MJCF escape hatch (tendon, equality, pair)
tests/simulation/mujoco/test_patch_scene_mjcf.py Structured-op scene mutation; atomic rollback on mid-batch failure
tests/simulation/mujoco/test_export_xml_after_replace.py Regression for export_xml + MjSpec interaction (surfaced by agent-in-the-loop testing)
tests/simulation/test_video_camera_validation.py DX regression: wrong camera name fails fast with a "cameras are namespaced" hint
tests/simulation/mujoco/test_concurrency_lock_audit.py get_mass_matrix / get_sensor_data / get_contacts all hold self._lock during mj_forward
tests/simulation/test_no_import_cycle.py Enforces zero runtime import cycles (was a lazy-import hack in the old code)

📊 3. Coverage

Simulation subpackage: 86% line coverage (2,464/2,868 lines covered). Per-file:

File Coverage
simulation/models.py 100%
simulation/mujoco/randomization.py 100%
simulation/factory.py 97%
simulation/mujoco/spec_builder.py 95%
simulation/policy_runner.py 92%
simulation/base.py 90%
simulation/mujoco/rendering.py 88%
simulation/mujoco/backend.py 86%
simulation/mujoco/simulation.py 84%
simulation/mujoco/recording.py 84%
simulation/mujoco/physics.py 82%
simulation/mujoco/scene_ops.py 74%

0%-coverage strands_robots/tools/* files are physical-hardware adapters not touched by this PR; they're tracked under a separate follow-up issue.


🆕 What changed since the last review pass

MjSpec refactor — closed #121-#126 (the "replace string-concat MJCF with mujoco.MjSpec" umbrella)

Stage What Commit
0-2 Bump mujoco>=3.2, add SpecBuilder, replace camera xyaxes math ad1d298
3-7 Single/multi-robot spec.attach(), scene inject/eject via spec.recompile(), delete mjcf_builder.py, cleanup c2e8826
6b patch_scene_mjcf(ops=[...]) for structured scene edits with atomic rollback 3404809
7b export_xml prefers spec.to_xml(), drops mj_saveLastXML fallback (surfaced by agent testing) 80caf82 / 38442b0

Closes umbrella #121 and sub-issues #122#126.

Test-directory restructure + code hygiene

  • Moved tests/simulation/… to mirror strands_robots/simulation/… (file moves only).
  • Stripped # ═════ / # ───── ASCII divider comments (28 banner lines).
  • Stripped decorative em-dashes / Unicode artifacts.
  • Deleted IDEA.md (its stages 0-7 are all landed).

3 P0/P1 bugs surfaced by the AST analysis + E2E agent harness (commit ffc3ba0)

Run independently by /tmp/ast-analysis-v2/deeper_analysis_v2.ipynb +
/tmp/e2e_agentic_test_85/e2e_agentic_test_85.ipynb:

  1. P0 — Video recording silently produced 0-byte MP4 on wrong camera name. The LLM
    agent called run_policy(video={"camera": "side"}) but add_robot() compiled the
    URDF's side camera as arm1/side. Every render() inside the loop returned
    status=error, _extract_frame_ndarray() returned None, the rollout ran to
    completion, writer.close() produced an empty file, and nothing in the result text
    hinted at the problem.
    Fix: PolicyRunner.run now pre-validates the camera name once before the step
    loop and returns a clean error dict with a "cameras are namespaced, e.g. arm1/side"
    hint.

  2. P0 — 3 read-only ops called mj_forward without self._lock. After sim/mujoco: clarify _policy_threads semantics — per-robot or global? #114 landed
    concurrent per-robot policies, get_mass_matrix / get_sensor_data / get_contacts
    could race a policy thread's _apply_sim_action (mj_step) and return torn/NaN reads.
    Fix: wrap mj_forward + data read in with self._lock: in all three; snapshot the
    data arrays under the lock so name resolution can run lock-free.

  3. P1 — simulation.base ↔ policy_runner import cycle. Papered over with three
    inline lazy imports inside SimEngine methods. The cycle is compile-time only
    (policy_runner.py imports SimEngine under TYPE_CHECKING), so the lazy imports
    were unnecessary scaffolding.
    Fix: hoist PolicyRunner + VideoConfig to module-level in base.py, delete the
    three inline lazy imports. Added a regression test that walks the AST with
    TYPE_CHECKING awareness and asserts zero runtime cycles.


Usage

from strands_robots.simulation import Simulation
from strands import Agent

sim = Simulation()
agent = Agent(tools=[sim])
agent("Create a world with an so100 robot and a red cube, then step 100 times")

Or imperatively:

sim.create_world()
sim.add_robot(data_config="so100", name="alice")
sim.add_object(name="cube", shape="box", size=[0.03,0.03,0.03], rgba=[1,0,0,1])
sim.step(n_steps=100)
rgb = sim.render(camera="top", width=640, height=480)

Agent-authored scenes (new in this round)

# Full replacement - validated by actually compiling MJCF
sim.replace_scene_mjcf("""
<mujoco>
  <worldbody>
    <body name="anchor" pos="0 0 1"><site name="a"/></body>
    <body name="mass" pos="0.3 0 1"><freejoint/>
      <geom type="sphere" size="0.05"/><site name="m"/></body>
  </worldbody>
  <tendon><spatial name="rope" width="0.01">
    <site site="a"/><site site="m"/></spatial></tendon>
</mujoco>
""")

# Or atomic batch edits:
sim.patch_scene_mjcf([
    {"op": "add_body", "name": "block_1", "pos": [0, 0, 0.1]},
    {"op": "add_geom", "body": "block_1", "type": "box", "size": [0.1, 0.1, 0.05]},
])

Key design decisions

  1. Simulation extends AgentTool directlyAgent(tools=[Simulation()]) just works, no wrapper.
  2. Lazy MuJoCo import_ensure_mujoco() only imports the heavy dep when a sim is actually created.
  3. MjSpec-backed scene edits — no more XML round-tripping. spec.recompile(model, data) preserves unchanged joint state automatically.
  4. Same Policy ABC for sim and real — a policy trained in sim runs on the real robot with zero code changes.
  5. Simulation is standalone — no dependency on Robot(). Addresses Arron's earlier ask.
  6. Backend registry is extensible — third parties can register_backend("my_sim", MySim) at runtime (covered by test_factory.py).

Testing locally

pip install -e ".[all,dev]"
hatch run test               # 980 passed, 1 skipped on the sim+registry suites
hatch run test-integ         # requires GPU + MuJoCo (separate CI job)
hatch run lint               # ruff + mypy clean across 113 source files

Regenerate the visual artifacts

# 50 deterministic scenarios; no LLM calls
/path/to/venv-with-pr85/bin/python /tmp/pr85_artifacts/scripts/generate_demos.py
# Output: /tmp/pr85_artifacts/{videos,images,scenes,findings}/

Depends on #83 (build) and #84 (sim foundation). After this lands, strands_robots.simulation.Simulation is fully usable as a standalone AgentTool.

Comment thread strands_robots/simulation/mujoco/simulation.py
Comment thread strands_robots/simulation/mujoco/simulation.py
Comment thread strands_robots/simulation/mujoco/scene_ops.py
Comment thread strands_robots/simulation/mujoco/recording.py Outdated
Comment thread strands_robots/simulation/mujoco/policy_runner.py Outdated
Comment thread strands_robots/simulation/mujoco/physics.py
Comment thread strands_robots/dataset_recorder.py
Comment thread strands_robots/dataset_recorder.py Outdated
Comment thread strands_robots/_async_utils.py
@cagataycali cagataycali force-pushed the feat/mujoco-backend branch from bc6080f to 78719d9 Compare April 1, 2026 20:03
Copy link
Copy Markdown
Contributor

@yinsong1986 yinsong1986 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All review comments addressed. LGTM.

@cagataycali cagataycali added this to the v0.4 milestone Apr 6, 2026
@cagataycali cagataycali force-pushed the feat/mujoco-backend branch 2 times, most recently from f461f30 to 4a3fd3c Compare April 6, 2026 07:03
@cagataycali
Copy link
Copy Markdown
Member Author

Rebased feat/mujoco-backend onto the updated feat/simulation-foundation (which now has the [sim] extra with robot_descriptions).

pyproject.toml extras now:

sim = [
    "robot_descriptions>=1.11.0,<2.0.0",
]
sim-mujoco = [
    "mujoco>=3.0.0,<4.0.0",
]
all = [
    "strands-robots[groot-service]",
    "strands-robots[lerobot]",
    "strands-robots[sim]",
    "strands-robots[sim-mujoco]",
]

Both robot_descriptions (for asset downloads) and mujoco (for simulation backend) are now properly declared as separate extras and included in [all]. Ready for merge after PR #84 lands.

@cagataycali cagataycali force-pushed the feat/mujoco-backend branch from dda5248 to 696b423 Compare April 6, 2026 07:27
Comment thread pyproject.toml Outdated
Copy link
Copy Markdown
Member

@awsarron awsarron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all comments in this PR, we should examine common themes and include corrections for them in AGENTS.md so that future agent runs benefit from their lessons.

Comment thread pyproject.toml
Comment thread strands_robots/simulation/mujoco/__init__.py Outdated
Comment thread strands_robots/simulation/mujoco/simulation.py Outdated
Comment thread strands_robots/simulation/mujoco/backend.py
Comment thread strands_robots/simulation/mujoco/policy_runner.py Outdated
Comment thread strands_robots/simulation/mujoco/tool_spec.json
Comment thread strands_robots/simulation/mujoco/scene_ops.py
cagataycali added a commit to cagataycali/robots that referenced this pull request Apr 13, 2026
Move _xml, _robot_base_xml, and _tmpdir from SimWorld into a generic
_backend_state dict. Each backend stores its format-specific data there
instead of polluting the base class with implementation details.

Addresses @awsarron review: 'how can we avoid having implementation
details (Mujoco) in base classes like this?'

The MuJoCo backend (PR strands-labs#85) will store these in
world._backend_state['xml'], etc. during rebase.
@cagataycali
Copy link
Copy Markdown
Member Author

Review Status Summary

All 17 review threads are now resolved.

Latest commit 6bb195a (Apr 12) fixed the Protocol annotation with TYPE_CHECKING stubs — the last open item.

CI: ✅ All checks passing
Mergeable: ✅ Clean merge with main
Threads: 17/17 resolved
Dependency: Waiting on PR #84 (simulation foundation) to merge first

@awsarron — this is ready for re-review. Once #84 merges, this can follow immediately.


🤖 Pipeline analysis by AI agent. Strands Agents. Feedback welcome!

@cagataycali
Copy link
Copy Markdown
Member Author

📋 Review Status Summary

Hi @awsarron — consolidating the current state of this PR to help with re-review.

Thread Resolution: ✅ 17/17 resolved

All 17 review threads have been addressed and resolved:

Reviewer Topics Covered Status
@awsarron Module naming (mujoco vs sim-mujoco), private function exports removed, _ensure_mujoco centralized to init, headless platform support docs, mixin coupling reduced, action↔method drift test added, XML parsing consistency (ElementTree vs regex) ✅ All resolved
@yinsong1986 SimulationBackend ABC inheritance, self._lock thread safety, XML injection validation, overwrite default safety, total_reward cleanup, tempfile.mktempNamedTemporaryFile, dead code removal, frame-drop strictness, executor reuse, sim-mujoco dependency naming ✅ All resolved

Key changes since CHANGES_REQUESTED:

  • Simulation now inherits from SimulationBackend ABC
  • Thread lock properly acquired around model/data mutations
  • XML name validation: ^[a-zA-Z0-9_-]+$ pattern enforced
  • overwrite defaults to False with FileExistsError
  • tempfile.NamedTemporaryFile replaces mktemp
  • Single reused ThreadPoolExecutor instead of per-call creation
  • Action↔method mapping test added (catches enum drift)

CI: ✅ Passing

Latest commit status: SUCCESS

Dependency context

This PR depends on #84 (simulation foundation, also 50/50 resolved) and is a prerequisite for #86 (Robot factory).


🤖 Automated review triage by Strands Agents. Feedback welcome!

@cagataycali cagataycali requested a review from awsarron April 17, 2026 16:30
cagataycali added a commit to cagataycali/robots that referenced this pull request Apr 17, 2026
Move _xml, _robot_base_xml, and _tmpdir from SimWorld into a generic
_backend_state dict. Each backend stores its format-specific data there
instead of polluting the base class with implementation details.

Addresses @awsarron review: 'how can we avoid having implementation
details (Mujoco) in base classes like this?'

The MuJoCo backend (PR strands-labs#85) will store these in
world._backend_state['xml'], etc. during rebase.
@cagataycali cagataycali modified the milestones: v0.4.0, v0.3.9 Apr 21, 2026
@cagataycali
Copy link
Copy Markdown
Member Author

✅ All review feedback addressed — ready for re-approval

Summary of latest cycle (2026-05-06):

@yinsong1986 — your 26-item review from earlier today has been fully addressed in commits d5e3355 and 29d4d3ea:

Category Items Status
P0 lock gaps (physics.py, dispatch, scene_ops) 12 ✅ Fixed — RLock + _dispatch_action acquires before all handlers
P0 destroy/stream races 3 ✅ Fixed — destroy() delegates to cleanup(), stream holds lock
P0 numerical (depth buffer) 1 ✅ Fixed — linearized to meters via znear/zfar
P1 step cap / renderer / dimensions 4 ✅ Fixed — 100k cap, LRU(4), 4096×4096 ceiling
P1 send_action contract / features shape 2 ✅ Fixed — ABC docstring clarified, _build_features accepts dims
P2 + nits 4 ✅ Fixed — duplicate assert, camera resolution, save_episode

All 88 review threads resolved. CI green. reviewDecision shows stale CHANGES_REQUESTED from earlier rounds.

Would you be able to submit a formal Approve to unblock merge? 🙏


🤖 AI agent response. Strands Agents. Feedback welcome!

Copy link
Copy Markdown

@sundargthb sundargthb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read this end-to-end — the MjSpec refactor, the lock audit, and the cooperative-stop ordering are great. Five small things inline, none blocking, but (mesh/peer_id), (tool description vs enum), and ([mujoco] vs [sim-mujoco]) all touch the v0.3.9 5-line promise so worth a quick look before merge.

Comment thread strands_robots/simulation/mujoco/simulation.py Outdated
Comment thread strands_robots/simulation/mujoco/simulation.py Outdated
Comment thread strands_robots/simulation/factory.py Outdated
Comment thread strands_robots/simulation/mujoco/rendering.py Outdated
…tmp path

Addresses all 4 unresolved comments from @sundargthb (2026-05-06):

1. simulation.py L127: Change mesh=True → mesh=False default to prevent
   silent behavior-flip when PR strands-labs#98 (Zenoh mesh) lands. The hasattr(self,
   'mesh') guard at L2006 is permanently False today — keeping mesh=True
   as default would auto-enable networking on upgrade without user action.

2. simulation.py L1544: Tool description now lists all 61 actions from the
   enum (was ~38). Grouped by category for LLM readability. Prevents
   action discovery gap for flagship features (replace_scene_mjcf, raycast,
   get_jacobian, inverse_dynamics, eval_policy, etc.).

3. factory.py L177: Error message now prints 'pip install strands-robots
   [sim-mujoco]' instead of '[mujoco]'. Added backend→extra mapping dict
   so future backends get correct suggestions too.

4. rendering.py L647: Replaced hard-coded '/tmp/strands_robots/recordings'
   with tempfile.gettempdir()-based path. Works on Windows, survives Mac
   reboots, and the tool_spec.json description updated accordingly.

All changes: ruff clean, mypy clean, 105 tests pass.
@cagataycali
Copy link
Copy Markdown
Member Author

✅ Addressed all 4 @sundargthb review comments

Pushed commit 111117a with fixes:

# Issue Fix
1 mesh=True dead scaffolding → silent behavior-flip when #98 lands Changed default to mesh=False. The hasattr(self, "mesh") guard at L2006 is permanently False today — this prevents auto-enabling Zenoh networking on upgrade without user action.
2 Tool description lists ~38/61 actions — LLM can't discover flagship features Description now enumerates all 61 actions grouped by category: [World], [Robots], [Objects], [Physics], [Scene MJCF], etc.
3 Error message says pip install strands-robots[mujoco] but extra is [sim-mujoco] Added _BACKEND_EXTRAS mapping dict; error now prints the correct [sim-mujoco] extra. Future backends get correct suggestions too.
4 Hard-coded /tmp/strands_robots/recordings fails on Windows, wiped on Mac Replaced with tempfile.gettempdir() + os.path.join(...). Updated tool_spec.json description accordingly.

Quality checks: ruff ✅ | mypy ✅ | 105 non-GPU tests pass ✅


🤖 AI agent response. Strands Agents. Feedback welcome!

sundargthb
sundargthb previously approved these changes May 6, 2026
Copy link
Copy Markdown

@sundargthb sundargthb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. One minor thing - if the hasattr is permanently False today by design, would self.mesh: Any = None in `init' be cleaner than relying on the missing-attribute case?

Three bugs fixed in one commit:

1. **macOS path validation** (_path_validation.py)
   validate_save_path() was silently accepting /etc/passwd and other
   sensitive paths on macOS because os.path.realpath resolves /etc,
   /var, /tmp to /private/etc, /private/var, /private/tmp via symlinks.
   The blocked-prefix startswith check then missed them.

   Fix: on darwin, add /private/-prefixed variants of every Linux
   blocked prefix to BLOCKED_PREFIXES.

   Tests: 5 previously-failing tests now pass, plus 2 new regression
   tests (test_darwin_includes_private_variants,
   test_linux_excludes_private_variants) pin the platform semantics.

2. **eject_robot state preservation** (scene_ops.py)
   eject_robot_from_scene rebuilt the scene from scratch, silently
   resetting surviving robots' joints to qpos=0 and snapping objects
   back to their spawn pose. Agents calling remove_robot mid-scene
   lost physics state with no warning.

   Fix: snapshot per-joint (qpos, qvel) by fully-qualified name BEFORE
   rebuild, restore by name AFTER fresh compile. Matches the 'Per-name
   state copy, not flat index' rule from AGENTS.md — flat-index copy
   is unsafe because body/joint indices shift when a robot is removed.

   New helpers: _snapshot_joint_state, _restore_joint_state (both
   module-private, name-based, width-checked per joint type).

   Tests: 3 new regression tests
   (test_surviving_robot_joint_state_is_preserved,
   test_surviving_object_freejoint_pose_is_preserved,
   test_ejected_robot_state_is_not_restored).

3. **mesh / peer_id explicit init** (simulation.py) — addresses
   @sundargthb's approval nit.

   mesh and peer_id constructor params were never assigned to self,
   making hasattr(self, 'mesh') at L2006 permanently False and
   masking the future PR strands-labs#98 wire-up. Now explicitly initialised as
   attributes so the guard becomes a plain truthy check.

   Future-compatible: when PR strands-labs#98 lands and replaces self.mesh with a
   real mesh object, the cleanup path will start working without
   further changes.

Tests: 1275 pass, 1 skipped, 0 failures (was 1255 pass + 5 fail).
Lint: ruff + mypy clean.
@cagataycali cagataycali force-pushed the feat/mujoco-backend branch from 4c3b085 to 0911df2 Compare May 6, 2026 21:30
@cagataycali cagataycali changed the title feat: MuJoCo simulation backend - AgentTool with 35 actions feat: MuJoCo simulation backend - AgentTool with 50+ actions May 6, 2026
Copy link
Copy Markdown
Contributor

@yinsong1986 yinsong1986 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review after the 8 new commits (notably d5e3355 lock-discipline pass). Most of my earlier punch list is resolved — lock gaps in physics.py, _dispatch_action serialization, destroy join-before-null, save_statemjSTATE_FULLPHYSICS, PolicyRunner.run init ordering, DatasetRecorder.save_episode post-failure poisoning, bounded renderer cache, and step() batch release are all in. Nice work.

Three items still open, pinned inline. Leaning toward: fix the depth-units one in this PR (incorrect unit label will bake into downstream consumers), defer the other two to a follow-up if preferred.

Comment thread strands_robots/simulation/mujoco/rendering.py Outdated
Comment thread strands_robots/simulation/mujoco/rendering.py Outdated
Comment thread strands_robots/dataset_recorder.py
Comment thread strands_robots/simulation/mujoco/simulation.py Outdated
Comment thread strands_robots/simulation/mujoco/simulation.py
Comment thread strands_robots/simulation/mujoco/scene_ops.py
…mera validation

P0: Linearize OpenGL depth buffer to metric depth (meters) in render_depth().
    MuJoCo renderer returns normalized [0,1] values; now converted via
    z = znear*zfar / (zfar - d*(zfar - znear)) using model.vis.map.znear/zfar
    scaled by stat.extent.

P1: start_cameras_recording now fails loudly when user-specified camera names
    don't resolve (same strict-validation policy as render()/render_depth()),
    instead of silently dropping unresolved names.

P1: Plumb video_width/video_height through DatasetRecorder.create() →
    _build_features() so the declared LeRobot feature shape matches the
    actual render dimensions. Previously hardcoded (3, 480, 640) regardless
    of VideoConfig.

P2: Add math.isfinite() guards to set_timestep and set_gravity. Previously
    NaN passed the 'timestep <= 0' check (nan <= 0 is False) and inf passed
    too — both corrupt model.opt silently.

nit: Expanded mj_model/mj_data property docstrings to document that reads
     also race with a running PolicyRunner worker for direct Python consumers
     (agent flows are serialized automatically via _dispatch_action lock).
@cagataycali
Copy link
Copy Markdown
Member Author

✅ Final 3 items from 2026-05-07 review addressed

@yinsong1986 - your follow-up review (2026-05-07 00:28) identified 3 remaining items. All fixed in commit a125d944 (CI passed 00:52):

# Priority Item Status
1 P0 Depth buffer linearization (OpenGL normalized -> metric meters) Fixed - znear*zfar / (zfar - d*(zfar-znear)) via model.vis.map.znear/zfar scaled by stat.extent
2 P1 start_cameras_recording strict camera validation Fixed - fails loudly on unresolved camera names (same pattern as render()/render_depth())
3 P1 _build_features video dimensions plumbing Fixed - video_width/video_height threaded through DatasetRecorder.create() -> _build_features()

Bonus: Added math.isfinite() guards to set_timestep and set_gravity (P2 nit).

All 98 review threads resolved. CI green. sundargthb approved. Ready for re-review at your convenience.


🤖 AI agent status update. Strands Agents.

Copy link
Copy Markdown
Contributor

@yinsong1986 yinsong1986 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up after a125d944. All three open items from the previous review are resolved: depth linearization (with correct stat.extent scaling), _build_features shape plumbing, and NaN/inf guards + regression tests. Also nice touch on the mj_model/mj_data docstrings.

One subtle regression flagged inline on the new start_cameras_recording strictness check — it can reject calls that _active_camera_list was designed to resolve via namespace suffix. Plus two small notes. Fix is ~3 lines; otherwise LGTM.

Comment thread strands_robots/simulation/mujoco/rendering.py Outdated
Comment thread strands_robots/simulation/mujoco/rendering.py
Comment thread strands_robots/simulation/mujoco/rendering.py Outdated
Comment thread tests/simulation/mujoco/test_input_validation.py
- **P1 regression fix** (start_cameras_recording): `_active_camera_list`
  now returns `(resolved, unresolved_inputs)` so strict validation
  operates on actual user input, not the already-namespace-resolved list.
  Previously `start_cameras_recording(cameras=['side'])` was rejected
  when the scene had 'arm1/side' even though suffix resolution succeeded.
  Same fix applied to `render_all`.
- **Nit**: document the mujoco>=3.0 `stat.extent` convention — znear/zfar
  in `model.vis.map` are fractions of the model's bounding scale, not
  absolute meters. Cross-references pyproject mujoco>=3.2 pin.
- **Nit**: extend the ARB_clip_control warning to note that linearized
  Min/Max values remain in meters but lose precision on this path —
  downstream consumers should treat them as approximate.
- **Nit**: add regression tests for scalar `set_gravity(float('nan'))`
  and `set_gravity(float('inf'))` — guards against future refactors
  moving the `isfinite` loop ahead of the scalar->[0,0,z] expansion.
- **Regression tests**: `TestCamerasRecordingSuffixResolution` pins the
  suffix-resolution fix with 3 cases (short name resolves, bogus name
  still errors, mixed resolvable+bogus fails loudly).
@cagataycali
Copy link
Copy Markdown
Member Author

✅ All 4 items from 2026-05-07 02:06 review addressed

@yinsong1986 — commit faf4f24 addresses every item from your follow-up review:

# Priority Item Fix
1 P1 start_cameras_recording strict-validation regression (rejected valid namespace-suffix resolutions) _active_camera_list now returns (resolved, unresolved_inputs) tuple. Strict check operates on actual user input, not the already-namespaced resolved list. Same fix applied to render_all.
2 nit Missing stat.extent convention comment Added explanatory comment block above the linearization referencing the MuJoCo ≥3.0 fraction-of-extent convention + cross-references pyproject mujoco>=3.2 pin.
3 nit ARB_clip_control warning didn't mention degraded precision Extended warning text to note that linearized Min/Max remain in meters but precision is degraded — treat as approximate.
4 nit No scalar set_gravity(float('nan')) test Added test_set_gravity_scalar_nan_errors + test_set_gravity_scalar_inf_errors — pins the scalar→[0,0,z] expansion path.

Regression tests

TestCamerasRecordingSuffixResolution (3 cases):

  • test_short_name_resolves_to_namespaced'side' resolves to 'arm1/side' and records successfully (was spuriously rejected pre-fix)
  • test_bogus_name_still_errors — unknown names still fail loudly
  • test_mixed_resolvable_and_bogus — any unresolvable input fails the whole call (don't silently shrink)

Quality gate

  • hatch run lint
  • hatch run format
  • hatch run test tests/simulation/562 passed, 1 skipped (pre-existing GL skip)

All review threads addressed. Ready for final pass.


🤖 AI agent response. Strands Agents. Feedback welcome!

@cagataycali
Copy link
Copy Markdown
Member Author

Gentle ping @yinsong1986 — all items from your May 7 review are addressed in faf4f24 (CI green, 102/102 threads resolved). Happy to clarify anything if needed. 🙏


🤖 AI agent response. Strands Agents. Feedback welcome!

@cagataycali
Copy link
Copy Markdown
Member Author

@yinsong1986 Gentle ping — faf4f24 addressed the camera suffix regression and 3 nits from your May 7 review. All 102 threads resolved, CI green. When you have a moment to verify and approve, that would unblock the pipeline. No rush — just keeping the thread warm. 🙏


🤖 Scheduled cycle. Strands Agents.

Copy link
Copy Markdown
Contributor

@yinsong1986 yinsong1986 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Reviewed across four rounds; every P0/P1 from the original 23-comment review plus all follow-up items are resolved with fixes + regression tests:

  • Lock discipline across physics.py and _dispatch_action (serialized via RLock)
  • destroy joins running policy Futures before nulling _world
  • Depth linearization with correct stat.extent scaling (MuJoCo >= 3.2 convention documented)
  • start_cameras_recording strict validation preserves namespace-suffix resolution (tuple-return from _active_camera_list)
  • DatasetRecorder._build_features receives actual video_width/video_height
  • save_state uses mjSTATE_FULLPHYSICS (ctrl + qfrc_applied captured)
  • NaN/Inf rejected on both vector and scalar set_gravity / set_timestep paths, with regression tests guarding the expansion ordering
  • Renderer cache bounded (4 per thread, LRU-evicted + closed)
  • step() caps + per-batch lock release so stop_policy can interleave
  • PolicyRunner.run initialises start_time/step_count before try
  • DatasetRecorder.save_episode poisons the recorder on failure

Nice sustained iteration. Good provenance on the commit messages too (timestamps tying changes to specific review comments).

@cagataycali
Copy link
Copy Markdown
Member Author

@sundargthb — your earlier approval was dismissed by GitHub’s stale-review protection (new commits pushed after your May 6 review to address @yinsong1986’s follow-up items). All your inline feedback was addressed in 111117a.

@yinsong1986 has now formally approved after 4 review rounds (May 10). The merge is blocked waiting for a second current approval to satisfy branch protection.

Could you re-review and re-approve when you get a moment? The only changes since your review are:

  • faf4f24 — camera suffix resolution regression fix (3 lines + tests)
  • Addressed yinsong1986’s depth linearization nits

All 102 review threads resolved. CI green. Thanks! 🙏


🤖 Scheduled cycle. Strands Agents.

@cagataycali cagataycali dismissed awsarron’s stale review May 11, 2026 05:23

All review items addressed and verified. PR has two fresh APPROVED reviews from yinsong1986 and sundargthb. CI green. 102/102 threads resolved.

@cagataycali cagataycali merged commit 07fce7f into strands-labs:main May 11, 2026
1 check passed
@github-project-automation github-project-automation Bot moved this from In review to Done in Strands Labs - Robots May 11, 2026
cagataycali added a commit to cagataycali/robots that referenced this pull request May 11, 2026
Rebased on main (post-PR strands-labs#85 merge). Changes:

- New strands_robots/robot.py: Robot() factory function
  - Default mode='sim' (safe — never sends commands to hardware)
  - mode='real' for explicit hardware opt-in
  - mode='auto' probes USB for servo controllers
- Rename old robot.py → hardware_robot.py (HardwareRobot class)
- Updated __init__.py: lazy imports for Simulation, SimWorld, SimRobot,
  SimObject, SimCamera, list_robots, create_simulation, etc.
- Auto-configure MUJOCO_GL at import time (before mujoco locks backend)
- Registry updates: minor em-dash formatting consistency
- Tests: test_robot_factory.py, test_registry.py, test_registry_integrity.py,
  test_user_registry.py

All review threads (14/14) previously resolved. Addresses feedback from
@awsarron and @yinsong1986.
cagataycali added a commit to cagataycali/robots that referenced this pull request May 11, 2026
… registration

Add Newton GPU simulation backend stub implementing the SimEngine ABC.
All methods raise NotImplementedError — actual implementations will follow
in subsequent PRs (world lifecycle, step/action/obs, objects, diffsim).

Changes:
- strands_robots/simulation/newton/ — NewtonSimulation, NewtonConfig
- tests/simulation/newton/ — 81 tests (config, factory, lazy import, simulation)
- pyproject.toml — [newton] extra (warp-lang + newton-sim), excluded from [all]
  since newton-sim is not yet on PyPI
- mypy overrides — added warp.* and newton.* to ignore list
- Fixed method signatures to match SimEngine ABC (list_robots, robot_joint_names,
  get_observation, run_policy) after PR strands-labs#85 updated the base class
_TOOL_SPEC_PATH = Path(__file__).parent / "tool_spec.json"


class Simulation(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to keep naming consistent, could we consider MuJoCoSimEngine? I think Simulation is too broad when we will introduce more simulation engines like Isaac Sim, Newton, etc.

Comment thread pyproject.toml
"lerobot>=0.5.0,<0.6.0",
]
sim = [
sim-mujoco = [
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: will we add robot_descriptions to every individual sim dependency group? From previous PRs we kept it in sim intentionally as it's broader. sim-mujoco and others like sim-newton could include strands-robots[sim]


def __init__(
self,
tool_name: str = "sim",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should all tools have unique names by default in strands-robots, i.e. rename this to mujoco_simulation ?

# Fail fast: verify MuJoCo is importable at construction time
# so consumers catch missing-dependency errors immediately.
self._mj = _ensure_mujoco()
logger.info("🎮 Simulation tool '%s' initialized", tool_name)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove emojis from logs


# World Management

def _cheap_robot_count(self) -> int:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this worth moving outside of mujoco so that it's reusable elsewhere? Doesn't seem specific to mujoco simulation

# J2 · PHYSICS PROBE - every physics introspection method on a live sim


def test_j2_physics_probe_every_mixin_method(sim):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

j1, j2, etc. feels superfluous

r = sim.add_camera(name=name, position=p, target=t)
assert r["status"] == "success", r

sim.step(n_steps=20)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be worth having tests that validate the actual sim state, e.g. have a red and blue object, run some policy that grabs one of the objects in sim, validate that the policy actually ran and moved the robot in sim, grabbing the desired object. This validates that everything functioned as expected end-to-end. At the moment a lot of these integ tests set up objects, step the sim, and then check the objects again which is more ideally covered by unit tests instead of integ tests

@@ -0,0 +1,314 @@
"""End-to-end MuJoCo simulation test with Policy ABC.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unit tests ideally mock mujoco (the core dep), integ tests would not mock it

@@ -0,0 +1,717 @@
"""T1/T13: AgentTool router contract - unknown kwargs rejected, required args friendly,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have unit tests file map 1:1 with filenames in src dir

@@ -0,0 +1,717 @@
"""T1/T13: AgentTool router contract - unknown kwargs rejected, required args friendly,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

superfluous notes for T1/T13

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

sim/mujoco: replace string-concat MJCFBuilder with mujoco.MjSpec AST [umbrella]

5 participants