Skip to content

Genentech/Orion

Repository files navigation

Orion: Towards Lab Automation with Computer-Using Agents

Website  |  Paper  |  Data  |  Setup


About

Biomedical imaging produces rich visual data, but analysis is bottlenecked by labor-intensive workflows that depend on expert use of multiple GUI-driven software packages and on integrating knowledge from many sources. Orion is an autonomous computer-using agent for biomedical image analysis that combines large language models with terminal execution, GUI control, and adaptive multi-step reasoning inside a shared computing environment. A single Orion agent can inspect visual data, drive standard scientific software (CellProfiler, QuPath, napari, …), mine web resources, and run end-to-end analysis and interpretation workflows without bespoke setup per tool.

Orion overview: planning, file browsing, code execution, software using, self-checking, post-processing, report generation

Under the hood, Orion drives a real Ubuntu desktop running — clicking through GUIs, running shell commands, browsing the web, and reading multi-channel microscopy and pathology images — so one agent can take on biology lab tasks end-to-end instead of being limited to text or single-tool calls.

Repository layout

Orion/
├── sweagent/                 # Agent runtime (built on SWE-Agent + custom for Orion)
│   └── run/                  # Running entry points: run_orion.py + 5 run_orion_<task>.py
│   └── agent/                # orionagents.py
│   └── deployment/           # swe-rex deployment for vmware & docker ...
│   └── environment/          # swe-rex environment for vmware & docker ...
│   └── operator/             # OS support for GUI
├── config/                   # Per-task YAML agent configs (templates, step caps, memory toggles)
├── tools/                    # Agent-callable tools (cell_image_tools, cellprofiler, jump, web_browser, napari_viewer_tools, …)
├── subagents/                # Agent-calllable GUI sub-agents prompts / configs (figure interpreter, web browser, image viewer, …)
├── scripts/                  # Bash launchers — set env vars and call `python -m sweagent.run.run_orion_<task>`
├── baselines/                # Non-Orion comparison runners (microscopy_baselines, deepresearch_benchmarks, JUMP_discovery)
├── communication_test/       # `vmrun` smoke test for the host ↔ VM swerex pipeline
├── workflow_data/            # Persisted per-task workflow memory
├── setup.md                  # Full installation guide (host conda env + VMware Ubuntu guest)
├── pyproject.toml            # Project metadata + dependencies (single source of truth)
├── requirements.txt          # Flat dependency list (kept as a pinning fallback)
└── LICENSE                   # MIT

Datasets

The supported tasks are backed by five datasets. Orion-hosted datasets live under the orion-agent Hugging Face organization; external benchmarks keep their original locations.

  • Microscopy-Datasetorion-agent/Microscopy-Dataset. CellProfiler-Training and -Testing problems (segmentation / counting) for the microscopy image processing task.
  • Pathology-Datasetorion-agent/Pathology-Dataset. Whole-slide images + QuPath workflows for digital pathology analyzing (Tutorial / Segmentation / tp53_index splits).
  • JUMPDiscoveryorion-agent/JUMPDiscovery. JUMP Cell Painting morphological profiles for the discovery task. All figures and reports generated by Orion are available at orion-agent/JUMPDiscovery_results.
  • LAB-Bench & MicroVQA — data is directly downloaded from huggingface during running.

The guest Ubuntu image (with CellProfiler / QuPath / napari pre-installed) is setup as in VM_SETUP.md. A pre-installed version for arm64 machines is available at orion-agent/orion-vm-arm.

Each dataset contains subfolders, each one corresponding to an independent task. ProblemStatement.mdin tasks subfolders is a set of given instruction to Orion agent. Test suites and ground truth labels are provided for each task (except discovery tasks which require manual evaluation).

JUMPDiscovery and Pathology-Dataset both contain large files which would be time-costly to download and move from local to guest machines. Therefore, we provide download scripts that could be directly given to the agents to download on a need basis.

Quick start

Full instructions live in setup.md, with VM-specific details in VM_SETUP.md. The short version, once you have VMware Fusion / Workstation (vmrun on PATH) and the OSWorld Ubuntu guest image:

# 1. Install Orion on the host (pick ONE of A / B)
git clone <orion-repo>
cd Orion

pip install -e .

# 2. Configure API keys + VM paths (see setup.md § "Running the agent")
cp .env.example .env
$EDITOR .env

# 3. Smoke-test the host ↔ VM channel (one-shot)
python communication_test/test_vmware_deployment.py

# 4. Download a dataset (example: JUMPDiscovery into ./JUMPDiscovery)
hf download orion-agent/JUMPDiscovery --repo-type dataset --local-dir ./JUMPDiscovery

# 5. Run a task
bash scripts/run_orion_jump_discovery.sh       # or any of the 4 launchers

Equivalent CLI forms after pip install -e .:

orion jump-discovery                                   # via the `orion` console script
python -m sweagent jump-discovery                      # via the package dispatcher
python -m sweagent.run.run_orion_jump_discovery        # direct module call

All three resolve to the same run_orion_jump_discovery.main() entry point. Task settings (resume mode, output directory, benchmark split, etc.) are read from environment variables exported by the launcher scripts — see Supported tasks for the full list of launchers and the configs each one uses.

How a run looks

  1. The launcher script sources .env, exports task-specific env vars, and invokes the matching run_orion_<task> module.
  2. sweagent.run.run_orion.build_deployment(...) boots the VMware guest via vmrun, starts swerex-remote on port 4000, and opens a session.
  3. reset_desktop_for_problem(...) pushes the host's tools/napari_viewer_tools/napari_server/ onto the guest at /home/user/server/napari_server, launches the napari helper service, and resets the DesktopEnv.
  4. The agent loads its YAML config from config/, reads the problem statement from inside the VM, and proceeds turn-by-turn: think → act → observe, with screenshots, accessibility trees, and tool calls routed through swerex-remote (port 4000) and the DesktopEnv HTTP server (port 5000) running inside the guest.
  5. Trajectories, intermediate files, and the final answer are written under RESUME_OUTPUT_DIR (host path you set in the launcher).

While the run is in progress you can watch it happen in real time on the Ubuntu machine — the cursor moves, applications open, and files appear / update under the working directory as the agent acts.

Configuration

  • Agent prompts & limitsconfig/ (README). One YAML per task; tune max_trajectory_length, system_template, instance_template, and the workflow-memory toggles.
  • Launcher env varsscripts/run_orion_*.sh. Each script documents its inputs (mode, benchmark split, resume path, problem source, …) inline.
  • Host .env — collects API keys (ANTHROPIC_API_KEY, AZURE_OPENAI_API_KEY_EUS2, GEMINI_API_KEY, HF_API_TOKEN, GITHUB_API_KEY) and VM/host paths (HOST_VM_FILE, VM_USER, VM_PASSWORD, VM_WORKING_DIR, ENV_NAME, …). See setup.md § "Running the agent" for the full list.

Supported tasks

Four task families ship out of the box. Each pairs a bash launcher under scripts/ with one or more YAML agent configs under config/:

Baselines

baselines/ contains comparison runners (LLM-only, search-augmented, and agent-style) for each task family. They have their own .env and dependency layer:

$EDITOR baselines/.env
python baselines/deepresearch_benchmarks/run_biomni_benchmark.py --benchmarks MicroVQA

See baselines/README.md for the per-folder layout (microscopy_baselines/, deepresearch_benchmarks/, JUMP_discovery/) and the env-var reference.

Acknowledgements

Orion builds on the open-source SWE-agent agent-computer-interface and SWE-rex remote runtime by Yang, Jimenez, Wettig, Lieret, Yao, Narasimhan, and Press; the OSWorld team for both the Ubuntu guest image and the DesktopEnv code we build on; and on community tools and benchmarks including CellProfiler, QuPath, napari, the JUMP Cell Painting consortium dataset, LAB-Bench, and MicroVQA.

License

MIT — see LICENSE. Copyright © 2026 Genentech, Inc.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages