Website | Paper | Data | Setup
Biomedical imaging produces rich visual data, but analysis is bottlenecked by labor-intensive workflows that depend on expert use of multiple GUI-driven software packages and on integrating knowledge from many sources. Orion is an autonomous computer-using agent for biomedical image analysis that combines large language models with terminal execution, GUI control, and adaptive multi-step reasoning inside a shared computing environment. A single Orion agent can inspect visual data, drive standard scientific software (CellProfiler, QuPath, napari, …), mine web resources, and run end-to-end analysis and interpretation workflows without bespoke setup per tool.
Under the hood, Orion drives a real Ubuntu desktop running — clicking through GUIs, running shell commands, browsing the web, and reading multi-channel microscopy and pathology images — so one agent can take on biology lab tasks end-to-end instead of being limited to text or single-tool calls.
Orion/
├── sweagent/ # Agent runtime (built on SWE-Agent + custom for Orion)
│ └── run/ # Running entry points: run_orion.py + 5 run_orion_<task>.py
│ └── agent/ # orionagents.py
│ └── deployment/ # swe-rex deployment for vmware & docker ...
│ └── environment/ # swe-rex environment for vmware & docker ...
│ └── operator/ # OS support for GUI
├── config/ # Per-task YAML agent configs (templates, step caps, memory toggles)
├── tools/ # Agent-callable tools (cell_image_tools, cellprofiler, jump, web_browser, napari_viewer_tools, …)
├── subagents/ # Agent-calllable GUI sub-agents prompts / configs (figure interpreter, web browser, image viewer, …)
├── scripts/ # Bash launchers — set env vars and call `python -m sweagent.run.run_orion_<task>`
├── baselines/ # Non-Orion comparison runners (microscopy_baselines, deepresearch_benchmarks, JUMP_discovery)
├── communication_test/ # `vmrun` smoke test for the host ↔ VM swerex pipeline
├── workflow_data/ # Persisted per-task workflow memory
├── setup.md # Full installation guide (host conda env + VMware Ubuntu guest)
├── pyproject.toml # Project metadata + dependencies (single source of truth)
├── requirements.txt # Flat dependency list (kept as a pinning fallback)
└── LICENSE # MIT
The supported tasks are backed by five datasets. Orion-hosted datasets live under the orion-agent Hugging Face organization; external benchmarks keep their original locations.
- Microscopy-Dataset —
orion-agent/Microscopy-Dataset. CellProfiler-Training and -Testing problems (segmentation / counting) for the microscopy image processing task. - Pathology-Dataset —
orion-agent/Pathology-Dataset. Whole-slide images + QuPath workflows for digital pathology analyzing (Tutorial / Segmentation / tp53_index splits). - JUMPDiscovery —
orion-agent/JUMPDiscovery. JUMP Cell Painting morphological profiles for the discovery task. All figures and reports generated by Orion are available at orion-agent/JUMPDiscovery_results. - LAB-Bench & MicroVQA — data is directly downloaded from huggingface during running.
The guest Ubuntu image (with CellProfiler / QuPath / napari pre-installed) is setup as in VM_SETUP.md. A pre-installed version for arm64 machines is available at orion-agent/orion-vm-arm.
Each dataset contains subfolders, each one corresponding to an independent task. ProblemStatement.mdin tasks subfolders is a set of given instruction to Orion agent. Test suites and ground truth labels are provided for each task (except discovery tasks which require manual evaluation).
JUMPDiscovery and Pathology-Dataset both contain large files which would be time-costly to download and move from local to guest machines. Therefore, we provide download scripts that could be directly given to the agents to download on a need basis.
Full instructions live in setup.md, with VM-specific details in
VM_SETUP.md. The short version, once you
have VMware Fusion / Workstation (vmrun on PATH) and the OSWorld Ubuntu
guest image:
# 1. Install Orion on the host (pick ONE of A / B)
git clone <orion-repo>
cd Orion
pip install -e .
# 2. Configure API keys + VM paths (see setup.md § "Running the agent")
cp .env.example .env
$EDITOR .env
# 3. Smoke-test the host ↔ VM channel (one-shot)
python communication_test/test_vmware_deployment.py
# 4. Download a dataset (example: JUMPDiscovery into ./JUMPDiscovery)
hf download orion-agent/JUMPDiscovery --repo-type dataset --local-dir ./JUMPDiscovery
# 5. Run a task
bash scripts/run_orion_jump_discovery.sh # or any of the 4 launchersEquivalent CLI forms after pip install -e .:
orion jump-discovery # via the `orion` console script
python -m sweagent jump-discovery # via the package dispatcher
python -m sweagent.run.run_orion_jump_discovery # direct module callAll three resolve to the same run_orion_jump_discovery.main() entry point. Task
settings (resume mode, output directory, benchmark split, etc.) are read from
environment variables exported by the launcher scripts — see
Supported tasks for the full list of launchers and the
configs each one uses.
- The launcher script sources
.env, exports task-specific env vars, and invokes the matchingrun_orion_<task>module. sweagent.run.run_orion.build_deployment(...)boots the VMware guest viavmrun, startsswerex-remoteon port 4000, and opens a session.reset_desktop_for_problem(...)pushes the host'stools/napari_viewer_tools/napari_server/onto the guest at/home/user/server/napari_server, launches the napari helper service, and resets the DesktopEnv.- The agent loads its YAML config from
config/, reads the problem statement from inside the VM, and proceeds turn-by-turn: think → act → observe, with screenshots, accessibility trees, and tool calls routed throughswerex-remote(port 4000) and the DesktopEnv HTTP server (port 5000) running inside the guest. - Trajectories, intermediate files, and the final answer are written under
RESUME_OUTPUT_DIR(host path you set in the launcher).
While the run is in progress you can watch it happen in real time on the Ubuntu machine — the cursor moves, applications open, and files appear / update under the working directory as the agent acts.
- Agent prompts & limits —
config/(README). One YAML per task; tunemax_trajectory_length,system_template,instance_template, and the workflow-memory toggles. - Launcher env vars —
scripts/run_orion_*.sh. Each script documents its inputs (mode, benchmark split, resume path, problem source, …) inline. - Host
.env— collects API keys (ANTHROPIC_API_KEY,AZURE_OPENAI_API_KEY_EUS2,GEMINI_API_KEY,HF_API_TOKEN,GITHUB_API_KEY) and VM/host paths (HOST_VM_FILE,VM_USER,VM_PASSWORD,VM_WORKING_DIR,ENV_NAME, …). See setup.md § "Running the agent" for the full list.
Four task families ship out of the box. Each pairs a bash launcher under
scripts/ with one or more YAML agent configs under config/:
- Microscopy image processing —
scripts/run_orion_cellprofiler.sh(vmware_cell.yaml,vmware_cell_test.yaml). Build, debug, and run CellProfiler pipelines on segmentation / counting problems. - Digital pathology analyzing —
scripts/run_orion_qupath.sh(vmware_learn_qupath.yaml,vmware_qupath_metastasis.yaml,vmware_qupath_tma.yaml). Learn from QuPath tutorials, then run pathology segmentation on lymph-node / TMA / Ki67 slides. - Deep research — covers both literature/database QA (LabBench DbQA / LitQA2 / FigQA via
scripts/run_orion_labbench.shwithvmware_labbench.yaml) and multi-question visual QA on microscopy images (MicroVQA viascripts/run_orion_microvqa.shwithvmware_microvqa.yaml). The agent searches the web, reads papers/figures, and answers multiple-choice biomedical questions. - Discovery —
scripts/run_orion_jump_discovery.sh(vmware_jumpdiscovery.yaml). Mine the JUMP Cell Painting morphological-profile dataset for novel perturbation–phenotype hypotheses (up to 500 agent steps per run).
baselines/ contains comparison runners (LLM-only, search-augmented, and
agent-style) for each task family. They have their own .env and dependency
layer:
$EDITOR baselines/.env
python baselines/deepresearch_benchmarks/run_biomni_benchmark.py --benchmarks MicroVQASee baselines/README.md for the per-folder layout
(microscopy_baselines/, deepresearch_benchmarks/, JUMP_discovery/)
and the env-var reference.
Orion builds on the open-source SWE-agent agent-computer-interface and SWE-rex remote runtime by Yang, Jimenez, Wettig, Lieret, Yao, Narasimhan, and Press; the OSWorld team for both the Ubuntu guest image and the DesktopEnv code we build on; and on community tools and benchmarks including CellProfiler, QuPath, napari, the JUMP Cell Painting consortium dataset, LAB-Bench, and MicroVQA.
MIT — see LICENSE. Copyright © 2026 Genentech, Inc.

