Apply the same augmentation chain to all camera images by 0xadvait · Pull Request #970 · Physical-Intelligence/openpi

0xadvait · 2026-06-12T00:18:12Z

Problem

preprocess_observation passes the same rng for every camera, which suggests augmentation parameters are meant to be consistent across cameras within a frame. They are not: augmax.Chain splits its key once per sub-transform, and the base camera chain has 4 transforms while wrist chains have 1, so ColorJitter draws different subkeys for base vs wrist. The same physical frame gets visibly different hue/brightness/contrast on the base camera vs the wrist cameras (#859 has a visual repro).

Why this divergence matters

The pi0.5 paper (arXiv 2504.16054, Appendix E) describes the training recipe as applying the full chain to every input image:

We apply image augmentation (random crop, resizing, rotation, and color jittering) to all input images using the following hyper-parameters and in this order

followed by exactly the RandomCrop(0.95) -> Resize -> Rotate(+-5) -> ColorJitter(0.3, 0.4, 0.5) chain this file uses. The wrist special case deviates from the recipe used to train the released checkpoints, and as a side effect breaks cross-camera color consistency through the Chain key-splitting described above.

Fix

Apply the full augmentation chain to every camera image. With identical chains and the shared per-frame rng, all cameras receive identical augmentation parameters for a given frame, restoring cross-camera color consistency and matching the published recipe.

If the wrist exclusion from geometric augmentation was intentional, the minimal alternative is to keep the wrist chain geometric-free but give ColorJitter a dedicated key shared across cameras. Happy to switch this PR to that variant.

Test

Adds test_preprocess_observation_train_augmentations_consistent_across_cameras: identical images on all three cameras, fixed key, asserts all augmented outputs are bitwise identical across cameras, actually differ from the input, stay within [-1, 1], and that train=False passes images through unchanged. Runs on CPU.

Fixes #859

augmax.Chain splits its rng once per sub-transform, so with 4 transforms on the base camera and 1 on wrist cameras, ColorJitter drew different parameters for base vs wrist views of the same frame even though the same rng is passed for every camera. The pi0.5 paper (Appendix E) applies the full crop/resize/rotate/jitter chain to all input images, so this change applies the full chain to every camera, which also makes augmentation parameters consistent across cameras within a frame. Adds a regression test for cross-camera consistency. Fixes Physical-Intelligence#859

0xadvait requested a review from kvablack as a code owner June 12, 2026 00:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apply the same augmentation chain to all camera images#970

Apply the same augmentation chain to all camera images#970
0xadvait wants to merge 1 commit into
Physical-Intelligence:mainfrom
0xadvait:fix/consistent-image-augmentation

0xadvait commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

0xadvait commented Jun 12, 2026

Problem

Why this divergence matters

Fix

Test

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant