feat: Add `json+cuda_ipc` array encoding for GPU-direct tensor transfer by dionhaefner · Pull Request #588 · pasteurlabs/tesseract-core

dionhaefner · 2026-05-11T12:13:08Z

Summary

New cuda_ipc array encoding that passes CUDA IPC memory handles instead of serialized tensor bytes
Framework-agnostic: works with any GPU array that implements __cuda_array_interface__ (PyTorch, CuPy, JAX, Numba)
Uses ctypes calls to libcudart directly — no PyTorch or CuPy dependency in the encode path
Decode path requires CuPy (returns cupy.ndarray); consumers convert via torch.as_tensor() or DLPack as needed
Containers launched with json+cuda_ipc automatically get --ipc=host

What problem does this solve?

Tesseract's data path is CPU-bound. Array encoding serializes via JSON/base64/binref, and copies to CPU at every boundary. For tight composition loops (optimization, MCMC), the GPU→CPU→serialize→network→CPU→GPU round-trip dominates wall time.

With cuda_ipc, tensors stay on the GPU, while the CPU only handles metadata (like shape and dtype) – this is essentially binref for GPU memory.

Usage

# Container path
t = Tesseract.from_image("my_gpu_tesseract", gpus=["0"], output_format="json+cuda_ipc")
t.serve()
result = t.apply({"x": cupy_array})  # or torch tensor, or any __cuda_array_interface__ object

# Local path
t = Tesseract.from_tesseract_api("tesseract_api.py", output_format="json+cuda_ipc")

Requirements

CUDA runtime (libcudart.so) on both producer and consumer
CuPy for decoding (pip install cupy-cuda12x)
--ipc=host for cross-container IPC (handled automatically by engine.py)
Both processes must see the same physical GPU

codecov · 2026-05-11T12:15:24Z

Codecov Report

❌ Patch coverage is 21.77419% with 97 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.39%. Comparing base (54d5c74) to head (1889fd0).

Files with missing lines	Patch %	Lines
tesseract_core/runtime/array_encoding.py	21.35%	80 Missing and 1 partial ⚠️
tesseract_core/sdk/tesseract.py	26.66%	10 Missing and 1 partial ⚠️
tesseract_core/sdk/engine.py	0.00%	2 Missing and 1 partial ⚠️
tesseract_core/runtime/file_interactions.py	33.33%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #588      +/-   ##
==========================================
+ Coverage   67.22%   75.39%   +8.17%     
==========================================
  Files          32       32              
  Lines        4519     4638     +119     
  Branches      743      765      +22     
==========================================
+ Hits         3038     3497     +459     
+ Misses       1237      831     -406     
- Partials      244      310      +66

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

PasteurBot · 2026-05-11T12:20:14Z

Benchmark Results

Benchmarks use a no-op Tesseract to measure pure framework overhead.

🚀 0 faster, ⚠️ 0 slower, ✅ 36 unchanged

✅ No significant performance changes detected.

Full results

Benchmark	Baseline	Current	Change	Status
`api/apply_1,000`	0.402ms	0.393ms	-2.2%	✅
`api/apply_100,000`	0.408ms	0.400ms	-1.9%	✅
`api/apply_10,000,000`	0.407ms	0.397ms	-2.5%	✅
`cli/apply_1,000`	1672.317ms	1648.496ms	-1.4%	✅
`cli/apply_100,000`	1702.285ms	1680.354ms	-1.3%	✅
`cli/apply_10,000,000`	1766.081ms	1730.622ms	-2.0%	✅
`decoding/base64_1,000`	0.027ms	0.027ms	-1.5%	✅
`decoding/base64_100,000`	0.751ms	0.758ms	+0.9%	✅
`decoding/base64_10,000,000`	139.384ms	139.584ms	+0.1%	✅
`decoding/binref_1,000`	0.170ms	0.172ms	+1.3%	✅
`decoding/binref_100,000`	0.262ms	0.264ms	+0.8%	✅
`decoding/binref_10,000,000`	27.732ms	28.095ms	+1.3%	✅
`decoding/json_1,000`	0.089ms	0.089ms	-0.0%	✅
`decoding/json_100,000`	8.255ms	8.280ms	+0.3%	✅
`decoding/json_10,000,000`	1116.804ms	1121.226ms	+0.4%	✅
`encoding/base64_1,000`	0.034ms	0.034ms	-0.7%	✅
`encoding/base64_100,000`	0.204ms	0.206ms	+0.7%	✅
`encoding/base64_10,000,000`	66.182ms	65.200ms	-1.5%	✅
`encoding/binref_1,000`	0.236ms	0.242ms	+2.2%	✅
`encoding/binref_100,000`	0.410ms	0.412ms	+0.5%	✅
`encoding/binref_10,000,000`	30.667ms	30.439ms	-0.7%	✅
`encoding/json_1,000`	0.117ms	0.119ms	+1.2%	✅
`encoding/json_100,000`	10.918ms	11.576ms	+6.0%	✅
`encoding/json_10,000,000`	1296.752ms	1317.191ms	+1.6%	✅
`http/apply_1,000`	2.856ms	2.910ms	+1.9%	✅
`http/apply_100,000`	8.743ms	9.234ms	+5.6%	✅
`http/apply_10,000,000`	930.312ms	941.730ms	+1.2%	✅
`roundtrip/base64_1,000`	0.071ms	0.070ms	-1.0%	✅
`roundtrip/base64_100,000`	1.122ms	1.127ms	+0.4%	✅
`roundtrip/base64_10,000,000`	206.272ms	208.074ms	+0.9%	✅
`roundtrip/binref_1,000`	0.421ms	0.418ms	-0.7%	✅
`roundtrip/binref_100,000`	0.661ms	0.662ms	+0.2%	✅
`roundtrip/binref_10,000,000`	59.083ms	59.370ms	+0.5%	✅
`roundtrip/json_1,000`	0.220ms	0.217ms	-1.1%	✅
`roundtrip/json_100,000`	18.543ms	17.888ms	-3.5%	✅
`roundtrip/json_10,000,000`	2413.511ms	2415.169ms	+0.1%	✅

Runner: Linux 6.17.0-1010-azure x86_64

The 'Merge branch main into dion/gpu-go-brr' conflict resolution left output_format/timeout params at 7-space indent and an over-length is_leaf lambda. ruff-format clean now so the pre-commit/CI gate passes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

add cuda_ipc encoding

67c3150

dionhaefner assigned jpbrodrick89 May 22, 2026

jpbrodrick89 and others added 2 commits May 29, 2026 16:30

Merge branch 'main' into dion/gpu-go-brr

3f018ff

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add `json+cuda_ipc` array encoding for GPU-direct tensor transfer#588

feat: Add `json+cuda_ipc` array encoding for GPU-direct tensor transfer#588
dionhaefner wants to merge 3 commits into
mainfrom
dion/gpu-go-brr

dionhaefner commented May 11, 2026

Uh oh!

codecov Bot commented May 11, 2026 •

edited

Loading

Uh oh!

PasteurBot commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

dionhaefner commented May 11, 2026

Summary

What problem does this solve?

Usage

Requirements

Uh oh!

codecov Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

PasteurBot commented May 11, 2026

Benchmark Results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov Bot commented May 11, 2026 •

edited

Loading