Skip to content

fix(BA-6156): rewire TUS PATCH/HEAD to chunk-based store#11767

Closed
jopemachine wants to merge 8 commits into
fix/BA-6155-tus-upload-session-classfrom
fix/BA-6156-rewire-tus-handlers
Closed

fix(BA-6156): rewire TUS PATCH/HEAD to chunk-based store#11767
jopemachine wants to merge 8 commits into
fix/BA-6155-tus-upload-session-classfrom
fix/BA-6156-rewire-tus-handlers

Conversation

@jopemachine

@jopemachine jopemachine commented May 22, 2026

Copy link
Copy Markdown
Member

📚 Stacked PRs

This PR is part of a 5-PR stack implementing BA-3974 (epic: BA-6153). Merge in order:

  1. ⬇️ feat(BA-6155): add metadata-driven chunk-store upload session engine #11766feat(BA-6155): add metadata-driven chunk-store upload session engine
  2. 👉 fix(BA-6156): rewire TUS PATCH/HEAD to chunk-based store #11767fix(BA-6156): rewire TUS PATCH/HEAD to chunk-based store (actual user-visible fix) ← you are here
  3. ⬇️ test(BA-6157): add multi-proxy NFS race regression test #11768test(BA-6157): add multi-proxy NFS race regression test
  4. ⬇️ feat(BA-6158): support TUS Checksum extension #11769feat(BA-6158): support TUS Checksum extension
  5. ⬇️ feat(BA-6159): add /upload/status endpoint and progress headers #11770feat(BA-6159): add /upload/status endpoint + progress headers

Summary

  • Replace the legacy single-file append in `tus_upload_part` with `TusUploadSession`-backed per-offset chunk commits. Concurrent PATCH requests from multiple Storage Proxy replicas on a shared NFS mount no longer corrupt each other.
  • `tus_check_session` now returns `Upload-Offset` from `SessionState.committed_offset` instead of a raw `stat()` of the temp file.
  • `prepare_upload` (vfs) creates the session directory instead of touching an empty file, matching the new on-disk layout.
  • Handler tests rewritten to drive the real `TusUploadSession` through `tmp_path`.

Wire format on `/upload` (PATCH/HEAD/OPTIONS) stays TUS 1.0.0-compatible; existing tus-js-client based callers continue to work unchanged.

Resolves BA-6156. Part of epic BA-6153 (implements BA-3974).

@github-actions github-actions Bot added size:XL 500~ LoC comp:storage-proxy Related to Storage proxy component labels May 22, 2026
@jopemachine jopemachine force-pushed the fix/BA-6156-rewire-tus-handlers branch from dada14e to 0d1bf13 Compare May 26, 2026 05:55
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch from 2a9d0eb to 193aece Compare May 26, 2026 08:29
@jopemachine jopemachine force-pushed the fix/BA-6156-rewire-tus-handlers branch 2 times, most recently from 7dcbab8 to 84e303d Compare May 26, 2026 08:50
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch from 193aece to 31eb871 Compare May 26, 2026 08:50
@jopemachine jopemachine force-pushed the fix/BA-6156-rewire-tus-handlers branch from 84e303d to fc4dc36 Compare May 26, 2026 09:04
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch from 31eb871 to 0c2b3f8 Compare May 26, 2026 09:04
@jopemachine jopemachine force-pushed the fix/BA-6156-rewire-tus-handlers branch from fc4dc36 to 968d6c3 Compare May 26, 2026 09:11
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch from 764ffad to 89e52ef Compare May 26, 2026 09:13
@jopemachine jopemachine force-pushed the fix/BA-6156-rewire-tus-handlers branch 2 times, most recently from b40ebc8 to a94454f Compare May 26, 2026 09:19
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch 2 times, most recently from 4499f92 to b84c12f Compare May 26, 2026 09:31
@jopemachine jopemachine force-pushed the fix/BA-6156-rewire-tus-handlers branch 2 times, most recently from 583cb92 to 1f2fc01 Compare May 28, 2026 05:02
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch from b84c12f to c0c0db8 Compare May 28, 2026 05:10
@jopemachine jopemachine force-pushed the fix/BA-6156-rewire-tus-handlers branch from 1f2fc01 to 4c07f47 Compare May 28, 2026 05:10
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch from c0c0db8 to 4d1cb89 Compare May 28, 2026 06:01
@jopemachine jopemachine force-pushed the fix/BA-6156-rewire-tus-handlers branch 2 times, most recently from 58a1a95 to 681019d Compare May 28, 2026 08:41
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch from 4d1cb89 to 5c675ef Compare May 28, 2026 08:41
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch from 940cceb to a3f366f Compare June 1, 2026 06:44
@jopemachine jopemachine force-pushed the fix/BA-6156-rewire-tus-handlers branch from 51181ab to fccee9f Compare June 1, 2026 06:44
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch from a3f366f to e971cb9 Compare June 1, 2026 06:52
@jopemachine jopemachine force-pushed the fix/BA-6156-rewire-tus-handlers branch from fccee9f to ee1e510 Compare June 1, 2026 06:52
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch from e971cb9 to 69963e3 Compare June 1, 2026 07:29
@jopemachine jopemachine force-pushed the fix/BA-6156-rewire-tus-handlers branch from ee1e510 to 087b43d Compare June 1, 2026 07:29
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch from 69963e3 to 4a3ff2e Compare June 1, 2026 07:33
@jopemachine jopemachine force-pushed the fix/BA-6156-rewire-tus-handlers branch 2 times, most recently from 5dd870e to c9c67a8 Compare June 1, 2026 07:45
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch from 4a3ff2e to 0d58813 Compare June 1, 2026 07:45
@jopemachine jopemachine force-pushed the fix/BA-6156-rewire-tus-handlers branch from c9c67a8 to e35aee1 Compare June 1, 2026 07:55
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch from 0d58813 to 9a8919e Compare June 1, 2026 07:55
Comment thread src/ai/backend/storage/api/client.py Outdated
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch from 9a8919e to 9225be5 Compare June 1, 2026 08:00
@jopemachine jopemachine force-pushed the fix/BA-6156-rewire-tus-handlers branch from e35aee1 to 8482a07 Compare June 1, 2026 08:00
Comment thread src/ai/backend/storage/volumes/vfs/__init__.py
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch from 9225be5 to 355cff2 Compare June 1, 2026 08:19
@jopemachine jopemachine force-pushed the fix/BA-6156-rewire-tus-handlers branch from 8482a07 to b761e68 Compare June 1, 2026 08:19
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch from 355cff2 to 52073e4 Compare June 1, 2026 08:27
@jopemachine jopemachine force-pushed the fix/BA-6156-rewire-tus-handlers branch 2 times, most recently from 9bbfaef to a3a6412 Compare June 1, 2026 08:40
jopemachine and others added 8 commits June 4, 2026 14:04
This is the user-visible bug fix for BA-3974: replace the legacy single
temp-file append in tus_upload_part with the metadata-driven chunk store
introduced in BA-6155.

  - tus_upload_part now streams each PATCH body into a unique temp chunk
    file (no shared writer), then commits through TusUploadSession under
    a brief metadata lock. The committer whose write closes the
    contiguous prefix to total_size is the sole one to assemble + clean
    up.
  - tus_check_session returns Upload-Offset / Upload-Length from
    SessionState.committed_offset rather than from a raw stat() of the
    temp file.
  - prepare_upload (vfs) now creates the session directory instead of
    touching an empty file, matching the new layout.
  - Handler tests rewritten to drive the real TusUploadSession through
    tmp_path: missing/invalid/out-of-range Upload-Offset, missing
    session, happy path (single + multi-chunk), idempotent replay,
    and over-large chunk rejection.

The wire format on /upload (PATCH/HEAD/OPTIONS) stays TUS 1.0.0
compatible — existing tus-js-client based callers (Web UI included)
work unchanged.

Note: the news fragment is staged as changes/BA-6156.fix.md and should
be renamed to changes/<PR#>.fix.md when this PR is opened.

Resolves BA-6156. Part of epic BA-6153 (implements BA-3974).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wrap the raw TUS PATCH body in TusChunkUploadStreamReader (in storage/types.py,
mirroring MultipartFileUploadStreamReader) so it conforms to the common
StreamReader abstraction, and drain it through TusUploadSession.write_temp_chunk
instead of a one-off reader protocol + free function. This keeps the reader a
pure byte source and concentrates all chunk file I/O in the session, consistent
with the existing storage upload/download readers.

Also update the completed-upload assertion to match the race-safe cleanup: the
session directory and its info.json marker are intentionally kept while chunk
data is reclaimed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Pass ctx.valkey_tus_client into TusUploadSessionArgs at the PATCH/HEAD handler
call sites so the Valkey-backed engine has its metadata store + lock. Rework the
handler tests to run against a real Valkey (redis container) with a unique
per-test session id.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
ValkeyTusClient now delegates its per-session lock to RedisLock over a
dedicated lock connection, so the handler-test fixture provisions a
RedisConnectionInfo from the redis container and hands it to .create().

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
UploadTokenData.session is now typed as the common TusSessionId NewType so the
session id flows type-safely from the token through the handlers into the
Valkey-backed engine.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The TUS HEAD/PATCH handlers previously decided "is this upload session valid?"
by checking whether `session_dir` exists on disk. That is a leftover from the
pre-Valkey design where the session *was* the directory. In the new model,
session metadata lives in Valkey and is the source of truth — the directory is
just where chunk payloads happen to sit. The two could diverge (TTL-expired
Valkey state but the directory lingers; externally-deleted directory while
state is still alive) and the disk check would lie in either direction.

Switch the check to a Valkey lookup via the new `TusUploadSession.exists()`. To
make HEAD on a freshly prepared session still return 200 (it has no chunks
yet), `create_upload_session` now writes an empty `SessionState` to Valkey
immediately after `prepare_upload`, so the registry exists from the moment the
JWT is issued. The PATCH handler drops the now-redundant `ensure_initialized`
call — the existence check has already verified state is present, and any
late-arriving PATCH whose state has TTL-expired should 404 rather than be
silently revived.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@jopemachine

Copy link
Copy Markdown
Member Author

This PR has become stale, and I don't believe it is necessary at this time. If it is determined to be needed in the future, I will revisit and implement it then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp:storage-proxy Related to Storage proxy component size:XL 500~ LoC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant