Skip to content

feat(BA-6155): add metadata-driven chunk-store upload session engine#11766

Draft
jopemachine wants to merge 1 commit into
mainfrom
fix/BA-6155-tus-upload-session-class
Draft

feat(BA-6155): add metadata-driven chunk-store upload session engine#11766
jopemachine wants to merge 1 commit into
mainfrom
fix/BA-6155-tus-upload-session-class

Conversation

@jopemachine
Copy link
Copy Markdown
Member

@jopemachine jopemachine commented May 22, 2026

📚 Stacked PRs

This PR is part of a 5-PR stack implementing BA-3974 (epic: BA-6153). Merge in order:

  1. 👉 feat(BA-6155): add metadata-driven chunk-store upload session engine #11766feat(BA-6155): add Valkey-backed chunk-store upload session engine ← you are here
  2. ⬇️ fix(BA-6156): rewire TUS PATCH/HEAD to chunk-based store #11767fix(BA-6156): rewire TUS PATCH/HEAD to chunk-based store (actual user-visible fix)
  3. ⬇️ test(BA-6157): add multi-proxy NFS race regression test #11768test(BA-6157): add multi-proxy NFS race regression test
  4. ⬇️ feat(BA-6158): support TUS Checksum extension #11769feat(BA-6158): support TUS Checksum extension
  5. ⬇️ feat(BA-6159): add /upload/status endpoint and progress headers #11770feat(BA-6159): add /upload/status endpoint + progress headers

Summary

Introduces the concurrency-safe TUS upload session engine. Per-session metadata
is the source of truth in Valkey (keyed by session id) and is guarded by a
per-session distributed lock produced by an injected DistributedLockFactory.
Chunk payload bytes live as individual files under chunks/chunk_<offset>.dat
on the shared filesystem; they are content-addressed by (offset, sha256) and
idempotent, so the heavy write path needs no coordination at all. Only the small
metadata read-modify-write window is serialized — and across Storage Proxy
replicas, since the lock is distributed.

Note: This PR combines two originally stacked issues — the pure
session-state model (BA-6154) and the on-disk storage engine (BA-6155). They
live in the same package and the model has no consumer on its own, so splitting
them across PRs hurt reviewability. Both issues are resolved here.

Architecture

The handler (PR #11767) only adapts the request body; all chunk file I/O and
metadata management lives in the engine
(this PR).

flowchart TB
    classDef pr66 fill:#e6f7ff,stroke:#1890ff,color:#000
    classDef pr67 fill:#fff7e6,stroke:#fa8c16,color:#000
    classDef redis fill:#fff1f0,stroke:#cf1322,color:#000
    classDef disk fill:#f6ffed,stroke:#52c41a,color:#000

    subgraph L1["PATCH handler (PR #11767)"]
        REQ["request.content"]:::pr67
        ADP["TusChunkUploadStreamReader"]:::pr67
        REQ --> ADP
    end

    subgraph L2["Upload engine (PR #11766)"]
        WTC["write_temp_chunk"]:::pr66
        CC["commit_chunk"]:::pr66
        FIN["assemble + cleanup"]:::pr66
        WTC --> CC --> FIN
    end

    subgraph L3R["Valkey · source of truth"]
        STATE["tus.upload.session:id"]:::redis
        LOCK["tus.upload.lock:id"]:::redis
    end

    subgraph L3F["shared FS · chunk payloads"]
        TMP["chunk_off.tok.tmp"]:::disk
        DAT["chunk_off.dat"]:::disk
        OUT["assembled file"]:::disk
    end

    ADP --> WTC
    WTC -->|"lock-free write"| TMP
    CC -->|"acquire"| LOCK
    CC -->|"get / set"| STATE
    CC -->|"atomic rename"| DAT
    FIN -->|"concat in order"| OUT
    FIN -->|"mark COMPLETED"| STATE
    FIN -->|"unlink chunks"| DAT
Loading

Data model

classDiagram
    class TusUploadSession {
        +ensure_initialized()
        +read_state() SessionState
        +open_temp_chunk(offset) Path
        +write_temp_chunk(offset, reader) WrittenChunk
        +commit_chunk(offset, path, length, sha256) ChunkCommitResult
        +assemble(target)
        +cleanup()
    }
    class SessionState {
        +session_id
        +total_size
        +committed_chunks : ChunkMetadata[]
        +status : UploadStatus
        +committed_offset
        +find_at_offset(offset)
        +append_chunk(chunk)
    }
    class ChunkMetadata {
        +offset
        +length
        +sha256
    }
    class WrittenChunk {
        +path
        +length
        +sha256
    }
    class ChunkCommitResult {
        +state : SessionState
        +committed : bool
        +is_final_commit : bool
    }
    class TusUploadSessionArgs {
        +session_dir
        +session_id
        +total_size
        +valkey_client : ValkeyTusClient
        +lock_factory : DistributedLockFactory
    }
    SessionState o-- ChunkMetadata
    TusUploadSession ..> SessionState
    TusUploadSession ..> ChunkCommitResult
    TusUploadSession ..> WrittenChunk
    TusUploadSession ..> TusUploadSessionArgs
Loading

One PATCH lifecycle

The expensive body write is lock-free; only the short metadata read-modify-write
is serialized via the distributed lock, so concurrent replicas never corrupt
each other.

sequenceDiagram
    autonumber
    participant H as tus_upload_part
    participant S as TusUploadSession
    participant V as Valkey
    participant D as shared FS

    H->>S: ensure_initialized()
    S->>V: acquire lock + SET state if missing
    H->>S: write_temp_chunk(offset, reader)
    loop async for chunk in reader.read()
        S->>D: write chunk_off.tok.tmp (lock-free) + sha256
    end
    S-->>H: WrittenChunk(path, length, sha256)
    H->>S: commit_chunk(offset, path, length, sha256)
    Note over S,V: distributed lock window (short)
    alt status == COMPLETED
        S-->>H: no-op (late duplicate from another replica)
    else duplicate (same offset, length, sha256)
        S-->>H: committed=False (idempotent)
    else conflict (same offset, different content)
        S-->>H: ChunkConflictError 409
    else new chunk
        S->>D: atomic rename tmp to chunk_off.dat
        S->>V: update SessionState with new record
        S-->>H: ChunkCommitResult(is_final_commit?)
    end
    opt is_final_commit
        H->>S: assemble(target)
        S->>D: concat chunks in offset order
        S->>V: mark status=COMPLETED
        H->>S: cleanup()
        S->>D: unlink chunk files (state kept for TTL)
    end
    H-->>H: 204 No Content
Loading

Locking & HA semantics

Session state is NOT held in memory and NOT on the shared filesystem.
TusUploadSession is stateless in-process — every operation reads/writes the
SessionState in Valkey, and chunk payload files are content-addressed.

This makes it safe for the production topology of multiple storage-proxy
processes/replicas
sharing an NFS mount: no in-process state to synchronize,
no dependency on filesystem lock semantics, no dependency on NFS attribute-cache
coherence.

How replicas coordinate:

  • Source of truth = Valkey, fetched per request (no in-process cache).
  • Per-session distributed lock (RedisLock via DistributedLockFactory)
    serializes the metadata read-modify-write window. The factory pattern mirrors
    the manager's DistributedLockFactory — the lock backend lives in
    RootContext.tus_lock_factory and is injected into the engine.
  • Atomic rename (Path.replace) promotes a temp chunk file
    (chunk_<offset>.<token>.tmp) to its canonical name (chunk_<offset>.dat).
    Readers see old-or-new, never a partial file.
  • Per-offset content-addressed chunk files — duplicates are idempotent
    no-ops, mismatched content surfaces as a 409 ChunkConflictError.
  • Stateless JWT upload tokens — any PATCH may land on any replica with no
    session affinity, and still resolves to the same Valkey key and same chunk
    file path.

Stale or abandoned sessions are reclaimed by the Valkey state's TTL (24h), so
there is no separate GC sweep.

Resolves BA-6154, BA-6155.

@github-actions github-actions Bot added size:XL 500~ LoC comp:storage-proxy Related to Storage proxy component labels May 22, 2026
@jopemachine jopemachine added the skip:changelog Make the action workflow to skip towncrier check label May 22, 2026
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch from 2a9d0eb to 193aece Compare May 26, 2026 08:29
@jopemachine jopemachine changed the title feat(BA-6155): add TusUploadSession storage class feat(BA-6155): add metadata-driven chunk-store upload session engine May 26, 2026
@jopemachine jopemachine changed the base branch from fix/BA-6154-upload-state-model to main May 26, 2026 08:29
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch 4 times, most recently from 764ffad to 89e52ef Compare May 26, 2026 09:13
Comment thread changes/11766.feature.md
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch from 89e52ef to 4499f92 Compare May 26, 2026 09:19
Comment thread src/ai/backend/storage/services/upload/tus_session.py Outdated
Comment thread src/ai/backend/storage/services/upload/tus_session.py Outdated
Comment thread src/ai/backend/storage/services/upload/tus_session.py Outdated
Comment thread src/ai/backend/storage/services/upload/tus_session.py Outdated
Comment thread src/ai/backend/storage/services/upload/tus_session.py Outdated
Comment thread src/ai/backend/storage/services/upload/tus_session.py Outdated
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch 2 times, most recently from c0c0db8 to 4d1cb89 Compare May 28, 2026 06:01
@github-actions github-actions Bot added the comp:common Related to Common component label May 28, 2026
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch from 4d1cb89 to 5c675ef Compare May 28, 2026 08:41
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch from 9112c26 to f1a899d Compare June 1, 2026 05:51
Comment thread src/ai/backend/storage/server.py Outdated
Comment thread src/ai/backend/storage/services/upload/tus_session.py Outdated
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch from f1a899d to a4fb473 Compare June 1, 2026 06:15
Comment thread tests/unit/storage/services/upload/test_tus_session.py Outdated
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch 4 times, most recently from 49b4a42 to 940cceb Compare June 1, 2026 06:35
Comment thread src/ai/backend/storage/services/upload/tus_session.py Outdated
Comment thread tests/unit/storage/services/upload/test_tus_session.py Outdated
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch 7 times, most recently from 9a8919e to 9225be5 Compare June 1, 2026 08:00
Comment thread src/ai/backend/storage/services/upload/tus_session.py Outdated
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch 9 times, most recently from 88d3e00 to 7bbd8c4 Compare June 1, 2026 09:49
Introduce the concurrency-safe upload session engine that replaces the
single-file-append TUS model. Session metadata is the source of truth in
Valkey, guarded by a per-session lock (SET NX + token-compare Lua release);
only the chunk payload bytes live on the shared filesystem as per-offset
chunk_<offset>.dat files. Coordination across multiple Storage Proxy replicas
happens entirely through Valkey, so there is no dependency on filesystem lock
semantics (fcntl.flock) or NFS attribute-cache coherence; chunk payloads are
content-addressed by (offset, sha256) and idempotent, needing no coordination.

Contents:
- common/clients/valkey_client/valkey_tus: ValkeyTusClient — per-session
  state get/set (with TTL) + a per-session lock; reuses the Glide-based
  AbstractValkeyClient like the other valkey clients.
- common/defs: REDIS_TUS_DB; common/metrics: VALKEY_TUS layer.
- errors/upload.py: ChunkConflictError (409), UploadSessionCorruptedError (500)
- services/upload/tus_session.py:
  - ChunkRecord / SessionState (BackendAISchema; committed_offset as the
    largest contiguous prefix, missing_ranges, progress_percent) / ChunkAcceptance
  - UploadStatus StrEnum
  - TusUploadSession (ensure_initialized, read_state, write_temp_chunk,
    commit_chunk — idempotent dup / 409 conflict / no-op when completed,
    assemble, cleanup) taking a TusUploadSessionArgs that carries the
    ValkeyTusClient.
- Stale sessions auto-expire via the Valkey TTL (no separate GC sweep).
- Storage RootContext now provisions a ValkeyTusClient (server.py bootstrap).
- Integration tests drive a real Valkey (redis container) + tmp_path chunks.

Resolves BA-6154. Resolves BA-6155.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@jopemachine jopemachine force-pushed the fix/BA-6155-tus-upload-session-class branch from 7bbd8c4 to f337ca3 Compare June 3, 2026 02:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp:common Related to Common component comp:storage-proxy Related to Storage proxy component size:XL 500~ LoC skip:changelog Make the action workflow to skip towncrier check

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant