id: garbage-collection
title: "CAS Garbage Collection"
description: "Automatic and manual garbage collection for the content-addressed store — prune caches, compact history, mark-and-sweep unreachable objects"
category: internals
tags: [gc, garbage-collection, cas, storage, compaction, mark-sweep, epochs]
version: "1.0.0"CAS accumulates objects indefinitely from remote pushes and graph executions. GC frees storage by pruning derived caches, compacting snapshot history, and sweeping unreachable objects. The same GC engine powers the Rye tool (local + remote), the server auto-GC, and the server API endpoints.
| Component | File |
|---|---|
| Core engine | ryeos/rye/cas/gc.py |
| Type definitions | ryeos/rye/cas/gc_types.py |
| Epoch support | ryeos/rye/cas/gc_epochs.py |
| Lock support | ryeos/rye/cas/gc_lock.py |
| Incremental state | ryeos/rye/cas/gc_incremental.py |
| Rye tool | ryeos/bundles/core/ryeos_core/.ai/tools/rye/core/gc/gc.py |
| Server integration | ryeos-node/ryeos_node/server.py |
- Prunes materialized
cache/snapshots/directories (fully derived, safe to delete unconditionally) - Prunes user-space cache (
cache/user/) - Prunes excess execution records per
(project_path, graph_id): keeps running + last N success + last N failure - Configurable via
cache_max_age_hours(default 24) andmax_executions_per_graph(default 10)
Rewrites the ProjectSnapshot DAG chain to make old objects unreachable. Walks the first-parent chain from HEAD and classifies which snapshots to retain:
- HEAD (always)
- Last N manual pushes (source == "push", default 3)
- 1 daily checkpoint per calendar day for last N days (default 7)
- Weekly checkpoints (optional)
- Pinned snapshots (always, exact hash preserved)
Builds a new chain oldest→newest with rewritten parent links. Uses CAS compare-and-swap to advance HEAD atomically. Skips compaction if the chain is too deep (>50K) or history is incomplete.
- Collects root hashes from all live refs (project HEADs, user-space HEAD, pins, execution records, running markers, in-flight epochs)
- Two root collectors:
_collect_roots_server()for per-user server layout,_collect_roots_local()for project CAS layout (cas_root/refs/) - Iterative BFS traversal from roots — follows all hash references in objects to build the reachable set
- Sweep deletes objects and blobs not in the reachable set, respecting a grace window (default 3600s, aggressive: 300s)
- Grace window protects objects created after mark started
Push and execute endpoints register epochs before creating CAS objects via register_epoch(). Epochs complete after ref advance via complete_epoch().
- Stored in
user_root/inflight/as JSON files - Epoch root hashes are added to the reachable set during mark phase
- Stale epochs cleaned after configurable timeout (default 30 min)
- Prevents sweep from deleting objects being written by concurrent operations
pin_snapshot(user_root, cas_root, project_hash, snapshot_hash, pin_id)creates durable refs atrefs/pins/<project>/<pin-id>/headunpin_snapshot(user_root, project_hash, pin_id)removes pins- Pinned snapshots survive compaction — their original hash is preserved (never rewritten)
- Pin refs are treated as GC roots during mark phase
Per-user lock at user_root/gc.lock.json:
- Includes
gc_run_id,node_id,timestamp,expiry,generationcounter,current_phase - Lock expires after configurable timeout (prevents deadlocks from crashed GC runs)
- Phase tracking:
idle→prune→compact→mark→sweep
Triggered in _check_user_quota() when a user exceeds max_user_storage_bytes quota:
- Rate-limited by
gc_auto_cooldown_seconds(default 600s) - First attempts emergency cache prune, then full aggressive GC if still over quota
- Controlled by
gc_auto_enabled(default true)
Runs GC for the authenticated user.
- Parameters:
dry_run,aggressive,retention_days,max_manual_pushes - Returns:
GCResultwith per-phase breakdowns
Returns usage info and GC state:
usage_bytes, quota info,gc_state,gc_lockstatusrecent_events(last 10 fromgc.jsonl)inflight_epochscount
gc:
retention_days: 7 # daily checkpoints to keep
max_manual_pushes: 3 # manual push snapshots to retain
max_executions_per_graph: 10 # execution records per graph
cache_max_age_hours: 24 # cache snapshot max age
auto_gc_enabled: true # enable quota-triggered auto-GC
auto_gc_cooldown_seconds: 600 # minimum interval between auto-GC runs
grace_window_seconds: 3600 # sweep grace period for new objectsMaps to Settings fields: gc_retention_days, gc_max_manual_pushes, gc_max_executions, gc_cache_max_age_hours, gc_auto_enabled, gc_auto_cooldown, gc_grace_window.
- Every GC run writes a structured JSON event to
user_root/logs/gc.jsonl - Events include per-phase metrics: cache entries deleted, snapshots compacted/retained, objects/blobs swept, bytes freed, duration
- GC state persisted for incremental runs:
last_gc_at, reachable count, generation counter - State invalidated when compaction discards snapshots (forces full mark on next run)
- After a full mark, the reachable set is stored as a CAS blob
- Subsequent runs can load the previous reachable set and only re-mark from changed/new roots
- Invalidated when compaction discards snapshots or GC state is manually cleared
First production deployment on track-blox:
- Remote (server): 1.8 GB / ~198K objects / 237 snapshot chain → 0.7 GB / 5 retained snapshots / 198,231 objects swept, 1.1 GB freed in 53 seconds
- Local: 50 MB / 7,463 objects → 16.5 MB / 128 reachable
| Component | File |
|---|---|
| GC engine | ryeos/rye/cas/gc.py |
| Type definitions | ryeos/rye/cas/gc_types.py |
| Writer epochs | ryeos/rye/cas/gc_epochs.py |
| Distributed lock | ryeos/rye/cas/gc_lock.py |
| Incremental state | ryeos/rye/cas/gc_incremental.py |
| Rye tool | ryeos/bundles/core/ryeos_core/.ai/tools/rye/core/gc/gc.py |
| Server endpoints | ryeos-node/ryeos_node/server.py |
| Server config | ryeos-node/ryeos_node/config.py |
- Durable roots = all current refs: project HEADs, user-space HEAD, pins
- Ephemeral roots = all active writer epochs in
inflight/ - Protected set during sweep = durable reachable ∪ ephemeral reachable ∪ objects newer than grace window
- Compaction invalidates incremental state — next mark must be full
- Pinned snapshots preserve exact original hash — compaction never rewrites pinned objects
- Current HEAD policy governs compaction — no historical per-snapshot policies