feat(server): unified cache framework and PCH cache keys by 16bit-ykiko · Pull Request #453 · clice-io/clice

16bit-ykiko · 2026-06-11T16:21:57Z

Background

clice persists three kinds of derived artifacts, and before this PR each managed its own files with ad-hoc logic: PCH/PCM blobs under cache/{pch,pcm} tracked by a hand-rolled cache.json, index shards under a separate index_dir, and synthesized header-context preambles accumulating under header_context/ with no cleanup at all. None of them had a real reliability story:

No durability. Writes used tmp+rename in some places but never fsync, so a crash could leave a blob whose name promises content its bytes don't have. For content-addressed PCHs that means silently loading a truncated artifact under a valid key.
No size bound. The only reclamation was a startup sweep deleting files older than 7 days — which evicts a hot PCH that happens to be old and keeps a cold one that happens to be recent. Disk usage was effectively unbounded.
No crash residue handling, no multi-instance awareness. Leftover tmp files lived forever; two clice instances sharing a cache directory could clobber each other's files.
A correctness bug in the PCH key. The key hashed only the preamble text, so two compile commands differing in -D/-I/-std shared one PCH — wrong macros, wrong diagnostics. This was a long-standing FIXME in compiler.cpp.

This PR introduces a unified blob store (CacheStore) that owns the lifecycle of every on-disk cache artifact, migrates PCH/PCM/index onto it, and fixes the cache-key bug properly.

Design

Narrow ownership. The store manages blob lifecycle only: atomic publication, size accounting and eviction, crash recovery, and an access-time checkpoint. It never computes keys and never decides validity — staleness is domain knowledge (deps snapshots, two-layer mtime+hash checks) that stays with the callers, who hand the store opaque filename-safe keys. This keeps the store generic enough that future artifacts (render outputs, preamble files) plug in as namespaces rather than new subsystems.

One invariant instead of many states. A blob is complete iff it exists at its final path. Commit is fsync-then-atomic-rename, so a crash at any instant leaves exactly one kind of residue: tmp files, which live in per-pid directories and are swept at startup by pid-liveness. There are no sidecar metadata files to get out of sync — the filename is the key, size/mtime come from the filesystem, and the manifest is not a source of truth: it only carries last-accessed times across restarts. Registration always reconciles against a directory scan, so a corrupt or missing manifest merely degrades LRU ordering to mtime, and blobs committed after the last checkpoint are still adopted.

Versioned root, zero migration code. Everything lives under cache/v{N}; anything else under cache/ is discarded wholesale at open. Caller-side metadata (cache.json, now reduced to dependency snapshots) lives inside the versioned root, so a format bump retires blobs and metadata together. We will never write or debug migration code for cache data that can simply be rebuilt.

Policies are the variation point. Three lifecycle policies cover everything we store:

LRU — rebuildable large artifacts (PCH/PCM, 8 GiB budget each): evicted coldest-first on commit, with the freshly committed blob protected and failed deletions (Windows open files) retried on later rounds.
Persistent — data that is expensive to accumulate (the index): never auto-evicted; the store provides key enumeration and the indexer does mark-and-sweep of orphaned shards on load.
Scratch — per-instance working files in pid subdirectories, excluded from manifest and LRU; dead-pid directories are reclaimed on registration. This both bounds previously-unbounded preamble accumulation and stops concurrent instances from deleting each other's live files. (The namespace is registered now; the legacy preamble path migrates onto it with the upcoming preamble rework.)

Process boundary follows the existing architecture. The master exclusively owns the store. Workers never touch store state: the master allocates a tmp path (begin_store), the worker compiles into it, and the master commits after the worker reports success. No cross-process coordination protocol — concurrent instances sharing a root get best-effort safety (per-pid isolation), not coordinated LRU accounting.

Passive and thread-safe by construction. The store owns no timers and never couples to the event loop. All methods are internally synchronized so the heavy operations — commit's fsync, manifest writes — are offloaded by callers to the kota thread pool while lookups (pure in-memory, zero disk IO) continue on the loop. Periodic checkpointing is scheduled by the server; losing the most recent access times in a crash only blurs LRU ordering, which is an accepted trade.

Cache keys

Keys are a caller convention: hex of xxh3_128bits. 128 bits because a collision here means silently reusing the wrong PCH — the 64-bit birthday bound (~2³²) is not acceptable for that failure mode, and there is no adversarial model that would justify paying for SHA-256.

The PCH key now covers, via the new canonicalize(args, profile):

The frontend-relevant subset of compile flags — built on the existing per-option classifiers rather than a new list. Codegen-only flags (-g, -fPIC, -flto, …) stay out of the key so they can't cause spurious rebuilds; note that -O* is kept deliberately, since it defines __OPTIMIZE__ and changes preprocessing. Filtering never reorders (relative -I/-D order is semantic), and unknown options are kept verbatim — over-keying costs sharing, under-keying costs correctness.
The compile directory and the source file's directory — but not the file itself. Quote includes and relative paths resolve against these directories, so identical preamble text in different directories can mean different content; files in the same directory with the same preamble still share one PCH.
The clang version — a clice upgrade bundling a newer clang must not "hit" on blobs that clang would reject, and nothing else in the validity metadata would ever trigger that rebuild.

One subtlety the implementation defends against: even these keys are not fully content-addressed (editing an included header changes PCH content without changing the key input). A rename collision on commit — possible on Windows when the destination blob is open — is therefore treated as benign only after verifying the existing blob is byte-identical; otherwise the stale blob is removed and the publish retried.

Behavior changes

The project.index_dir config field is gone — the index lives inside the store under cache_dir. There is no legacy-layout cleanup or migration anywhere: nothing has shipped yet, so there are no layouts to migrate from.
PCH/PCM disk usage is bounded (8 GiB each by default — currently constants, easy to make configurable).
A kill -9 at any point leaves no residue that the next start doesn't reclaim, and never a half-written blob under a valid name.
Deleting cache_dir remains the single user-facing "clear everything" action.

Testing

Unit: full two-phase state machine (begin/commit/abort, collision and failure paths), LRU eviction with replacement accounting and fresh-commit protection, every manifest recovery path (missing / corrupt / blob-newer-than-manifest), a discriminating test proving checkpointed access times — not mtimes — drive cross-restart eviction, automatic checkpoint triggering, version-bump and stray-file discard, dead-pid tmp/Scratch sweeps, and canonicalize key stability (irrelevant flags keep the key; -D/-I/-std/ordering/unknown flags change it; input file and -main-file-name stay out).
Integration: the regression that motivated the key change (same preamble text, different -D → two PCHs; flag change across restart invalidates), plus kill -9 mid-build → restart sweeps residue and serves normally.
All three local suites pass (unit / integration / smoke) and CI is green on every platform, including the Windows jobs that exercise the collision and pid-liveness paths.

Summary by CodeRabbit

New Features
- Versioned, content-addressed blob cache for PCH/PCM/index with reliable two-phase writes, eviction, and crash-safe recovery.
- Stable canonicalization of compiler command arguments to produce consistent cache keys.
Refactor
- Unified cache config: removed separate index dir; all caches use project.cache_dir.
- Background cache checkpointing and safer shutdown sequencing.
Bug Fixes
- Sweeps tmp/residue files and prevents stray partial blobs after crashes.

Content-addressed blob store with two-phase atomic writes (tmp + fsync + rename), per-namespace lifecycle policies (LRU eviction, Persistent, per-instance Scratch), crash recovery via tmp sweep and directory scan, and a manifest checkpoint carrying last-accessed times across restarts.

canonicalize(args, profile) renders the profile-selected subset of a compile command into a stable string for cache keys. Frontend drops codegen-only and clice-managed options; Preprocessing additionally drops diagnostics-presentation options. Filtering never reorders, unknown options are kept verbatim, and per-file identity (input path, -main-file-name) stays out so identical preambles share keys.

PCH cache keys now hash the preamble text together with the canonical frontend flags, the compile directories and the clang version (fixes the FIXME where -D/-I/-std variants wrongly shared one PCH); PCM keys likewise. Keys use xxh3_128bits. Workers write blobs to store-allocated tmp paths and the master commits them; evicted PCMs are re-validated before each compile. The index moved into the store as a Persistent namespace with mark-and-sweep cleanup; stale pre-framework layouts are discarded on startup. cache.json keeps only validity metadata and lives under the versioned store root.

coderabbitai · 2026-06-11T16:22:09Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

This PR adds a versioned CacheStore, introduces argument canonicalization for deterministic cache keys, migrates PCH/PCM/index persistence to the store, coordinates workspace/server cache lifecycle and checkpointing, simplifies worker output handling, and updates tests and docs to the versioned cache layout.

Changes

Unified cache store and compilation persistence migration

Layer / File(s)	Summary
Command-line argument canonicalization `src/command/argument_parser.h`, `src/command/argument_parser.cpp`, `tests/unit/command/canonicalize_tests.cpp`	Adds `is_diagnostics_option`, `ArgsProfile` enum, and `canonicalize(...)` producing a stable NUL-delimited key used to derive deterministic cache keys; unit tests cover profiles and key stability/semantics.
CacheStore public API `src/support/cache_store.h`	Declares CachePolicy, CacheNamespace, and CacheStore with two-phase writes (`PendingEntry`), lookup/invalidate/for_each_key, checkpoint, shutdown, and move/dtor semantics.
CacheStore implementation `src/support/cache_store.cpp`, `tests/unit/support/cache_store_tests.cpp`	Implements versioned on-disk blob store with manifest atime persistence, tmp/PID sweeping, begin_store/commit/abort, byte-identical collision handling, LRU eviction, checkpointing, shutdown cleanup, and extensive unit tests covering lifecycle, eviction, persistence, and recovery.
PCH/PCM store-backed caching `src/server/compiler/compiler.cpp`, `src/server/service/session.h`, `src/server/worker/stateless_worker.cpp`	Compiler composes store keys from clang version + preamble/module + canonicalized frontend args; PCM/PCH artifacts are looked up and stored transactionally via CacheStore; Session PCH references now use string keys; worker build handlers write to pre-allocated tmp output paths and return them directly.
Indexer persistence migration `src/server/compiler/indexer.h`, `src/server/compiler/indexer.cpp`	Indexer.save/load become parameterless and persist project + dirty shard blobs under the `"index"` namespace using CacheStore two-phase writes; load enumerates keys and invalidates orphan shards when project blob missing.
Workspace and MasterServer wiring `src/server/workspace/workspace.h`, `src/server/workspace/workspace.cpp`, `src/server/service/master_server.h`, `src/server/service/master_server.cpp`	Workspace tracks PCH/PCM entries by store key and uses store->base_dir() for cache.json staging; MasterServer opens/registers CacheStore namespaces, spawns a periodic checkpoint task, and performs coordinated shutdown including store shutdown.
Integration tests and harness updates `tests/integration/utils/cache.py`, `tests/integration/compilation/test_persistent_cache.py`, `tests/integration/compilation/test_staleness.py`, `tests/integration/utils/client.py`, `tests/conftest.py`	Test utilities use versioned cache_root and pin_cache_to_workspace; integration tests validate key-based PCH/PCM persistence, flag-change invalidation, tmp residue recovery, and public client/server shutdown helpers.
Filesystem & test helpers `src/support/filesystem.h`, `tests/unit/test/temp_dir.h`	Adds `fs::remove_all` and updates TempDir cleanup to recursively remove test roots.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Compiler
  participant CacheStore
  participant StatelessWorker
  Client->>Compiler: invoke compile (args)
  Compiler->>Compiler: canonicalize(args) -> key
  Compiler->>CacheStore: lookup("pch"/"pcm", key)
  alt cache miss
    Compiler->>CacheStore: begin_store(ns, key) -> pending.tmp_path
    Compiler->>StatelessWorker: build(output_path=pending.tmp_path)
    StatelessWorker-->>Compiler: build result (tmp_path)
    Compiler->>CacheStore: commit(pending)
  else cache hit
    CacheStore-->>Compiler: blob path
  end
  Compiler->>Client: compilation result (uses blob path or built artifact)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

clice-io/clice#371: Modifies command-line argument parsing helpers (is_discarded_option) that this PR builds upon for canonicalization.
clice-io/clice#435: Related changes to PCH cache handling that this PR extends with store-keyed identities and transactional writes.
clice-io/clice#391: Overlaps on worker output-path handling and build output semantics.

Poem

🐰 I hopped through args to make a key,

tucked blobs in folders safe and neat,
temp files danced then vanished fast,
manifests hummed to hold the past,
happy keys make caches sweet.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 30.77% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: introducing a unified cache framework with updated PCH cache key handling, which aligns with the substantial architectural addition of CacheStore and PCH key derivation from canonical arguments.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/cache-store

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cb145d38e7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

src/support/cache_store.h (1)
73-81: ⚖️ Poor tradeoff

PendingEntry lacks automatic cleanup.

Dropping a PendingEntry without calling commit() or abort() leaks the tmp file until the next open() sweep. This creates exception-safety risk: if the caller's code throws between begin_store() and commit(), the tmp file is orphaned.

Consider adding an RAII guard (e.g., abort() in destructor if not committed) or documenting the leak risk more prominently at the call sites.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/support/cache_store.h` around lines 73 - 81, PendingEntry can leak tmp
files if dropped before commit()/abort(); modify the struct to follow RAII by
adding a destructor that calls abort() when the entry hasn't been finalized, and
add a boolean flag (e.g., finalized or committed) updated by commit() and
abort() to prevent double work; update begin_store()/commit()/abort() callers to
rely on this RAII behavior and ensure move semantics are handled (delete or
define move ctor/assign to transfer ownership of tmp_path and the finalized
flag) so exceptions between begin_store() and commit() automatically clean up
the tmp file without waiting for open() sweeps.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/server/compiler/indexer.cpp`:
- Around line 133-149: The code currently proceeds to write merged shard blobs
even if storing the "project" blob fails; modify the save sequence in the
function containing store_blob and workspace.project_index so that after
attempting store_blob(..., "project", ...) you check the boolean result and
return early (or abort further saves) when it is false, preventing any writes of
workspace.merged_indices shards; ensure you reference the existing store_blob
call that serializes workspace.project_index and guard the subsequent loop over
workspace.merged_indices (and any use of shard.index.serialize) so shards are
only persisted when the project blob save succeeded.

In `@src/server/service/master_server.cpp`:
- Around line 211-214: The current shutdown persists caches before quiescing
producers, which allows in-flight indexing/PCH work to complete after the
snapshot; change the shutdown order so you first await/quiesce producers and
compilers (co_await kota::when_all(indexer.stop(), compiler.stop()) or
equivalent) to drain in-flight work, then call indexer.save() and
workspace.save_cache() to persist a stable snapshot, and finally stop the pool
(co_await pool.stop()); update calls around the indexer/compile stop/save
sequence to ensure no new work can modify caches before serialization.

---

Nitpick comments:
In `@src/support/cache_store.h`:
- Around line 73-81: PendingEntry can leak tmp files if dropped before
commit()/abort(); modify the struct to follow RAII by adding a destructor that
calls abort() when the entry hasn't been finalized, and add a boolean flag
(e.g., finalized or committed) updated by commit() and abort() to prevent double
work; update begin_store()/commit()/abort() callers to rely on this RAII
behavior and ensure move semantics are handled (delete or define move
ctor/assign to transfer ownership of tmp_path and the finalized flag) so
exceptions between begin_store() and commit() automatically clean up the tmp
file without waiting for open() sweeps.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 742002e5-0580-49a1-8440-017a7f13aca7

📥 Commits

Reviewing files that changed from the base of the PR and between d20e897 and cb145d3.

📒 Files selected for processing (18)

src/command/argument_parser.cpp
src/command/argument_parser.h
src/server/compiler/compiler.cpp
src/server/compiler/indexer.cpp
src/server/compiler/indexer.h
src/server/service/master_server.cpp
src/server/service/master_server.h
src/server/service/session.h
src/server/worker/stateless_worker.cpp
src/server/workspace/workspace.cpp
src/server/workspace/workspace.h
src/support/cache_store.cpp
src/support/cache_store.h
tests/integration/compilation/test_persistent_cache.py
tests/integration/compilation/test_staleness.py
tests/integration/utils/cache.py
tests/unit/command/canonicalize_tests.cpp
tests/unit/support/cache_store_tests.cpp

windows.h was included before the LLVM headers without NOMINMAX, so its min/max function-like macros broke SmallVector.h/MathExtras.h macro expansion on the MSVC CI jobs. Also define WIN32_LEAN_AND_MEAN; this is the only direct windows.h include in clice, and the comment documents that any future include needs the same guards.

The benign rename-collision fallback in CacheStore::commit assumed content-addressed keys, which only holds for LRU namespaces. Persistent and Scratch keys are mutable (the index rewrites "project"/"{path_id}" with new content), so on Windows a rename onto an open blob was silently treated as success and the new data discarded — stale index served forever. Now only LRU keeps the fallback (and requires a regular file); Persistent/Scratch remove the stale destination and retry, surfacing an error if the blob still cannot be published, and dropping the in-memory entry when the old blob is gone too.

Indexer::save() ran store.commit() — an fsync per blob — synchronously on the event loop, unlike the PCH/PCM paths which already offload via kota::queue. save() is now a task: phase one serializes the ProjectIndex and dirty shards to tmp files with no suspension point, so the batch stays a consistent snapshot even if a merge runs before the commits land; phase two commits each blob on the kota thread pool.

The index lives inside the unified cache store under cache_dir, so the field no longer points at anything. Drop it from the config struct, its default/substitution logic and the docs; the TOML codec skips unknown keys, so configs still setting index_dir keep parsing (covered by a new test). The legacy index/ directory is still cleaned up at store open.

CliceClient now exposes server/kill_server()/stop_io(), giving the pygls internals (_server, _stop_event, _async_tasks) a single home. The kill -9 recovery test, conftest shutdown and the agentic shutdown tests use the new API instead of poking at private members.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/integration/agentic/test_agentic.py`:
- Around line 447-448: Wrap the call to assert_server_exited_cleanly(c.server)
and await c.stop_io() in a try/finally so stop_io() always runs even if the
assertion fails; specifically, in the test (test_rpc_shutdown) ensure you call
await assert_server_exited_cleanly(c.server) inside a try block and put await
c.stop_io() in the finally block to guarantee IO cleanup and prevent task leaks.

In `@tests/integration/utils/client.py`:
- Around line 129-134: The stop_io method sets _stop_event and cancels tasks but
doesn't await them, leaving pending tasks that can leak; update stop_io to
cancel each Task in self._async_tasks and then await their completion (e.g.,
await asyncio.gather(*self._async_tasks, return_exceptions=True) or await tasks
individually while swallowing asyncio.CancelledError) to ensure all client-side
IO tasks finish before teardown, and clear or reset self._async_tasks afterward.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7f45079b-1634-44e9-adec-e769cb1dc493

📥 Commits

Reviewing files that changed from the base of the PR and between cb145d3 and ec80822.

📒 Files selected for processing (16)

docs/clice.toml
docs/en/guide/configuration.md
docs/zh/guide/configuration.md
src/server/compiler/indexer.cpp
src/server/compiler/indexer.h
src/server/service/master_server.cpp
src/server/workspace/config.cpp
src/server/workspace/config.h
src/support/cache_store.cpp
src/support/cache_store.h
tests/conftest.py
tests/integration/agentic/test_agentic.py
tests/integration/compilation/test_persistent_cache.py
tests/integration/utils/client.py
tests/unit/server/config_tests.cpp
tests/unit/support/cache_store_tests.cpp

💤 Files with no reviewable changes (2)

src/server/workspace/config.h
src/server/workspace/config.cpp

✅ Files skipped from review due to trivial changes (2)

docs/zh/guide/configuration.md
docs/en/guide/configuration.md

🚧 Files skipped from review as they are similar to previous changes (6)

src/server/compiler/indexer.cpp
src/support/cache_store.h
src/server/compiler/indexer.h
src/server/service/master_server.cpp
tests/unit/support/cache_store_tests.cpp
tests/integration/compilation/test_persistent_cache.py

16bit-ykiko · 2026-06-11T18:37:30Z

@codex

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ec80822213

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

The collision fallback assumed LRU blobs colliding by name share the same content, but even LRU keys are not fully content-addressed: a dependency edit changes PCH content without changing the key input, so on Windows a rename onto an open stale blob was kept while the caller recorded fresh deps metadata — stale PCH served with no rebuild trigger. Keep the survivor only when verified byte-identical; otherwise remove and retry like mutable keys. Also document that the unsigned wraparound in the size accounting is intentional and exact.

Publish shard blobs only together with the ProjectIndex they were built against: bail out when its serialization fails and drop the staged shards when its commit fails, so a restart never pairs new shards with an old project blob. Quiesce compiler/indexer before the shutdown save so the snapshot covers everything that completed.

stop_io now awaits the cancelled IO tasks so none outlive the test, and test_rpc_shutdown reaches it even when the exit assertion fails.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ade9017146

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

The building-event cleanup (wake waiters, clear the marker) ran inline on each exit path of ensure_pch, so a coroutine cancelled at any of its suspension points unwound without ever signalling the event and later requests for the same file would wait on it forever. Today that only happens at shutdown where the waiters are cancelled too, but the invariant is fragile — any future per-compile cancellation would trip it. An RAII guard now owns the cleanup, covering cancellation as well as the normal paths, and only clears the marker if it still points at its own registration (the entry can be erased by didClose and re- registered by a newer build mid-flight).

Record why the store is synchronous-plus-mutex rather than coroutines over kota's async fs: a started operation always runs to completion, so cancellation never observes a torn mid-operation state. Leave a TODO to re-evaluate the loop-confined coroutine design (no mutex, cancellation points inside operations) once real-world usage settles.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 52ed28fd20

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

llvm::sys::fs::remove_directories is implemented over shell COM on Windows (CoInitializeEx + IFileOperation) with no fallback, and with the default IgnoreErrors=true a COM failure (e.g. the thread already holds a different apartment model) is a silent no-op — unsuitable for the server event loop, which uses it for the version sweep, dead-pid reclamation and shutdown cleanup. Add fs::remove_all, the same plain recursive removal LLVM's Unix implementation uses (directory_iterator + remove, no COM), and use it everywhere. Also document that the commit rename collision is rarer than it looks: sys::fs::rename already moves aside destinations whose holders granted delete sharing, which LLVM-opened files always do.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: eff8fcf930

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Nothing has shipped yet, so there is no old on-disk layout to clean up after: remove the startup deletion of the pre-store index/ directory and the stale-layout wording around the version sweep (which keeps discarding anything under cache/ that isn't the current version dir).

A stale entry under cache/ that is a symlink to a directory outside the cache root would be recursed into by the version sweep, deleting files the store does not own. lstat the root first: anything that is not a real directory — including symlinks to directories — is unlinked without being followed (children were already handled no-follow).

The eviction preflight ran only before compile_deps, but building dependencies can itself evict another clean module's PCM under LRU budget pressure, reopening the window just closed and handing clang a dangling path via fill_pcm_deps. Scan-dirty-compile now loops until a scan finds nothing evicted (bounded retries).

chatgpt-codex-connector

💡 Codex Review

clice/src/command/argument_parser.cpp

Lines 218 to 221 in 4f6d17d

    
           case OPT_fPIC: 
        
           case OPT_fno_PIC: 
        
           case OPT_fpic: 
        
           case OPT_fno_pic:

Keep PIC flags in frontend cache keys

For compile commands that differ only by -fpic versus -fPIC, this makes canonicalize(..., ArgsProfile::Frontend) drop the distinguishing flag, so files with the same preamble can reuse the same PCH/PCM even though Clang exposes different predefined macros (clang++ -fpic/-fPIC -dM -E -x c++ /dev/null reports different __PIC__/__pic__ values). Any header that branches on those macros can therefore be compiled from a stale artifact; these flags need to remain in the frontend key.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

16bit-ykiko added 3 commits June 12, 2026 00:20

chatgpt-codex-connector Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread src/support/cache_store.cpp

Comment thread src/support/cache_store.cpp Outdated

coderabbitai Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread src/server/compiler/indexer.cpp Outdated

Comment thread src/server/service/master_server.cpp Outdated

16bit-ykiko added 5 commits June 12, 2026 01:21

coderabbitai Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread tests/integration/agentic/test_agentic.py Outdated

Comment thread tests/integration/utils/client.py

chatgpt-codex-connector Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread src/support/cache_store.cpp Outdated

16bit-ykiko added 3 commits June 12, 2026 02:48

fix(tests): harden client IO teardown

ade9017

stop_io now awaits the cancelled IO tasks so none outlive the test, and test_rpc_shutdown reaches it even when the exit assertion fails.

chatgpt-codex-connector Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread src/support/cache_store.cpp

16bit-ykiko mentioned this pull request Jun 12, 2026

feat(server): error feedback and logging channels (D9) #456

Open

4 tasks

16bit-ykiko added 2 commits June 12, 2026 17:00

chatgpt-codex-connector Bot reviewed Jun 12, 2026

View reviewed changes

Comment thread src/server/compiler/compiler.cpp Outdated

Comment thread src/server/compiler/indexer.cpp

chatgpt-codex-connector Bot reviewed Jun 12, 2026

View reviewed changes

Comment thread src/server/compiler/compiler.cpp

Comment thread src/support/cache_store.cpp

Comment thread src/support/cache_store.cpp

16bit-ykiko added 3 commits June 12, 2026 19:56

chatgpt-codex-connector Bot reviewed Jun 12, 2026

View reviewed changes

Comment thread src/server/compiler/compiler.cpp

	case OPT_fPIC:
	case OPT_fno_PIC:
	case OPT_fpic:
	case OPT_fno_pic:

Conversation

16bit-ykiko commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

Design

Cache keys

Behavior changes

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

16bit-ykiko commented Jun 11, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

16bit-ykiko commented Jun 11, 2026 •

edited

Loading

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading