diff --git a/changelog/727.feature.md b/changelog/727.feature.md
new file mode 100644
index 000000000..749c94f24
--- /dev/null
+++ b/changelog/727.feature.md
@@ -0,0 +1,5 @@
+Added the `ref test-cases ci-gate` command, which decides how CI should verify each
+test case's regression baseline against the base branch: skip it, replay the cached
+native baseline, re-run the diagnostic in full, or fail an unauthorised change.
+Baselines are now coupled to their input catalogs, so changing a test case's inputs
+without regenerating its baseline is reported as a failure rather than passing silently.
diff --git a/docs/background/regression-baselines.md b/docs/background/regression-baselines.md
new file mode 100644
index 000000000..c4c9b2c36
--- /dev/null
+++ b/docs/background/regression-baselines.md
@@ -0,0 +1,156 @@
+# Regression baselines and the CI coupling gate
+
+Climate REF pins each test case to a **regression baseline**:
+a recorded, known-good output that a pull request is checked against
+so that an unintended change in a diagnostic's results cannot land unnoticed.
+
+This page explains the two-layer baseline model,
+the lifecycle commands that produce and verify it,
+and the CI coupling gate that decides *how* each test case is verified in a pull request.
+
+## The two-layer baseline model
+
+A baseline is split into two layers with very different trust and portability properties.
+
+- The **committed bundle** is the small, text-only CMEC output
+  (`series.json`, `diagnostic.json`, `output.json`)
+  written into the test case's `regression/` directory and tracked in git.
+  Absolute paths are rewritten to portable placeholders
+  (`<OUTPUT_DIR>`, `<TEST_DATA_DIR>`) so the bytes are machine independent.
+  This bundle is **the gate signal**: it is what review actually sees in the diff.
+
+- The **native bundle** is the heavy binary output
+  (`.nc`, `.png`, ...) that the committed bundle references.
+  Native files are content-addressed by their sha256 digest in an object store,
+  fetched anonymously, and are **never required to be present**.
+  They are written only by the credentialed `mint` step.
+
+The two layers are bound by a **`manifest.json`** alongside the bundle, which records:
+
+- `test_case_version` — a monotonic, author-bumped integer that *authorises* a new baseline.
+- `committed` — sha256 digests of the committed JSON artefacts, over the exact placeholder-substituted bytes on disk.
+- `native` — sha256 + size of each curated native file (empty `{}` until minted).
+- `catalog_hash` — the hash of the test case's input `catalog.yaml`, coupling the baseline to the inputs that produced it.
+
+!!! note "An empty native set is a permanent, valid state"
+    Fork contributors cannot mint
+    (minting needs object-store write credentials that never run on untrusted pull-request code).
+    A baseline with `native: {}` is fully gated by its committed JSON bundle;
+    the native layer is **opt-in extra verification** that only runs when the blobs exist.
+
+## Lifecycle commands
+
+The `ref test-cases` verbs produce and verify baselines.
+Only `mint` needs write credentials; everything else is anonymous and safe on untrusted code.
+
+```mermaid
+flowchart LR
+    run["run<br/>(no creds)"] -->|"seed committed bundle<br/>+ manifest, native = {}"| repo[("regression/<br/>+ manifest.json")]
+    repo --> mint["mint<br/>(write creds, CI)"]
+    mint -->|"upload blobs to store<br/>populate manifest.native"| store[("native object<br/>store (sha256)")]
+    store --> sync["sync<br/>(public read)"]
+    store --> replay["replay<br/>(public read)"]
+    sync -->|"fetch blobs locally"| local[(local native)]
+    replay -->|"materialise native +<br/>re-run build_execution_result +<br/>tolerant compare"| verdict{matches<br/>committed?}
+    verdict -->|yes| ok([pass])
+    verdict -->|no| bad([fail])
+```
+
+| Verb | Credentials | What it does |
+| --- | --- | --- |
+| `run` | none | Execute the diagnostic, curate outputs, write the committed bundle and seed `manifest.json` (`native = {}`). |
+| `mint` | write | Upload the curated native files to the object store and populate `manifest.native`. Generally run by CI. |
+| `sync` | public read | Fetch the native blobs referenced by the manifest into the local tree. |
+| `replay` | public read | Materialise the native baseline, re-run only `build_execution_result`, and tolerantly compare to the committed bundle. |
+
+## Tolerant comparison
+
+`replay` and the execute path do not require byte-equality of the regenerated JSON.
+A byte-equal fast path short-circuits the common case;
+otherwise values are compared with a small relative/absolute tolerance
+(`math.isclose`), and volatile keys/values are sanitised to placeholders before comparison.
+This absorbs platform-level floating-point noise without masking a real change in results.
+
+## The CI coupling gate
+
+For every test case, CI must decide *how* to verify the baseline in a pull request,
+purely from what changed relative to the base branch.
+`decide_coupling` is a pure function over the head and base manifests
+plus three on-disk facts the caller supplies —
+whether the diagnostic's extraction code changed,
+whether the committed bundle still matches its manifest digests,
+and whether the input catalog still matches its manifest hash.
+
+The gate **fails closed**:
+any state it cannot positively verify is a failure, not a silent skip.
+The single exception is the native layer —
+because an empty native set is legitimate, a missing native baseline downgrades to `skip` (with a warning), never `fail`.
+
+### Actions
+
+- **`skip`** — nothing relevant to this case changed, or the case is not under regression management.
+- **`replay`** — cheap, anonymous re-check against the cached native baseline (only when native blobs exist).
+- **`execute`** — full end-to-end re-run with tolerant compare; required when `test_case_version` was bumped to authorise a new baseline.
+- **`fail`** — an unauthorised or unverifiable change (committed bundle edited without a version bump, version moved backwards, bundle drifted from its manifest, or catalog changed without regenerating the baseline).
+
+### Decision flow
+
+```mermaid
+flowchart TD
+    start([test case]) --> hasManifest{manifest.json<br/>present?}
+
+    hasManifest -->|no| baseHad{present on<br/>base branch?}
+    baseHad -->|yes| failDeleted[["FAIL<br/>managed baseline removed"]]
+    baseHad -->|no| skipUnmanaged[["SKIP<br/>not under management"]]
+
+    hasManifest -->|yes| committedOk{committed bundle<br/>matches manifest<br/>digests?}
+    committedOk -->|no| failDrift[["FAIL<br/>bundle drifted from manifest"]]
+
+    committedOk -->|yes| catalogOk{catalog.yaml<br/>matches manifest<br/>catalog_hash?}
+    catalogOk -->|no| failCatalog[["FAIL<br/>inputs changed without<br/>regenerating baseline"]]
+
+    catalogOk -->|yes| seeding{newly added<br/>this branch?}
+    seeding -->|yes| seedNative{native<br/>non-empty?}
+    seedNative -->|yes| replaySeed[["REPLAY<br/>seed against committed"]]
+    seedNative -->|no| skipSeed[["SKIP<br/>committed-only baseline"]]
+
+    seeding -->|no| versionCmp{version vs base}
+    versionCmp -->|decreased| failVersion[["FAIL<br/>version not monotonic"]]
+    versionCmp -->|bumped| execute[["EXECUTE<br/>authorised new baseline"]]
+    versionCmp -->|unchanged| committedChanged{committed<br/>changed?}
+
+    committedChanged -->|yes| failUnauthorised[["FAIL<br/>baseline changed without<br/>version bump"]]
+    committedChanged -->|no| nativeChanged{native<br/>changed?}
+
+    nativeChanged -->|yes| nativeEmpty{"head native<br/>empty?<br/>(de-mint)"}
+    nativeEmpty -->|no| replayNative[["REPLAY<br/>confirm new native<br/>reproduces committed"]]
+    nativeEmpty -->|yes| skipDemint[["SKIP + WARN<br/>native removed"]]
+
+    nativeChanged -->|no| extractionChanged{extraction<br/>code changed?}
+    extractionChanged -->|no| skipNothing[["SKIP<br/>nothing to verify"]]
+    extractionChanged -->|yes| extractNative{native<br/>non-empty?}
+    extractNative -->|yes| replayExtract[["REPLAY<br/>verify against cached native"]]
+    extractNative -->|no| skipExtract[["SKIP<br/>no native to replay"]]
+```
+
+### How changed files map to the signals
+
+The gate diffs `base...HEAD` (against the merge base, so unrelated base-branch churn is excluded)
+and maps the changed-file list onto each case:
+
+- **`extraction_changed`** is coarse and errs toward `replay` (cheap, credential-free):
+  any change under the diagnostic's provider package,
+  or under the core regression package, counts for every case in that provider.
+- **`committed_integrity_ok`** recomputes the committed digests from the working tree
+  and compares them to `manifest.committed`.
+- **`catalog_integrity_ok`** recomputes the input catalog hash
+  and compares it to `manifest.catalog_hash`,
+  catching an input change that was not accompanied by a regenerated baseline.
+
+The gate exits non-zero if any case is gated `fail`;
+the `--json` output drives CI's dispatch of the `replay` and `execute` jobs.
+
+!!! warning "Credentials never cross the trust boundary"
+    `replay`, `sync`, `run`, and the gate itself are anonymous and safe to run on untrusted pull-request code.
+    Only `mint` holds object-store write credentials,
+    and it runs exclusively on the trusted-tier runner — never on fork pull-request code.
diff --git a/mkdocs.yml b/mkdocs.yml
index 4049eaa1e..093fadfa3 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -41,6 +41,7 @@ nav:
       - Architecture: background/architecture.md
       - Explanation: background/explanation.md
       - Datasets: background/datasets.md
+      - Regression Baselines: background/regression-baselines.md
       - API Surface: api-surface.md
       - Versioning and Compatibility: versioning-and-compatibility.md
       - Quality Assurance: background/quality-assurance.md
diff --git a/packages/climate-ref-core/src/climate_ref_core/paths.py b/packages/climate-ref-core/src/climate_ref_core/paths.py
index d1cc3e699..2adb0c898 100644
--- a/packages/climate-ref-core/src/climate_ref_core/paths.py
+++ b/packages/climate-ref-core/src/climate_ref_core/paths.py
@@ -58,10 +58,10 @@ def safe_path(
     """
     text = str(relpath)
     pure = PurePosixPath(text)
-    if not text or "\x00" in text or pure.is_absolute() or ".." in pure.parts:
+    if not text or "\x00" in text or pure.is_absolute() or ".." in pure.parts or not pure.parts:
         raise ValueError(
             f"Unsafe {label} {text!r}: must be a contained relative path "
-            "(no absolute paths, no '..', no NUL bytes)."
+            "(no absolute paths, no '.' or '..', no NUL bytes)."
         )
     if single_segment and (len(pure.parts) != 1 or "\\" in text):
         raise ValueError(
diff --git a/packages/climate-ref-core/src/climate_ref_core/regression/__init__.py b/packages/climate-ref-core/src/climate_ref_core/regression/__init__.py
index e2519cb20..d92acc1ca 100644
--- a/packages/climate-ref-core/src/climate_ref_core/regression/__init__.py
+++ b/packages/climate-ref-core/src/climate_ref_core/regression/__init__.py
@@ -22,6 +22,12 @@
     assert_bundle_regression,
     compare_json_content,
 )
+from climate_ref_core.regression.gate import (
+    Action,
+    GateDecision,
+    decide_coupling,
+    paths_under,
+)
 from climate_ref_core.regression.manifest import (
     COMMITTED_BUNDLE_FILES,
     SCHEMA_VERSION,
@@ -43,6 +49,8 @@
 __all__ = [
     "COMMITTED_BUNDLE_FILES",
     "SCHEMA_VERSION",
+    "Action",
+    "GateDecision",
     "LocalFilesystemStore",
     "Manifest",
     "NativeEntry",
@@ -56,7 +64,9 @@
     "capture_execution",
     "compare_json_content",
     "compute_committed_digests",
+    "decide_coupling",
     "materialise_native",
+    "paths_under",
     "sha256_bytes",
     "sha256_file",
     "verify_committed_integrity",
diff --git a/packages/climate-ref-core/src/climate_ref_core/regression/compare.py b/packages/climate-ref-core/src/climate_ref_core/regression/compare.py
index 5bc01c89c..7a7d610f0 100644
--- a/packages/climate-ref-core/src/climate_ref_core/regression/compare.py
+++ b/packages/climate-ref-core/src/climate_ref_core/regression/compare.py
@@ -23,12 +23,12 @@ class Tolerance:
         Relative tolerance — the allowed proportional difference between expected
         and actual float values (default ``1e-6``).
     atol
-        Absolute tolerance — the minimum absolute difference that can ever be
-        flagged, regardless of magnitude (default ``0.0``).
+        Absolute tolerance — the floor difference always permitted,
+        regardless of magnitude (default ``1e-8``).
     """
 
     rtol: float = 1e-6
-    atol: float = 0.0
+    atol: float = 1e-8
 
 
 def _rewrite_keys_and_values(obj: Any, replacements: list[tuple[str, str]]) -> Any:
@@ -137,9 +137,10 @@ def _compare_recursive(
     is still identifiable.
     """
     label = path or "<root>"
-    # Type mismatch (treat int/float as numeric together)
+    # Type mismatch (treat int/float as numeric together).
     if type(expected) is not type(actual):
-        if isinstance(expected, (int, float)) and isinstance(actual, (int, float)):
+        either_is_bool = isinstance(expected, bool) or isinstance(actual, bool)
+        if not either_is_bool and isinstance(expected, (int, float)) and isinstance(actual, (int, float)):
             _compare_numeric(float(expected), float(actual), tol=tol, path=label, out=out)
             return
         out.append(
diff --git a/packages/climate-ref-core/src/climate_ref_core/regression/gate.py b/packages/climate-ref-core/src/climate_ref_core/regression/gate.py
new file mode 100644
index 000000000..17b75ac27
--- /dev/null
+++ b/packages/climate-ref-core/src/climate_ref_core/regression/gate.py
@@ -0,0 +1,259 @@
+"""
+CI coupling gate for test case regression bundles.
+
+For each test case, CI must decide *how* to verify the regression baseline in a pull request,
+based on what changed relative to the base branch:
+
+- **EXECUTE** — re-run the diagnostic end-to-end and compare the regenerated committed
+    bundle to the in-repo copy (tolerant compare).
+    Required when the author has bumped ``test_case_version`` to authorise a new baseline.
+- **REPLAY** — materialise the cached native baseline and re-run only
+    ``build_execution_result`` against it, comparing to the committed bundle.
+    Cheap and anonymous; used when extraction code changed but the committed bundle did
+    not, and to seed a newly added manifest — but only when native blobs exist to replay.
+- **FAIL** — the committed bundle (or its input catalog) changed without a
+    ``test_case_version`` bump (an unauthorised baseline change), or the version moved backwards.
+- **SKIP** — nothing relevant to this test case changed, or the case is not yet under
+    regression-baseline management.
+
+See [Regression Baselines](https://climate-ref.readthedocs.io/en/latest/background/regression-baselines/)
+for more information about the motivation and workflow for regression baselines.
+"""
+
+from __future__ import annotations
+
+import enum
+from collections.abc import Iterable
+
+from attrs import frozen
+
+from climate_ref_core.regression.manifest import Manifest
+
+
+class Action(enum.Enum):
+    """The verification action the CI gate selects for a single test case."""
+
+    SKIP = "skip"
+    """Nothing relevant changed, or the case is not under regression management."""
+
+    REPLAY = "replay"
+    """Replay the cached native baseline and compare to the committed bundle."""
+
+    EXECUTE = "execute"
+    """Re-run the diagnostic end-to-end and compare to the committed bundle."""
+
+    FAIL = "fail"
+    """The change is not permissible (unauthorised baseline change or bad version)."""
+
+
+@frozen
+class GateDecision:
+    """The gate's decision for one test case: an :class:`Action` and why."""
+
+    action: Action
+    """The selected verification action."""
+
+    reason: str
+    """A human-readable explanation, surfaced in CI logs."""
+
+
+def decide_coupling(  # noqa: PLR0911, PLR0912
+    manifest: Manifest | None,
+    base_manifest: Manifest | None,
+    *,
+    extraction_changed: bool = False,
+    committed_integrity_ok: bool = True,
+    catalog_integrity_ok: bool = True,
+) -> GateDecision:
+    """
+    Decide how CI should verify a single test case's regression baseline.
+
+    The decision is pure: it performs no I/O.
+    All on-disk reality is summarised by its arguments
+    — the two manifests, whether the diagnostic's extraction code
+    changed, and whether the committed bundle on disk still matches the current
+    manifest's digests.
+
+    The gate fails if any state cannot be positively verified.
+    A deleted managed manifest, a committed bundle that drifted from its manifest,
+    an input catalog that drifted from its manifest
+    is a failure rather than a silent skip.
+
+    Changes to the native baseline are not failures.
+    This is due to the workflow for minting requires credentials.
+    This means fork contributors cannot author or edit native blobs.
+    ``replay`` is therefore only selected when native blobs actually exist to replay;
+    an absent or removed native baseline downgrades to ``skip`` (with a warning in the reason),
+    never ``fail``.
+
+    See the module docstring for the meaning of each :class:`Action`.
+
+    Parameters
+    ----------
+    manifest
+        The current ``manifest.json`` for the test case,
+        or ``None`` if the case has no manifest on this branch (never managed, or deleted in this change).
+    base_manifest
+        The ``manifest.json`` as it exists on the base branch,
+        or ``None`` if the manifest is newly added in this change (seeding).
+    extraction_changed
+        Whether code that influences ``build_execution_result`` for this test case
+        changed in this pull request.
+        Only consulted when the committed bundle is unchanged and the version was not bumped.
+    committed_integrity_ok
+        Whether the committed bundle on disk matches the current manifest's ``committed`` digests exactly
+        (no edited, added, or removed committed file).
+        The caller computes this against the working tree.
+        ``False`` means the manifest no longer describes the bundle it is supposed to gate,
+        which is a hard failure.
+
+        Ignored when ``manifest`` is ``None`` (nothing to verify).
+    catalog_integrity_ok
+        Whether the test case's input ``catalog.yaml`` still matches the current
+        manifest's ``catalog_hash``.
+        The caller computes this against the working tree.
+        ``False`` means the inputs changed without the baseline being regenerated,
+        so the committed bundle no longer reflects its inputs — a hard failure.
+
+        Always ``True`` when the manifest carries no ``catalog_hash``
+
+    Returns
+    -------
+    :
+        The gate's decision, pairing an :class:`Action` with a reason.
+    """
+    if manifest is None:
+        # Distinguish a never-managed case from the deletion of a managed baseline.
+        # Deleting manifest.json must not be a silent way to disable the gate.
+        if base_manifest is not None:
+            return GateDecision(
+                Action.FAIL,
+                "manifest.json is absent but exists on the base branch; a managed "
+                "regression baseline cannot be removed without review",
+            )
+        return GateDecision(
+            Action.SKIP,
+            "no manifest.json; test case not under regression-baseline management",
+        )
+
+    # The manifest must faithfully describe the committed bundle it gates.
+    # If the bundle on disk drifted from the manifest digests
+    # (an edit/add/remove without regenerating the manifest), the metadata comparisons below are meaningless.
+    if not committed_integrity_ok:
+        return GateDecision(
+            Action.FAIL,
+            "committed bundle on disk does not match manifest.json digests; "
+            "regenerate the manifest with `ref test-cases run` after changing the bundle",
+        )
+
+    # The input catalog must still describe the baseline it produced.
+    if not catalog_integrity_ok:
+        return GateDecision(
+            Action.FAIL,
+            "input catalog.yaml does not match manifest.json catalog_hash; "
+            "regenerate the baseline with `ref test-cases run` after changing the inputs",
+        )
+
+    if base_manifest is None:
+        # Seeding a newly added manifest.
+        # Replay only verifies something when native blobs exist
+        if manifest.native:
+            return GateDecision(
+                Action.REPLAY,
+                "manifest newly added (seeding); replaying native baseline against committed bundle",
+            )
+        return GateDecision(
+            Action.SKIP,
+            "manifest newly added (seeding) with no native baseline; "
+            "the committed bundle is the only signal and is reviewed in the diff",
+        )
+
+    if manifest.test_case_version < base_manifest.test_case_version:
+        return GateDecision(
+            Action.FAIL,
+            f"test_case_version decreased ({base_manifest.test_case_version} -> "
+            f"{manifest.test_case_version}); version must be monotonic",
+        )
+
+    version_bumped = manifest.test_case_version > base_manifest.test_case_version
+    committed_changed = manifest.committed != base_manifest.committed
+    native_changed = manifest.native != base_manifest.native
+
+    if version_bumped:
+        return GateDecision(
+            Action.EXECUTE,
+            f"test_case_version bumped ({base_manifest.test_case_version} -> "
+            f"{manifest.test_case_version}); executing full run with tolerant compare",
+        )
+
+    if committed_changed:
+        return GateDecision(
+            Action.FAIL,
+            "committed bundle changed without a test_case_version bump; "
+            "bump test_case_version to authorise the new baseline",
+        )
+
+    if native_changed:
+        if manifest.native:
+            # Native blobs were re-authored (re-minted) without a version bump.
+            # The committed bundle is unchanged,
+            # so verify the new native snapshot still reproduces it rather than skipping unverified.
+            return GateDecision(
+                Action.REPLAY,
+                "native baseline changed with committed bundle unchanged; "
+                "replaying to confirm the new native snapshot reproduces the committed bundle",
+            )
+        # De-mint: the native baseline was removed while the committed bundle stayed.
+        return GateDecision(
+            Action.SKIP,
+            "WARNING: native baseline removed (de-mint) with committed bundle unchanged; "
+            "the committed bundle still gates this case but native replay is no longer possible",
+        )
+
+    if extraction_changed:
+        if manifest.native:
+            return GateDecision(
+                Action.REPLAY,
+                "extraction code changed with committed bundle unchanged; "
+                "replaying cached native baseline to verify",
+            )
+        return GateDecision(
+            Action.SKIP,
+            "extraction code changed but no native baseline exists to replay; "
+            "the committed bundle is unchanged and remains the only signal",
+        )
+
+    return GateDecision(
+        Action.SKIP,
+        "no committed-bundle, native, version, or extraction-code change; nothing to verify",
+    )
+
+
+def paths_under(changed_files: Iterable[str], roots: Iterable[str]) -> bool:
+    """
+    Return whether any changed file lies within one of the given directory roots.
+
+    A small helper for deriving the ``extraction_changed`` signal from a pull request's changed-file list.
+    Paths are compared textually as POSIX-style, repo-relative strings,
+    so callers must normalise both ``changed_files`` and ``roots`` to the same convention
+    (e.g. ``git diff --name-only`` output and a package source directory relative to the repo root).
+
+    Parameters
+    ----------
+    changed_files
+        Repo-relative paths changed in the pull request.
+    roots
+        Repo-relative directory prefixes to test against. A trailing slash is
+        optional; an empty root never matches.
+
+    Returns
+    -------
+    :
+        ``True`` if any changed file equals or sits beneath any root.
+    """
+    normalised_roots = [root.rstrip("/") for root in roots if root.rstrip("/")]
+    for changed in changed_files:
+        for root in normalised_roots:
+            if changed == root or changed.startswith(root + "/"):
+                return True
+    return False
diff --git a/packages/climate-ref-core/src/climate_ref_core/regression/manifest.py b/packages/climate-ref-core/src/climate_ref_core/regression/manifest.py
index 743ea1f95..47dba4af1 100644
--- a/packages/climate-ref-core/src/climate_ref_core/regression/manifest.py
+++ b/packages/climate-ref-core/src/climate_ref_core/regression/manifest.py
@@ -60,7 +60,11 @@ def _validate_digest(digest: str) -> str:
     return digest
 
 
-COMMITTED_BUNDLE_FILES: tuple[str, ...] = ("series.json", "diagnostic.json", "output.json")
+COMMITTED_BUNDLE_FILES: tuple[str, ...] = (
+    "series.json",
+    "diagnostic.json",
+    "output.json",
+)
 """The committed CMEC artefacts tracked in git.
 
 Their digests are tracked in :attr:`Manifest.committed`.
@@ -135,6 +139,12 @@ class Manifest:
     native: dict[str, NativeEntry]
     """Digests of curated native output files: ``{relpath: NativeEntry}``."""
 
+    catalog_hash: str | None = None
+    """Hash of the test case input ``catalog.yaml`` (its ``_metadata.hash``) that produced this baseline.
+    This couples the baseline to its inputs.
+    The CI gate fails a case whose live catalog hash no longer matches this value.
+    """
+
     @classmethod
     def load(cls, path: Path) -> Manifest:
         """
@@ -156,11 +166,39 @@ def load(cls, path: Path) -> Manifest:
             If the manifest is missing required keys or has malformed native entries
             (e.g. hand-edited or written by an incompatible version).
         """
-        data = json.loads(path.read_text(encoding="utf-8"))
+        return cls.loads(path.read_text(encoding="utf-8"), source=str(path))
+
+    @classmethod
+    def loads(cls, text: str, *, source: str = "<string>") -> Manifest:
+        """
+        Parse a manifest from its JSON text.
+
+        Used when the manifest does not live on disk at parse time,
+        e.g. when reading the base-branch copy via ``git show`` for the CI coupling gate.
+
+        Parameters
+        ----------
+        text
+            The manifest JSON.
+        source
+            A label for the text's origin, used in error messages.
+
+        Returns
+        -------
+        :
+            The parsed manifest.
+
+        Raises
+        ------
+        ValueError
+            If the manifest is missing required keys or has malformed native entries
+            (e.g. hand-edited or written by an incompatible version).
+        """
+        data = json.loads(text)
         missing = [key for key in ("schema", "test_case_version", "committed", "native") if key not in data]
         if missing:
             raise ValueError(
-                f"Invalid manifest {path}: missing required keys {missing}. "
+                f"Invalid manifest {source}: missing required keys {missing}. "
                 "The manifest may be corrupted or written by an incompatible version; "
                 "regenerate it with `ref test-cases run --force-regen`."
             )
@@ -171,7 +209,7 @@ def load(cls, path: Path) -> Manifest:
             }
         except (KeyError, TypeError, AttributeError) as exc:
             raise ValueError(
-                f"Invalid manifest {path}: malformed 'native' entry ({exc!r}). "
+                f"Invalid manifest {source}: malformed 'native' entry ({exc!r}). "
                 "Each entry must be a mapping with 'sha256' and 'size' keys."
             ) from exc
         # Reject hand-edited or hostile manifests that could escape the
@@ -180,13 +218,15 @@ def load(cls, path: Path) -> Manifest:
             try:
                 safe_path(relpath, label="native path")
             except ValueError as exc:
-                raise ValueError(f"Invalid manifest {path}: {exc}") from exc
+                raise ValueError(f"Invalid manifest {source}: {exc}") from exc
             _validate_digest(entry.sha256)
         return cls(
             schema=data["schema"],
             test_case_version=data["test_case_version"],
             committed=dict(data["committed"]),
             native=native,
+            # TODO: remove optonality when all manifests have this field.
+            catalog_hash=data.get("catalog_hash"),
         )
 
     def dump(self, path: Path) -> None:
@@ -204,6 +244,7 @@ def dump(self, path: Path) -> None:
         payload = {
             "schema": self.schema,
             "test_case_version": self.test_case_version,
+            "catalog_hash": self.catalog_hash,
             "committed": self.committed,
             "native": {relpath: asdict(entry) for relpath, entry in self.native.items()},
         }
@@ -211,7 +252,7 @@ def dump(self, path: Path) -> None:
         path.write_text(text, encoding="utf-8")
 
     @classmethod
-    def seed_v1(cls, committed_digests: dict[str, str]) -> Manifest:
+    def seed_v1(cls, committed_digests: dict[str, str], catalog_hash: str | None = None) -> Manifest:
         """
         Create an initial manifest at ``test_case_version == 1`` with no native outputs.
 
@@ -219,6 +260,8 @@ def seed_v1(cls, committed_digests: dict[str, str]) -> Manifest:
         ----------
         committed_digests
             Digests of the committed regression JSON artefacts.
+        catalog_hash
+            Hash of the input ``catalog.yaml`` that produced the baseline, if known.
 
         Returns
         -------
@@ -229,6 +272,7 @@ def seed_v1(cls, committed_digests: dict[str, str]) -> Manifest:
             schema=SCHEMA_VERSION,
             test_case_version=1,
             committed=dict(committed_digests),
+            catalog_hash=catalog_hash,
             native={},
         )
 
diff --git a/packages/climate-ref-core/src/climate_ref_core/regression/store.py b/packages/climate-ref-core/src/climate_ref_core/regression/store.py
index 8505a6e97..355416f96 100644
--- a/packages/climate-ref-core/src/climate_ref_core/regression/store.py
+++ b/packages/climate-ref-core/src/climate_ref_core/regression/store.py
@@ -21,6 +21,7 @@
 """
 
 import shutil
+from functools import cache
 from pathlib import Path
 from typing import Protocol, runtime_checkable
 from urllib.parse import unquote, urlsplit
@@ -34,6 +35,23 @@
 from .manifest import _validate_digest, sha256_file
 
 
+@cache
+def _pooch_manager(base_url: str, cache_dir: str) -> pooch.Pooch:
+    """
+    Build (and cache) a pooch manager for a ``(base_url, cache_dir)`` pair.
+
+    ``pooch.create`` rebuilds the whole manager and registry, so doing it per
+    ``fetch`` is wasteful when many blobs are pulled from the same store.
+    The manager is keyed by its immutable inputs and reused across fetches;
+    per-blob registry entries are added on the shared instance at fetch time.
+    """
+    return pooch.create(
+        path=Path(cache_dir),
+        base_url=base_url + "/",
+        retry_if_failed=10,
+    )
+
+
 @runtime_checkable
 class NativeStore(Protocol):
     """
@@ -265,17 +283,11 @@ def fetch(self, digest: str, dest: Path) -> None:
         ValueError
             If the downloaded blob's sha256 does not match ``digest``.
         """
-        # Build a single-file pooch registry: the URL path component is the digest
-        # and the expected hash IS the digest (content-addressed).
-        registry = pooch.create(
-            path=self.cache_dir,
-            base_url=self.base_url + "/",
-            retry_if_failed=10,
-        )
+        registry = _pooch_manager(self.base_url, str(self.cache_dir))
         registry.registry[digest] = digest  # content-addressed: hash == name
 
         cached = registry.fetch(digest)
-        # Verify the cached copy against the expected digest.
+        # pooch should already verify the hash against the registry entry
         _verify_hash_matches(cached, digest)
 
         dest.parent.mkdir(parents=True, exist_ok=True)
@@ -380,26 +392,29 @@ def build_native_store(config: _NativeStoreConfigProtocol, *, writable: bool) ->
     cache_dir: Path = config.cache_dir
 
     parts = urlsplit(url)
-    is_local = parts.scheme not in ("http", "https")
-
-    if is_local:
-        if parts.scheme == "file":
-            # Parse properly so malformed variants fail loudly instead of
-            # silently producing a wrong (e.g. relative) path.
-            if parts.netloc not in ("", "localhost"):
-                raise ValueError(
-                    f"Unsupported file URL {url!r}: a host component ({parts.netloc!r}) is not "
-                    "supported. Use the file:///absolute/path form (three slashes)."
-                )
-            # Percent-decode the path so escaped characters (e.g. %20) resolve to the real on-disk directory.
-            local_root = Path(unquote(parts.path))
-        else:
-            # A bare filesystem path.
-            local_root = Path(url)
-        return LocalFilesystemStore(root=local_root)
-    elif writable:
-        # Remote writable store — R2 backend is deferred.
-        # Construction raises NotImplementedError until the follow-up lands.
-        return R2WriteStore()
-    else:
+    scheme = parts.scheme
+
+    if scheme in ("http", "https"):
+        if writable:
+            # TODO: Construction raises NotImplementedError until the follow-up lands.
+            return R2WriteStore()
         return PoochReadStore(base_url=url.rstrip("/"), cache_dir=cache_dir)
+
+    if scheme == "file":
+        # Parse properly so malformed variants fail loudly instead of
+        # silently producing a wrong (e.g. relative) path.
+        if parts.netloc not in ("", "localhost"):
+            raise ValueError(
+                f"Unsupported file URL {url!r}: a host component ({parts.netloc!r}) is not "
+                "supported. Use the file:///absolute/path form (three slashes)."
+            )
+        return LocalFilesystemStore(root=Path(unquote(parts.path)))
+
+    if scheme == "":
+        return LocalFilesystemStore(root=Path(url))
+
+    raise ValueError(
+        f"Unsupported native store URL {url!r}: scheme {scheme!r} is not recognised. "
+        "Use http(s):// for a remote store, or file:///absolute/path or a bare filesystem "
+        "path for a local store."
+    )
diff --git a/packages/climate-ref-core/src/climate_ref_core/testing.py b/packages/climate-ref-core/src/climate_ref_core/testing.py
index 771057118..a2f60c9bd 100644
--- a/packages/climate-ref-core/src/climate_ref_core/testing.py
+++ b/packages/climate-ref-core/src/climate_ref_core/testing.py
@@ -30,9 +30,10 @@
 from climate_ref_core.diagnostics import ExecutionDefinition, ExecutionResult
 from climate_ref_core.esgf.base import ESGFRequest
 from climate_ref_core.metric_values.typing import SeriesMetricValue
-from climate_ref_core.output_files import from_placeholders
+from climate_ref_core.output_files import from_placeholders, ordered_replacements
 from climate_ref_core.pycmec.metric import CMECMetric
 from climate_ref_core.pycmec.output import CMECOutput
+from climate_ref_core.regression.manifest import Manifest
 
 if TYPE_CHECKING:
     from _pytest.mark.structures import ParameterSet
@@ -200,11 +201,6 @@ def manifest(self) -> Path:
         """Path to manifest.json (the regression bundle manifest, tracked in git)."""
         return self.root / "manifest.json"
 
-    @property
-    def regression_catalog_hash(self) -> Path:
-        """Path to catalog hash file in regression directory."""
-        return self.regression / ".catalog_hash"
-
     @property
     def test_data_dir(self) -> Path:
         """Path to the test-data directory (parent of diagnostic slug dir)."""
@@ -366,10 +362,14 @@ def catalog_changed_since_regression(paths: TestCasePaths) -> bool:
     """
     Check if the catalog has changed since regression data was generated.
 
+    The baseline's input hash is read from ``manifest.json`` (``catalog_hash``),
+    the single coupling record; there is no separate sidecar.
+
     Returns True if:
     - No regression data exists (new test case)
-    - No stored catalog hash exists (legacy regression data)
-    - The catalog hash differs from the stored one
+    - No manifest, or the manifest records no ``catalog_hash`` (legacy regression data)
+    - No catalog file exists
+    - The current catalog hash differs from the one recorded in the manifest
 
     Parameters
     ----------
@@ -383,15 +383,16 @@ def catalog_changed_since_regression(paths: TestCasePaths) -> bool:
     """
     if not paths.regression.exists():
         return True  # No regression data, needs to run
-    if not paths.regression_catalog_hash.exists():
-        return True  # No stored hash, needs to run
     if not paths.catalog.exists():
         return True  # No catalog file, needs to run
+    if not paths.manifest.exists():
+        return True  # No manifest, needs to run
 
-    stored_hash = paths.regression_catalog_hash.read_text().strip()
-    current_hash = get_catalog_hash(paths.catalog)
+    stored_hash = Manifest.load(paths.manifest).catalog_hash
+    if stored_hash is None:
+        return True  # Legacy manifest without a recorded catalog hash, needs to run
 
-    return stored_hash != current_hash
+    return stored_hash != get_catalog_hash(paths.catalog)
 
 
 def _sanitize_for_yaml(value: Any) -> Any:
@@ -537,8 +538,9 @@ def _load_series_sanitised(path: Path, replacements: dict[str, str]) -> list[Ser
     Load series from ``path``, replacing real paths with regression placeholders.
     """
     text = path.read_text(encoding="utf-8")
-    # Apply replacements longest-key-first in case of nested repleacements
-    for real, placeholder in sorted(replacements.items(), key=lambda kv: len(kv[0]), reverse=True):
+    # Apply replacements longest-key-first so a shorter key cannot shadow a longer one
+    # (the canonical ordering shared with all other sanitisation).
+    for real, placeholder in ordered_replacements(replacements):
         text = text.replace(real, placeholder)
     data = json.loads(text)
     if not isinstance(data, list):
diff --git a/packages/climate-ref-core/tests/unit/regression/test_compare.py b/packages/climate-ref-core/tests/unit/regression/test_compare.py
index 55ca671c5..e0693b4f3 100644
--- a/packages/climate-ref-core/tests/unit/regression/test_compare.py
+++ b/packages/climate-ref-core/tests/unit/regression/test_compare.py
@@ -26,7 +26,7 @@ class TestTolerance:
     def test_defaults(self) -> None:
         t = Tolerance()
         assert t.rtol == 1e-6
-        assert t.atol == 0.0
+        assert t.atol == 1e-8
 
     def test_custom_values(self) -> None:
         t = Tolerance(rtol=0.01, atol=1e-9)
@@ -97,6 +97,21 @@ def test_type_mismatch_fail(self) -> None:
     def test_int_float_numeric_comparison(self) -> None:
         """int vs float should be treated as numeric, not type-mismatch."""
         assert compare_json_content(1, 1.0, tol=TOL) == []
+
+    def test_bool_vs_int_is_type_mismatch(self) -> None:
+        result = compare_json_content(True, 1, tol=TOL)
+        assert len(result) == 1
+        assert "type mismatch" in result[0]
+
+    def test_int_vs_bool_is_type_mismatch(self) -> None:
+        result = compare_json_content(0, False, tol=TOL)
+        assert len(result) == 1
+        assert "type mismatch" in result[0]
+
+    def test_bool_vs_float_is_type_mismatch(self) -> None:
+        result = compare_json_content(True, 1.0, tol=TOL)
+        assert len(result) == 1
+        assert "type mismatch" in result[0]
         result = compare_json_content(1, 2.0, tol=TOL)
         assert len(result) == 1
         assert "float mismatch" in result[0]
diff --git a/packages/climate-ref-core/tests/unit/regression/test_gate.py b/packages/climate-ref-core/tests/unit/regression/test_gate.py
new file mode 100644
index 000000000..b47f825a3
--- /dev/null
+++ b/packages/climate-ref-core/tests/unit/regression/test_gate.py
@@ -0,0 +1,175 @@
+"""
+Tests for :mod:`climate_ref_core.regression.gate`.
+
+The full decision matrix of :func:`decide_coupling` is exercised offline; no I/O
+or git interaction is involved.
+"""
+
+from __future__ import annotations
+
+import pytest
+
+from climate_ref_core.regression.gate import (
+    Action,
+    decide_coupling,
+    paths_under,
+)
+from climate_ref_core.regression.manifest import SCHEMA_VERSION, Manifest, NativeEntry
+
+
+def make_manifest(
+    version: int,
+    committed: dict[str, str] | None = None,
+    native: dict[str, NativeEntry] | None = None,
+) -> Manifest:
+    """Build a manifest with a given version and committed/native digests."""
+    return Manifest(
+        schema=SCHEMA_VERSION,
+        test_case_version=version,
+        committed=committed if committed is not None else {"output.json": "a" * 64},
+        native=native if native is not None else {},
+    )
+
+
+class TestDecideCoupling:
+    def test_no_manifest_no_base_skips(self) -> None:
+        # Never managed: no manifest on this branch and none on the base.
+        decision = decide_coupling(None, None)
+        assert decision.action is Action.SKIP
+        assert "not under regression-baseline management" in decision.reason
+
+    def test_manifest_deletion_fails(self) -> None:
+        # The manifest existed on the base branch and was removed: fail closed.
+        decision = decide_coupling(None, make_manifest(1))
+        assert decision.action is Action.FAIL
+        assert "removed without review" in decision.reason
+
+    def test_committed_integrity_failure_fails(self) -> None:
+        # The committed bundle on disk drifted from the manifest digests.
+        base = make_manifest(1, {"output.json": "a" * 64})
+        current = make_manifest(1, {"output.json": "a" * 64})
+        decision = decide_coupling(current, base, committed_integrity_ok=False)
+        assert decision.action is Action.FAIL
+        assert "does not match manifest.json digests" in decision.reason
+
+    def test_integrity_failure_outranks_seeding(self) -> None:
+        # Even a newly added manifest must describe its on-disk bundle.
+        decision = decide_coupling(make_manifest(1), None, committed_integrity_ok=False)
+        assert decision.action is Action.FAIL
+
+    def test_native_change_replays(self) -> None:
+        # Native re-minted (different blob) with committed + version unchanged.
+        base = make_manifest(1, {"output.json": "a" * 64}, {"data.nc": NativeEntry("a" * 64, 10)})
+        current = make_manifest(1, {"output.json": "a" * 64}, {"data.nc": NativeEntry("b" * 64, 12)})
+        decision = decide_coupling(current, base)
+        assert decision.action is Action.REPLAY
+        assert "native baseline changed" in decision.reason
+
+    def test_newly_added_manifest_seeds_with_replay(self) -> None:
+        # Seeding with a native baseline present: replay verifies it reproduces the bundle.
+        manifest = make_manifest(1, native={"data.nc": NativeEntry("a" * 64, 10)})
+        decision = decide_coupling(manifest, None)
+        assert decision.action is Action.REPLAY
+        assert "seeding" in decision.reason
+
+    def test_newly_added_manifest_without_native_skips(self) -> None:
+        # Seeding with no native baseline: nothing to replay; the committed bundle is the signal.
+        decision = decide_coupling(make_manifest(1), None)
+        assert decision.action is Action.SKIP
+        assert "seeding" in decision.reason
+
+    def test_catalog_integrity_failure_fails(self) -> None:
+        # The input catalog drifted from the manifest's recorded hash.
+        base = make_manifest(1, {"output.json": "a" * 64})
+        current = make_manifest(1, {"output.json": "a" * 64})
+        decision = decide_coupling(current, base, catalog_integrity_ok=False)
+        assert decision.action is Action.FAIL
+        assert "catalog.yaml does not match" in decision.reason
+
+    def test_native_demint_skips_with_warning(self) -> None:
+        # Native baseline removed (de-mint) with committed bundle unchanged: warn, do not fail.
+        base = make_manifest(1, {"output.json": "a" * 64}, {"data.nc": NativeEntry("a" * 64, 10)})
+        current = make_manifest(1, {"output.json": "a" * 64}, {})
+        decision = decide_coupling(current, base)
+        assert decision.action is Action.SKIP
+        assert "WARNING" in decision.reason
+        assert "de-mint" in decision.reason
+
+    def test_version_bump_executes(self) -> None:
+        base = make_manifest(1)
+        current = make_manifest(2)
+        decision = decide_coupling(current, base)
+        assert decision.action is Action.EXECUTE
+        assert "1 -> 2" in decision.reason
+
+    def test_version_bump_executes_even_with_committed_change(self) -> None:
+        # A bump authorises a new baseline, so a committed change is expected.
+        base = make_manifest(1, {"output.json": "a" * 64})
+        current = make_manifest(2, {"output.json": "b" * 64})
+        decision = decide_coupling(current, base)
+        assert decision.action is Action.EXECUTE
+
+    def test_committed_change_without_bump_fails(self) -> None:
+        base = make_manifest(1, {"output.json": "a" * 64})
+        current = make_manifest(1, {"output.json": "b" * 64})
+        decision = decide_coupling(current, base)
+        assert decision.action is Action.FAIL
+        assert "without a test_case_version bump" in decision.reason
+
+    def test_version_decrease_fails(self) -> None:
+        base = make_manifest(3)
+        current = make_manifest(2)
+        decision = decide_coupling(current, base)
+        assert decision.action is Action.FAIL
+        assert "monotonic" in decision.reason
+
+    def test_extraction_change_replays_when_bundle_unchanged(self) -> None:
+        native = {"data.nc": NativeEntry("a" * 64, 10)}
+        base = make_manifest(1, {"output.json": "a" * 64}, native)
+        current = make_manifest(1, {"output.json": "a" * 64}, native)
+        decision = decide_coupling(current, base, extraction_changed=True)
+        assert decision.action is Action.REPLAY
+        assert "extraction code changed" in decision.reason
+
+    def test_extraction_change_without_native_skips(self) -> None:
+        # Extraction code changed but no native baseline exists to replay.
+        base = make_manifest(1, {"output.json": "a" * 64})
+        current = make_manifest(1, {"output.json": "a" * 64})
+        decision = decide_coupling(current, base, extraction_changed=True)
+        assert decision.action is Action.SKIP
+        assert "no native baseline exists to replay" in decision.reason
+
+    def test_no_change_skips(self) -> None:
+        base = make_manifest(1, {"output.json": "a" * 64})
+        current = make_manifest(1, {"output.json": "a" * 64})
+        decision = decide_coupling(current, base, extraction_changed=False)
+        assert decision.action is Action.SKIP
+        assert "nothing to verify" in decision.reason
+
+
+class TestPathsUnder:
+    def test_file_directly_under_root_matches(self) -> None:
+        assert paths_under(["pkg/src/mod.py"], ["pkg/src"])
+
+    def test_file_equal_to_root_matches(self) -> None:
+        assert paths_under(["pkg/src"], ["pkg/src"])
+
+    def test_root_trailing_slash_normalised(self) -> None:
+        assert paths_under(["pkg/src/mod.py"], ["pkg/src/"])
+
+    def test_no_match_returns_false(self) -> None:
+        assert not paths_under(["other/mod.py"], ["pkg/src"])
+
+    def test_prefix_collision_does_not_match(self) -> None:
+        # "pkg/src2" must not match root "pkg/src".
+        assert not paths_under(["pkg/src2/mod.py"], ["pkg/src"])
+
+    def test_empty_root_never_matches(self) -> None:
+        assert not paths_under(["anything.py"], [""])
+
+    def test_multiple_roots(self) -> None:
+        assert paths_under(["b/x.py"], ["a", "b"])
+
+    @pytest.mark.parametrize("changed", [[], ["unrelated.txt"]])
+    def test_empty_or_unrelated(self, changed: list[str]) -> None:
+        assert not paths_under(changed, ["pkg/src"])
diff --git a/packages/climate-ref-core/tests/unit/regression/test_store.py b/packages/climate-ref-core/tests/unit/regression/test_store.py
index 5cc232a0b..fa4625fba 100644
--- a/packages/climate-ref-core/tests/unit/regression/test_store.py
+++ b/packages/climate-ref-core/tests/unit/regression/test_store.py
@@ -302,3 +302,10 @@ def test_writable_true_remote_url_raises_not_implemented(self, tmp_path: Path) -
         cfg = _StubConfig(url="https://baselines.example.com", cache_dir=tmp_path / "cache")
         with pytest.raises(NotImplementedError, match="deferred to a follow-up PR"):
             build_native_store(cfg, writable=True)
+
+    def test_unsupported_scheme_raises_value_error(self, tmp_path: Path) -> None:
+        # An unrecognised remote scheme must fail loudly, not be coerced to a local path.
+        for url in ["s3://bucket/blobs", "gs://bucket/blobs", "ftp://host/blobs"]:
+            cfg = _StubConfig(url=url, cache_dir=tmp_path / "cache")
+            with pytest.raises(ValueError, match="not recognised"):
+                build_native_store(cfg, writable=False)
diff --git a/packages/climate-ref-core/tests/unit/test_paths.py b/packages/climate-ref-core/tests/unit/test_paths.py
index 38904e7a2..bac67f9be 100644
--- a/packages/climate-ref-core/tests/unit/test_paths.py
+++ b/packages/climate-ref-core/tests/unit/test_paths.py
@@ -33,12 +33,12 @@ class TestSingleSegment:
     def test_accepts_single_segments(self, relpath):
         assert safe_path(relpath, single_segment=True) == Path(relpath)
 
-    @pytest.mark.parametrize("relpath", ["sub/dir", "a/b/c.json", "foo\\bar", "."])
-    def test_rejects_separators_and_dot(self, relpath):
+    @pytest.mark.parametrize("relpath", ["sub/dir", "a/b/c.json", "foo\\bar"])
+    def test_rejects_separators(self, relpath):
         with pytest.raises(ValueError, match="single path segment"):
             safe_path(relpath, single_segment=True)
 
-    @pytest.mark.parametrize("relpath", ["", "/abs", "../escape"])
+    @pytest.mark.parametrize("relpath", ["", "/abs", "../escape", "."])
     def test_lexical_layer_still_applies(self, relpath):
         with pytest.raises(ValueError, match="contained relative path"):
             safe_path(relpath, single_segment=True)
diff --git a/packages/climate-ref-core/tests/unit/test_testing.py b/packages/climate-ref-core/tests/unit/test_testing.py
index 1d04873fd..63c60d9e8 100644
--- a/packages/climate-ref-core/tests/unit/test_testing.py
+++ b/packages/climate-ref-core/tests/unit/test_testing.py
@@ -9,6 +9,7 @@
 from climate_ref_core.datasets import DatasetCollection, ExecutionDatasetCollection, SourceDatasetType
 from climate_ref_core.diagnostics import ExecutionResult
 from climate_ref_core.metric_values.typing import SeriesMetricValue
+from climate_ref_core.regression.manifest import Manifest
 from climate_ref_core.testing import (
     RegressionValidator,
     TestCase,
@@ -156,13 +157,6 @@ def test_different_test_cases(self, tmp_path):
         assert default.root != custom.root
         assert default.regression != custom.regression
 
-    def test_regression_catalog_hash_property(self, tmp_path):
-        """Test regression_catalog_hash returns correct path."""
-        paths = TestCasePaths.from_test_data_dir(tmp_path, "my-diag", "default")
-
-        expected = tmp_path / "my-diag" / "default" / "regression" / ".catalog_hash"
-        assert paths.regression_catalog_hash == expected
-
 
 class TestYamlSerialization:
     """Tests for YAML serialization functions."""
@@ -544,6 +538,12 @@ def test_returns_none_for_empty_file(self, tmp_path):
 class TestCatalogChangedSinceRegression:
     """Tests for catalog_changed_since_regression function."""
 
+    @staticmethod
+    def _seed_manifest(paths, catalog_hash):
+        """Write a minimal manifest recording ``catalog_hash`` for the test case."""
+        paths.regression.mkdir(parents=True, exist_ok=True)
+        Manifest.seed_v1({}, catalog_hash=catalog_hash).dump(paths.manifest)
+
     def test_returns_true_when_no_regression_exists(self, tmp_path):
         """Test returns True when regression directory doesn't exist."""
         paths = TestCasePaths.from_test_data_dir(tmp_path, "my-diag", "default")
@@ -553,13 +553,23 @@ def test_returns_true_when_no_regression_exists(self, tmp_path):
         result = catalog_changed_since_regression(paths)
         assert result is True
 
-    def test_returns_true_when_no_catalog_hash_file(self, tmp_path):
-        """Test returns True when catalog hash file doesn't exist."""
+    def test_returns_true_when_no_manifest(self, tmp_path):
+        """Test returns True when manifest.json doesn't exist."""
         paths = TestCasePaths.from_test_data_dir(tmp_path, "my-diag", "default")
         paths.create()
         paths.catalog.write_text("_metadata:\n  hash: abc123\ncmip6:\n  datasets: []\n")
         paths.regression.mkdir(parents=True)
-        # No .catalog_hash file
+        # No manifest.json
+
+        result = catalog_changed_since_regression(paths)
+        assert result is True
+
+    def test_returns_true_when_manifest_has_no_catalog_hash(self, tmp_path):
+        """Test returns True for a legacy manifest with no recorded catalog hash."""
+        paths = TestCasePaths.from_test_data_dir(tmp_path, "my-diag", "default")
+        paths.create()
+        paths.catalog.write_text("_metadata:\n  hash: abc123\ncmip6:\n  datasets: []\n")
+        self._seed_manifest(paths, catalog_hash=None)
 
         result = catalog_changed_since_regression(paths)
         assert result is True
@@ -568,42 +578,28 @@ def test_returns_true_when_no_catalog_file(self, tmp_path):
         """Test returns True when catalog file doesn't exist."""
         paths = TestCasePaths.from_test_data_dir(tmp_path, "my-diag", "default")
         paths.create()
-        paths.regression.mkdir(parents=True)
-        paths.regression_catalog_hash.write_text("abc123")
+        self._seed_manifest(paths, catalog_hash="abc123")
         # No catalog file
 
         result = catalog_changed_since_regression(paths)
         assert result is True
 
     def test_returns_true_when_hash_differs(self, tmp_path):
-        """Test returns True when catalog hash differs from stored hash."""
+        """Test returns True when catalog hash differs from the manifest hash."""
         paths = TestCasePaths.from_test_data_dir(tmp_path, "my-diag", "default")
         paths.create()
         paths.catalog.write_text("_metadata:\n  hash: new_hash_456\ncmip6:\n  datasets: []\n")
-        paths.regression.mkdir(parents=True)
-        paths.regression_catalog_hash.write_text("old_hash_123")
+        self._seed_manifest(paths, catalog_hash="old_hash_123")
 
         result = catalog_changed_since_regression(paths)
         assert result is True
 
     def test_returns_false_when_hash_matches(self, tmp_path):
-        """Test returns False when catalog hash matches stored hash."""
+        """Test returns False when catalog hash matches the manifest hash."""
         paths = TestCasePaths.from_test_data_dir(tmp_path, "my-diag", "default")
         paths.create()
         paths.catalog.write_text("_metadata:\n  hash: same_hash_789\ncmip6:\n  datasets: []\n")
-        paths.regression.mkdir(parents=True)
-        paths.regression_catalog_hash.write_text("same_hash_789")
-
-        result = catalog_changed_since_regression(paths)
-        assert result is False
-
-    def test_handles_whitespace_in_stored_hash(self, tmp_path):
-        """Test handles whitespace in stored hash file."""
-        paths = TestCasePaths.from_test_data_dir(tmp_path, "my-diag", "default")
-        paths.create()
-        paths.catalog.write_text("_metadata:\n  hash: hash_value\ncmip6:\n  datasets: []\n")
-        paths.regression.mkdir(parents=True)
-        paths.regression_catalog_hash.write_text("  hash_value  \n")
+        self._seed_manifest(paths, catalog_hash="same_hash_789")
 
         result = catalog_changed_since_regression(paths)
         assert result is False
diff --git a/packages/climate-ref/src/climate_ref/cli/__init__.py b/packages/climate-ref/src/climate_ref/cli/__init__.py
index abbc34b55..e64fe2f37 100644
--- a/packages/climate-ref/src/climate_ref/cli/__init__.py
+++ b/packages/climate-ref/src/climate_ref/cli/__init__.py
@@ -42,6 +42,7 @@
     ("test-cases", "sync"),
     ("test-cases", "replay"),
     ("test-cases", "mint"),
+    ("test-cases", "ci-gate"),
 }
 
 
diff --git a/packages/climate-ref/src/climate_ref/cli/test_cases.py b/packages/climate-ref/src/climate_ref/cli/test_cases.py
index f7c91e2a5..61a43680f 100644
--- a/packages/climate-ref/src/climate_ref/cli/test_cases.py
+++ b/packages/climate-ref/src/climate_ref/cli/test_cases.py
@@ -35,7 +35,8 @@
     from climate_ref.provider_registry import ProviderRegistry
     from climate_ref_core.datasets import ExecutionDatasetCollection, SourceDatasetType
     from climate_ref_core.diagnostics import Diagnostic
-    from climate_ref_core.testing import TestCase
+    from climate_ref_core.regression.manifest import Manifest, NativeEntry
+    from climate_ref_core.testing import TestCase, TestCasePaths
 
 
 app = typer.Typer(help=__doc__)
@@ -466,6 +467,37 @@ def _print_regression_summary(  # pragma: no cover
     console.print(summary)
 
 
+def _write_test_case_manifest(
+    paths: TestCasePaths,
+    *,
+    test_case_version: int,
+    committed: dict[str, str],
+    native: dict[str, NativeEntry],
+    schema: int | None = None,
+) -> Manifest:
+    """
+    Construct and write a test case ``manifest.json``, recording the input catalog hash.
+
+    Shared by ``run`` (which preserves the existing version and native block) and
+    ``mint`` (which authors the native block and may bump the version); the two
+    callers differ only in the ``test_case_version`` and ``native`` they supply.
+    The ``catalog_hash`` is always (re)derived from the current ``catalog.yaml`` so
+    the manifest stays coupled to the inputs that produced the committed bundle.
+    """
+    from climate_ref_core.regression.manifest import SCHEMA_VERSION, Manifest
+    from climate_ref_core.testing import get_catalog_hash
+
+    manifest = Manifest(
+        schema=SCHEMA_VERSION if schema is None else schema,
+        test_case_version=test_case_version,
+        committed=dict(committed),
+        native=native,
+        catalog_hash=get_catalog_hash(paths.catalog),
+    )
+    manifest.dump(paths.manifest)
+    return manifest
+
+
 def _run_single_test_case(  # noqa: PLR0911, PLR0912, PLR0915
     config: Config,
     console: Console,
@@ -489,7 +521,6 @@ def _run_single_test_case(  # noqa: PLR0911, PLR0912, PLR0915
     from climate_ref_core.regression.manifest import Manifest
     from climate_ref_core.testing import (
         TestCasePaths,
-        get_catalog_hash,
         load_datasets_from_yaml,
     )
 
@@ -588,26 +619,26 @@ def _run_single_test_case(  # noqa: PLR0911, PLR0912, PLR0915
                 test_data_dir=paths.test_data_dir,
             )
 
-        # Refresh the manifest's committed block only; the native block is mint-owned.
+        # Refresh the manifest's committed block (and catalog hash) only; the native
+        # block is mint-owned, so preserve the previous one when the manifest exists.
         if paths.manifest.exists():
             previous = Manifest.load(paths.manifest)
-            manifest = Manifest(
-                schema=previous.schema,
+            _write_test_case_manifest(
+                paths,
                 test_case_version=previous.test_case_version,
                 committed=committed_digests,
                 native=previous.native,
+                schema=previous.schema,
             )
         else:
-            manifest = Manifest.seed_v1(committed_digests)
-        manifest.dump(paths.manifest)
+            _write_test_case_manifest(
+                paths,
+                test_case_version=1,
+                committed=committed_digests,
+                native={},
+            )
 
         logger.info(f"Captured {len(native)} native output file(s) (not minted)")
-
-        # Store catalog hash for change detection
-        catalog_hash = get_catalog_hash(paths.catalog)
-        if catalog_hash:
-            paths.regression_catalog_hash.write_text(catalog_hash)
-
         logger.info(f"Updated regression data: {paths.regression}")
         _print_regression_summary(console, paths.regression, size_threshold)
     elif paths.regression.exists():
@@ -953,7 +984,11 @@ def replay_test_case(  # noqa: PLR0912, PLR0915
 
     from climate_ref.provider_registry import ProviderRegistry
     from climate_ref_core.diagnostics import ExecutionDefinition
-    from climate_ref_core.output_files import from_placeholders
+    from climate_ref_core.output_files import (
+        PLACEHOLDER_OUTPUT_DIR,
+        PLACEHOLDER_TEST_DATA_DIR,
+        from_placeholders,
+    )
     from climate_ref_core.regression import (
         COMMITTED_BUNDLE_FILES,
         Manifest,
@@ -1054,8 +1089,8 @@ def replay_test_case(  # noqa: PLR0912, PLR0915
                 continue
 
             replacements = {
-                str(output_dir): "<OUTPUT_DIR>",
-                str(paths.test_data_dir): "<TEST_DATA_DIR>",
+                str(output_dir): PLACEHOLDER_OUTPUT_DIR,
+                str(paths.test_data_dir): PLACEHOLDER_TEST_DATA_DIR,
             }
             # The comparator silently skips a bundle file with no committed copy,
             compared = [f for f in COMMITTED_BUNDLE_FILES if (paths.regression / f).exists()]
@@ -1138,7 +1173,7 @@ def mint_native(  # noqa: PLR0912, PLR0915
     from climate_ref_core.regression.capture import capture_execution
     from climate_ref_core.regression.manifest import Manifest
     from climate_ref_core.regression.store import build_native_store
-    from climate_ref_core.testing import TestCasePaths, get_catalog_hash, load_datasets_from_yaml
+    from climate_ref_core.testing import TestCasePaths, load_datasets_from_yaml
 
     config: Config = ctx.obj.config
     db = ctx.obj.database
@@ -1229,17 +1264,12 @@ def mint_native(  # noqa: PLR0912, PLR0915
             version = previous.test_case_version + 1 if bump_version else previous.test_case_version
         else:
             version = 1
-        manifest = Manifest(
-            schema=1,
+        _write_test_case_manifest(
+            paths,
             test_case_version=version,
             committed=committed_digests,
             native=native,
         )
-        manifest.dump(paths.manifest)
-
-        catalog_hash = get_catalog_hash(paths.catalog)
-        if catalog_hash:
-            paths.regression_catalog_hash.write_text(catalog_hash)
 
         minted += 1
         logger.info(
@@ -1256,3 +1286,270 @@ def mint_native(  # noqa: PLR0912, PLR0915
             console.print(f"  - {case}")
         raise typer.Exit(code=1)
     console.print(f"[green]Minted {minted} native baseline(s)[/green]")
+
+
+def _provider_source_root(diag: Diagnostic, repo_root: Path) -> str | None:
+    """
+    Return the diagnostic's provider package source directory, relative to the repo root.
+
+    Used to decide whether a changed file touches the diagnostic's extraction code.
+    The returned path is POSIX-style so it can be compared against
+    ``git diff --name-only`` output.
+
+    Parameters
+    ----------
+    diag
+        The diagnostic whose provider source directory is wanted.
+    repo_root
+        The repository working-tree root.
+
+    Returns
+    -------
+    :
+        The provider package source directory relative to ``repo_root``, or ``None``
+        if it cannot be located or lies outside the repository.
+    """
+    import importlib.util
+
+    top_package = type(diag).__module__.split(".")[0]
+    try:
+        spec = importlib.util.find_spec(top_package)
+    except (ImportError, ValueError):
+        return None
+    if spec is None or not spec.submodule_search_locations:
+        return None
+    package_dir = Path(next(iter(spec.submodule_search_locations))).resolve()
+    try:
+        return package_dir.relative_to(repo_root.resolve()).as_posix()
+    except ValueError:
+        return None
+
+
+def _core_extraction_roots(repo_root: Path) -> list[str]:
+    """
+    Return the core paths whose change affects replay/execute for every test case.
+
+    ``build_execution_result`` (the function replay/execute re-run) depends on more
+    than the regression package: it builds and reads CMEC bundles via
+    :mod:`climate_ref_core.pycmec`, persists curated outputs via
+    :mod:`climate_ref_core.output_files`, and is dispatched through
+    :mod:`climate_ref_core.diagnostics`. A change under any of these can alter the
+    regenerated bundle, so all of them count as an extraction change.
+
+    Detection is deliberately coarse and errs toward REPLAY (cheap, credential-free):
+    a false positive only triggers an unnecessary replay, never a missed one.
+    Roots are derived from the installed package location (rather than hardcoded
+    paths) so they survive a package move; any root outside the repository
+    (e.g. an installed wheel) is dropped.
+
+    Parameters
+    ----------
+    repo_root
+        The repository working-tree root.
+
+    Returns
+    -------
+    :
+        Repo-relative POSIX paths for the core extraction surfaces that lie inside the repo.
+    """
+    import climate_ref_core
+
+    core_dir = Path(climate_ref_core.__file__).resolve().parent
+    repo_root_resolved = repo_root.resolve()
+    candidates = [
+        core_dir / "regression",
+        core_dir / "pycmec",
+        core_dir / "output_files.py",
+        core_dir / "diagnostics.py",
+    ]
+    roots: list[str] = []
+    for candidate in candidates:
+        try:
+            roots.append(candidate.relative_to(repo_root_resolved).as_posix())
+        except ValueError:
+            continue
+    return roots
+
+
+@app.command(name="ci-gate")
+def ci_gate(  # noqa: PLR0912, PLR0915
+    ctx: typer.Context,
+    base: Annotated[
+        str,
+        typer.Option(help="Git ref to compare against (the PR base branch)"),
+    ] = "origin/main",
+    provider: Annotated[
+        str | None,
+        typer.Option(help="Limit the gate to a single provider slug"),
+    ] = None,
+    diagnostic: Annotated[
+        str | None,
+        typer.Option(help="Limit the gate to a single diagnostic slug"),
+    ] = None,
+    test_case: Annotated[
+        str | None,
+        typer.Option(help="Limit the gate to a single test case name"),
+    ] = None,
+    output_json: Annotated[
+        bool,
+        typer.Option("--json", help="Emit the per-case decisions as JSON on stdout"),
+    ] = False,
+) -> None:
+    """
+    Decide how CI should verify each test case's regression baseline.
+
+    Compares each committed ``manifest.json`` to its counterpart on the base branch
+    and reports the action CI should take per case: ``replay`` (cheap, against the
+    cached native baseline), ``execute`` (full re-run, when ``test_case_version`` was
+    bumped), ``skip`` (nothing relevant changed), or ``fail`` (an unauthorised
+    baseline change). Exits non-zero if any case is gated ``fail``.
+
+    The ``--json`` output is intended for CI to dispatch ``replay``/``run`` jobs.
+
+    Examples
+    --------
+        ref test-cases ci-gate                       # Gate all cases against origin/main
+        ref test-cases ci-gate --base origin/develop
+        ref test-cases ci-gate --provider example --json
+    """
+    import json as _json
+
+    from git import GitCommandError
+
+    from climate_ref.provider_registry import ProviderRegistry
+    from climate_ref_core.regression.gate import Action, decide_coupling, paths_under
+    from climate_ref_core.regression.manifest import Manifest, compute_committed_digests
+    from climate_ref_core.testing import TestCasePaths, get_catalog_hash
+
+    config: Config = ctx.obj.config
+    db = ctx.obj.database
+    console: Console = ctx.obj.console
+
+    repo = get_repo_for_path(Path.cwd())
+    if repo is None or repo.working_dir is None:
+        logger.error("ci-gate must be run inside a git repository")
+        raise typer.Exit(code=1)
+    repo_root = Path(repo.working_dir)
+
+    # Resolve the set of files changed on this branch relative to the base ref.
+    # `base...HEAD` diffs against the merge-base, so unrelated base-branch churn
+    # is excluded.
+    try:
+        diff_output = repo.git.diff("--name-only", f"{base}...HEAD")
+    except GitCommandError as exc:
+        logger.error(f"Could not diff against base ref {base!r}: {exc}")
+        raise typer.Exit(code=1) from exc
+    changed_files = [line.strip() for line in diff_output.splitlines() if line.strip()]
+
+    # The core machinery behind build_execution_result affects every replay/execute,
+    # so a change there counts as an extraction change for all cases. Extraction-change
+    # detection is deliberately coarse: any change under a diagnostic's provider package
+    # (see `_provider_source_root`) or under the core extraction surfaces counts for
+    # every case in that provider. This errs toward REPLAY (cheap, credential-free),
+    # never away from it.
+    core_changed = paths_under(changed_files, _core_extraction_roots(repo_root))
+
+    # Hoisted once: repo_root.resolve() is filesystem-touching, and a provider's source
+    # root is identical for every case in that provider, so memoise it per provider slug
+    # rather than recomputing find_spec on each case.
+    repo_root_resolved = repo_root.resolve()
+    source_root_cache: dict[str, str | None] = {}
+
+    registry = ProviderRegistry.build_from_config(config, db)
+
+    decisions: list[dict[str, str]] = []
+    has_failure = False
+
+    def record(case: str, action: Action, reason: str) -> None:
+        nonlocal has_failure
+        if action is Action.FAIL:
+            has_failure = True
+        decisions.append({"case": case, "action": action.value, "reason": reason})
+
+    for diag, tc in _iter_test_cases(registry, provider=provider, diagnostic=diagnostic, test_case=test_case):
+        case_id = f"{diag.provider.slug}/{diag.slug}/{tc.name}"
+        paths = TestCasePaths.from_diagnostic(diag, tc.name)
+
+        # A corrupt manifest authored in this change is a hard failure for that case,
+        # not a crash for the whole gate.
+        manifest: Manifest | None = None
+        if paths is not None and paths.manifest.exists():
+            try:
+                manifest = Manifest.load(paths.manifest)
+            except ValueError as exc:
+                logger.error(f"{case_id}: invalid manifest.json: {exc}")
+                record(case_id, Action.FAIL, f"invalid manifest.json: {exc}")
+                continue
+
+        base_manifest: Manifest | None = None
+        if paths is not None:
+            try:
+                rel_manifest = paths.manifest.resolve().relative_to(repo_root_resolved).as_posix()
+            except ValueError:
+                rel_manifest = None
+            if rel_manifest is not None:
+                try:
+                    base_text = repo.git.show(f"{base}:{rel_manifest}")
+                except GitCommandError:
+                    base_manifest = None
+                else:
+                    # A corrupt manifest on the base branch can't be compared against;
+                    # fall back to seeding (REPLAY) rather than aborting the gate.
+                    try:
+                        base_manifest = Manifest.loads(base_text, source=f"{base}:{rel_manifest}")
+                    except ValueError as exc:
+                        logger.warning(
+                            f"{case_id}: base manifest at {base}:{rel_manifest} is invalid "
+                            f"({exc}); treating as newly added"
+                        )
+                        base_manifest = None
+
+        provider_slug = diag.provider.slug
+        if provider_slug not in source_root_cache:
+            source_root_cache[provider_slug] = _provider_source_root(diag, repo_root)
+        source_root = source_root_cache[provider_slug]
+        extraction_roots = [r for r in (source_root,) if r]
+        extraction_changed = core_changed or paths_under(changed_files, extraction_roots)
+
+        # Verify the committed bundle on disk still matches the manifest digests.
+        # A drift (edited/added/removed committed file without regenerating the
+        # manifest) must fail closed rather than slip through as SKIP.
+        committed_integrity_ok = True
+        # Verify the input catalog still matches the manifest's recorded hash. A catalog
+        # edit without regenerating the baseline leaves it silently stale; fail closed.
+        # Legacy manifests without a catalog_hash have nothing to compare, so stay OK.
+        catalog_integrity_ok = True
+        if manifest is not None and paths is not None:
+            committed_integrity_ok = compute_committed_digests(paths.regression) == manifest.committed
+            if manifest.catalog_hash is not None:
+                catalog_integrity_ok = get_catalog_hash(paths.catalog) == manifest.catalog_hash
+
+        decision = decide_coupling(
+            manifest,
+            base_manifest,
+            extraction_changed=extraction_changed,
+            committed_integrity_ok=committed_integrity_ok,
+            catalog_integrity_ok=catalog_integrity_ok,
+        )
+        record(case_id, decision.action, decision.reason)
+
+    if output_json:
+        console.print_json(_json.dumps(decisions))
+    else:
+        table = Table(title=f"CI coupling gate (base: {base})")
+        table.add_column("Test case", style="cyan", no_wrap=True)
+        table.add_column("Action")
+        table.add_column("Reason")
+        style_for = {
+            Action.FAIL.value: "red",
+            Action.EXECUTE.value: "yellow",
+            Action.REPLAY.value: "green",
+            Action.SKIP.value: "dim",
+        }
+        for entry in decisions:
+            style = style_for[entry["action"]]
+            table.add_row(entry["case"], f"[{style}]{entry['action']}[/{style}]", entry["reason"])
+        console.print(table)
+
+    if has_failure:
+        raise typer.Exit(code=1)
diff --git a/packages/climate-ref/tests/unit/cli/test_test_cases.py b/packages/climate-ref/tests/unit/cli/test_test_cases.py
index f9434ad85..6055f5e15 100644
--- a/packages/climate-ref/tests/unit/cli/test_test_cases.py
+++ b/packages/climate-ref/tests/unit/cli/test_test_cases.py
@@ -173,8 +173,14 @@ def test_cmip6_data_returned(self, mock_diagnostic, mock_test_case, mock_fetcher
         with (
             patch("climate_ref_core.esgf.ESGFFetcher", return_value=mock_fetcher),
             patch("climate_ref.datasets.CMIP6DatasetAdapter", return_value=mock_adapter),
-            patch("climate_ref.cli.test_cases._solve_test_case", return_value=mock_datasets),
-            patch("climate_ref_core.testing.TestCasePaths.from_diagnostic", return_value=None),
+            patch(
+                "climate_ref.cli.test_cases._solve_test_case",
+                return_value=mock_datasets,
+            ),
+            patch(
+                "climate_ref_core.testing.TestCasePaths.from_diagnostic",
+                return_value=None,
+            ),
         ):
             result, written = _fetch_and_build_catalog(mock_diagnostic, mock_test_case)
 
@@ -192,8 +198,14 @@ def test_saves_catalog_yaml(self, tmp_path, mock_diagnostic, mock_test_case, moc
         with (
             patch("climate_ref_core.esgf.ESGFFetcher", return_value=mock_fetcher),
             patch("climate_ref.datasets.CMIP6DatasetAdapter", return_value=mock_adapter),
-            patch("climate_ref.cli.test_cases._solve_test_case", return_value=mock_datasets),
-            patch("climate_ref_core.testing.TestCasePaths.from_diagnostic", return_value=mock_paths),
+            patch(
+                "climate_ref.cli.test_cases._solve_test_case",
+                return_value=mock_datasets,
+            ),
+            patch(
+                "climate_ref_core.testing.TestCasePaths.from_diagnostic",
+                return_value=mock_paths,
+            ),
             patch("climate_ref_core.testing.save_datasets_to_yaml", return_value=True) as mock_save,
         ):
             _, written = _fetch_and_build_catalog(mock_diagnostic, mock_test_case)
@@ -227,8 +239,14 @@ def test_obs4mips_data_returned(self, mock_diagnostic, mock_test_case):
         with (
             patch("climate_ref_core.esgf.ESGFFetcher", return_value=mock_fetcher),
             patch("climate_ref.datasets.Obs4MIPsDatasetAdapter", return_value=mock_adapter),
-            patch("climate_ref.cli.test_cases._solve_test_case", return_value=mock_datasets),
-            patch("climate_ref_core.testing.TestCasePaths.from_diagnostic", return_value=None),
+            patch(
+                "climate_ref.cli.test_cases._solve_test_case",
+                return_value=mock_datasets,
+            ),
+            patch(
+                "climate_ref_core.testing.TestCasePaths.from_diagnostic",
+                return_value=None,
+            ),
         ):
             result, written = _fetch_and_build_catalog(mock_diagnostic, mock_test_case)
 
@@ -539,12 +557,23 @@ def test_run_with_fetch_flag(self, invoke_cli, mocker):
             "climate_ref.provider_registry.ProviderRegistry.build_from_config",
             return_value=mock_registry,
         )
-        mocker.patch("climate_ref.cli.test_cases._fetch_and_build_catalog", return_value=(MagicMock(), True))
+        mocker.patch(
+            "climate_ref.cli.test_cases._fetch_and_build_catalog",
+            return_value=(MagicMock(), True),
+        )
         mocker.patch("climate_ref.testing.TestCaseRunner", return_value=mock_runner)
         mocker.patch("climate_ref_core.testing.TestCasePaths.from_diagnostic", return_value=None)
 
         result = invoke_cli(
-            ["test-cases", "run", "--provider", "example", "--diagnostic", "test-diag", "--fetch"],
+            [
+                "test-cases",
+                "run",
+                "--provider",
+                "example",
+                "--diagnostic",
+                "test-diag",
+                "--fetch",
+            ],
         )
         assert result.exit_code == 0
 
@@ -571,7 +600,15 @@ def test_run_with_fetch_flag_raises_error(self, invoke_cli, mocker):
         )
 
         invoke_cli(
-            ["test-cases", "run", "--provider", "example", "--diagnostic", "test-diag", "--fetch"],
+            [
+                "test-cases",
+                "run",
+                "--provider",
+                "example",
+                "--diagnostic",
+                "test-diag",
+                "--fetch",
+            ],
             expected_exit_code=1,
         )
 
@@ -612,7 +649,10 @@ def test_run_handles_exceptions(self, invoke_cli, mocker, tmp_path, exception_cl
             "climate_ref.provider_registry.ProviderRegistry.build_from_config",
             return_value=mock_registry,
         )
-        mocker.patch("climate_ref_core.testing.TestCasePaths.from_diagnostic", return_value=mock_paths)
+        mocker.patch(
+            "climate_ref_core.testing.TestCasePaths.from_diagnostic",
+            return_value=mock_paths,
+        )
         mocker.patch("climate_ref_core.testing.load_datasets_from_yaml", return_value=MagicMock())
         mocker.patch("climate_ref.testing.TestCaseRunner", return_value=mock_runner)
 
@@ -648,7 +688,9 @@ def test_run_successful_execution(self, invoke_cli, mocker, tmp_path):
         mock_paths.regression = regression_dir  # Use real dir that exists
 
         mock_result = MagicMock(
-            successful=True, metric_bundle_filename="metrics.json", output_bundle_filename="output.json"
+            successful=True,
+            metric_bundle_filename="metrics.json",
+            output_bundle_filename="output.json",
         )
         mock_result.to_output_path.side_effect = lambda x: tmp_path / x
         mock_runner = MagicMock()
@@ -658,7 +700,10 @@ def test_run_successful_execution(self, invoke_cli, mocker, tmp_path):
             "climate_ref.provider_registry.ProviderRegistry.build_from_config",
             return_value=mock_registry,
         )
-        mocker.patch("climate_ref_core.testing.TestCasePaths.from_diagnostic", return_value=mock_paths)
+        mocker.patch(
+            "climate_ref_core.testing.TestCasePaths.from_diagnostic",
+            return_value=mock_paths,
+        )
         mocker.patch("climate_ref_core.testing.load_datasets_from_yaml", return_value=MagicMock())
         mocker.patch("climate_ref.testing.TestCaseRunner", return_value=mock_runner)
 
@@ -695,7 +740,10 @@ def test_run_unsuccessful_execution(self, invoke_cli, mocker, tmp_path):
             "climate_ref.provider_registry.ProviderRegistry.build_from_config",
             return_value=mock_registry,
         )
-        mocker.patch("climate_ref_core.testing.TestCasePaths.from_diagnostic", return_value=mock_paths)
+        mocker.patch(
+            "climate_ref_core.testing.TestCasePaths.from_diagnostic",
+            return_value=mock_paths,
+        )
         mocker.patch("climate_ref_core.testing.load_datasets_from_yaml", return_value=MagicMock())
         mocker.patch("climate_ref.testing.TestCaseRunner", return_value=mock_runner)
 
@@ -760,13 +808,24 @@ def test_run_with_force_regen(self, invoke_cli, mocker, tmp_path):
             "climate_ref.provider_registry.ProviderRegistry.build_from_config",
             return_value=mock_registry,
         )
-        mocker.patch("climate_ref_core.testing.TestCasePaths.from_diagnostic", return_value=real_paths)
+        mocker.patch(
+            "climate_ref_core.testing.TestCasePaths.from_diagnostic",
+            return_value=real_paths,
+        )
         mocker.patch("climate_ref_core.testing.load_datasets_from_yaml", return_value=MagicMock())
         mocker.patch("climate_ref.testing.TestCaseRunner", return_value=mock_runner)
         mocker.patch("climate_ref_core.testing.get_catalog_hash", return_value="abc123")
 
         result = invoke_cli(
-            ["test-cases", "run", "--provider", "example", "--diagnostic", "test-diag", "--force-regen"],
+            [
+                "test-cases",
+                "run",
+                "--provider",
+                "example",
+                "--diagnostic",
+                "test-diag",
+                "--force-regen",
+            ],
         )
         assert result.exit_code == 0
 
@@ -782,7 +841,11 @@ def test_run_with_force_regen(self, invoke_cli, mocker, tmp_path):
         manifest = Manifest.load(real_paths.manifest)
         assert manifest.test_case_version == 1
         assert manifest.native == {}
-        assert set(manifest.committed) == {"diagnostic.json", "output.json", "series.json"}
+        assert set(manifest.committed) == {
+            "diagnostic.json",
+            "output.json",
+            "series.json",
+        }
 
     def test_run_with_existing_baseline(self, invoke_cli, mocker, tmp_path):
         """Test run command logs when baseline exists."""
@@ -814,7 +877,9 @@ def test_run_with_existing_baseline(self, invoke_cli, mocker, tmp_path):
         mock_paths.regression = regression_dir
 
         mock_result = MagicMock(
-            successful=True, metric_bundle_filename="metrics.json", output_bundle_filename=None
+            successful=True,
+            metric_bundle_filename="metrics.json",
+            output_bundle_filename=None,
         )
         mock_result.to_output_path.return_value = tmp_path / "metrics.json"
         mock_runner = MagicMock()
@@ -824,7 +889,10 @@ def test_run_with_existing_baseline(self, invoke_cli, mocker, tmp_path):
             "climate_ref.provider_registry.ProviderRegistry.build_from_config",
             return_value=mock_registry,
         )
-        mocker.patch("climate_ref_core.testing.TestCasePaths.from_diagnostic", return_value=mock_paths)
+        mocker.patch(
+            "climate_ref_core.testing.TestCasePaths.from_diagnostic",
+            return_value=mock_paths,
+        )
         mocker.patch("climate_ref_core.testing.load_datasets_from_yaml", return_value=MagicMock())
         mocker.patch("climate_ref.testing.TestCaseRunner", return_value=mock_runner)
 
@@ -859,7 +927,10 @@ def test_run_with_only_missing_skips_existing(self, invoke_cli, mocker, tmp_path
             "climate_ref.provider_registry.ProviderRegistry.build_from_config",
             return_value=mock_registry,
         )
-        mocker.patch("climate_ref_core.testing.TestCasePaths.from_diagnostic", return_value=mock_paths)
+        mocker.patch(
+            "climate_ref_core.testing.TestCasePaths.from_diagnostic",
+            return_value=mock_paths,
+        )
 
         # With --only-missing, test case should be skipped and exit 0
         result = invoke_cli(
@@ -887,9 +958,15 @@ def test_run_with_if_changed_skips_unchanged(self, invoke_cli, mocker, tmp_path)
             "climate_ref.provider_registry.ProviderRegistry.build_from_config",
             return_value=mock_registry,
         )
-        mocker.patch("climate_ref_core.testing.TestCasePaths.from_diagnostic", return_value=mock_paths)
+        mocker.patch(
+            "climate_ref_core.testing.TestCasePaths.from_diagnostic",
+            return_value=mock_paths,
+        )
         # Catalog not changed since regression
-        mocker.patch("climate_ref_core.testing.catalog_changed_since_regression", return_value=False)
+        mocker.patch(
+            "climate_ref_core.testing.catalog_changed_since_regression",
+            return_value=False,
+        )
 
         result = invoke_cli(
             ["test-cases", "run", "--provider", "example", "--if-changed"],
@@ -926,8 +1003,14 @@ def test_cmip7_data_returned(self, mock_diagnostic, mock_test_case):
         with (
             patch("climate_ref_core.esgf.ESGFFetcher", return_value=mock_fetcher),
             patch("climate_ref.datasets.CMIP7DatasetAdapter", return_value=mock_adapter),
-            patch("climate_ref.cli.test_cases._solve_test_case", return_value=mock_datasets),
-            patch("climate_ref_core.testing.TestCasePaths.from_diagnostic", return_value=None),
+            patch(
+                "climate_ref.cli.test_cases._solve_test_case",
+                return_value=mock_datasets,
+            ),
+            patch(
+                "climate_ref_core.testing.TestCasePaths.from_diagnostic",
+                return_value=None,
+            ),
         ):
             result, written = _fetch_and_build_catalog(mock_diagnostic, mock_test_case)
 
@@ -958,9 +1041,18 @@ def test_pmp_climatology_data_returned(self, mock_diagnostic, mock_test_case):
 
         with (
             patch("climate_ref_core.esgf.ESGFFetcher", return_value=mock_fetcher),
-            patch("climate_ref.datasets.PMPClimatologyDatasetAdapter", return_value=mock_adapter),
-            patch("climate_ref.cli.test_cases._solve_test_case", return_value=mock_datasets),
-            patch("climate_ref_core.testing.TestCasePaths.from_diagnostic", return_value=None),
+            patch(
+                "climate_ref.datasets.PMPClimatologyDatasetAdapter",
+                return_value=mock_adapter,
+            ),
+            patch(
+                "climate_ref.cli.test_cases._solve_test_case",
+                return_value=mock_datasets,
+            ),
+            patch(
+                "climate_ref_core.testing.TestCasePaths.from_diagnostic",
+                return_value=None,
+            ),
         ):
             result, written = _fetch_and_build_catalog(mock_diagnostic, mock_test_case)
 
@@ -994,8 +1086,14 @@ def test_force_flag_passed_to_save(self, tmp_path, mock_diagnostic, mock_test_ca
         with (
             patch("climate_ref_core.esgf.ESGFFetcher", return_value=mock_fetcher),
             patch("climate_ref.datasets.CMIP6DatasetAdapter", return_value=mock_adapter),
-            patch("climate_ref.cli.test_cases._solve_test_case", return_value=mock_datasets),
-            patch("climate_ref_core.testing.TestCasePaths.from_diagnostic", return_value=mock_paths),
+            patch(
+                "climate_ref.cli.test_cases._solve_test_case",
+                return_value=mock_datasets,
+            ),
+            patch(
+                "climate_ref_core.testing.TestCasePaths.from_diagnostic",
+                return_value=mock_paths,
+            ),
             patch("climate_ref_core.testing.save_datasets_to_yaml", return_value=True) as mock_save,
         ):
             _fetch_and_build_catalog(mock_diagnostic, mock_test_case, force=True)
@@ -1035,10 +1133,22 @@ def test_mixed_source_types(self, mock_diagnostic, mock_test_case):
 
         with (
             patch("climate_ref_core.esgf.ESGFFetcher", return_value=mock_fetcher),
-            patch("climate_ref.datasets.CMIP6DatasetAdapter", return_value=mock_cmip6_adapter),
-            patch("climate_ref.datasets.Obs4MIPsDatasetAdapter", return_value=mock_obs_adapter),
-            patch("climate_ref.cli.test_cases._solve_test_case", return_value=mock_datasets),
-            patch("climate_ref_core.testing.TestCasePaths.from_diagnostic", return_value=None),
+            patch(
+                "climate_ref.datasets.CMIP6DatasetAdapter",
+                return_value=mock_cmip6_adapter,
+            ),
+            patch(
+                "climate_ref.datasets.Obs4MIPsDatasetAdapter",
+                return_value=mock_obs_adapter,
+            ),
+            patch(
+                "climate_ref.cli.test_cases._solve_test_case",
+                return_value=mock_datasets,
+            ),
+            patch(
+                "climate_ref_core.testing.TestCasePaths.from_diagnostic",
+                return_value=None,
+            ),
         ):
             result, _written = _fetch_and_build_catalog(mock_diagnostic, mock_test_case)
 
@@ -1156,7 +1266,10 @@ def test_list_with_test_cases_and_paths(self, invoke_cli, mocker, tmp_path):
             "climate_ref.provider_registry.ProviderRegistry.build_from_config",
             return_value=mock_registry,
         )
-        mocker.patch("climate_ref_core.testing.TestCasePaths.from_diagnostic", return_value=mock_paths)
+        mocker.patch(
+            "climate_ref_core.testing.TestCasePaths.from_diagnostic",
+            return_value=mock_paths,
+        )
 
         result = invoke_cli(["test-cases", "list"])
         assert result.exit_code == 0
@@ -1315,7 +1428,14 @@ def test_replay_empty_native_is_hard_failure(self, invoke_cli, mocker, tmp_path)
         mocker.patch("climate_ref_core.regression.store.build_native_store", return_value=store)
 
         result = invoke_cli(
-            ["test-cases", "replay", "--provider", "example", "--diagnostic", "test-diag"],
+            [
+                "test-cases",
+                "replay",
+                "--provider",
+                "example",
+                "--diagnostic",
+                "test-diag",
+            ],
             expected_exit_code=1,
         )
         assert "not yet minted" in result.stderr.lower() or "mint" in result.stderr.lower()
@@ -1351,7 +1471,14 @@ def test_replay_integrity_mismatch_warns_and_continues(self, invoke_cli, mocker,
         mocker.patch("climate_ref_core.regression.store.build_native_store", return_value=store)
 
         result = invoke_cli(
-            ["test-cases", "replay", "--provider", "example", "--diagnostic", "test-diag"],
+            [
+                "test-cases",
+                "replay",
+                "--provider",
+                "example",
+                "--diagnostic",
+                "test-diag",
+            ],
             expected_exit_code=1,
         )
         # The mismatch warned rather than gated, and execution fell through to the native guard.
@@ -1402,7 +1529,16 @@ def test_replay_reconciles_integrity_mismatch_within_tolerance(self, invoke_cli,
         mocker.patch("climate_ref_core.diagnostics.ExecutionDefinition")
         mocker.patch("climate_ref_core.regression.assert_bundle_regression")
 
-        result = invoke_cli(["test-cases", "replay", "--provider", "example", "--diagnostic", "test-diag"])
+        result = invoke_cli(
+            [
+                "test-cases",
+                "replay",
+                "--provider",
+                "example",
+                "--diagnostic",
+                "test-diag",
+            ]
+        )
         assert "Replay reconciled committed bundle" in result.stderr
         assert "equivalent within tolerance" in result.stderr
         assert "3 bundle file(s)" in result.stderr
@@ -1490,3 +1626,276 @@ def test_mint_writes_blobs_and_manifest(self, invoke_cli, mocker, tmp_path):
         # Each native blob was PUT into the store.
         for entry in manifest.native.values():
             assert store.has(entry.sha256)
+
+
+class TestCIGateCommand:
+    """Tests for the ``ref test-cases ci-gate`` command."""
+
+    def _setup(
+        self,
+        mocker,
+        tmp_path,
+        *,
+        current_version=None,
+        committed_content=None,
+        native=None,
+    ):
+        """
+        Wire a single example test case for the gate.
+
+        Writes the *current* manifest (when ``current_version`` is set) alongside real
+        committed files in ``regression/``, so the gate's on-disk integrity check sees a
+        manifest that faithfully describes the bundle.
+        The base-branch manifest is left absent by default (``git show`` raises);
+        a test sets ``repo.git.show`` to supply one,
+        using the returned committed digests so they line up with the current bundle.
+
+        Parameters
+        ----------
+        current_version
+            ``test_case_version`` of the current manifest,
+            or ``None`` to leave the manifest absent (never managed / deleted on this branch).
+        committed_content
+            ``{filename: text}`` written to ``regression/`` for the current manifest.
+            Defaults to a single ``output.json``.
+        native
+            Native entries for the current manifest.
+
+        Returns
+        -------
+        :
+            ``(repo, paths, committed_digests)`` — the mock repo, the test case paths,
+            and the digests computed from the on-disk committed bundle.
+        """
+        from git import GitCommandError
+
+        from climate_ref_core.regression.manifest import (
+            SCHEMA_VERSION,
+            Manifest,
+            compute_committed_digests,
+        )
+        from climate_ref_core.testing import TestCasePaths
+
+        mock_tc = MagicMock()
+        mock_tc.name = "default"
+        mock_spec = MagicMock(test_cases=[mock_tc])
+        mock_diag = MagicMock(slug="test-diag", test_data_spec=mock_spec)
+        mock_diag.provider = MagicMock(slug="example")
+        mock_provider = MagicMock(slug="example")
+        mock_provider.diagnostics.return_value = [mock_diag]
+        mock_registry = MagicMock(providers=[mock_provider])
+
+        case_dir = tmp_path / "test-diag" / "default"
+        case_dir.mkdir(parents=True)
+        paths = TestCasePaths(root=case_dir)
+
+        committed_digests: dict[str, str] = {}
+        if current_version is not None:
+            content = committed_content if committed_content is not None else {"output.json": '{"x": 1}\n'}
+            paths.regression.mkdir(parents=True, exist_ok=True)
+            for name, text in content.items():
+                (paths.regression / name).write_text(text, encoding="utf-8")
+            committed_digests = compute_committed_digests(paths.regression)
+            Manifest(
+                schema=SCHEMA_VERSION,
+                test_case_version=current_version,
+                committed=committed_digests,
+                native=native or {},
+            ).dump(paths.manifest)
+
+        repo = MagicMock()
+        repo.working_dir = str(tmp_path)
+        repo.git.diff.return_value = ""
+        # Base manifest absent by default; tests that need one override repo.git.show.
+        repo.git.show.side_effect = GitCommandError("show", 128)
+
+        mocker.patch(
+            "climate_ref.provider_registry.ProviderRegistry.build_from_config",
+            return_value=mock_registry,
+        )
+        mocker.patch("climate_ref_core.testing.TestCasePaths.from_diagnostic", return_value=paths)
+        mocker.patch("climate_ref.cli.test_cases.get_repo_for_path", return_value=repo)
+        return repo, paths, committed_digests
+
+    @staticmethod
+    def _set_base(repo, version, committed, native=None):
+        """Configure ``repo.git.show`` to return a base manifest with these fields."""
+        import json as _json
+
+        from attrs import asdict
+
+        from climate_ref_core.regression.manifest import SCHEMA_VERSION
+
+        payload = {
+            "schema": SCHEMA_VERSION,
+            "test_case_version": version,
+            "committed": committed,
+            "native": {relpath: asdict(entry) for relpath, entry in (native or {}).items()},
+        }
+        repo.git.show.side_effect = None
+        repo.git.show.return_value = _json.dumps(payload, indent=2, sort_keys=True)
+
+    def test_seeding_replays(self, invoke_cli, mocker, tmp_path):
+        # Current manifest present with a native baseline, no base manifest -> seeding -> REPLAY.
+        from climate_ref_core.regression.manifest import NativeEntry
+
+        self._setup(mocker, tmp_path, current_version=1, native={"data.nc": NativeEntry("a" * 64, 10)})
+
+        result = invoke_cli(["test-cases", "ci-gate"])
+        assert result.exit_code == 0
+        assert "replay" in result.output
+
+    def test_seeding_without_native_skips(self, invoke_cli, mocker, tmp_path):
+        # Current manifest present with an empty native set, no base manifest -> seeding -> SKIP.
+        self._setup(mocker, tmp_path, current_version=1)
+
+        result = invoke_cli(["test-cases", "ci-gate"])
+        assert result.exit_code == 0
+        assert "skip" in result.output
+
+    def test_version_bump_executes(self, invoke_cli, mocker, tmp_path):
+        repo, _, digests = self._setup(mocker, tmp_path, current_version=2)
+        self._set_base(repo, 1, digests)
+
+        result = invoke_cli(["test-cases", "ci-gate"])
+        assert result.exit_code == 0
+        assert "execute" in result.output
+
+    def test_committed_change_without_bump_fails(self, invoke_cli, mocker, tmp_path):
+        repo, _, _ = self._setup(mocker, tmp_path, current_version=1)
+        # Base committed differs from the on-disk bundle, with no version bump.
+        self._set_base(repo, 1, {"output.json": "f" * 64})
+
+        result = invoke_cli(["test-cases", "ci-gate"], expected_exit_code=1)
+        assert result.exit_code == 1
+        assert "fail" in result.output
+
+    def test_no_change_skips(self, invoke_cli, mocker, tmp_path):
+        repo, _, digests = self._setup(mocker, tmp_path, current_version=1)
+        self._set_base(repo, 1, digests)
+
+        result = invoke_cli(["test-cases", "ci-gate"])
+        assert result.exit_code == 0
+        assert "skip" in result.output
+
+    def test_core_change_replays(self, invoke_cli, mocker, tmp_path):
+        repo, _, digests = self._setup(mocker, tmp_path, current_version=1)
+        self._set_base(repo, 1, digests)
+        repo.git.diff.return_value = "packages/climate-ref-core/src/climate_ref_core/regression/compare.py"
+
+        result = invoke_cli(["test-cases", "ci-gate"])
+        assert result.exit_code == 0
+        assert "replay" in result.output
+
+    def test_native_change_replays(self, invoke_cli, mocker, tmp_path):
+        from climate_ref_core.regression.manifest import NativeEntry
+
+        repo, _, digests = self._setup(
+            mocker,
+            tmp_path,
+            current_version=1,
+            native={"data.nc": NativeEntry("a" * 64, 10)},
+        )
+        # Committed + version unchanged, but native blob re-minted -> REPLAY.
+        self._set_base(repo, 1, digests, native={"data.nc": NativeEntry("b" * 64, 12)})
+
+        result = invoke_cli(["test-cases", "ci-gate"])
+        assert result.exit_code == 0
+        assert "replay" in result.output
+
+    def test_committed_integrity_drift_fails(self, invoke_cli, mocker, tmp_path):
+        repo, paths, digests = self._setup(mocker, tmp_path, current_version=1)
+        self._set_base(repo, 1, digests)
+        # Edit the on-disk committed bundle without regenerating the manifest.
+        (paths.regression / "output.json").write_text('{"x": 999}\n', encoding="utf-8")
+
+        result = invoke_cli(["test-cases", "ci-gate"], expected_exit_code=1)
+        assert result.exit_code == 1
+        assert "fail" in result.output
+
+    def test_json_output(self, invoke_cli, mocker, tmp_path):
+        repo, _, digests = self._setup(mocker, tmp_path, current_version=2)
+        self._set_base(repo, 1, digests)
+
+        result = invoke_cli(["test-cases", "ci-gate", "--json"])
+        assert result.exit_code == 0
+        # The test runner mixes loguru's stderr lines into the captured output; in real
+        # use stdout (the JSON) is captured separately. Slice from the JSON array start.
+        payload = json.loads(result.output[result.output.index("[") :])
+        assert payload[0]["action"] == "execute"
+        assert payload[0]["case"] == "example/test-diag/default"
+
+    def test_manifest_deletion_fails(self, invoke_cli, mocker, tmp_path):
+        # Manifest absent on this branch but present on the base -> deletion -> FAIL.
+        repo, _, _ = self._setup(mocker, tmp_path, current_version=None)
+        self._set_base(repo, 1, {"output.json": "a" * 64})
+
+        result = invoke_cli(["test-cases", "ci-gate"], expected_exit_code=1)
+        assert result.exit_code == 1
+        assert "fail" in result.output
+
+    def test_never_managed_skips(self, invoke_cli, mocker, tmp_path):
+        # No current manifest and none on the base -> never managed -> SKIP.
+        self._setup(mocker, tmp_path, current_version=None)
+
+        result = invoke_cli(["test-cases", "ci-gate"])
+        assert result.exit_code == 0
+        assert "skip" in result.output
+
+    def test_not_in_repo_fails(self, invoke_cli, mocker, tmp_path):
+        self._setup(mocker, tmp_path, current_version=1)
+        mocker.patch("climate_ref.cli.test_cases.get_repo_for_path", return_value=None)
+
+        result = invoke_cli(["test-cases", "ci-gate"], expected_exit_code=1)
+        assert result.exit_code == 1
+
+    def test_paths_none_skips(self, invoke_cli, mocker, tmp_path):
+        # A non-dev checkout: from_diagnostic cannot locate the test case directory.
+        self._setup(mocker, tmp_path, current_version=1)
+        mocker.patch("climate_ref_core.testing.TestCasePaths.from_diagnostic", return_value=None)
+
+        result = invoke_cli(["test-cases", "ci-gate"])
+        assert result.exit_code == 0
+        assert "skip" in result.output
+
+    def test_corrupt_current_manifest_fails(self, invoke_cli, mocker, tmp_path):
+        # A valid manifest is written, then overwritten with garbage.
+        _, paths, _ = self._setup(mocker, tmp_path, current_version=1)
+        paths.manifest.write_text("not json{", encoding="utf-8")
+
+        result = invoke_cli(["test-cases", "ci-gate"], expected_exit_code=1)
+        assert result.exit_code == 1
+        assert "fail" in result.output
+
+    def test_corrupt_base_manifest_treated_as_seeding(self, invoke_cli, mocker, tmp_path):
+        # git show returns non-JSON for the base manifest: must not crash; seed (replay).
+        from climate_ref_core.regression.manifest import NativeEntry
+
+        repo, _, _ = self._setup(
+            mocker, tmp_path, current_version=1, native={"data.nc": NativeEntry("a" * 64, 10)}
+        )
+        repo.git.show.side_effect = None
+        repo.git.show.return_value = "}}not json"
+
+        result = invoke_cli(["test-cases", "ci-gate"])
+        assert result.exit_code == 0
+        assert "replay" in result.output
+
+    def test_catalog_drift_fails(self, invoke_cli, mocker, tmp_path):
+        # The manifest records a catalog_hash that the on-disk catalog no longer matches.
+        from climate_ref_core.regression.manifest import Manifest
+
+        _, paths, _ = self._setup(mocker, tmp_path, current_version=1)
+        manifest = Manifest.load(paths.manifest)
+        Manifest(
+            schema=manifest.schema,
+            test_case_version=manifest.test_case_version,
+            committed=manifest.committed,
+            native=manifest.native,
+            catalog_hash="expected_hash",
+        ).dump(paths.manifest)
+        paths.catalog.write_text("_metadata:\n  hash: different_hash\ncmip6:\n  datasets: []\n")
+
+        result = invoke_cli(["test-cases", "ci-gate"], expected_exit_code=1)
+        assert result.exit_code == 1
+        assert "fail" in result.output