Skip to content

fix(registry): verify registry artifact content hashes#2763

Open
daryllimyt wants to merge 11 commits into
mainfrom
build/registry-artifact-content-hash
Open

fix(registry): verify registry artifact content hashes#2763
daryllimyt wants to merge 11 commits into
mainfrom
build/registry-artifact-content-hash

Conversation

@daryllimyt

@daryllimyt daryllimyt commented May 23, 2026

Copy link
Copy Markdown
Contributor

Summary

  • add registry artifact content hashes to version records and prebuilt artifact metadata
  • include #sha256=... artifact references so executors verify downloaded SquashFS artifacts
  • make SquashFS output deterministic by sorting staged files, fixing filesystem timestamps, normalizing metadata, and using portable gzip -6
  • default SquashFS build processors to detected CPU count while preserving the env override

Stack

Stacked on #2753, which keeps the prebuilt manifest startup-sync performance changes and manifest-fingerprint fallback.

Testing

  • uv run pytest tests/unit/executor/test_registry_helpers.py tests/unit/test_registry_artifacts.py tests/unit/test_registry_sync_base_service.py tests/unit/test_registry_lock_service.py tests/unit/test_registry_platform_startup.py tests/unit/test_registry_sync_runner.py tests/unit/test_registry_sync_schemas.py
  • uv run pytest tests/unit/test_registry_sync_artifact.py
  • uv run ruff check .
  • uv run ruff format --check .
  • uv run pyright tracecat/config.py tracecat/registry/sync/artifact.py tests/unit/test_registry_sync_artifact.py
  • uv run alembic heads
  • pnpm -C frontend exec biome check src/client/schemas.gen.ts src/client/types.gen.ts

daryllimyt commented May 23, 2026

Copy link
Copy Markdown
Contributor Author

@daryllimyt daryllimyt added engine Improvements or additions to the workflow engine build Build system and package dependency changes fix Bug fix labels May 23, 2026
@zeropath-ai

zeropath-ai Bot commented May 23, 2026

Copy link
Copy Markdown

No security or compliance issues detected. Reviewed everything up to 1fe4c3d.

Security Overview
Detected Code Changes
Change Type Relevant files
Enhancement ► alembic/versions/b4f8c1d2e3a4_add_registry_version_artifact_hash.py
    Add artifact_hash column to registry_version and platform_registry_version tables
► frontend/src/client/schemas.gen.ts
    Add artifact_hash to RegistryVersionRead schema
► frontend/src/client/types.gen.ts
    Add artifact_hash to RegistryVersionRead types
► tests/unit/executor/test_registry_artifact_lookup.py
    Add tests for artifact hash locking and fallback behavior
► tests/unit/executor/test_registry_helpers.py
    Update tests to handle artifact hashes in URIs and lookups
► tests/unit/test_registry_artifacts.py
    Add tests for artifact hash verification during download and candidate selection
► tests/unit/test_registry_lock_service.py
    Test that artifact hash is preferred for origin fingerprint
► tests/unit/test_registry_platform_startup.py
    Update promotion logic to handle artifact hashes
► tests/unit/test_registry_sync_artifact.py
    Add support for artifact hashes in SquashFS image creation
► tests/unit/test_registry_sync_base_service.py
    Add artifact hash to build results and handle prebuilt metadata
► tests/unit/test_registry_sync_runner.py
    Add artifact hash to runner results and prebuilt metadata handling
► tests/unit/test_registry_version_schemas.py
    Add validation for artifact hash format
► tracecat/admin/registry/schemas.py
    Add artifact_hash to RegistryVersionRead schema
► tracecat/admin/registry/service.py
    Include artifact_hash when listing registry versions
► tracecat/config.py
    Add configuration for SquashFS processors
► tracecat/db/models.py
    Add artifact_hash column to BaseRegistryVersion model
► tracecat/executor/backends/registry_helpers.py
    Update artifact lookup to consider artifact hashes and bundled metadata
► tracecat/executor/registry_artifacts.py
    Add expected_hash to RegistryArtifact and support hash fragment in URIs
    Update _download_s3_artifact to verify expected SHA-256 hash
    Add functions to handle registry artifact references with hashes

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

candidates.append(
TarballArtifact(
uri=tarball_uri,
cache_key=ctx.cache_key,
)

P1 Badge Enforce hash verification on SquashFS fallback tarball

When a lock carries an expected artifact hash, integrity failures on the primary .squashfs candidate are currently bypassable because the fallback TarballArtifact is created without any expected hash. In materialize(), candidate failures are caught and the next candidate is tried, so a hash mismatch (or other download integrity error) on the squashfs can still lead to executing an unverified tarball sibling if it exists. This weakens the new content-hash verification guarantee for artifact resolution.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 29 files

Confidence score: 3/5

  • There is concrete integrity risk in tracecat/executor/registry_artifacts.py: in the TAR_GZ flow, expected_hash is not applied to TarballArtifact, and TarballArtifact.download does not enforce hash verification, so tampered artifacts could be accepted.
  • tracecat/registry/versions/schemas.py currently accepts any artifact_hash string despite SHA-256 documentation, which can let invalid data in and weaken downstream validation guarantees.
  • Given the medium-high severity and confidence on artifact verification behavior, this looks mergeable only with caution rather than low-risk safe-to-merge.
  • Pay close attention to tracecat/executor/registry_artifacts.py, tracecat/registry/versions/schemas.py - missing hash enforcement and weak input validation are the main regression/integrity concerns.

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread tracecat/executor/registry_artifacts.py
Comment thread tracecat/registry/versions/schemas.py

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bc4c4c0d4b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tracecat/executor/registry_artifacts.py Outdated
@daryllimyt daryllimyt changed the base branch from build/imrpove-sync to graphite-base/2763 May 24, 2026 18:58

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 181ebe0a7a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tracecat/registry/sync/base_service.py

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

promote_version_id=None if is_fresh_install else result.version.id,

P2 Badge Persist artifact hash after fresh-install deferred build

When startup sync runs on a fresh install, this call path sets promote_version_id=None, so _build_platform_registry_artifact() never invokes _promote_platform_registry_version_after_artifact_build() to write back the built artifact metadata. In the same flow, deferred sync can create the version with artifact_hash=None when no prebuilt metadata is present (tracecat/registry/sync/base_service.py lines 479-487). The result is a current builtin registry version that keeps a null hash even after the artifact is built, so lock resolution won’t emit hash-locked artifact refs and executors skip integrity verification indefinitely in that environment.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@daryllimyt daryllimyt force-pushed the build/registry-artifact-content-hash branch from 7f01567 to 56416c6 Compare May 24, 2026 19:36
@daryllimyt daryllimyt force-pushed the build/registry-artifact-content-hash branch from 56416c6 to 07618f2 Compare May 24, 2026 19:49
@daryllimyt daryllimyt force-pushed the graphite-base/2763 branch from 2f03d6d to 0023fcb Compare May 24, 2026 19:49
@daryllimyt daryllimyt changed the base branch from graphite-base/2763 to main May 24, 2026 19:49
@daryllimyt daryllimyt changed the base branch from main to fix/registry-current-startup-artifact-uri May 24, 2026 20:17
Base automatically changed from fix/registry-current-startup-artifact-uri to main May 24, 2026 20:40
@daryllimyt daryllimyt requested a review from jordan-umusu May 24, 2026 22:23

@jordan-umusu jordan-umusu left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build Build system and package dependency changes engine Improvements or additions to the workflow engine fix Bug fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants