Skip to content

feat(sync): workspace-level git sync foundation with changeset staging#2852

Open
daryllimyt wants to merge 5 commits into
mainfrom
feat/revamp-git-sync-1
Open

feat(sync): workspace-level git sync foundation with changeset staging#2852
daryllimyt wants to merge 5 commits into
mainfrom
feat/revamp-git-sync-1

Conversation

@daryllimyt

@daryllimyt daryllimyt commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

What

Workspace-level Git sync foundation built on the projection/reconciliation model from the spec (Revamp Git Sync project):

  • Canonical workspace spec: tracecat.json manifest + workflows/<source-id>/definition.yml resource files, deterministically serialized (sorted keys, exclude_none, versioned v1: SHA-256 spec hashes). Legacy wf_... repo layouts dual-read on pull.
  • Identity mapping: workspace_sync_resource_mapping maps stable slug-derived source ids to workspace-local UUIDs, so one repo can hydrate multiple workspaces.
  • Three-way sync state: base/local/remote classification (clean / local_dirty / remote_ahead / diverged / conflicted) via GET /workflows/sync/status, with pending changes from diff(base, P(workspace)).
  • ChangeSets: reviewable export units with files frozen at creation time (rendered_files) — exports always ship the reviewed snapshot, never a re-projection. Materializations record branch/commit/PR outputs.
  • Staging UI: workspace settings panel to review pending changes, select resources into a ChangeSet, and export to a branch/PR.
  • Pull as reconciliation: parse → validate → resolve identities → atomic import, with dry-run diagnostics. No hard deletes.

Replaces the legacy per-workflow WorkflowSyncService (tracecat/workflow/store/sync.py); the publish path now routes through the workspace sync exporter.

Database

One additive migration (25f4e2a1c9d8): five workspace-scoped tables (workspace_sync_state, workspace_sync_resource_mapping, workspace_sync_changeset, workspace_sync_changeset_item, workspace_sync_materialization) with RLS enabled.

Testing

  • tests/unit/test_workspace_sync.py: projection determinism, no local-UUID leakage, legacy dual-read, mapping identity, status/changeset/export flow against a mocked GitHub transport.
  • Updated store service/publish API tests for the new exporter path.

Notes

  • Stacked PR: the M1 lifecycle fixes (feat/revamp-git-sync-2) land on top of this branch.
  • GitHub transport reads the full tree per status call; efficiency work is tracked in ENG-1476.

Summary by cubic

Introduces workspace-level Git sync with a canonical spec, three-way status, and reviewable ChangeSets with staging, replacing the per‑workflow sync flow. Publish/pull now route through the workspace sync service, aligned with the Revamp Git Sync projection/reconciliation spec.

  • New Features

    • Canonical repo layout: tracecat.json + workflows/<source-id>/definition.yml, deterministic serialization and versioned SHA‑256 spec hashes.
    • Identity mapping from stable slug‑based source IDs to workspace UUIDs (one repo can hydrate multiple workspaces).
    • Three‑way status (base/local/remote) and pending changes via GET /workflows/sync/status.
    • ChangeSets with frozen rendered_files: select resources, then export to a branch/PR; materialization records commit/PR info.
    • Pull as reconciliation: parse → validate → resolve identities → atomic import, with dry‑run diagnostics and no hard deletes; legacy layouts are dual‑read on pull.
    • Staging UI in workspace settings to review changes and export.
    • Backend refactor: remove legacy WorkflowSyncService; workflow publish/pull now uses WorkspaceGitSyncService.
    • Tests: webhook fixtures now include headers so projections carry through.
  • Migration

    • Additive DB migrations: new workspace‑scoped tables (workspace_sync_state, workspace_sync_resource_mapping, workspace_sync_changeset, workspace_sync_changeset_item, workspace_sync_materialization, workspace_sync_event) and rendered_files on ChangeSets; RLS enabled.
    • Tables are workspace‑scoped only (no organization_id), matching tenant RLS.
    • Action: run Alembic migrations; no breaking changes. Existing repos continue to work via legacy dual‑read on pull.
    • Fix: migration helper annotations updated to Python types.

Written for commit 911e754. Summary will update on new commits.

Review in cubic

Reframe git sync from per-workflow UUID-keyed pushes to a workspace-level
canonical projection addressed by stable, human-readable source IDs.

- Add tracecat/workspace_sync/ package: canonical specs + deterministic
  serialization/hashing, workflow ORM/legacy/YAML adapter, GitHub App
  transport, and the projection/parse/pull/export service.
- Repo layout becomes tracecat.json manifest + workflows/<slug>/definition.yml
  (pull dual-reads the legacy RemoteWorkflowDefinition format).
- Add 6 workspace-scoped sync tables (state, resource_mapping, event,
  changeset, changeset_item, materialization) + migration + RLS entries.
- Rewire workflow store publish/pull onto WorkspaceGitSyncService; legacy
  no-branch publish now synthesizes a temp branch + PR; pull gains a
  base-vs-local drift conflict gate.
- Remove dead WorkflowSyncService and its tests; repoint store API tests.

Note: changeset/event lifecycle is scaffolding only (tables written on
export, no review endpoints yet); WorkspaceGitHubSyncService lacks dedicated
transport tests.
@daryllimyt daryllimyt added enhancement New feature or request engine Improvements or additions to the workflow engine ui Improvements or additions to UI/UX migration Database migration labels Jun 12, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6d92b9feb5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tracecat/workspace_sync/git.py
Comment thread tracecat/workspace_sync/service.py
@zeropath-ai

zeropath-ai Bot commented Jun 12, 2026

Copy link
Copy Markdown

No security or compliance issues detected. Reviewed everything up to 911e754.

Security Overview
Detected Code Changes

The diff is too large to display a summary of code changes.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

16 issues found across 25 files

Confidence score: 2/5

  • The highest-risk issue is in tracecat/workspace_sync/service.py: pull reconciliation and pending-change staging both miss delete operations, so merges can advance base hashes while silently failing to propagate workflow deletions, leaving local/remote state inconsistent—add delete handling end-to-end in reconciliation/export and cover it with regression tests before merging.
  • Export flow validation is currently leaky across tracecat/workspace_sync/schemas.py, tracecat/workspace_sync/git.py, and frontend/src/components/organization/workspace-sync-staging.tsx, so whitespace-only messages and non-exportable selected changes can still reach Git/PR calls and fail late—enforce trimmed non-empty messages and block non-exportable items in payload construction before merge.
  • tracecat/workspace_sync/git.py and tracecat/workspace_sync/service.py can fail hard on repository contents because file reads assume UTF-8 and parsing scans all files under workflows/, so a binary/helper file can break sync unexpectedly—filter to valid workflow definition paths and make reads resilient to non-text blobs.
  • Input/path guardrails in tracecat/workspace_sync/workflow.py and tracecat/workflow/store/router.py are weak enough to accept malformed workflow paths or empty-string branch input, which can import unexpected files or silently target the wrong base branch—tighten canonical path checks and treat empty strings as invalid explicitly before merging.

Tip: cubic can generate docs of your entire codebase and keep them up to date. Try it here.

Re-trigger cubic

Comment thread tracecat/workspace_sync/service.py
Comment thread tracecat/workspace_sync/service.py
Comment thread alembic/versions/25f4e2a1c9d8_add_workspace_git_sync_tables.py
Comment thread tracecat/workspace_sync/schemas.py
Comment thread tracecat/workspace_sync/schemas.py
Comment thread tracecat/workspace_sync/service.py
Comment thread tracecat/workspace_sync/workflow.py
Comment thread tracecat/workspace_sync/workflow.py
Comment thread tracecat/workspace_sync/service.py
Comment thread tracecat/workflow/store/service.py
- Drop organization_id from workspace sync tables per workspace-scoped
  table convention (org derives via workspace); satisfies the tenant RLS
  registry test introduced on main
- Reparent migration onto 9b52f7f18a31 so the branch is mergeable alone
- Add include_headers to webhook test fixtures for the projection carry

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 911e7542b8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +435 to +443
preferred_source_id = default_workflow_source_id(
alias=workflow.alias,
title=dsl.title,
)
mapping = await self._ensure_resource_mapping(
resource_type=SyncResourceType.WORKFLOW.value,
local_id=WorkflowUUID.new(workflow.id),
preferred_source_id=preferred_source_id,
source_path=workflow_source_path(preferred_source_id),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve legacy workflow publish paths

When an upgraded workspace publishes a workflow before any new WorkspaceSyncResourceMapping has been created, this fallback chooses an alias/title slug as the source id, so the exporter writes workflows/<slug>/definition.yml instead of the legacy stable workflows/<wf_id>/definition.yml path that existing Git sync repos already contain. Since write_files() only adds/updates the selected files and never removes the old path, merging that publish leaves two definitions for the same workflow in the repo, and later pulls can import duplicates or stale data. Please fall back to the existing workflow-id source path or backfill/discover mappings before changing the path.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

engine Improvements or additions to the workflow engine enhancement New feature or request migration Database migration ui Improvements or additions to UI/UX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant