feat: support planning cleanup#7147
Conversation
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
@claude review |
Xuanwo
left a comment
There was a problem hiding this comment.
I think this should follow SQL EXPLAIN semantics: the cleanup plan should be a dry-run/audit report, not a materialized deletion plan.
Execution should re-evaluate cleanup from the current dataset/ref state instead of trusting an old file list. For example, a tag or branch can be added after planning without advancing the manifest version, so the current read_version check can still pass while the old plan deletes files that are now protected.
Sounds reasonable. My original idea is also for |
Motivation
Adds a plan + execute flow for
cleanupso deletions become auditable. Per reviewer feedback on the original design, the plan now follows SQLEXPLAINsemantics: it is a read-only audit report, not a frozen deletion contract.Design
Plan is an EXPLAIN-style report.
CleanupPlanexposes:dataset_uri,base_path,policy,created_at— authoritative inputs reused by execution.files,stats,referenced_branches,tagged_old_versions— informational; describe the dataset state observed at planning time.Execute re-plans against the current dataset state.
cleanup_with_plan:dataset_uri/base_pathmatch the current dataset.plan.policy.Tags, branches, and commits added after planning are naturally honored — including the cases where they do not advance the manifest version.
Drops the
read_versionenforcement.latest_version_iddoes not move when new tags or branch refs are created, so the previous check did not protect against the TOCTOU case the reviewer raised. Re-planning is strictly stronger.API
Impact
plan.filesif the dataset changes between plan and execute. This upper-bound semantics matches SQLEXPLAINand is documented onCleanupPlanandcleanup_with_plan.read_versionfield, which has not been released yet.Tests
cleanup_with_plan_reevaluates_against_concurrent_commit— execute after a concurrent commit succeeds and removes the correct set.cleanup_with_plan_handles_encoded_paths— re-planning still discovers temp manifests with URL-unsafe names.read_versionwith behavior-level coverage of the new semantics.