Skip to content

feat: size-aware CRDB bulk export partitioning#3089

Draft
ostafen wants to merge 2 commits intomainfrom
fix/partitioned-export-partitions
Draft

feat: size-aware CRDB bulk export partitioning#3089
ostafen wants to merge 2 commits intomainfrom
fix/partitioned-export-partitions

Conversation

@ostafen
Copy link
Copy Markdown
Contributor

@ostafen ostafen commented May 5, 2026

Balance partitions by total range bytes from SHOW RANGES ... WITH DETAILS instead of by range count, so workers process roughly equal amounts of data when range sizes are uneven. Uses a simple greedy: walk ranges in order, accumulating size, and split at the next boundary once a target of totalSize/K is reached. Skewed inputs may yield fewer than K partitions, but each one stays close to its fair share.

Falls back to range-count balancing (logged) when CRDB reports zero range_size for every range, e.g. on tables too fresh to have been sized.

Description

Testing

References

Balance partitions by total range bytes from SHOW RANGES ... WITH DETAILS
instead of by range count, so workers process roughly equal amounts of
data when range sizes are uneven. Uses a simple greedy: walk ranges in
order, accumulating size, and split at the next boundary once a target
of totalSize/K is reached. Skewed inputs may yield fewer than K
partitions, but each one stays close to its fair share.

Falls back to range-count balancing (logged) when CRDB reports zero
range_size for every range, e.g. on tables too fresh to have been sized.
@github-actions github-actions Bot added area/datastore Affects the storage system area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools) labels May 5, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 5, 2026

Codecov Report

❌ Patch coverage is 92.72727% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.39%. Comparing base (893f622) to head (92fa9df).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
internal/datastore/crdb/partitioner.go 92.73% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3089      +/-   ##
==========================================
- Coverage   75.42%   75.39%   -0.03%     
==========================================
  Files         502      502              
  Lines       61264    61312      +48     
==========================================
+ Hits        46204    46221      +17     
- Misses      11711    11742      +31     
  Partials     3349     3349              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/datastore Affects the storage system area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools) Skip-Changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant