Add Superset asset management system with automated sync#1870
Merged
Conversation
Implements complete asset management workflow for Apache Superset using the sup CLI with OAuth authentication and automated database UUID mapping. New Features: - Automated export/import of all Superset assets (datasets, charts, dashboards) - OAuth2 PKCE authentication with CSRF token support - Automatic database UUID mapping between environments - Pagination support to fetch ALL assets (not limited to 50) - Environment-agnostic scripts (works with any source/target instance) - Continue-on-error for resilient imports Scripts Added: - export_all.sh: Export all assets from any Superset instance with pagination - sync_assets.sh: Sync assets between instances with UUID mapping - map_database_uuids.py: Automatic database UUID translation - validate_assets.sh: YAML validation (from previous workflow) - promote_to_production.sh: Legacy promotion script (from previous workflow) - export_from_qa.sh: Legacy QA export script (from previous workflow) Assets Included: - 76+ datasets from production Trino and Superset Metadata DB - 107+ charts covering all visualization types - 18+ published dashboards (enrollment, engagement, orders, etc.) - 2 database connection configs (Trino + Superset Metadata DB) Documentation: - WORKFLOWS.md: Complete workflow guide with diagrams - scripts/README.md: Script reference and troubleshooting - Both include examples for common operations Technical Implementation: - Uses sup CLI (fork: mitodl/superset-sup with self-hosted support) - CSRF token handling for POST requests (multipart/form-data) - Database UUID mapping: Production Trino → QA Trino translation - Pagination loops fetch 100 items per page until complete - Regex fallback for malformed JSON from CLI Typical Workflows: 1. Production → QA sync (backup/mirroring) 2. QA → Production promotion (after testing changes) 3. Weekly backups with git version control Related: Requires sup CLI from ~/src/superset-sup (see PR preset-io/superset-sup#19)
The --limit 1000 flags were preventing pagination from triggering. Now the script relies on the pagination logic in pull commands to fetch ALL assets without any artificial limit. This works with the recent sup CLI fix that makes pagination trigger when no limit is specified (filters.limit = None). Result: - Fetches all datasets via pagination - Fetches all charts via pagination (123+ instead of 100) - Fetches all dashboards via pagination
6d2a604 to
00aec8a
Compare
rachellougee
approved these changes
Jan 30, 2026
Contributor
rachellougee
left a comment
There was a problem hiding this comment.
Nice work! I wasn't able to run the export script locally but I can circle back next week.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
Implements complete asset management workflow for Apache Superset using the sup CLI with OAuth authentication and automated database UUID mapping.
New Features
Scripts Added
export_all.sh: Export all assets from any Superset instance with paginationsync_assets.sh: Sync assets between instances with UUID mappingmap_database_uuids.py: Automatic database UUID translationvalidate_assets.sh: YAML validation (from previous workflow)promote_to_production.sh: Legacy promotion script (from previous workflow)export_from_qa.sh: Legacy QA export script (from previous workflow)Assets Included
Documentation
Technical Implementation
Typical Workflows
Related Work
uv tool install git+https://github.com/mitodl/superset-sup@self_hosted_superset_oidcTesting
✅ Tested and validated:
To run it yourself, run
uv tool install git+https://github.com/mitodl/superset-sup@self_hosted_superset_oidcand then create a configuration file located at~/.sup.config.ymlwith the following contents: