diff --git a/migration/mongosync_insights/CONFIGURATION.md b/migration/mongosync_insights/CONFIGURATION.md index 5b209652..ed7bf1e9 100644 --- a/migration/mongosync_insights/CONFIGURATION.md +++ b/migration/mongosync_insights/CONFIGURATION.md @@ -55,7 +55,7 @@ All configuration can be set using `export` commands before running the applicat | Variable | Default | Description | |----------|---------|-------------| -| `MI_ERROR_PATTERNS_FILE` | `lib/error_patterns.json` _(auto-detected)_ | Path to a custom error patterns JSON file used during log analysis to detect common errors (e.g., oplog rollover, timeouts, verifier mismatches) | +| `MI_ERROR_PATTERNS_FILE` | `lib/error_patterns.json` _(auto-detected)_ | Path to a custom error patterns JSON file used during log analysis to detect common errors (e.g., oplog rollover, timeouts, verifier mismatches). Each entry may include an optional `recommendation` string, shown in the Errors tab when a line matches that pattern. | ### UI Customization @@ -77,7 +77,7 @@ All configuration can be set using `export` commands before running the applicat | Variable | Default | Description | |----------|---------|-------------| -| `MI_SECURE_COOKIES` | `true` | Enable secure cookies (requires HTTPS) | +| `MI_SECURE_COOKIES` | mirrors `MI_SSL_ENABLED` (`false` when HTTPS is off) | Enable secure cookies (set `true` when using HTTPS) | | `MI_SESSION_TIMEOUT` | `3600` | Session timeout in seconds (1 hour default) | | `MI_SSL_ENABLED` | `false` | Enable HTTPS/SSL in Flask application | | `MI_SSL_CERT` | `/etc/letsencrypt/live/your-domain/fullchain.pem` | Path to SSL certificate file | @@ -90,7 +90,7 @@ All configuration can be set using `export` commands before running the applicat ### Connection String Validation -> **Note**: For connection string handling information, see [VALIDATION.md](VALIDATION.md) +> **Note**: For connection string handling information, see [CONNECTION_STRING.md](CONNECTION_STRING.md) --- @@ -295,7 +295,7 @@ export MI_LOG_FILE=/var/log/mongosync-insights/insights.log - **[README.md](README.md)** - Getting started and installation guide - **[HTTPS_SETUP.md](HTTPS_SETUP.md)** - Enable HTTPS/SSL for secure deployments -- **[VALIDATION.md](VALIDATION.md)** - Connection string validation, sanitization, and error handling +- **[CONNECTION_STRING.md](CONNECTION_STRING.md)** - Connection string formats, security, and troubleshooting ### License diff --git a/migration/mongosync_insights/VALIDATION.md b/migration/mongosync_insights/CONNECTION_STRING.md similarity index 100% rename from migration/mongosync_insights/VALIDATION.md rename to migration/mongosync_insights/CONNECTION_STRING.md diff --git a/migration/mongosync_insights/HTTPS_SETUP.md b/migration/mongosync_insights/HTTPS_SETUP.md index 336222a6..cc282b57 100644 --- a/migration/mongosync_insights/HTTPS_SETUP.md +++ b/migration/mongosync_insights/HTTPS_SETUP.md @@ -486,7 +486,7 @@ sudo systemctl restart apache2 | `MI_SSL_KEY` | `/etc/letsencrypt/live/your-domain/privkey.pem` | Path to SSL private key file | | `MI_PORT` | `3030` | Port to run the application on (use 443 for HTTPS) | | `MI_HOST` | `127.0.0.1` | Host to bind to (use 0.0.0.0 for all interfaces) | -| `MI_SECURE_COOKIES` | `true` | Enable secure cookies (requires HTTPS) | +| `MI_SECURE_COOKIES` | mirrors `MI_SSL_ENABLED` (`false` when HTTPS is off) | Enable secure cookies (set `true` when using HTTPS) | ### Example Configurations diff --git a/migration/mongosync_insights/LOG_VERBOSITY.md b/migration/mongosync_insights/LOG_VERBOSITY.md new file mode 100644 index 00000000..e6b482e4 --- /dev/null +++ b/migration/mongosync_insights/LOG_VERBOSITY.md @@ -0,0 +1,104 @@ +# Log Verbosity and Plot Coverage + +Mongosync Insights reads the structured JSON log lines that mongosync writes to its log files. The amount of data available to plot depends directly on the **verbosity level** used when mongosync was running. + +## Verbosity Hierarchy + +Mongosync uses a cumulative verbosity system. Each level includes all levels below it: + +| Level | Severity order | How to enable | +|-------|---------------|---------------| +| `error` | lowest verbosity | Always present | +| `warn` | ↑ | Always present | +| `info` | ↑ | Default — no flags needed | +| `debug` | ↑ | `--verbosity 1` | +| `trace` | highest verbosity | `--verbosity 2` | + +To set verbosity when starting mongosync: + +```bash +# default (info and above) +mongosync --cluster0 "..." --cluster1 "..." + +# enable debug logs +mongosync --cluster0 "..." --cluster1 "..." --verbosity 1 + +# enable trace logs +mongosync --cluster0 "..." --cluster1 "..." --verbosity 2 +``` + +## Effect on Plots and Features + +The table below lists every chart and panel in Mongosync Insights, the minimum verbosity required, and the exact log pattern that feeds it. + +### Global Migration Metrics + +| Chart / Feature | Min. verbosity | Source log pattern | +|----------------|---------------|--------------------| +| Mongosync Phases (scatter + table) | `info` (default) | `"Starting ... phase"` \| `"Commit handler called"` | +| Lag Time (seconds) | `info` (default) | `"Replication progress"` → `lagTimeSeconds` | +| Est. Source Oplog Time Remaining | `info` (default) | `"Replication progress"` → `estimatedOplogTimeRemaining` | +| Ping Latency — src & dst | `info` (default) | `"Operation duration stats"` → `sourcePingLatencyMs` / `destinationPingLatencyMs` | +| Average Source CRUD Event Rate | `info` (default) | `"Average Source CRUD events rate"` → `srcCRUDEventsPerSec` | + +> **Note:** `Average Source CRUD events rate` is only emitted by **standalone mongosync**. It is not logged during Live Import runs. + +### Collection Copy Metrics + +| Chart / Feature | Min. verbosity | Source log pattern | +|----------------|---------------|--------------------| +| Partition Init Progress (time series) | `info` (default) | `"Creating a single/initial partition for..."` | +| Partition Init Summary — doc count & sampler | `info` (default) | `"Pre-sampling information"` | +| Partition Init Summary — **partition count & duration** | **`debug` (`--verbosity 1`)** | `"Persisted a new partition after sampling"` | +| Data Copied Over Time | `info` (default) | `"sent response"` → `body.progress.collectionCopy.estimatedCopiedBytes` | +| Estimated Total & Copied (bar) | `info` (default) | `"sent response"` → `body.progress.collectionCopy` | +| Partitions Copied (time series + bar) | `info` (default) | `"Completed writing X / Y partitions to destination cluster"` | + +### CEA Metrics + +| Chart / Feature | Min. verbosity | Source log pattern | +|----------------|---------------|--------------------| +| Change Events Applied | `info` (default) | `"Replication progress"` → `totalEventsApplied` | +| Events Rate per Second | `info` (default) | `"Replication progress"` → `eventApplicationRatePerSecond` | +| CEA Source Read — avg / max / ops | `info` (default) | `"Operation duration stats"` → `CEASourceRead` | +| CEA Destination Write — avg / max / ops | `info` (default) | `"Operation duration stats"` → `CEADestinationWrite` | + +### Indexes Metrics + +| Chart / Feature | Min. verbosity | Source log pattern | +|----------------|---------------|--------------------| +| Index Built Over Time | `info` (default) | `"sent response"` → `body.progress.indexBuilding.indexesBuilt` | +| Total and Index Built (bar) | `info` (default) | `"sent response"` → `body.progress.indexBuilding.totalIndexesToBuild` | + +### Verifier Metrics + +| Chart / Feature | Min. verbosity | Source log pattern | +|----------------|---------------|--------------------| +| Source Verifier Lag Time | `warn` (automatic) | Field `verifierSrcLagTimeSeconds` present | +| Destination Verifier Lag Time | `warn` (automatic) | Field `verifierDstLagTimeSeconds` present | + +> **Note:** Verifier lag lines are emitted by mongosync at `warn` level independently of the `--verbosity` flag. No extra configuration is needed — they appear automatically whenever the live verifier is active. + +### Info Tabs (non-chart panels) + +| Panel | Min. verbosity | Source log pattern | +|-------|---------------|--------------------| +| Version Info | `info` (default) | `"Version info"` | +| Start Options | `info` (default) | `"Received request"` filtered by `uri=/api/v1/start` | +| Mongosync Options | `info` (default) | `"Mongosync Options"` | +| Hidden Flags | `info` (default) | `"Mongosync HiddenFlags"` | +| Natural Order Collections | `info` (default) | `reason` field → `"Selected for natural order collection reads"` | + +> **Note:** Version Info, Mongosync Options, and Hidden Flags are startup log lines. If you upload a **rotated or partial log file** that was captured mid-migration (i.e., after mongosync already rotated its initial log), these panels will be empty. The data is present in the earlier rotated log files, not the current one. + +## Summary + +| Verbosity | Charts and panels available | +|-----------|----------------------------| +| Default (`info`, no flags) | Everything **except** Partition Init partition count & duration columns | +| `--verbosity 1` (`debug`) | Full coverage, including Partition Init partition count & duration | +| `--verbosity 2` (`trace`) | No additional plots currently — `trace`-level lines are not yet extracted by Mongosync Insights | + +## Recommendation + +For a complete analysis, capture mongosync logs with at least `--verbosity 1`. This adds one additional log line per partition created (`"Persisted a new partition after sampling"`), which is low volume and has negligible performance impact. All other charts work at the default verbosity level. diff --git a/migration/mongosync_insights/PACKAGING.md b/migration/mongosync_insights/PACKAGING.md index 51bc5836..b6ae53d4 100644 --- a/migration/mongosync_insights/PACKAGING.md +++ b/migration/mongosync_insights/PACKAGING.md @@ -43,7 +43,7 @@ dist/mongosync-insights--1.x86_64.rpm Copy the RPM to the target machine (USB, SCP, etc.) and install: ```bash -sudo rpm -i mongosync-insights-0.8.0.18-1.x86_64.rpm +sudo rpm -i mongosync-insights-0.8.2.8-1.x86_64.rpm ``` No internet access is required. No additional packages need to be installed. diff --git a/migration/mongosync_insights/README.md b/migration/mongosync_insights/README.md index ae306726..df9f77bd 100644 --- a/migration/mongosync_insights/README.md +++ b/migration/mongosync_insights/README.md @@ -63,21 +63,23 @@ pip3 install -r requirements.txt python3 mongosync_insights.py ``` -The application will start and display: +The application will print the access URL to the console, for example: + ``` -Starting Mongosync Insights v0.8.1.14 -Server: 127.0.0.1:3030 + Mongosync Insights v0.8.2.8 + Open in browser: http://127.0.0.1:3030/ ``` +Startup details are also written to `insights.log` (or the path set by `MI_LOG_FILE`). + ### Access the Web Interface Open your web browser and navigate to: + ``` http://localhost:3030 ``` -![Mongosync Logs Analyzer](images/mongosync_insights_home.png) - ## Using Mongosync Insights ### Sidebar Navigation @@ -99,6 +101,7 @@ Results pages include a left sidebar with quick-access buttons: **Duplicate Upload Detection:** If you upload a file with the same name as an existing saved analysis, a dialog will appear offering three options: **Load Previous** (open the saved session without re-parsing), **Replace** (delete the saved session and parse the file again), or **Cancel**. **Supported File Formats:** + - Plain text: `.log`, `.json`, `.out` - Compressed: `.gz`, `.zip`, `.bz2`, `.tar.gz`, `.tgz`, `.tar.bz2` @@ -106,31 +109,38 @@ Compressed files are automatically decompressed during processing. Archives (ZIP **Note**: Log files must be in mongosync's native **NDJSON** (Newline Delimited JSON) format. Each line should be a valid JSON object. +**Log Verbosity and Plot Coverage:** + +Most plots populate with mongosync's **default log level** (`info`). The only chart that requires elevated verbosity is the **Partition Init Summary** table's partition count and duration columns, which need `--verbosity 1` (`debug` level). For full coverage, start mongosync with `--verbosity 1`. See **[LOG_VERBOSITY.md](LOG_VERBOSITY.md)** for the complete breakdown. + **Automatic File Classification:** The tool automatically classifies files based on their filename: -- **Mongosync logs** (`mongosync.log`, `mongosync-*`, `liveimport_*`) -- parsed for migration progress and events -- **Mongosync metrics** (`mongosync_metrics.log`, `mongosync_metrics-*`) -- parsed for mongosync performance metrics + +- **Mongosync logs** (`mongosync.log`, `mongosync-`*, `liveimport_*`) -- parsed for migration progress and events +- **Mongosync metrics** (`mongosync_metrics.log`, `mongosync_metrics-`*) -- parsed for mongosync performance metrics **Results Tabs:** After upload, the results are organized into tabs: -| Tab | Description | -|-----|-------------| -| **Logs** | Migration progress plots: Total/Copied bytes, CEA Reads/Writes, Collection Copy Reads/Writes, Events applied, Lag Time | -| **Metrics** | Mongosync metrics plots (when a `mongosync_metrics` file is uploaded): 40+ metrics across Collection Copy, Core Replication, CEA Reader, CEA CRUD Applier, Hot Documents, Indexes, Buffer Service, Bulk Inserter, and Verifier | -| **Options** | Mongosync configuration options extracted from the logs (with **Copy as Markdown** for easy sharing) | -| **Collections** | Collection-level progress details (with **Copy as Markdown** for easy sharing) | -| **Errors** | Detected error patterns such as oplog rollover, timeouts, verifier mismatches, and write conflicts during cutover | -| **Log Viewer** | Browse recent log lines with severity filtering, semantic focus, multiple view modes (Highlighted, Raw, Pretty JSON, Summary), and full-text search across the entire log file | - -![Mongosync Logs Tab](images/mongosync_logs_logs.png) -![Mongosync Metrics Tab](images/mongosync_logs_metrics.png) -![Mongosync Options Tab](images/mongosync_logs_options.png) -![Mongosync Collections and Partitions Tab](images/mongosync_logs_collections_partitions.png) -![Mongosync Errors and Warnings Tab](images/mongosync_logs_errors.png) -![Mongosync Log Viewer Tab](images/mongosync_logs_logviewer.png) + +| Tab | Description | +| --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| **Logs** | Migration progress plots: Total/Copied bytes, CEA Reads/Writes, Collection Copy Reads/Writes, Events applied, Lag Time | +| **Metrics** | Mongosync metrics plots (when a `mongosync_metrics` file is uploaded): 40+ metrics across Collection Copy, Core Replication, CEA Reader, CEA CRUD Applier, Hot Documents, Indexes, Buffer Service, Bulk Inserter, and Verifier | +| **Options** | Mongosync configuration options extracted from the logs (with **Copy as Markdown** for easy sharing) | +| **Collections** | Collection-level progress details (with **Copy as Markdown** for easy sharing) | +| **Errors** | Detected error patterns such as oplog rollover, timeouts, verifier mismatches, and write conflicts during cutover | +| **Log Viewer** | Browse recent log lines with severity filtering, semantic focus, multiple view modes (Highlighted, Raw, Pretty JSON, Summary), and full-text search across the entire log file | + + +Mongosync Logs Tab +Mongosync Metrics Tab +Mongosync Options Tab +Mongosync Collections and Partitions Tab +Mongosync Errors and Warnings Tab +Mongosync Log Viewer Tab #### Analysis Snapshot Persistence @@ -145,55 +155,56 @@ After parsing a log file, the analysis is automatically saved as a **snapshot** ### Option 2: Live Monitoring (Metadata) 1. Enter the MongoDB **connection string** to your destination cluster - - Format: `mongodb+srv://user:password@cluster.mongodb.net/` - - This is where mongosync stores its internal metadata + - Format: `mongodb+srv://user:password@cluster.mongodb.net/` + - This is where mongosync stores its internal metadata 2. Click **"Live Monitor"** 3. The page will refresh automatically every 10 seconds (configurable) showing: - - State - - Phase - - Start and finish time - - Lag time - - Reversible - - Write Blocking Mode - - Build Indexes Method - - Detect Random ID - - Embedded Verifier method - - Namespace Filters - - Partitions Completed - - Data Copied - - Migration Phases - - Collection Progress - -![Mongosync metadata status](images/mongosync_metadata_status.png) -![Mongosync metadata progress](images/mongosync_metadata_progress.png) + - State + - Phase + - Start and finish time + - Lag time + - Reversible + - Write Blocking Mode + - Build Indexes Method + - Detect Random ID + - Embedded Verifier method + - Namespace Filters + - Partitions Completed + - Data Copied + - Migration Phases + - Collection Progress + +Mongosync metadata status +Mongosync metadata progress ### Option 3: Live Monitoring (Progress Endpoint) 1. Enter the MongoDB **Progress Endpoint URL** - - Format: `host:port/api/v1/progress` + - Format: `host:port/api/v1/progress` 2. Click **"Live Monitor"** 3. The page will refresh automatically every 10 seconds (configurable) showing: - - State - - Lag time - - Can Commit - - Can Write - - Phase - - Mongosync ID - - Coordinator ID - - Collection Copy progress - - Direction Mapping (source x destination) - - Source and Destination Ping Latency - - Events applied - - Verification table to compare the status between the source and the destination - - Verification progress based on Document Count - -![Mongosync Endpoint](images/mongosync_endpoint.png) + - State + - Lag time + - Can Commit + - Can Write + - Phase + - Mongosync ID + - Coordinator ID + - Collection Copy progress + - Direction Mapping (source x destination) + - Source and Destination Ping Latency + - Events applied + - Verification table to compare the status between the source and the destination + - Verification progress based on Document Count + +Mongosync Endpoint ### Option 4: Combined Monitoring (Metadata + Progress Endpoint) You can provide **both** the MongoDB connection string and the Progress Endpoint URL to get a comprehensive view that combines data from both sources. Simply fill in both fields and click **"Live Monitor"**. This combined approach provides: + - Full metadata insights from the destination cluster (partitions, collection progress, configuration) - Real-time progress data from the mongosync endpoint (state, lag time, verification status) @@ -215,13 +226,13 @@ The [Embedded Verifier](https://www.mongodb.com/docs/cluster-to-cluster-sync/cur 2. Optionally customize the **Verifier Database Name** (default: `migration_verification_metadata`) 3. Click **"Monitor Verifier"** 4. The page will refresh automatically every 10 seconds (configurable) showing: - - Generation history (Initial Verification, Recheck #1, #2, etc.) - - Per-generation summary with task status (completed, failed, pending, processing) - - Failed tasks details with mismatch information - - Namespace stats (per-namespace verification progress) - - Collection metadata mismatches + - Generation history (Initial Verification, Recheck #1, #2, etc.) + - Per-generation summary with task status (completed, failed, pending, processing) + - Failed tasks details with mismatch information + - Namespace stats (per-namespace verification progress) + - Collection metadata mismatches -![Migration Verifier Dashboard](images/migration_verifier_dashboard.png) +Migration Verifier Dashboard #### Important: Embedded Verifier @@ -235,18 +246,22 @@ The [migration-verifier](https://github.com/mongodb-labs/migration-verifier) is **Key terms:** -| Term | Description | -|------|-------------| -| **Generation** | A round of verification. Generation 0 is the initial full check; subsequent generations are rechecks of changed/failed documents. | -| **FINAL** | Label shown on the dashboard for the last generation — only its failures indicate real mismatches. | + +| Term | Description | +| ----------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- | +| **Generation** | A round of verification. Generation 0 is the initial full check; subsequent generations are rechecks of changed/failed documents. | +| **FINAL** | Label shown on the dashboard for the last generation — only its failures indicate real mismatches. | | **Task statuses** | `added` (unstarted), `processing` (in-progress), `completed` (no issues), `failed` (document mismatch), `mismatch` (collection metadata mismatch). | + **Metadata collections:** -| Collection | Purpose | -|------------|---------| + +| Collection | Purpose | +| -------------------- | ----------------------------------------------------------------------------------------------------------------------------------- | | `verification_tasks` | Tracks each verification task with a generation number, status, and type (`verify` for documents, `verifyCollection` for metadata). | -| `mismatches` | Records document-level mismatches found during verification. | +| `mismatches` | Records document-level mismatches found during verification. | + **Note**: The `MI_VERIFIER_CONNECTION_STRING` environment variable can be used to pre-configure the connection string. When omitted, it falls back to `MI_CONNECTION_STRING`. See **[CONFIGURATION.md](CONFIGURATION.md)** for details. @@ -265,6 +280,7 @@ Configure the application using environment variables. See **[CONFIGURATION.md]( - Security and session settings **Quick Example:** + ```bash export MI_PORT=8080 export MI_REFRESH_TIME=5 @@ -281,6 +297,7 @@ For production deployments, enable HTTPS encryption. See **[HTTPS_SETUP.md](HTTP - Reverse proxy setup with Nginx/Apache (recommended) **Quick Enable HTTPS:** + ```bash export MI_SSL_ENABLED=true export MI_SSL_CERT=/path/to/certificate.pem @@ -296,7 +313,8 @@ For detailed guides, see: - **[PACKAGING.md](PACKAGING.md)** - Build a self-contained RPM for offline/air-gapped deployment - **[CONFIGURATION.md](CONFIGURATION.md)** - Complete environment variables reference, configuration options, and MongoDB connection pooling - **[HTTPS_SETUP.md](HTTPS_SETUP.md)** - Enable HTTPS/SSL for secure deployments -- **[VALIDATION.md](VALIDATION.md)** - Connection string validation, sanitization, and error handling +- **[CONNECTION_STRING.md](CONNECTION_STRING.md)** - Connection string formats, security, and troubleshooting +- **[LOG_VERBOSITY.md](LOG_VERBOSITY.md)** - How mongosync log verbosity levels affect plot and panel coverage ## Security Best Practices @@ -304,16 +322,18 @@ For detailed guides, see: - ✅ Keep SSL certificates up to date with auto-renewal - ✅ Use environment variables for sensitive configuration (never hardcode connection strings) - ✅ The application includes security headers for XSS, CSRF, and clickjacking protection -- ✅ Secure cookies are enabled by default when using HTTPS +- ✅ Set `MI_SECURE_COOKIES=true` (or enable `MI_SSL_ENABLED`) for secure session cookies over HTTPS ## Troubleshooting ### Plots not visible after upload + - Refresh the page - Check the console for error messages - Verify the log file format is correct ### Connection failures (Live Monitoring) + - Verify the connection string format and credentials - Ensure network connectivity to the MongoDB cluster - Check that the mongosync internal database exists @@ -322,8 +342,8 @@ For detailed guides, see: [Apache 2.0](http://www.apache.org/licenses/LICENSE-2.0) -DISCLAIMER ----------- +## DISCLAIMER + Please note: all tools/ scripts in this repo are released for use "AS IS" **without any warranties of any kind**, including, but not limited to their installation, use, or performance. We disclaim any and all warranties, either express or implied, including but not limited to any warranty of noninfringement, merchantability, and/ or fitness @@ -338,4 +358,4 @@ You are responsible for reviewing and testing any scripts you run *thoroughly* b environment. Thanks, -The MongoDB Support Team +The MongoDB Support Team \ No newline at end of file diff --git a/migration/mongosync_insights/blueprints/live.py b/migration/mongosync_insights/blueprints/live.py new file mode 100644 index 00000000..678fd71f --- /dev/null +++ b/migration/mongosync_insights/blueprints/live.py @@ -0,0 +1,316 @@ +import logging + +from flask import Blueprint, jsonify, make_response, render_template, request + +from lib.connection_validator import sanitize_for_display +from lib.live_migration_metrics import ( + gatherEndpointMetrics, + gatherMetrics, + gatherPartitionsMetrics, + plotMetrics, +) +from lib.migration_verifier import gatherVerifierMetrics, plotVerifierMetrics +from lib.session_support import SESSION_COOKIE_NAME, store_session_data +from lib.app_config import ( + CONNECTION_STRING, + PROGRESS_ENDPOINT_URL, + SECURE_COOKIES, + SESSION_TIMEOUT, + VERIFIER_CONNECTION_STRING, + clear_connection_cache, + session_store, + validate_connection, + validate_progress_endpoint_url, +) +from pymongo.errors import InvalidURI, PyMongoError + +bp = Blueprint("live", __name__, url_prefix="/live") + +logger = logging.getLogger(__name__) + + +@bp.route("/") +def live_home(): + if not CONNECTION_STRING: + connection_string_form = """ +

""" + else: + sanitized_connection = sanitize_for_display(CONNECTION_STRING) + connection_string_form = f"

Connecting to Destination Cluster at: {sanitized_connection}

" + + if not PROGRESS_ENDPOINT_URL: + progress_endpoint_form = """ +

""" + else: + progress_endpoint_form = f"

Mongosync Progress Endpoint: {PROGRESS_ENDPOINT_URL}

" + + if not VERIFIER_CONNECTION_STRING: + verifier_connection_string_form = """ +

""" + else: + sanitized_connection = sanitize_for_display(VERIFIER_CONNECTION_STRING) + verifier_connection_string_form = f"

Connecting to Verifier DB at: {sanitized_connection}

" + + return render_template( + "live/home.html", + connection_string_form=connection_string_form, + progress_endpoint_form=progress_endpoint_form, + verifier_connection_string_form=verifier_connection_string_form, + ) + + +@bp.route("/liveMonitoring", methods=["POST"]) +def live_monitoring(): + if CONNECTION_STRING: + target_mongo_uri = CONNECTION_STRING + else: + target_mongo_uri = request.form.get("connectionString") + if target_mongo_uri: + target_mongo_uri = target_mongo_uri.strip() if target_mongo_uri.strip() else None + + if PROGRESS_ENDPOINT_URL: + progress_url = PROGRESS_ENDPOINT_URL + else: + progress_url = request.form.get("progressEndpointUrl") + if progress_url: + progress_url = progress_url.strip() if progress_url.strip() else None + + if not target_mongo_uri and not progress_url: + logger.error("No connection string or progress endpoint URL provided") + return render_template( + "error.html", + error_title="No Input Provided", + error_message="Please provide at least one of the following: MongoDB Connection String or Mongosync Progress Endpoint URL (or both).", + ) + + if progress_url and not validate_progress_endpoint_url(progress_url): + logger.error("Invalid progress endpoint URL format: %s", progress_url) + return render_template( + "error.html", + error_title="Invalid Progress Endpoint URL", + error_message="The Progress Endpoint URL format is invalid. Expected format: host:port/api/v1/progress (e.g., localhost:27182/api/v1/progress)", + ) + + if target_mongo_uri: + try: + validate_connection(target_mongo_uri) + except InvalidURI as e: + clear_connection_cache() + logger.error("Invalid connection string format: %s", e) + return render_template( + "error.html", + error_title="Invalid Connection String", + error_message="The connection string format is invalid. Please check your MongoDB connection string and try again.", + ) + except PyMongoError as e: + clear_connection_cache() + logger.error("Failed to connect: %s", e) + return render_template( + "error.html", + error_title="Connection Failed", + error_message="Could not connect to MongoDB. Please verify your credentials, network connectivity, and that the cluster is accessible.", + ) + except Exception as e: + clear_connection_cache() + logger.error("Unexpected error during connection validation: %s", e) + return render_template( + "error.html", + error_title="Connection Error", + error_message="An unexpected error occurred. Please try again.", + ) + + session_data = { + "connection_string": target_mongo_uri, + "endpoint_url": progress_url, + } + session_id = store_session_data(session_data) + + has_connection_string = bool(target_mongo_uri) + has_endpoint_url = bool(progress_url) + + response = make_response( + plotMetrics( + has_connection_string=has_connection_string, + has_endpoint_url=has_endpoint_url, + ) + ) + + response.set_cookie( + SESSION_COOKIE_NAME, + session_id, + httponly=True, + secure=SECURE_COOKIES, + samesite="Strict", + max_age=SESSION_TIMEOUT, + ) + return response + + +@bp.route("/getLiveMonitoring", methods=["POST"]) +def get_live_monitoring(): + if CONNECTION_STRING: + connection_string = CONNECTION_STRING + else: + session_id = request.cookies.get(SESSION_COOKIE_NAME) + session_data = session_store.get_session(session_id) + connection_string = session_data.get("connection_string") if session_data else None + + if not connection_string: + logger.error("No connection string available for metrics refresh") + return jsonify( + { + "error": "No connection string available. Please refresh the page and re-enter your credentials." + } + ), 400 + + try: + return jsonify(gatherMetrics(connection_string)) + except PyMongoError as e: + clear_connection_cache() + logger.error("Live monitoring metrics failed: %s", e) + return jsonify( + { + "error": "Could not connect to MongoDB. Please verify your credentials, network connectivity, and that the cluster is accessible." + } + ), 503 + + +@bp.route("/getPartitionsData", methods=["POST"]) +def get_partitions_data(): + if CONNECTION_STRING: + connection_string = CONNECTION_STRING + else: + session_id = request.cookies.get(SESSION_COOKIE_NAME) + session_data = session_store.get_session(session_id) + connection_string = session_data.get("connection_string") if session_data else None + + if not connection_string: + logger.error("No connection string available for partitions data refresh") + return jsonify( + { + "error": "No connection string available. Please refresh the page and re-enter your credentials." + } + ), 400 + + try: + return jsonify(gatherPartitionsMetrics(connection_string)) + except PyMongoError as e: + clear_connection_cache() + logger.error("Partitions metrics failed: %s", e) + return jsonify( + { + "error": "Could not connect to MongoDB. Please verify your credentials, network connectivity, and that the cluster is accessible." + } + ), 503 + + +@bp.route("/getEndpointData", methods=["POST"]) +def get_endpoint_data(): + if PROGRESS_ENDPOINT_URL: + endpoint_url = PROGRESS_ENDPOINT_URL + else: + session_id = request.cookies.get(SESSION_COOKIE_NAME) + session_data = session_store.get_session(session_id) + endpoint_url = session_data.get("endpoint_url") if session_data else None + + if not endpoint_url: + logger.error("No progress endpoint URL available for endpoint data refresh") + return jsonify( + { + "error": "No progress endpoint URL available. Please refresh the page and re-enter your credentials." + } + ), 400 + + return jsonify(gatherEndpointMetrics(endpoint_url)) + + +@bp.route("/Verifier", methods=["POST"]) +def verifier(): + if VERIFIER_CONNECTION_STRING: + target_mongo_uri = VERIFIER_CONNECTION_STRING + else: + target_mongo_uri = request.form.get("verifierConnectionString") + if target_mongo_uri: + target_mongo_uri = target_mongo_uri.strip() if target_mongo_uri.strip() else None + + db_name = request.form.get("verifierDbName", "migration_verification_metadata") + if db_name: + db_name = db_name.strip() if db_name.strip() else "migration_verification_metadata" + + if not target_mongo_uri: + logger.error("No connection string provided for migration verifier") + return render_template( + "error.html", + error_title="No Connection String", + error_message="Please provide a MongoDB Connection String for the migration verifier database.", + ) + + try: + validate_connection(target_mongo_uri) + except InvalidURI as e: + logger.error("Invalid connection string format: %s", e) + clear_connection_cache() + return render_template( + "error.html", + error_title="Invalid Connection String", + error_message="The connection string format is invalid. Please check your MongoDB connection string and try again.", + ) + except PyMongoError as e: + logger.error("Failed to connect: %s", e) + clear_connection_cache() + return render_template( + "error.html", + error_title="Connection Failed", + error_message="Could not connect to MongoDB. Please verify your credentials, network connectivity, and that the cluster is accessible.", + ) + except Exception as e: + logger.error("Unexpected error during connection validation: %s", e) + clear_connection_cache() + return render_template( + "error.html", + error_title="Connection Error", + error_message="An unexpected error occurred. Please try again.", + ) + + session_data = { + "verifier_connection_string": target_mongo_uri, + "verifier_db_name": db_name, + } + session_id = store_session_data(session_data) + + response = make_response(plotVerifierMetrics(db_name=db_name)) + + response.set_cookie( + SESSION_COOKIE_NAME, + session_id, + httponly=True, + secure=SECURE_COOKIES, + samesite="Strict", + max_age=SESSION_TIMEOUT, + ) + return response + + +@bp.route("/getVerifierData", methods=["POST"]) +def get_verifier_data(): + session_id = request.cookies.get(SESSION_COOKIE_NAME) + session_data = session_store.get_session(session_id) + + if VERIFIER_CONNECTION_STRING: + connection_string = VERIFIER_CONNECTION_STRING + else: + connection_string = (session_data or {}).get("verifier_connection_string") + + if not connection_string: + logger.error("No connection string available for verifier metrics refresh") + return jsonify( + { + "error": "No connection string available. Please refresh the page and re-enter your credentials." + } + ), 400 + + db_name = (session_data or {}).get("verifier_db_name", "migration_verification_metadata") + return jsonify(gatherVerifierMetrics(connection_string, db_name)) diff --git a/migration/mongosync_insights/blueprints/logs.py b/migration/mongosync_insights/blueprints/logs.py new file mode 100644 index 00000000..7847ef72 --- /dev/null +++ b/migration/mongosync_insights/blueprints/logs.py @@ -0,0 +1,113 @@ +import logging +import os + +from flask import Blueprint, jsonify, render_template, request + +from lib.logs_metrics import upload_file +from lib.log_store_registry import log_store_registry +from lib.snapshot_store import ( + load_snapshot, + list_snapshots as get_snapshot_list, + delete_snapshot as remove_snapshot, +) + +bp = Blueprint("logs", __name__, url_prefix="/logs") + +logger = logging.getLogger(__name__) + + +@bp.route("/") +def logs_home(): + from lib.app_config import ( + MAX_FILE_SIZE, + ) + + max_file_size_gb = MAX_FILE_SIZE / (1024**3) + return render_template("logs/home.html", max_file_size_gb=max_file_size_gb) + + +@bp.route("/uploadLogs", methods=["POST"]) +def upload_logs(): + return upload_file() + + +@bp.route("/search_logs") +def search_logs(): + store_id = request.args.get("store_id", "").strip() + if not store_id: + return jsonify({"error": "Missing store_id parameter"}), 400 + + q = request.args.get("q", "").strip() + level = request.args.get("level", "").strip() + try: + page = max(1, int(request.args.get("page", 1))) + except (ValueError, TypeError): + page = 1 + try: + per_page = min(max(1, int(request.args.get("per_page", 50))), 200) + except (ValueError, TypeError): + per_page = 50 + + store = log_store_registry.open_store(store_id) + if store is None: + return jsonify({"error": "Log store not found or expired"}), 404 + + try: + query = {} + if level: + query["level"] = level + if q: + query["$text"] = q + + result = store.find(query, skip=(page - 1) * per_page, limit=per_page) + result["page"] = page + result["per_page"] = per_page + return jsonify(result) + except Exception as e: + logger.error("Log search error: %s", e) + return jsonify({"error": "Search failed", "detail": str(e)}), 500 + + +@bp.route("/list_snapshots") +def list_snapshots(): + try: + snapshots = get_snapshot_list() + return jsonify(snapshots) + except Exception as e: + logger.error("Error listing snapshots: %s", e) + return jsonify([]) + + +@bp.route("/load_snapshot/") +def load_snapshot_view(snapshot_id): + data = load_snapshot(snapshot_id) + if data is None: + return render_template( + "error.html", + error_title="Snapshot Not Found", + error_message=( + "The requested analysis snapshot was not found or has expired. " + "Please upload and parse the log file again." + ), + ) + + store_id = data.get("log_store_id", "") + if store_id: + from lib.snapshot_store import logstore_path + + db_path = logstore_path(store_id) + if os.path.exists(db_path): + log_store_registry.register(store_id, db_path) + + template_data = data.get("template_data", {}) + return render_template("upload_results.html", **template_data) + + +@bp.route("/delete_snapshot/", methods=["DELETE"]) +def delete_snapshot_view(snapshot_id): + deleted, store_id = remove_snapshot(snapshot_id) + if store_id: + log_store_registry.remove(store_id) + if deleted: + return jsonify({"status": "ok"}) + return jsonify({"error": "Snapshot not found"}), 404 diff --git a/migration/mongosync_insights/build_rpm.sh b/migration/mongosync_insights/build_rpm.sh index 8e136521..36a3a5f9 100755 --- a/migration/mongosync_insights/build_rpm.sh +++ b/migration/mongosync_insights/build_rpm.sh @@ -30,7 +30,7 @@ cd "$SCRIPT_DIR" # --------------------------------------------------------------------------- APP_VERSION=$(python3 -c " import re, pathlib -m = re.search(r'APP_VERSION\s*=\s*\"([^\"]+)\"', pathlib.Path('app_config.py').read_text()) +m = re.search(r'APP_VERSION\s*=\s*\"([^\"]+)\"', pathlib.Path('lib/app_config.py').read_text()) print(m.group(1)) ") echo "==> Building Mongosync Insights v${APP_VERSION}" diff --git a/migration/mongosync_insights/lib/app_config.py b/migration/mongosync_insights/lib/app_config.py index 71e545b6..baae6429 100644 --- a/migration/mongosync_insights/lib/app_config.py +++ b/migration/mongosync_insights/lib/app_config.py @@ -23,7 +23,7 @@ # Application constants APP_NAME = "Mongosync Insights" -APP_VERSION = "0.8.1.14" +APP_VERSION = "0.8.2.8" DEVELOPER_CREDITS = { "copyright": "\u00a9 MongoDB Inc.", @@ -150,7 +150,8 @@ def load_error_patterns(): Load error patterns from external JSON file. Returns: - list: List of dictionaries with 'pattern' and 'friendly_name' keys + list: List of dictionaries with 'pattern' and 'friendly_name' keys, and + optionally 'recommendation' (string shown in the Errors tab for matches). """ import json logger = logging.getLogger(__name__) diff --git a/migration/mongosync_insights/lib/error_patterns.json b/migration/mongosync_insights/lib/error_patterns.json index 5dbf9aa4..10b9882b 100644 --- a/migration/mongosync_insights/lib/error_patterns.json +++ b/migration/mongosync_insights/lib/error_patterns.json @@ -1,159 +1,204 @@ [ - { - "pattern": "resume point may no longer be in the oplog", - "friendly_name": "Oplog rollover" - }, - { - "pattern": "the resume token was not found", - "friendly_name": "Oplog rollover" - }, - { - "pattern": "failed to run inserter service for partition", - "friendly_name": "timeout on dest" - }, - { - "pattern": "timed out after attempting to execute a command on the MongoDB server", - "friendly_name": "timeout on dest" - }, - { - "pattern": "verifier indicated a mismatch", - "friendly_name": "Verifier mismatch" - }, - { - "pattern": "the verifier found one or more mismatches", - "friendly_name": "Verifier mismatch" - }, - { - "pattern": "source cluster was modified after Mongosync final verification began", - "friendly_name": "Write on source during cutover" - }, - { - "pattern": "destination indexes should not be `unique` yet", - "friendly_name": "Verifier: metadata failed (unique index issue)" - }, - { - "pattern": "not authorized on shoreplanner to execute command", - "friendly_name": "Verifier: metadata failed (permissions issue)" - }, - { - "pattern": "failed to update document on destination", - "friendly_name": "addFields update error" - }, - { - "pattern": "namespace conflict detected", - "friendly_name": "Conflicting namespaces" - }, - { - "pattern": "no documents in result", - "friendly_name": "Driver Error" - }, - { - "pattern": "failed to enable index constraints", - "friendly_name": "Failed unique index conversion" - }, - { - "pattern": "unique index was set on the source that is incompatible", - "friendly_name": "Incompatible unique index" - }, - { - "pattern": "Index Checker Service exceeded the maximum number of index creation attempts", - "friendly_name": "Index creation attempts" - }, - { - "pattern": "IndexNotFound", - "friendly_name": "Index not found" - }, - { - "pattern": "turn off profiling", - "friendly_name": "profiling enabled" - }, - { - "pattern": "failed to open change stream using resume token", - "friendly_name": "Server Error" - }, - { - "pattern": "cluster is not sharded", - "friendly_name": "Sharded misconfiguration" - }, - { - "pattern": "is a timeseries collection", - "friendly_name": "timeseries" - }, - { - "pattern": "TooManyFilesOpen", - "friendly_name": "Too many open files" - }, - { - "pattern": "addShardToZone", - "friendly_name": "Unauthorized to run addShardToZone" - }, - { - "pattern": "IndexOptionsConflict", - "friendly_name": "unique and non-unique indexes" - }, - { - "pattern": "Cannot convert the index to unique", - "friendly_name": "Unique index conversion" - }, - { - "pattern": "UserWritesBlocked", - "friendly_name": "User writes blocked" - }, - { - "pattern": "User writes blocked", - "friendly_name": "User writes blocked" - }, - { - "pattern": "the destination cluster was modified after Mongosync final verification began", - "friendly_name": "Write on destination during cutover" - }, - { - "pattern": "bulk write exception: write errors", - "friendly_name": "Bulk Write error (duplicate key)" - }, - { - "pattern": "DDL events are unsupported on pre-6.0 source", - "friendly_name": "DDL event on source" - }, - { - "pattern": "No indexed plans available, and running with 'notablescan'", - "friendly_name": "Query internal partitions error (NoQueryExecutionPlans)" - }, - { - "pattern": "Values in the index key pattern cannot be empty strings", - "friendly_name": "Index creation error (empty string index)" - }, - { - "pattern": "Must have an index compatible with the proposed shard key", - "friendly_name": "Dest collection sharding failure (no compatible index)" - }, - { - "pattern": "Most source writes appear to be from $out aggregations", - "friendly_name": "Source writes from $out aggregations. Consider pausing workloads that run $out aggregations.", - "full_error_message": "Most source writes appear to be from $out aggregations. This might mean that relatively few documents are being constantly updated. Mongosync may not be able to keep pace with the source's workload." - }, - { - "pattern": "The level of CEA parallelization for this collection is low", - "friendly_name": "Hot Documents", - "full_error_message": "The level of CEA parallelization for this collection is low. This might mean that relatively few documents are being constantly updated. Mongosync may not be able to keep pace with the source's workload." - }, - { - "pattern": "refetched db-spec for recreating dst coll for natural scan", - "friendly_name": "Restart Collection Copy in natural order", - "full_error_message": "refetched db-spec for recreating dst coll for natural scan, collSpec: {COLLECTION DETAILS UUID {}}, isSrcDropped: XXX" - }, - { - "pattern": "Change Event Application failed", - "friendly_name": "CEA failed undefined reason" - }, - { - "pattern": "Failed to apply batch #1 for CRUD event application on the destination. Giving up on batch CRUD event application.", - "friendly_name": "Timeout on destination", - "full_error_message": "Failed to apply batch #1 for CRUD event application on the destination. Giving up on batch CRUD event application." - }, - { - "pattern": "Got a fatal error running Mongosync", - "friendly_name": "Fatal error running Mongosync", - "full_error_message": "Got a fatal error running Mongosync" - } - ] + { + "pattern": "resume point may no longer be in the oplog", + "friendly_name": "Oplog rollover", + "recommendation": "resume point may no longer be in the oplog means mongosync’s change stream fell behind and the resume token it needs has already been rolled out of the oplog, so the migration can’t continue from that point (an oplog rollover). Fix by increasing the source oplog window (and destination, if the error is from the embedded verifier), reducing write volume so mongosync can keep up, and restarting the migration from scratch with the larger oplog in place." + }, + { + "pattern": "the resume token was not found", + "friendly_name": "Oplog rollover", + "recommendation": "the resume token was not found means mongosync’s change stream fell behind and the resume token it needs has already been rolled out of the oplog, so the migration can’t continue from that point (an oplog rollover). Fix by increasing the source oplog window (and destination, if the error is from the embedded verifier), reducing write volume so mongosync can keep up, and restarting the migration from scratch with the larger oplog in place." + }, + { + "pattern": "failed to run inserter service for partition", + "friendly_name": "timeout on dest", + "recommendation": "failed to run inserter service for partition is a wrapper error meaning mongosync failed to insert documents for a specific partition into the destination collection; the real cause is the inner error from BufferService.RunForCollection (e.g. duplicate key, auth, network, etc.). Check the next/inner log lines for the specific insert/write error on that namespace, fix the underlying issue (e.g. resolve duplicate keys, permissions, connectivity, timeouts), then rerun the collection copy or the whole migration once the cause is addressed." + }, + { + "pattern": "timed out after attempting to execute a command on the MongoDB server", + "friendly_name": "timeout on dest", + "recommendation": "It means mongosync issued a command to MongoDB but timed out waiting for the server’s response, usually due to server resource contention or a transient connectivity/server-selection problem on the source or destination. Recommendation: check cluster and network health (CPU/disk, scaling events, ReplicaSetNoPrimary, DNS/VPN/PE issues), then reduce load or scale up and, if needed, increase timeoutMS on the affected connection string before restarting/resuming the migration." + }, + { + "pattern": "verifier indicated a mismatch", + "friendly_name": "Verifier mismatch", + "recommendation": "verifier indicated a mismatch means mongosync’s embedded verifier compared source and destination and found at least one data or index mismatch, so verification failed for that namespace. Check the following log lines to see which collection/keys differ, ensure there were no writes to source or destination during verification/cutover, then fix or recopy the affected collections (often by rerunning the migration/verification from scratch once the discrepancy is resolved)." + }, + { + "pattern": "the verifier found one or more mismatches", + "friendly_name": "Verifier mismatch", + "recommendation": "the verifier found one or more mismatches means mongosync’s embedded verifier detected differences between source and destination, so commit failed and verification did not pass. Recommendation: review the verifier warn-level logs to identify the mismatched collections/keys, ensure no writes occurred on source or destination during verification, then recopy those collections or rerun the migration/verification from scratch after fixing the discrepancy." + }, + { + "pattern": "source cluster was modified after Mongosync final verification began", + "friendly_name": "Write on source during cutover", + "recommendation": "It means that during final verification, mongosync’s change stream saw writes/DDL on the source cluster after verification started, so it aborted to avoid cutting over from a moving target. Recommendation: fully stop/block all writes and schema changes on the source before starting cutover/final verification, then rerun verification (or the migration) from scratch once the source is quiescent." + }, + { + "pattern": "destination indexes should not be `unique` yet", + "friendly_name": "Verifier: metadata failed (unique index issue)", + "recommendation": "destination indexes should not be `unique` yet means the verifier found a destination index already marked unique instead of prepareUnique, usually because the destination cluster received writes or manual index changes during migration. Fix by stopping all writes and index changes on the destination, dropping/recreating the affected destination collection (or cluster, if needed), and rerunning the migration so mongosync can handle the unique-index conversion itself." + }, + { + "pattern": "failed to update document on destination", + "friendly_name": "addFields update error", + "recommendation": "failed to update document on destination means mongosync tried to apply an update during change-event application, but the destination update command failed (often due to document shape issues like duplicate field names or other server write errors). Recommendation: check the full log for the underlying write error (e.g. “Document already has a field named …”), fix or clean those problematic documents on the source (remove/rename duplicates or invalid structures, or mark them as hotDocIDs/exclude), then rerun the migration or recopy the affected collection." + }, + { + "pattern": "namespace conflict detected", + "friendly_name": "Conflicting namespaces", + "recommendation": "namespace conflict detected means mongosync tried to create/rename a collection on the destination, but a collection with that same namespace (db.collection) already exists there. Fix by dropping or renaming the conflicting collections on the destination (and ensuring no writes go to dest during migration), then restart the migration from scratch." + }, + { + "pattern": "no documents in result", + "friendly_name": "Driver Error", + "recommendation": "In mongosync, mongo: no documents in result means a driver find/findOne/findOneAndUpdate call expected a single metadata/resume/ checkpoint document, but the query matched nothing (e.g. missing global state or resume-data doc, or a predicate that no longer matches). To fix it, read the full log line to see which document and ID are missing, then either (1) correct your config so that mongosync is pointing at the right metadata DB and all expected instances/IDs are actually started, or (2) reinitialize/restart the migration so mongosync can recreate the required metadata documents cleanly." + }, + { + "pattern": "failed to enable index constraints", + "friendly_name": "Failed unique index conversion", + "recommendation": "failed to enable index constraints means mongosync couldn’t complete the final step of making certain destination indexes fully unique / constraint-enforced (the collMod to enable uniqueness/constraints failed), so unique index conversion did not succeed. Recommendation: inspect the next log lines for the underlying collMod error (e.g. CannotConvertIndexToUnique, legacy unique keys, or writes to the destination during commit), fix the offending index/data or stop all writes on the destination as indicated, then start a fresh migration/commit so mongosync can re-run index constraint enablement cleanly." + }, + { + "pattern": "unique index was set on the source that is incompatible", + "friendly_name": "Incompatible unique index", + "recommendation": "It means mongosync found a unique index on the source whose key pattern is incompatible with the shard key you specified in the migration’s sharding parameters, so it aborts index modification. Recommendation: choose a shard key compatible with the existing unique index (same prefix), or drop/alter the conflicting unique index on the source and retry; if logs also mention legacy unique index keys, resync the source replica set members before rerunning the migration." + }, + { + "pattern": "Index Checker Service exceeded the maximum number of index creation attempts", + "friendly_name": "Index creation attempts", + "recommendation": "Index Checker Service exceeded the maximum number of index creation attempts means mongosync tried to create the same index on the destination several times during commit and hit its built-in retry limit because creation kept failing for that index. Short recommendation: check the surrounding logs for the root index creation error (e.g. IndexOptionsConflict, duplicate/legacy index), fix that on the destination (drop or adjust the conflicting index / options, stop concurrent index changes), then rerun the migration/commit so index creation can succeed in a fresh run." + }, + { + "pattern": "IndexNotFound", + "friendly_name": "Index not found", + "recommendation": "IndexNotFound in mongosync means the destination cluster returned (IndexNotFound) when mongosync tried to run collMod to modify an index (usually while enabling unique index constraints), but that index name doesn’t exist on the destination collection. Fix by reading the full log line to get the namespace and index name, then create or recreate that index on the destination with the expected name/key pattern, and rerun the migration/commit (often from scratch if it’s in the commit/index-conversion phase)." + }, + { + "pattern": "turn off profiling", + "friendly_name": "profiling enabled", + "recommendation": "If you see errors or performance issues with profiling enabled, confirm whether profiling is intentionally on; if not required, reduce the profiling level or disable it after collecting needed diagnostics: https://www.mongodb.com/docs/manual/reference/command/profile/" + }, + { + "pattern": "failed to open change stream using resume token", + "friendly_name": "Oplog rollover", + "recommendation": "failed to open change stream using resume token means mongosync’s change stream fell behind and the resume token it needs has already been rolled out of the oplog, so the migration can’t continue from that point (an oplog rollover). Fix by increasing the source oplog window (and destination, if the error is from the embedded verifier), reducing write volume so mongosync can keep up, and restarting the migration from scratch with the larger oplog in place." + }, + { + "pattern": "cluster is not sharded", + "friendly_name": "Sharded misconfiguration", + "recommendation": "cluster is not sharded means mongosync expected to talk to a sharded cluster, but the node/URI you gave resolves to a non-sharded topology (e.g., replica set or direct shard host), which is an unsupported setup for that migration flow. Fix by using a supported topology and connection string: for sharded sources/destinations connect via mongos URIs, and avoid sharded→replica-set migrations entirely (only RS→RS, RS→SC, or SC→SC are supported)." + }, + { + "pattern": "is a timeseries collection", + "friendly_name": "timeseries", + "recommendation": "If the error points to a timeseries collection, confirm it’s excluded or handled outside mongosync, since timeseries is not supported: https://www.mongodb.com/docs/mongosync/current/reference/limitations/" + }, + { + "pattern": "TooManyFilesOpen", + "friendly_name": "Too many open files", + "recommendation": "TooManyFilesOpen means the MongoDB server (or host) hit the OS limit for open file descriptors, so mongosync’s operation/connection failed at the server side. Recommendation: increase the OS/file-descriptor limit (ulimit / systemd LimitNOFILE / equivalent) to MongoDB-recommended values for the mongod/mongos/mongosync processes and restart them; also reduce unnecessary concurrent connections/files if limits still get hit." + }, + { + "pattern": "addShardToZone", + "friendly_name": "Unauthorized to run addShardToZone", + "recommendation": "addShardToZone in mongosync logs means it tried to run the sharding admin command addShardToZone on the destination cluster but the mongosync user is not authorized (e.g. “not authorized on admin to execute command { addShardToZone: ... }”). Fix by granting the destination user the documented mongosync minimum privileges, including the ability to update config.shards (cluster-level sharding admin / appropriate Atlas role), then rerun the migration." + }, + { + "pattern": "IndexOptionsConflict", + "friendly_name": "unique and non-unique indexes", + "recommendation": "IndexOptionsConflict means MongoDB reported that an index with the same key pattern already exists but with different options (for mongosync specifically, this usually happens when there is both a unique and non-unique index on the same fields). Recommendation: drop either the unique or the non-unique index on the source before running mongosync, run the migration, then recreate the dropped index on both source and destination after commit if the application still needs it." + }, + { + "pattern": "Cannot convert the index to unique", + "friendly_name": "Unique index conversion", + "recommendation": "It means mongosync tried to convert a non-unique index to unique (via collMod), but MongoDB found conflicting documents with duplicate values for that index key, so it returned CannotConvertIndexToUnique. Recommendation: use the logged violations to identify the conflicting documents, update or delete them so the indexed field is truly unique, then rerun the migration/commit so mongosync can retry the unique index conversion successfully." + }, + { + "pattern": "UserWritesBlocked", + "friendly_name": "User writes blocked", + "recommendation": "For “user writes blocked” errors, determine whether this is intentional (cutover, maintenance) or due to disk/threshold conditions, then adjust write‑blocking or free space as appropriate." + }, + { + "pattern": "User writes blocked", + "friendly_name": "User writes blocked", + "recommendation": "For “user writes blocked” errors, determine whether this is intentional (cutover, maintenance) or due to disk/threshold conditions, then adjust write‑blocking or free space as appropriate." + }, + { + "pattern": "the destination cluster was modified after Mongosync final verification began", + "friendly_name": "Write on destination during cutover", + "recommendation": "the destination cluster was modified after Mongosync final verification began means mongosync’s final verification/change stream saw writes or DDL on the destination cluster after verification started, so it aborted to avoid cutting over to a moving target. Recommendation: fully stop/block all writes and schema/index changes on the destination before starting cutover/final verification, then rerun verification (or the migration) from scratch once the destination is quiescent." + }, + { + "pattern": "bulk write exception: write errors", + "friendly_name": "Bulk Write error (duplicate key)", + "recommendation": "bulk write exception: write errors means a bulk insert/update batch to the destination failed, and the server returned one or more write errors—most commonly a duplicate key violating a unique index. Recommendation: check the following log lines for the specific write error code/message (e.g. duplicate key 11000), fix the underlying data/index issue on source/destination, then rerun the affected collection copy or migration." + }, + { + "pattern": "DDL events are unsupported on pre-6.0 source", + "friendly_name": "DDL event on source", + "recommendation": "DDL events are unsupported on pre-6.0 source means a DDL change (create/drop/rename/index) happened on a source cluster running < 6.0, which mongosync cannot handle, so it treats this as a fatal, non-transient error. Short recommendation: delete the migrated data on the destination and restart the migration from scratch, and ensure no DDL runs on the pre-6.0 source during migration (or upgrade the source to 6.0+ if you need DDL support)." + }, + { + "pattern": "No indexed plans available, and running with 'notablescan'", + "friendly_name": "Query internal partitions error (NoQueryExecutionPlans)", + "recommendation": "If mongosync hits “NoQueryExecutionPlans” or notablescan issues, check whether “Require Indexes for All Queries” (notablescan) is enabled and temporarily disable or adjust it for the migration if appropriate." + }, + { + "pattern": "Values in the index key pattern cannot be empty strings", + "friendly_name": "Index creation error (empty string index)", + "recommendation": "It means MongoDB rejected an index create/modify operation because the index key pattern contains an empty-string value, which is not allowed (e.g. { a: \"\" }). Recommendation: identify the failing index in the surrounding logs, fix its definition so all key pattern values are valid (e.g. 1, -1, \"hashed\", \"2dsphere\", etc., but not \"\"), then rerun the index creation and the mongosync step/migration." + }, + { + "pattern": "Must have an index compatible with the proposed shard key", + "friendly_name": "Dest collection sharding failure (no compatible index)", + "recommendation": "It means the destination collection doesn’t have any index whose key pattern is compatible with the shard key mongosync is trying to use when sharding the collection, so the server rejects the operation with Must have an index compatible with the proposed shard key. Recommendation: Create on the destination a proper index that starts with the shard key fields and meets shard-key rules (non-partial, non-sparse, simple collation, correct directions / hashed where needed), then rerun the migration/commit so mongosync can successfully shard the collection." + }, + { + "pattern": "Most source writes appear to be from $out aggregations", + "friendly_name": "Source writes from $out aggregations. Consider pausing workloads that run $out aggregations.", + "full_error_message": "Most source writes appear to be from $out aggregations. This might mean that relatively few documents are being constantly updated. Mongosync may not be able to keep pace with the source's workload.", + "recommendation": "For hot‑document or $out workload warnings, assess skewed write patterns, consider throttling or rescheduling heavy aggregations, and verify CEA parallelism isn’t saturated: https://knowledge.corp.mongodb.com/article/000022720" + }, + { + "pattern": "The level of CEA parallelization for this collection is low", + "friendly_name": "Hot Documents", + "full_error_message": "The level of CEA parallelization for this collection is low. This might mean that relatively few documents are being constantly updated. Mongosync may not be able to keep pace with the source's workload.", + "recommendation": "For hot‑document or $out workload warnings, assess skewed write patterns, consider throttling or rescheduling heavy aggregations, and verify CEA parallelism isn’t saturated: https://knowledge.corp.mongodb.com/article/000022720" + }, + { + "pattern": "refetched db-spec for recreating dst coll for natural scan", + "friendly_name": "Dropped collection to restart Collection Copy in natural order", + "full_error_message": "refetched db-spec for recreating dst coll for natural scan, collSpec: {COLLECTION DETAILS UUID {}}, isSrcDropped: XXX", + "recommendation": "refetched db-spec for recreating dst coll for natural scan is a debug message meaning mongosync is restarting collection copy for that namespace in natural (collection) order: it drops the existing destination collection and recreates it from the source spec before re-copying all documents. This isn’t a fatal error by itself, but if you see it repeatedly you should avoid interruptions/timeouts during collection copy (increase timeouts, reduce load, stabilize network)" + }, + { + "pattern": "Using extraLongTimeoutDstClient in MaybeDeletePartitionOnDestination with timeout of", + "friendly_name": "Deleting documents to restart Collection Copy in natural order", + "full_error_message": "Using extraLongTimeoutDstClient in MaybeDeletePartitionOnDestination with timeout of", + "recommendation": "That log means mongosync is retrying a natural-order partition and is running a potentially large deleteMany on the destination with an extra-long timeout to clear partially copied data. If you see this often on large collections, reduce interruptions/timeouts during collection copy (stabilize network, increase server timeouts, avoid restarts)" + }, + { + "pattern": "Change Event Application failed", + "friendly_name": "CEA failed undefined reason", + "recommendation": "Change Event Application failed means mongosync’s change event application phase (CEA) hit an error while applying change-stream events (inserts/updates/deletes) to the destination; this is a wrapper around the real underlying error. Short recommendation: inspect the following log lines for the specific inner error (e.g. resume token/oplog issue, write/duplicate-key error, auth or timeout), fix that root cause on the relevant cluster, then rerun change event application (typically by restarting the migration/commit)." + }, + { + "pattern": "Failed to apply batch #1 for CRUD event application on the destination. Giving up on batch CRUD event application.", + "friendly_name": "Timeout on destination", + "full_error_message": "Failed to apply batch #1 for CRUD event application on the destination. Giving up on batch CRUD event application.", + "recommendation": "Failed to apply batch #1 for CRUD event application on the destination. Giving up on batch CRUD event application. means a CEA transaction applying a batch of CRUD events to the destination failed with a non-retriable error, so mongosync stopped batch application for that batch (typically a timeout/lock or other dest write error). Recommendation: check the surrounding log lines for the inner destination error (e.g. timeout, lock timeout, TooManyFilesOpen, auth), fix that cluster/timeout/permission issue, then rerun CEA / restart the migration once the destination is healthy." + }, + { + "pattern": "Got a fatal error running Mongosync", + "friendly_name": "Fatal error running Mongosync", + "full_error_message": "Got a fatal error running Mongosync", + "recommendation": "Got a fatal error running Mongosync is a generic top-level message emitted when mongosync hits a non-recoverable error and is about to shut down; the real cause is the labeled error and analysis data attached to that same log entry. Recommendation: find that exact line in the logs and read the preceding/embedded error details (e.g. verifier mismatch, index error, auth, oplog, etc.), fix that specific root cause using its own guidance, then rerun mongosync from scratch or from the appropriate phase." + }, + { + "pattern": "mongosync cannot migrate into namespaces that already have shard tags configured.", + "friendly_name": "Shard tags already exist in destination", + "full_error_message": "destination namespace has preconfigured shard zone ranges (shard tags); mongosync cannot migrate into namespaces that already have shard tags configured. Please remove all shard tag ranges for any namespaces mongosync will migrate into on the destination cluster. Then restart the migration from scratch and add the desired zone ranges after the migration is COMMITTED.", + "recommendation": "Please remove all shard tag ranges for any namespaces mongosync will migrate into on the destination cluster. Then restart the migration from scratch and add the desired zone ranges after the migration is COMMITTED." + } +] \ No newline at end of file diff --git a/migration/mongosync_insights/lib/file_decompressor.py b/migration/mongosync_insights/lib/file_decompressor.py index a2882833..245cf58f 100644 --- a/migration/mongosync_insights/lib/file_decompressor.py +++ b/migration/mongosync_insights/lib/file_decompressor.py @@ -329,20 +329,20 @@ def decompress_zip_classified(file_obj: BinaryIO) -> Iterator[Tuple[bytes, Optio with zipfile.ZipFile(file_obj, 'r') as zf: file_list = zf.namelist() logger.info(f"ZIP archive contains {len(file_list)} file(s): {file_list}") - + for filename in file_list: # Skip directories if filename.endswith('/'): continue - + file_type = classify_file_type(filename) logger.info(f"Processing file from ZIP: {filename} (classified as: {file_type})") - + # Skip files that don't match known patterns if file_type is None: logger.warning(f"Skipping unrecognized file: {filename}") continue - + with zf.open(filename) as inner_file: if filename.lower().endswith('.gz'): # Nested gzip file diff --git a/migration/mongosync_insights/lib/live_migration_metrics.py b/migration/mongosync_insights/lib/live_migration_metrics.py index 7bd5b4cc..8a7f7243 100644 --- a/migration/mongosync_insights/lib/live_migration_metrics.py +++ b/migration/mongosync_insights/lib/live_migration_metrics.py @@ -42,7 +42,7 @@ def gatherMetrics(connection_string): logger.info("Connected to target MongoDB cluster using connection pooling.") except PyMongoError as e: logger.error(f"Failed to connect to target MongoDB: {e}") - exit(1) + raise # Create a subplot for status information (4 rows) fig = make_subplots(rows=4, cols=5, @@ -73,7 +73,7 @@ def gatherMetrics(connection_string): vResumeData = internalDbDst.resumeData.find_one({"_id": "coordinator"}) #Plot mongosync State - vState = vResumeData["state"] + vState = vResumeData.get("state", "NO DATA") if vResumeData else "NO DATA" if vState == 'RUNNING': vColor = 'blue' elif vState == "IDLE": @@ -92,7 +92,8 @@ def gatherMetrics(connection_string): yaxis1=dict(showgrid=False, zeroline=False, showticklabels=False)) #Plot Mongosync Phase - vPhase = vResumeData["syncPhase"].capitalize() + sync_phase = vResumeData.get("syncPhase") if vResumeData else None + vPhase = sync_phase.capitalize() if sync_phase else "No Data" wrapped_phase = "
".join(textwrap.wrap(str(vPhase), width=15)) fig.add_trace(go.Scatter(x=[0], y=[0], text=[wrapped_phase], mode='text', name='Mongosync Phase',textfont=dict(size=17, color="black")), row=1, col=2) fig.update_layout(xaxis2=dict(showgrid=False, zeroline=False, showticklabels=False), @@ -364,7 +365,7 @@ def gatherPartitionsMetrics(connection_string): logger.info("Connected to target MongoDB for progress metrics.") except PyMongoError as e: logger.error(f"Failed to connect to target MongoDB: {e}") - exit(1) + raise # Create subplots for progress view (2x2 grid) fig = make_subplots( diff --git a/migration/mongosync_insights/lib/log_store.py b/migration/mongosync_insights/lib/log_store.py index 6e8f7f49..3dfea7f1 100644 --- a/migration/mongosync_insights/lib/log_store.py +++ b/migration/mongosync_insights/lib/log_store.py @@ -232,6 +232,38 @@ def count(self, query: Optional[dict] = None) -> int: result = self.find(query, skip=0, limit=1) return result['total'] + def fetch_latest_raw_lines(self, limit: int) -> list[str]: + """ + Return up to `limit` raw JSON lines with the newest `timestamp` values, + ordered ascending by time (oldest of the chunk first). + + Used for the Log Viewer tail when archives concatenate members in + non-chronological order so a streaming deque would drop newer events. + """ + if limit <= 0: + return [] + self.flush() + cur = self._conn.execute( + """ + SELECT doc FROM log_lines + WHERE timestamp != '' + ORDER BY timestamp DESC, rowid DESC + LIMIT ? + """, + (limit,), + ) + rows = [r[0] for r in cur.fetchall()] + rows.reverse() + if rows: + return rows + cur = self._conn.execute( + "SELECT doc FROM log_lines ORDER BY rowid DESC LIMIT ?", + (limit,), + ) + rows = [r[0] for r in cur.fetchall()] + rows.reverse() + return rows + @property def total_documents(self) -> int: """Total number of documents in the store.""" diff --git a/migration/mongosync_insights/lib/logs_metrics.py b/migration/mongosync_insights/lib/logs_metrics.py index a3113bb4..8351966c 100644 --- a/migration/mongosync_insights/lib/logs_metrics.py +++ b/migration/mongosync_insights/lib/logs_metrics.py @@ -143,7 +143,8 @@ def upload_file(): error_patterns = [ { 'pattern': re.compile(ep['pattern'], re.IGNORECASE), - 'friendly_name': ep['friendly_name'] + 'friendly_name': ep['friendly_name'], + 'recommendation': ep.get('recommendation', ''), } for ep in error_patterns_config ] @@ -182,7 +183,7 @@ def upload_file(): logs_line_count = 0 metrics_line_count = 0 invalid_json_count = 0 - + # Reset file pointer to beginning file.seek(0) @@ -312,6 +313,7 @@ def upload_file(): if ep['pattern'].search(message): matched_errors.append({ 'friendly_name': ep['friendly_name'], + 'recommendation': ep['recommendation'], 'message': message, 'time': json_obj.get('time', ''), 'level': json_obj.get('level', ''), @@ -329,17 +331,25 @@ def upload_file(): return render_template('error.html', error_title="Invalid File Format", error_message=f"The uploaded file does not contain valid JSON format. Error on line {line_count}: {str(e)}. Please ensure you're uploading a valid mongosync log file in NDJSON format.") - + + log_viewer_lines_out = list(raw_log_tail) + # Finalize log store: flush remaining buffered rows and build FTS index log_store.flush() if log_store.total_documents > 0: log_store.build_fts_index() log_store_registry.register(store_id, db_path) logger.info(f"Log store ready: {log_store.total_documents} documents, store_id={store_id[:8]}...") + try: + _chron = log_store.fetch_latest_raw_lines(LOG_VIEWER_MAX_LINES) + if _chron: + log_viewer_lines_out = _chron + except Exception as _e: + logger.warning(f"Chronological log viewer tail fetch failed, using stream order: {_e}") else: log_store.delete() store_id = '' - + logger.info(f"Processed {line_count} total lines ({logs_line_count} logs, {metrics_line_count} metrics), found {invalid_json_count} invalid JSON lines") logger.info(f"Found: {len(data)} replication progress, {len(version_info_list)} version info, " f"{len(mongosync_ops_stats)} operation stats, {len(mongosync_sent_response)} sent responses, " @@ -607,16 +617,110 @@ def _safe_float(val): index_built_times = [] indexes_built = [] indexes_total = [] + idx_coll_fin_times = [] + idx_coll_fin_vals = [] + idx_coll_tot_times = [] + idx_coll_tot_vals = [] + + def _safe_int_idx(val): + try: + if val is None: + return None + return int(val) + except (ValueError, TypeError): + return None + for response in mongosync_sent_response: try: + t_raw = response.get('time') + if not t_raw: + continue + t = datetime.strptime(t_raw[:26], "%Y-%m-%dT%H:%M:%S.%f") parsed_body = json.loads(response.get('body', '{}')) idx_building = (parsed_body.get('progress') or {}).get('indexBuilding') or {} built = idx_building.get('indexesBuilt') - total = idx_building.get('totalIndexesToBuild') - if built is not None and total is not None and 'time' in response: - index_built_times.append(datetime.strptime(response['time'][:26], "%Y-%m-%dT%H:%M:%S.%f")) + total_idx = idx_building.get('totalIndexesToBuild') + if built is not None and total_idx is not None: + index_built_times.append(t) indexes_built.append(built) - indexes_total.append(total) + indexes_total.append(total_idx) + cf = _safe_int_idx(idx_building.get('collectionsFinished')) + if cf is not None: + idx_coll_fin_times.append(t) + idx_coll_fin_vals.append(cf) + ct = _safe_int_idx(idx_building.get('collectionsTotal')) + if ct is not None: + idx_coll_tot_times.append(t) + idx_coll_tot_vals.append(ct) + except (json.JSONDecodeError, TypeError, ValueError, AttributeError): + continue + + # Estimated seconds to CEA catchup (from sent response progress) + cea_catchup_times = [] + cea_catchup_seconds = [] + + def _safe_int_catchup(val): + try: + if val is None: + return None + return int(val) + except (ValueError, TypeError): + return None + + for response in mongosync_sent_response: + try: + t_raw = response.get('time') + if not t_raw: + continue + t = datetime.strptime(t_raw[:26], "%Y-%m-%dT%H:%M:%S.%f") + parsed_body = json.loads(response.get('body', '{}')) + progress = parsed_body.get('progress') or {} + catchup = _safe_int_catchup(progress.get('estimatedSecondsToCEACatchup')) + if catchup is not None: + cea_catchup_times.append(t) + cea_catchup_seconds.append(float(catchup)) + except (json.JSONDecodeError, TypeError, ValueError, AttributeError): + continue + + # progress.verification (from sent response) for embedded verifier charts + verif_src_scan_times = [] + verif_src_scanned = [] + verif_src_total_coll = [] + verif_dst_scan_times = [] + verif_dst_scanned = [] + verif_dst_total_coll = [] + verif_src_hash_times = [] + verif_src_hashed = [] + verif_src_estimated = [] + verif_dst_hash_times = [] + verif_dst_hashed = [] + verif_dst_estimated = [] + + for response in mongosync_sent_response: + try: + t_raw = response.get('time') + if not t_raw: + continue + t = datetime.strptime(t_raw[:26], "%Y-%m-%dT%H:%M:%S.%f") + parsed_body = json.loads(response.get('body', '{}')) + progress = parsed_body.get('progress') or {} + ver = progress.get('verification') + if not isinstance(ver, dict) or not ver: + continue + src = ver.get('source') or {} + dst = ver.get('destination') or {} + verif_src_scan_times.append(t) + verif_src_scanned.append(_safe_int_catchup(src.get('scannedCollectionCount'))) + verif_src_total_coll.append(_safe_int_catchup(src.get('totalCollectionCount'))) + verif_dst_scan_times.append(t) + verif_dst_scanned.append(_safe_int_catchup(dst.get('scannedCollectionCount'))) + verif_dst_total_coll.append(_safe_int_catchup(dst.get('totalCollectionCount'))) + verif_src_hash_times.append(t) + verif_src_hashed.append(_safe_int_catchup(src.get('hashedDocumentCount'))) + verif_src_estimated.append(_safe_int_catchup(src.get('estimatedDocumentCount'))) + verif_dst_hash_times.append(t) + verif_dst_hashed.append(_safe_int_catchup(dst.get('hashedDocumentCount'))) + verif_dst_estimated.append(_safe_int_catchup(dst.get('estimatedDocumentCount'))) except (json.JSONDecodeError, TypeError, ValueError, AttributeError): continue @@ -672,11 +776,25 @@ def _parse_oplog_time_remaining_minutes(value): all_times.extend(estimatedCopiedBytes_times) if index_built_times: all_times.extend(index_built_times) + if idx_coll_fin_times: + all_times.extend(idx_coll_fin_times) + if idx_coll_tot_times: + all_times.extend(idx_coll_tot_times) if dst_lag_times: all_times.extend(dst_lag_times) if src_lag_times: all_times.extend(src_lag_times) - + if cea_catchup_times: + all_times.extend(cea_catchup_times) + if verif_src_scan_times: + all_times.extend(verif_src_scan_times) + if verif_dst_scan_times: + all_times.extend(verif_dst_scan_times) + if verif_src_hash_times: + all_times.extend(verif_src_hash_times) + if verif_dst_hash_times: + all_times.extend(verif_dst_hash_times) + if all_times: global_min_date = min(all_times) global_max_date = max(all_times) @@ -737,9 +855,10 @@ def _parse_oplog_time_remaining_minutes(value): logger.info(f"Plotting") # Create a subplot for the scatter plots (tables are now in a separate tab) - fig = make_subplots(rows=13, cols=2, subplot_titles=("Mongosync Phases", "Mongosync Phases Table", + fig = make_subplots(rows=17, cols=2, subplot_titles=("Mongosync Phases", "Mongosync Phases Table", "Lag Time (seconds)", "Estimated Source Oplog Time Remaining (minutes)", "Ping Latency (ms)", "Average Source CRUD Event Rate (Events/sec)", + "Est. seconds to CEA catchup", "", "Partition Init Progress", "Partition Init Summary", "Data Copied (" + estimated_total_bytes_unit + ")", "Estimated Total and Copied " + estimated_total_bytes_unit, "Partitions Copied", "Total and Copied Partitions", @@ -748,21 +867,28 @@ def _parse_oplog_time_remaining_minutes(value): "Change Events Applied", "Events Rate per Second", "CEA Source - Avg and Max Read time (ms)", "CEA Source Reads", "CEA Destination - Avg and Max Write time (ms)", "CEA Destination Writes", + "Collections finished", "Collections total / finished", "Index Built", "Total and Index Built", - "Source Verifier Lag Time (seconds)", "Destination Verifier Lag Time (seconds)"), + "Source Verifier Lag Time (seconds)", "Destination Verifier Lag Time (seconds)", + "Verification collections (source)", "Verification collections (destination)", + "Verification document hash (source)", "Verification document hash (destination)"), specs=[ [{}, {"type": "table"}], #Row 1: Mongosync Phases and Phases Table [{}, {}], #Row 2: Lag Time and Estimated Source Oplog Time Remaining [{}, {}], #Row 3: Ping Latency and CRUD Event Rate - [{}, {"type": "table"}], #Row 4: Partition Init Progress and Summary - [{}, {}], #Row 5: Data Copied Over Time + Estimated Total and Copied - [{}, {}], #Row 6: Partitions Copied and Completion % - [{}, {}], #Row 7: Collection Copy Source - [{}, {}], #Row 8: Collection Copy Destination - [{}, {}], #Row 9: Change Events Applied and Events Rate per Second - [{}, {}], #Row 10: CEA Source - [{}, {}], #Row 11: CEA Destination - [{}, {}], #Row 12: Index Built and Total and Index Built - [{}, {}] ]) #Row 13: Verifier Lag + [{}, {}], #Row 4: CEA catchup (col 1); col 2 intentionally empty + [{}, {"type": "table"}], #Row 5: Partition Init Progress and Summary + [{}, {}], #Row 6: Data Copied Over Time + Estimated Total and Copied + [{}, {}], #Row 7: Partitions Copied and Completion % + [{}, {}], #Row 8: Collection Copy Source + [{}, {}], #Row 9: Collection Copy Destination + [{}, {}], #Row 10: Change Events Applied and Events Rate per Second + [{}, {}], #Row 11: CEA Source + [{}, {}], #Row 12: CEA Destination + [{}, {}], #Row 13: Collections (time + summary bars) + [{}, {}], #Row 14: Index Built and Total and Index Built + [{}, {}], #Row 15: Verifier Lag + [{}, {}], #Row 16: Verification collections + [{}, {}] ]) #Row 17: Verification document hash # Add traces @@ -823,32 +949,63 @@ def _parse_oplog_time_remaining_minutes(value): fig.update_yaxes(range=[-1, 1], row=3, col=2) fig.update_xaxes(range=[-1, 1], row=3, col=2) - # Row 4: Partition Init Progress - collections initializing vs completed over time + # Row 4: Estimated seconds to CEA catchup (col 1 only; col 2 left empty per layout) + if cea_catchup_times: + fig.add_trace( + go.Scattergl( + x=cea_catchup_times, + y=cea_catchup_seconds, + mode='lines', + name='Est. seconds to CEA catchup', + legendgroup="groupCEACatchup", + ), + row=4, + col=1, + ) + else: + fig.add_trace( + go.Scatter( + x=[0], + y=[0], + text="NO DATA", + mode='text', + name='CEA catchup estimate', + textfont=dict(size=30, color="black"), + ), + row=4, + col=1, + ) + fig.update_yaxes(range=[-1, 1], row=4, col=1) + fig.update_xaxes(range=[-1, 1], row=4, col=1) + fig.update_xaxes(visible=False, row=4, col=2) + fig.update_yaxes(visible=False, row=4, col=2) + + # Row 5: Partition Init Progress - collections initializing vs completed over time if partition_init_progress_times: total_collections = len(partition_init_data) if partition_init_data else 0 fig.add_trace(go.Scattergl( x=partition_init_progress_times, y=partition_init_progress_in_progress, mode='lines', name='In Progress', line=dict(color='#2196F3'), legendgroup="groupPartitionInitProgress" - ), row=4, col=1) + ), row=5, col=1) fig.add_trace(go.Scattergl( x=partition_init_progress_times, y=partition_init_progress_completed, mode='lines', name='Completed', line=dict(color='#4CAF50'), legendgroup="groupPartitionInitProgress" - ), row=4, col=1) + ), row=5, col=1) if total_collections > 0: fig.add_trace(go.Scattergl( x=[partition_init_progress_times[0], partition_init_progress_times[-1]], y=[total_collections, total_collections], mode='lines', name='Total Collections', line=dict(color='gray', dash='dash'), legendgroup="groupPartitionInitProgress" - ), row=4, col=1) + ), row=5, col=1) else: - fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Partition Init Progress', textfont=dict(size=30, color="black")), row=4, col=1) - fig.update_yaxes(range=[-1, 1], row=4, col=1) - fig.update_xaxes(range=[-1, 1], row=4, col=1) + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Partition Init Progress', textfont=dict(size=30, color="black")), row=5, col=1) + fig.update_yaxes(range=[-1, 1], row=5, col=1) + fig.update_xaxes(range=[-1, 1], row=5, col=1) - # Row 4: Partition Init Summary Table + # Row 5: Partition Init Summary Table if partition_init_data: fig.add_trace(go.Table( header=dict(values=["Collection", "Type", "Partitions", "Doc Count", "Duration (s)"]), @@ -859,172 +1016,275 @@ def _parse_oplog_time_remaining_minutes(value): [f"{d['doc_count']:,}" if d['doc_count'] else 'N/A' for d in partition_init_data], [d['duration_sec'] if d['duration_sec'] is not None else 'N/A' for d in partition_init_data], ]) - ), row=4, col=2) + ), row=5, col=2) else: fig.add_trace(go.Table( header=dict(values=["Collection", "Type", "Partitions", "Doc Count", "Duration (s)"]), cells=dict(values=[[], [], [], [], []]) - ), row=4, col=2) + ), row=5, col=2) - # Row 5: Data Copied Over Time + # Row 6: Data Copied Over Time if estimatedCopiedBytes_converted: - fig.add_trace(go.Scattergl(x=estimatedCopiedBytes_times, y=estimatedCopiedBytes_converted, mode='lines', name='Copied ' + estimated_total_bytes_unit, legendgroup="groupTotalCopied"), row=5, col=1) + fig.add_trace(go.Scattergl(x=estimatedCopiedBytes_times, y=estimatedCopiedBytes_converted, mode='lines', name='Copied ' + estimated_total_bytes_unit, legendgroup="groupTotalCopied"), row=6, col=1) else: - fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Data Copied Over Time',textfont=dict(size=30, color="black")), row=5, col=1) - fig.update_yaxes(range=[-1, 1], row=5, col=1) - fig.update_xaxes(range=[-1, 1], row=5, col=1) + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Data Copied Over Time',textfont=dict(size=30, color="black")), row=6, col=1) + fig.update_yaxes(range=[-1, 1], row=6, col=1) + fig.update_xaxes(range=[-1, 1], row=6, col=1) - # Row 5: Estimated Total and Copied + # Row 6: Estimated Total and Copied if estimated_total_bytes > 0 or estimated_copied_bytes > 0: - fig.add_trace( go.Bar( name='Estimated ' + estimated_total_bytes_unit + ' to be Copied', x=[estimated_total_bytes_unit], y=[estimated_total_bytes], legendgroup="groupTotalCopied" ), row=5, col=2) - fig.add_trace( go.Bar( name='Estimated Copied ' + estimated_total_bytes_unit, x=[estimated_total_bytes_unit], y=[estimated_copied_bytes], legendgroup="groupTotalCopied"), row=5, col=2) + fig.add_trace( go.Bar( name='Estimated ' + estimated_total_bytes_unit + ' to be Copied', x=[estimated_total_bytes_unit], y=[estimated_total_bytes], legendgroup="groupTotalCopied" ), row=6, col=2) + fig.add_trace( go.Bar( name='Estimated Copied ' + estimated_total_bytes_unit, x=[estimated_total_bytes_unit], y=[estimated_copied_bytes], legendgroup="groupTotalCopied"), row=6, col=2) else: - fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Estimated Total and Copied',textfont=dict(size=30, color="black")), row=5, col=2) - fig.update_yaxes(range=[-1, 1], row=5, col=2) - fig.update_xaxes(range=[-1, 1], row=5, col=2) + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Estimated Total and Copied',textfont=dict(size=30, color="black")), row=6, col=2) + fig.update_yaxes(range=[-1, 1], row=6, col=2) + fig.update_xaxes(range=[-1, 1], row=6, col=2) - # Row 6: Partitions Copied Over Time + # Row 7: Partitions Copied Over Time if partition_times: - fig.add_trace(go.Scattergl(x=partition_times, y=partitions_copied, mode='lines', name='Partitions Copied', legendgroup="groupPartitions"), row=6, col=1) - fig.add_trace(go.Scattergl(x=partition_times, y=partitions_total, mode='lines', name='Total Partitions', legendgroup="groupPartitions"), row=6, col=1) + fig.add_trace(go.Scattergl(x=partition_times, y=partitions_copied, mode='lines', name='Partitions Copied', legendgroup="groupPartitions"), row=7, col=1) + fig.add_trace(go.Scattergl(x=partition_times, y=partitions_total, mode='lines', name='Total Partitions', legendgroup="groupPartitions"), row=7, col=1) else: - fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Partitions Copied', textfont=dict(size=30, color="black")), row=6, col=1) - fig.update_yaxes(range=[-1, 1], row=6, col=1) - fig.update_xaxes(range=[-1, 1], row=6, col=1) + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Partitions Copied', textfont=dict(size=30, color="black")), row=7, col=1) + fig.update_yaxes(range=[-1, 1], row=7, col=1) + fig.update_xaxes(range=[-1, 1], row=7, col=1) - # Row 6: Total and Copied Partitions + # Row 7: Total and Copied Partitions if partition_times: last_copied = partitions_copied[-1] last_total = partitions_total[-1] - fig.add_trace(go.Bar(name='Total Partitions', x=['Partitions'], y=[last_total], legendgroup="groupPartitions"), row=6, col=2) - fig.add_trace(go.Bar(name='Copied Partitions', x=['Partitions'], y=[last_copied], legendgroup="groupPartitions"), row=6, col=2) + fig.add_trace(go.Bar(name='Total Partitions', x=['Partitions'], y=[last_total], legendgroup="groupPartitions"), row=7, col=2) + fig.add_trace(go.Bar(name='Copied Partitions', x=['Partitions'], y=[last_copied], legendgroup="groupPartitions"), row=7, col=2) else: - fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Total and Copied Partitions', textfont=dict(size=30, color="black")), row=6, col=2) - fig.update_yaxes(range=[-1, 1], row=6, col=2) - fig.update_xaxes(range=[-1, 1], row=6, col=2) + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Total and Copied Partitions', textfont=dict(size=30, color="black")), row=7, col=2) + fig.update_yaxes(range=[-1, 1], row=7, col=2) + fig.update_xaxes(range=[-1, 1], row=7, col=2) # Row 7: Collection Copy Source Read if CollectionCopySourceRead or CollectionCopySourceRead_maximum: - fig.add_trace(go.Scattergl(x=times, y=CollectionCopySourceRead, mode='lines', name='Average time (ms)', legendgroup="groupCCSourceRead"), row=7, col=1) - fig.add_trace(go.Scattergl(x=times, y=CollectionCopySourceRead_maximum, mode='lines', name='Maximum time (ms)', legendgroup="groupCCSourceRead"), row=7, col=1) + fig.add_trace(go.Scattergl(x=times, y=CollectionCopySourceRead, mode='lines', name='Average time (ms)', legendgroup="groupCCSourceRead"), row=8, col=1) + fig.add_trace(go.Scattergl(x=times, y=CollectionCopySourceRead_maximum, mode='lines', name='Maximum time (ms)', legendgroup="groupCCSourceRead"), row=8, col=1) else: - fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Collection Copy Source Read',textfont=dict(size=30, color="black")), row=7, col=1) - fig.update_yaxes(range=[-1, 1], row=7, col=1) - fig.update_xaxes(range=[-1, 1], row=7, col=1) + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Collection Copy Source Read',textfont=dict(size=30, color="black")), row=8, col=1) + fig.update_yaxes(range=[-1, 1], row=8, col=1) + fig.update_xaxes(range=[-1, 1], row=8, col=1) # Row 7: Collection Copy Source Reads (numOperations) if CollectionCopySourceRead_numOperations: - fig.add_trace(go.Scattergl(x=times, y=CollectionCopySourceRead_numOperations, mode='lines', name='Reads', legendgroup="groupCCSourceRead"), row=7, col=2) + fig.add_trace(go.Scattergl(x=times, y=CollectionCopySourceRead_numOperations, mode='lines', name='Reads', legendgroup="groupCCSourceRead"), row=8, col=2) else: - fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Collection Copy Source Reads',textfont=dict(size=30, color="black")), row=7, col=2) - fig.update_yaxes(range=[-1, 1], row=7, col=2) - fig.update_xaxes(range=[-1, 1], row=7, col=2) + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Collection Copy Source Reads',textfont=dict(size=30, color="black")), row=8, col=2) + fig.update_yaxes(range=[-1, 1], row=8, col=2) + fig.update_xaxes(range=[-1, 1], row=8, col=2) # Row 8: Collection Copy Destination Write if CollectionCopyDestinationWrite or CollectionCopyDestinationWrite_maximum: - fig.add_trace(go.Scattergl(x=times, y=CollectionCopyDestinationWrite, mode='lines', name='Average time (ms)', legendgroup="groupCCDestinationWrite"), row=8, col=1) - fig.add_trace(go.Scattergl(x=times, y=CollectionCopyDestinationWrite_maximum, mode='lines', name='Maximum time (ms)', legendgroup="groupCCDestinationWrite"), row=8, col=1) + fig.add_trace(go.Scattergl(x=times, y=CollectionCopyDestinationWrite, mode='lines', name='Average time (ms)', legendgroup="groupCCDestinationWrite"), row=9, col=1) + fig.add_trace(go.Scattergl(x=times, y=CollectionCopyDestinationWrite_maximum, mode='lines', name='Maximum time (ms)', legendgroup="groupCCDestinationWrite"), row=9, col=1) else: - fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Collection Copy Destination Write',textfont=dict(size=30, color="black")), row=8, col=1) - fig.update_yaxes(range=[-1, 1], row=8, col=1) - fig.update_xaxes(range=[-1, 1], row=8, col=1) + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Collection Copy Destination Write',textfont=dict(size=30, color="black")), row=9, col=1) + fig.update_yaxes(range=[-1, 1], row=9, col=1) + fig.update_xaxes(range=[-1, 1], row=9, col=1) # Row 8: Collection Copy Destination Writes (numOperations) if CollectionCopyDestinationWrite_numOperations: - fig.add_trace(go.Scattergl(x=times, y=CollectionCopyDestinationWrite_numOperations, mode='lines', name='Writes', legendgroup="groupCCDestinationWrite"), row=8, col=2) + fig.add_trace(go.Scattergl(x=times, y=CollectionCopyDestinationWrite_numOperations, mode='lines', name='Writes', legendgroup="groupCCDestinationWrite"), row=9, col=2) else: - fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Collection Copy Destination Writes',textfont=dict(size=30, color="black")), row=8, col=2) - fig.update_yaxes(range=[-1, 1], row=8, col=2) - fig.update_xaxes(range=[-1, 1], row=8, col=2) + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Collection Copy Destination Writes',textfont=dict(size=30, color="black")), row=9, col=2) + fig.update_yaxes(range=[-1, 1], row=9, col=2) + fig.update_xaxes(range=[-1, 1], row=9, col=2) # Row 9: Total Events Applied if totalEventsApplied: - fig.add_trace(go.Scattergl(x=times, y=totalEventsApplied, mode='lines', name='Events', legendgroup="groupEventsAndLags"), row=9, col=1) + fig.add_trace(go.Scattergl(x=times, y=totalEventsApplied, mode='lines', name='Events', legendgroup="groupEventsAndLags"), row=10, col=1) else: - fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Change Events Applied',textfont=dict(size=30, color="black")), row=9, col=1) - fig.update_yaxes(range=[-1, 1], row=9, col=1) - fig.update_xaxes(range=[-1, 1], row=9, col=1) + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Change Events Applied',textfont=dict(size=30, color="black")), row=10, col=1) + fig.update_yaxes(range=[-1, 1], row=10, col=1) + fig.update_xaxes(range=[-1, 1], row=10, col=1) # Row 9: Events Rate per Second if eventRatePerSecond: - fig.add_trace(go.Scattergl(x=eventRatePerSecond_times, y=eventRatePerSecond, mode='lines', name='Events/sec', legendgroup="groupEventsAndLags"), row=9, col=2) + fig.add_trace(go.Scattergl(x=eventRatePerSecond_times, y=eventRatePerSecond, mode='lines', name='Events/sec', legendgroup="groupEventsAndLags"), row=10, col=2) else: - fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Events Rate per Second',textfont=dict(size=30, color="black")), row=9, col=2) - fig.update_yaxes(range=[-1, 1], row=9, col=2) - fig.update_xaxes(range=[-1, 1], row=9, col=2) + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Events Rate per Second',textfont=dict(size=30, color="black")), row=10, col=2) + fig.update_yaxes(range=[-1, 1], row=10, col=2) + fig.update_xaxes(range=[-1, 1], row=10, col=2) # Row 10: CEA Source Read if CEASourceRead or CEASourceRead_maximum: - fig.add_trace(go.Scattergl(x=times, y=CEASourceRead, mode='lines', name='Average time (ms)', legendgroup="groupCEASourceRead"), row=10, col=1) - fig.add_trace(go.Scattergl(x=times, y=CEASourceRead_maximum, mode='lines', name='Maximum time (ms)', legendgroup="groupCEASourceRead"), row=10, col=1) + fig.add_trace(go.Scattergl(x=times, y=CEASourceRead, mode='lines', name='Average time (ms)', legendgroup="groupCEASourceRead"), row=11, col=1) + fig.add_trace(go.Scattergl(x=times, y=CEASourceRead_maximum, mode='lines', name='Maximum time (ms)', legendgroup="groupCEASourceRead"), row=11, col=1) else: - fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='CEA Source Read',textfont=dict(size=30, color="black")), row=10, col=1) - fig.update_yaxes(range=[-1, 1], row=10, col=1) - fig.update_xaxes(range=[-1, 1], row=10, col=1) + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='CEA Source Read',textfont=dict(size=30, color="black")), row=11, col=1) + fig.update_yaxes(range=[-1, 1], row=11, col=1) + fig.update_xaxes(range=[-1, 1], row=11, col=1) # Row 10: CEA Source Reads (numOperations) if CEASourceRead_numOperations: - fig.add_trace(go.Scattergl(x=times, y=CEASourceRead_numOperations, mode='lines', name='Reads', legendgroup="groupCEASourceRead"), row=10, col=2) + fig.add_trace(go.Scattergl(x=times, y=CEASourceRead_numOperations, mode='lines', name='Reads', legendgroup="groupCEASourceRead"), row=11, col=2) else: - fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='CEA Source Reads',textfont=dict(size=30, color="black")), row=10, col=2) - fig.update_yaxes(range=[-1, 1], row=10, col=2) - fig.update_xaxes(range=[-1, 1], row=10, col=2) + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='CEA Source Reads',textfont=dict(size=30, color="black")), row=11, col=2) + fig.update_yaxes(range=[-1, 1], row=11, col=2) + fig.update_xaxes(range=[-1, 1], row=11, col=2) # Row 11: CEA Destination Write if CEADestinationWrite or CEADestinationWrite_maximum: - fig.add_trace(go.Scattergl(x=times, y=CEADestinationWrite, mode='lines', name='Average time (ms)', legendgroup="groupCEADestinationWrite"), row=11, col=1) - fig.add_trace(go.Scattergl(x=times, y=CEADestinationWrite_maximum, mode='lines', name='Maximum time (ms)', legendgroup="groupCEADestinationWrite"), row=11, col=1) + fig.add_trace(go.Scattergl(x=times, y=CEADestinationWrite, mode='lines', name='Average time (ms)', legendgroup="groupCEADestinationWrite"), row=12, col=1) + fig.add_trace(go.Scattergl(x=times, y=CEADestinationWrite_maximum, mode='lines', name='Maximum time (ms)', legendgroup="groupCEADestinationWrite"), row=12, col=1) else: - fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='CEA Destination Write',textfont=dict(size=30, color="black")), row=11, col=1) - fig.update_yaxes(range=[-1, 1], row=11, col=1) - fig.update_xaxes(range=[-1, 1], row=11, col=1) + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='CEA Destination Write',textfont=dict(size=30, color="black")), row=12, col=1) + fig.update_yaxes(range=[-1, 1], row=12, col=1) + fig.update_xaxes(range=[-1, 1], row=12, col=1) # Row 11: CEA Destination Writes (numOperations) if CEADestinationWrite_numOperations: - fig.add_trace(go.Scattergl(x=times, y=CEADestinationWrite_numOperations, mode='lines', name='Writes during CEA', legendgroup="groupCEADestinationWrite"), row=11, col=2) + fig.add_trace(go.Scattergl(x=times, y=CEADestinationWrite_numOperations, mode='lines', name='Writes during CEA', legendgroup="groupCEADestinationWrite"), row=12, col=2) else: - fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='CEA Destination Writes',textfont=dict(size=30, color="black")), row=11, col=2) - fig.update_yaxes(range=[-1, 1], row=11, col=2) - fig.update_xaxes(range=[-1, 1], row=11, col=2) + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='CEA Destination Writes',textfont=dict(size=30, color="black")), row=12, col=2) + fig.update_yaxes(range=[-1, 1], row=12, col=2) + fig.update_xaxes(range=[-1, 1], row=12, col=2) + + # Row 13: Collections finished (time) — Indexes Metrics + if idx_coll_fin_times: + fig.add_trace( + go.Scattergl( + x=idx_coll_fin_times, + y=idx_coll_fin_vals, + mode='lines', + name='Collections finished', + legendgroup="groupIndexCollections", + ), + row=13, + col=1, + ) + else: + fig.add_trace( + go.Scatter( + x=[0], + y=[0], + text="NO DATA", + mode='text', + name='Collections finished', + textfont=dict(size=30, color="black"), + ), + row=13, + col=1, + ) + fig.update_yaxes(range=[-1, 1], row=13, col=1) + fig.update_xaxes(range=[-1, 1], row=13, col=1) + + # Row 13: Collections total vs finished (bars) — same pattern as Total / Indexes Built + if idx_coll_tot_vals or idx_coll_fin_vals: + last_coll_total = idx_coll_tot_vals[-1] if idx_coll_tot_vals else None + last_coll_finished = idx_coll_fin_vals[-1] if idx_coll_fin_vals else None + if last_coll_total is not None: + fig.add_trace( + go.Bar( + name='Total collections', + x=['Collections'], + y=[last_coll_total], + legendgroup="groupIndexCollections", + ), + row=13, + col=2, + ) + if last_coll_finished is not None: + fig.add_trace( + go.Bar( + name='Collections finished (summary)', + x=['Collections'], + y=[last_coll_finished], + legendgroup="groupIndexCollections", + ), + row=13, + col=2, + ) + else: + fig.add_trace( + go.Scatter( + x=[0], + y=[0], + text="NO DATA", + mode='text', + name='Collections summary', + textfont=dict(size=30, color="black"), + ), + row=13, + col=2, + ) + fig.update_yaxes(range=[-1, 1], row=13, col=2) + fig.update_xaxes(range=[-1, 1], row=13, col=2) - # Row 12: Index Built Over Time + # Row 14: Index Built Over Time if index_built_times: - fig.add_trace(go.Scattergl(x=index_built_times, y=indexes_built, mode='lines', name='Indexes Built', legendgroup="groupIndexBuilt"), row=12, col=1) + fig.add_trace(go.Scattergl(x=index_built_times, y=indexes_built, mode='lines', name='Indexes Built', legendgroup="groupIndexBuilt"), row=14, col=1) else: - fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Index Built', textfont=dict(size=30, color="black")), row=12, col=1) - fig.update_yaxes(range=[-1, 1], row=12, col=1) - fig.update_xaxes(range=[-1, 1], row=12, col=1) + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Index Built', textfont=dict(size=30, color="black")), row=14, col=1) + fig.update_yaxes(range=[-1, 1], row=14, col=1) + fig.update_xaxes(range=[-1, 1], row=14, col=1) - # Row 12: Total and Index Built + # Row 14: Total and Index Built if index_built_times: last_built = indexes_built[-1] last_total = indexes_total[-1] - fig.add_trace(go.Bar(name='Total Indexes', x=['Indexes'], y=[last_total], legendgroup="groupIndexBuilt"), row=12, col=2) - fig.add_trace(go.Bar(name='Indexes Built', x=['Indexes'], y=[last_built], legendgroup="groupIndexBuilt"), row=12, col=2) + fig.add_trace(go.Bar(name='Total Indexes', x=['Indexes'], y=[last_total], legendgroup="groupIndexBuilt"), row=14, col=2) + fig.add_trace(go.Bar(name='Indexes Built', x=['Indexes'], y=[last_built], legendgroup="groupIndexBuilt"), row=14, col=2) else: - fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Total and Index Built', textfont=dict(size=30, color="black")), row=12, col=2) - fig.update_yaxes(range=[-1, 1], row=12, col=2) - fig.update_xaxes(range=[-1, 1], row=12, col=2) + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Total and Index Built', textfont=dict(size=30, color="black")), row=14, col=2) + fig.update_yaxes(range=[-1, 1], row=14, col=2) + fig.update_xaxes(range=[-1, 1], row=14, col=2) - # Row 13: Source Verifier Lag Time + # Row 15: Source Verifier Lag Time if verifierSrcLagTimeSeconds: - fig.add_trace(go.Scattergl(x=src_lag_times, y=verifierSrcLagTimeSeconds, mode='lines', name='Source Verifier Lag Time (seconds)', legendgroup="groupVerifierLag"), row=13, col=1) + fig.add_trace(go.Scattergl(x=src_lag_times, y=verifierSrcLagTimeSeconds, mode='lines', name='Source Verifier Lag Time (seconds)', legendgroup="groupVerifierLag"), row=15, col=1) else: - fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Source Verifier Lag Time', textfont=dict(size=30, color="black")), row=13, col=1) - fig.update_yaxes(range=[-1, 1], row=13, col=1) - fig.update_xaxes(range=[-1, 1], row=13, col=1) + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Source Verifier Lag Time', textfont=dict(size=30, color="black")), row=15, col=1) + fig.update_yaxes(range=[-1, 1], row=15, col=1) + fig.update_xaxes(range=[-1, 1], row=15, col=1) - # Row 13: Destination Verifier Lag Time + # Row 15: Destination Verifier Lag Time if verifierDstLagTimeSeconds: - fig.add_trace(go.Scattergl(x=dst_lag_times, y=verifierDstLagTimeSeconds, mode='lines', name='Destination Verifier Lag Time (seconds)', legendgroup="groupVerifierLag"), row=13, col=2) + fig.add_trace(go.Scattergl(x=dst_lag_times, y=verifierDstLagTimeSeconds, mode='lines', name='Destination Verifier Lag Time (seconds)', legendgroup="groupVerifierLag"), row=15, col=2) else: - fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Destination Verifier Lag Time', textfont=dict(size=30, color="black")), row=13, col=2) - fig.update_yaxes(range=[-1, 1], row=13, col=2) - fig.update_xaxes(range=[-1, 1], row=13, col=2) + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Destination Verifier Lag Time', textfont=dict(size=30, color="black")), row=15, col=2) + fig.update_yaxes(range=[-1, 1], row=15, col=2) + fig.update_xaxes(range=[-1, 1], row=15, col=2) + + # Row 16: Verification collection scan (source / destination) + if any(v is not None for v in verif_src_scanned) or any(v is not None for v in verif_src_total_coll): + fig.add_trace(go.Scattergl(x=verif_src_scan_times, y=verif_src_scanned, mode='lines', name='Source scanned collections', legendgroup="groupVerifierScan"), row=16, col=1) + fig.add_trace(go.Scattergl(x=verif_src_scan_times, y=verif_src_total_coll, mode='lines', name='Source total collections', legendgroup="groupVerifierScan"), row=16, col=1) + else: + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Verification collections (source)', textfont=dict(size=30, color="black")), row=16, col=1) + fig.update_yaxes(range=[-1, 1], row=16, col=1) + fig.update_xaxes(range=[-1, 1], row=16, col=1) + if any(v is not None for v in verif_dst_scanned) or any(v is not None for v in verif_dst_total_coll): + fig.add_trace(go.Scattergl(x=verif_dst_scan_times, y=verif_dst_scanned, mode='lines', name='Destination scanned collections', legendgroup="groupVerifierScan"), row=16, col=2) + fig.add_trace(go.Scattergl(x=verif_dst_scan_times, y=verif_dst_total_coll, mode='lines', name='Destination total collections', legendgroup="groupVerifierScan"), row=16, col=2) + else: + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Verification collections (destination)', textfont=dict(size=30, color="black")), row=16, col=2) + fig.update_yaxes(range=[-1, 1], row=16, col=2) + fig.update_xaxes(range=[-1, 1], row=16, col=2) + + # Row 17: Verification document hash (source / destination) + if any(v is not None for v in verif_src_hashed) or any(v is not None for v in verif_src_estimated): + fig.add_trace(go.Scattergl(x=verif_src_hash_times, y=verif_src_hashed, mode='lines', name='Source hashed documents', legendgroup="groupVerifierHash"), row=17, col=1) + fig.add_trace(go.Scattergl(x=verif_src_hash_times, y=verif_src_estimated, mode='lines', name='Source estimated documents', legendgroup="groupVerifierHash"), row=17, col=1) + else: + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Verification document hash (source)', textfont=dict(size=30, color="black")), row=17, col=1) + fig.update_yaxes(range=[-1, 1], row=17, col=1) + fig.update_xaxes(range=[-1, 1], row=17, col=1) + if any(v is not None for v in verif_dst_hashed) or any(v is not None for v in verif_dst_estimated): + fig.add_trace(go.Scattergl(x=verif_dst_hash_times, y=verif_dst_hashed, mode='lines', name='Destination hashed documents', legendgroup="groupVerifierHash"), row=17, col=2) + fig.add_trace(go.Scattergl(x=verif_dst_hash_times, y=verif_dst_estimated, mode='lines', name='Destination estimated documents', legendgroup="groupVerifierHash"), row=17, col=2) + else: + fig.add_trace(go.Scatter(x=[0], y=[0], text="NO DATA", mode='text', name='Verification document hash (destination)', textfont=dict(size=30, color="black")), row=17, col=2) + fig.update_yaxes(range=[-1, 1], row=17, col=2) + fig.update_xaxes(range=[-1, 1], row=17, col=2) # Update layout - # 225 per plot (13 rows = 2925) - fig.update_layout(height=2925, width=1450, title_text="Mongosync Replication Progress - " + version_text + " - Timezone info: " + timeZoneInfo, legend_tracegroupgap=190, showlegend=False) + # 225 per plot (17 rows) + fig.update_layout(height=17 * 225, width=1450, title_text="Mongosync Replication Progress - " + version_text + " - Timezone info: " + timeZoneInfo, legend_tracegroupgap=190, showlegend=False) # Force all y-axes to start at 0 for better visual comparison fig.update_yaxes(rangemode='tozero') @@ -1032,10 +1292,10 @@ def _parse_oplog_time_remaining_minutes(value): # Add section label annotations above each section group section_labels = [ ("Global Migration Metrics", 'yaxis'), # row 1 - ("Collection Copy Metrics", 'yaxis6'), # row 4 - ("CEA Metrics", 'yaxis15'), # row 9 - ("Indexes Metrics", 'yaxis21'), # row 12 - ("Verifier Metrics", 'yaxis23'), # row 13 + ("Collection Copy Metrics", 'yaxis8'), # row 5 (partition) + ("CEA Metrics", 'yaxis17'), # row 10 + ("Indexes Metrics", 'yaxis23'), # row 13 (collections) + ("Verifier Metrics", 'yaxis27'), # row 15 ] for section_name, yaxis_key in section_labels: domain = fig.layout[yaxis_key].domain @@ -1053,14 +1313,15 @@ def _parse_oplog_time_remaining_minutes(value): ) # Synchronize X-axis date range across all date-based plots - # Tables at row 1 col 2 and row 4 col 2 are excluded (no date axis) + # Tables at row 1 col 2 and row 5 col 2 are excluded; row 4 col 2 is intentionally empty if global_min_date and global_max_date: fig.update_xaxes(range=[global_min_date, global_max_date], row=1, col=1) for row in range(2, 4): # rows 2-3 (both cols are charts) for col in range(1, 3): fig.update_xaxes(range=[global_min_date, global_max_date], row=row, col=col) fig.update_xaxes(range=[global_min_date, global_max_date], row=4, col=1) - for row in range(5, 14): # rows 5-13 (both cols are charts) + fig.update_xaxes(range=[global_min_date, global_max_date], row=5, col=1) + for row in range(6, 18): # rows 6-17 (both cols are charts) for col in range(1, 3): fig.update_xaxes(range=[global_min_date, global_max_date], row=row, col=col) @@ -1131,7 +1392,8 @@ def _parse_oplog_time_remaining_minutes(value): 'partition_init_data': partition_init_data, 'has_logs_data': has_logs_data, 'has_metrics_data': has_metrics_data, - 'log_viewer_lines': list(raw_log_tail), + 'log_viewer_lines': log_viewer_lines_out, + 'log_viewer_max_lines': LOG_VIEWER_MAX_LINES, 'log_store_id': store_id, } diff --git a/migration/mongosync_insights/lib/session_support.py b/migration/mongosync_insights/lib/session_support.py new file mode 100644 index 00000000..fb9533a8 --- /dev/null +++ b/migration/mongosync_insights/lib/session_support.py @@ -0,0 +1,14 @@ +"""Shared session cookie + merge for live / verifier routes.""" +from flask import request + +from .app_config import session_store + +SESSION_COOKIE_NAME = "mi_session_id" + + +def store_session_data(new_data): + """Merge new_data into the existing session or create a fresh one.""" + session_id = request.cookies.get(SESSION_COOKIE_NAME) + if session_id and session_store.update_session(session_id, new_data): + return session_id + return session_store.create_session(new_data) diff --git a/migration/mongosync_insights/mongosync_insights.py b/migration/mongosync_insights/mongosync_insights.py index 5abf82dd..285f27c9 100644 --- a/migration/mongosync_insights/mongosync_insights.py +++ b/migration/mongosync_insights/mongosync_insights.py @@ -1,501 +1,184 @@ import logging -import sys import os -from flask import Flask, render_template, request, make_response, jsonify, send_from_directory -from lib.logs_metrics import upload_file -from lib.live_migration_metrics import plotMetrics, gatherMetrics, gatherPartitionsMetrics, gatherEndpointMetrics -from lib.migration_verifier import plotVerifierMetrics, gatherVerifierMetrics -from pymongo.errors import InvalidURI, PyMongoError +import sys + +from flask import ( + Flask, + make_response, + render_template, + request, + send_from_directory, +) + +from blueprints.live import bp as live_bp +from blueprints.logs import bp as logs_bp from lib.app_config import ( - setup_logging, validate_config, get_app_info, HOST, PORT, MAX_FILE_SIZE, - REFRESH_TIME, APP_VERSION, DEVELOPER_CREDITS, validate_connection, clear_connection_cache, - SECURE_COOKIES, CONNECTION_STRING, VERIFIER_CONNECTION_STRING, - PROGRESS_ENDPOINT_URL, validate_progress_endpoint_url, session_store, SESSION_TIMEOUT, - LOG_STORE_DIR, LOG_STORE_MAX_AGE_HOURS, + DEVELOPER_CREDITS, + APP_VERSION, + HOST, + LOG_STORE_DIR, + LOG_STORE_MAX_AGE_HOURS, + MAX_FILE_SIZE, + PORT, + get_app_info, + session_store, + setup_logging, + validate_config, ) -from lib.connection_validator import sanitize_for_display from lib.log_store import LogStore from lib.log_store_registry import log_store_registry -from lib.snapshot_store import ( - load_snapshot, list_snapshots as get_snapshot_list, - delete_snapshot as remove_snapshot, cleanup_old_snapshots, -) - -# Cookie name for session ID -SESSION_COOKIE_NAME = 'mi_session_id' - -def _store_session_data(new_data): - """Merge new_data into the existing session or create a fresh one.""" - session_id = request.cookies.get(SESSION_COOKIE_NAME) - if session_id and session_store.update_session(session_id, new_data): - return session_id - return session_store.create_session(new_data) +from lib.session_support import SESSION_COOKIE_NAME +from lib.snapshot_store import cleanup_old_snapshots -# Validate configuration on startup try: validate_config() except (PermissionError, ValueError) as e: print(f"Configuration error: {e}") exit(1) -# Setup logging logger = setup_logging() -# Resolve base path for templates & static assets (supports PyInstaller bundles) -if getattr(sys, 'frozen', False): +if getattr(sys, "frozen", False): _base_path = sys._MEIPASS else: _base_path = os.path.dirname(os.path.abspath(__file__)) -# Create a Flask app -app = Flask(__name__, - template_folder=os.path.join(_base_path, 'templates'), - static_folder=os.path.join(_base_path, 'images'), - static_url_path='/images') -# Configure Flask for file uploads -app.config['MAX_CONTENT_LENGTH'] = MAX_FILE_SIZE - - -@app.route('/static/js/') -def mi_static_js(filename): - """Serve shared JS (static_folder is reserved for /images).""" - return send_from_directory(os.path.join(_base_path, 'static', 'js'), filename) - - -# Add security headers to all responses -@app.after_request -def add_security_headers(response): - """Add security headers to all HTTP responses.""" - # Enforce HTTPS and prevent downgrade attacks - response.headers['Strict-Transport-Security'] = 'max-age=31536000; includeSubDomains' - - # Prevent MIME type sniffing - response.headers['X-Content-Type-Options'] = 'nosniff' - - # Prevent clickjacking attacks - response.headers['X-Frame-Options'] = 'DENY' - - # Control referrer information - response.headers['Referrer-Policy'] = 'no-referrer' - - # Content Security Policy - configured to work with Plotly charts - # Note: Plotly requires 'unsafe-inline' and 'unsafe-eval' for rendering - # Note: blob: is required for Plotly snapshot/download functionality - response.headers['Content-Security-Policy'] = ( - "default-src 'self'; " - "script-src 'self' 'unsafe-inline' 'unsafe-eval' https://cdn.plot.ly; " - "style-src 'self' 'unsafe-inline'; " - "img-src 'self' data: blob: https:; " - "font-src 'self' data:; " - "connect-src 'self' blob:;" +def create_app(): + app = Flask( + __name__, + template_folder=os.path.join(_base_path, "templates"), + static_folder=os.path.join(_base_path, "images"), + static_url_path="/images", ) - - # Additional security headers - response.headers['X-XSS-Protection'] = '1; mode=block' - response.headers['Permissions-Policy'] = 'geolocation=(), microphone=(), camera=()' - - return response - -# Make app version available to all templates -@app.context_processor -def inject_app_version(): - return dict(app_version=APP_VERSION, developer_credits=DEVELOPER_CREDITS) - -# Handle file too large error -@app.errorhandler(413) -def too_large(e): - max_size_mb = MAX_FILE_SIZE / (1024 * 1024) - return render_template('error.html', - error_title="File Too Large", - error_message=f"File size exceeds maximum allowed size ({max_size_mb:.1f} MB)."), 413 - -@app.route('/') -def home_page(): - # Calculate max file size in GB for display - max_file_size_gb = MAX_FILE_SIZE / (1024 * 1024 * 1024) - - if not CONNECTION_STRING: - connection_string_form = ''' -

''' - else: - # Use safe sanitization for display - sanitized_connection = sanitize_for_display(CONNECTION_STRING) - connection_string_form = f"

Connecting to Destination Cluster at: {sanitized_connection}

" - - if not PROGRESS_ENDPOINT_URL: - progress_endpoint_form = ''' -

''' - else: - progress_endpoint_form = f"

Mongosync Progress Endpoint: {PROGRESS_ENDPOINT_URL}

" - - # Migration verifier connection string form - if not VERIFIER_CONNECTION_STRING: - verifier_connection_string_form = ''' -

''' - else: - sanitized_connection = sanitize_for_display(VERIFIER_CONNECTION_STRING) - verifier_connection_string_form = f"

Connecting to Verifier DB at: {sanitized_connection}

" - - return render_template('home.html', - connection_string_form=connection_string_form, - progress_endpoint_form=progress_endpoint_form, - verifier_connection_string_form=verifier_connection_string_form, - max_file_size_gb=max_file_size_gb) - - -@app.route('/logout', methods=['POST']) -def logout(): - session_id = request.cookies.get(SESSION_COOKIE_NAME) - if session_id: - session_store.delete_session(session_id) - log_store_registry.cleanup_expired() - cleanup_old_snapshots(LOG_STORE_DIR, LOG_STORE_MAX_AGE_HOURS) - response = make_response('', 200) - response.delete_cookie(SESSION_COOKIE_NAME) - return response - - -@app.route('/uploadLogs', methods=['POST']) -def uploadLogs(): - return upload_file() - -@app.route('/search_logs') -def search_logs(): - """Full-text search across the uploaded log file via SQLite FTS5.""" - store_id = request.args.get('store_id', '').strip() - if not store_id: - return jsonify({'error': 'Missing store_id parameter'}), 400 - - q = request.args.get('q', '').strip() - level = request.args.get('level', '').strip() - try: - page = max(1, int(request.args.get('page', 1))) - except (ValueError, TypeError): - page = 1 - try: - per_page = min(max(1, int(request.args.get('per_page', 50))), 200) - except (ValueError, TypeError): - per_page = 50 - - store = log_store_registry.open_store(store_id) - if store is None: - return jsonify({'error': 'Log store not found or expired'}), 404 - - try: - query = {} - if level: - query['level'] = level - if q: - query['$text'] = q - - result = store.find(query, skip=(page - 1) * per_page, limit=per_page) - result['page'] = page - result['per_page'] = per_page - return jsonify(result) - except Exception as e: - logger.error(f"Log search error: {e}") - return jsonify({'error': 'Search failed', 'detail': str(e)}), 500 - -@app.route('/list_snapshots') -def list_snapshots_route(): - """Return JSON list of saved analysis snapshots.""" - try: - snapshots = get_snapshot_list() - return jsonify(snapshots) - except Exception as e: - logger.error(f"Error listing snapshots: {e}") - return jsonify([]) - -@app.route('/load_snapshot/') -def load_snapshot_route(snapshot_id): - """Load a saved analysis snapshot and render the results page.""" - data = load_snapshot(snapshot_id) - if data is None: - return render_template('error.html', - error_title="Snapshot Not Found", - error_message="The requested analysis snapshot was not found or has expired. " - "Please upload and parse the log file again.") - - store_id = data.get('log_store_id', '') - if store_id: - from lib.snapshot_store import logstore_path - db_path = logstore_path(store_id) - if os.path.exists(db_path): - log_store_registry.register(store_id, db_path) - - template_data = data.get('template_data', {}) - return render_template('upload_results.html', **template_data) - -@app.route('/delete_snapshot/', methods=['DELETE']) -def delete_snapshot_route(snapshot_id): - """Delete a saved analysis snapshot.""" - deleted, store_id = remove_snapshot(snapshot_id) - if store_id: - log_store_registry.remove(store_id) - if deleted: - return jsonify({'status': 'ok'}) - return jsonify({'error': 'Snapshot not found'}), 404 - - - -@app.route('/liveMonitoring', methods=['POST']) -def liveMonitoring(): - # Get connection string from env var or form (no caching) - if CONNECTION_STRING: - TARGET_MONGO_URI = CONNECTION_STRING - else: - TARGET_MONGO_URI = request.form.get('connectionString') - if TARGET_MONGO_URI: - TARGET_MONGO_URI = TARGET_MONGO_URI.strip() if TARGET_MONGO_URI.strip() else None - - # Get progress endpoint URL from env var or form (no caching) - if PROGRESS_ENDPOINT_URL: - progress_url = PROGRESS_ENDPOINT_URL - else: - progress_url = request.form.get('progressEndpointUrl') - if progress_url: - progress_url = progress_url.strip() if progress_url.strip() else None - - # Validate that at least one field is provided - if not TARGET_MONGO_URI and not progress_url: - logger.error("No connection string or progress endpoint URL provided") - return render_template('error.html', - error_title="No Input Provided", - error_message="Please provide at least one of the following: MongoDB Connection String or Mongosync Progress Endpoint URL (or both).") - - # Validate progress endpoint URL format if provided - if progress_url: - if not validate_progress_endpoint_url(progress_url): - logger.error(f"Invalid progress endpoint URL format: {progress_url}") - return render_template('error.html', - error_title="Invalid Progress Endpoint URL", - error_message="The Progress Endpoint URL format is invalid. Expected format: host:port/api/v1/progress (e.g., localhost:27182/api/v1/progress)") - - # Test MongoDB connection if connection string is provided - if TARGET_MONGO_URI: - try: - # Connection test (network, authentication) - validate_connection(TARGET_MONGO_URI) - - except InvalidURI as e: - clear_connection_cache() - logger.error(f"Invalid connection string format: {e}") - return render_template('error.html', - error_title="Invalid Connection String", - error_message="The connection string format is invalid. Please check your MongoDB connection string and try again.") - except PyMongoError as e: - clear_connection_cache() - logger.error(f"Failed to connect: {e}") - return render_template('error.html', - error_title="Connection Failed", - error_message="Could not connect to MongoDB. Please verify your credentials, network connectivity, and that the cluster is accessible.") - except Exception as e: - clear_connection_cache() - logger.error(f"Unexpected error during connection validation: {e}") - return render_template('error.html', - error_title="Connection Error", - error_message="An unexpected error occurred. Please try again.") + app.config["MAX_CONTENT_LENGTH"] = MAX_FILE_SIZE + + @app.route("/static/js/") + def mi_static_js(filename): + return send_from_directory( + os.path.join(_base_path, "static", "js"), filename + ) + + @app.after_request + def add_security_headers(response): + response.headers["Strict-Transport-Security"] = "max-age=31536000; includeSubDomains" + response.headers["X-Content-Type-Options"] = "nosniff" + response.headers["X-Frame-Options"] = "DENY" + response.headers["Referrer-Policy"] = "no-referrer" + response.headers["Content-Security-Policy"] = ( + "default-src 'self'; " + "script-src 'self' 'unsafe-inline' 'unsafe-eval' https://cdn.plot.ly; " + "style-src 'self' 'unsafe-inline'; " + "img-src 'self' data: blob: https:; " + "font-src 'self' data:; " + "connect-src 'self' blob:;" + ) + response.headers["X-XSS-Protection"] = "1; mode=block" + response.headers["Permissions-Policy"] = "geolocation=(), microphone=(), camera=()" + return response + + @app.context_processor + def inject_app_version(): + return dict(app_version=APP_VERSION, developer_credits=DEVELOPER_CREDITS) + + @app.errorhandler(413) + def too_large(e): + max_size_mb = MAX_FILE_SIZE / (1024 * 1024) + return ( + render_template( + "error.html", + error_title="File Too Large", + error_message=( + f"File size exceeds maximum allowed size ({max_size_mb:.1f} MB)." + ), + ), + 413, + ) + + @app.route("/") + def hub(): + return render_template("hub.html") + + @app.route("/logout", methods=["POST"]) + def logout(): + session_id = request.cookies.get(SESSION_COOKIE_NAME) + if session_id: + session_store.delete_session(session_id) + log_store_registry.cleanup_expired() + cleanup_old_snapshots(LOG_STORE_DIR, LOG_STORE_MAX_AGE_HOURS) + response = make_response("", 200) + response.delete_cookie(SESSION_COOKIE_NAME) + return response - # Store credentials in server-side in-memory session store (merge into existing session) - session_data = { - 'connection_string': TARGET_MONGO_URI, - 'endpoint_url': progress_url - } - session_id = _store_session_data(session_data) + app.register_blueprint(logs_bp) + app.register_blueprint(live_bp) - # Determine which tabs to show (pass only boolean flags to template, not credentials) - has_connection_string = bool(TARGET_MONGO_URI) - has_endpoint_url = bool(progress_url) - - # Render the metrics page - response = make_response(plotMetrics( - has_connection_string=has_connection_string, - has_endpoint_url=has_endpoint_url - )) - - # Set session ID in a secure cookie - response.set_cookie( - SESSION_COOKIE_NAME, - session_id, - httponly=True, # Prevent JavaScript access - secure=SECURE_COOKIES, # Only send over HTTPS when enabled - samesite='Strict', # CSRF protection - max_age=SESSION_TIMEOUT - ) - - return response + return app -@app.route('/getLiveMonitoring', methods=['POST']) -def getLiveMonitoring(): - # Get connection string from env var or in-memory session store - if CONNECTION_STRING: - connection_string = CONNECTION_STRING - else: - session_id = request.cookies.get(SESSION_COOKIE_NAME) - session_data = session_store.get_session(session_id) - connection_string = session_data.get('connection_string') - - if not connection_string: - logger.error("No connection string available for metrics refresh") - return jsonify({"error": "No connection string available. Please refresh the page and re-enter your credentials."}), 400 - - return jsonify(gatherMetrics(connection_string)) -@app.route('/getPartitionsData', methods=['POST']) -def getPartitionsData(): - # Get connection string from env var or in-memory session store - if CONNECTION_STRING: - connection_string = CONNECTION_STRING - else: - session_id = request.cookies.get(SESSION_COOKIE_NAME) - session_data = session_store.get_session(session_id) - connection_string = session_data.get('connection_string') - - if not connection_string: - logger.error("No connection string available for partitions data refresh") - return jsonify({"error": "No connection string available. Please refresh the page and re-enter your credentials."}), 400 - - return jsonify(gatherPartitionsMetrics(connection_string)) +app = create_app() -@app.route('/getEndpointData', methods=['POST']) -def getEndpointData(): - # Get endpoint URL from env var or in-memory session store - if PROGRESS_ENDPOINT_URL: - endpoint_url = PROGRESS_ENDPOINT_URL - else: - session_id = request.cookies.get(SESSION_COOKIE_NAME) - session_data = session_store.get_session(session_id) - endpoint_url = session_data.get('endpoint_url') - - if not endpoint_url: - logger.error("No progress endpoint URL available for endpoint data refresh") - return jsonify({"error": "No progress endpoint URL available. Please refresh the page and re-enter your credentials."}), 400 - - return jsonify(gatherEndpointMetrics(endpoint_url)) -@app.route('/Verifier', methods=['POST']) -def Verifier(): - """Render the migration verifier monitoring page.""" - # Get connection string from env var or form - if VERIFIER_CONNECTION_STRING: - TARGET_MONGO_URI = VERIFIER_CONNECTION_STRING +def _format_access_url(host: str, port: int, *, use_ssl: bool) -> str: + """Build a browser-friendly URL for the bind host/port.""" + scheme = "https" if use_ssl else "http" + if host in ("0.0.0.0", ""): + display_host = "127.0.0.1" + elif host == "::": + display_host = "[::1]" else: - TARGET_MONGO_URI = request.form.get('verifierConnectionString') - if TARGET_MONGO_URI: - TARGET_MONGO_URI = TARGET_MONGO_URI.strip() if TARGET_MONGO_URI.strip() else None + display_host = host + if ":" in display_host and not display_host.startswith("["): + try: + import ipaddress - # Get database name from form (default: migration_verification_metadata) - db_name = request.form.get('verifierDbName', 'migration_verification_metadata') - if db_name: - db_name = db_name.strip() if db_name.strip() else 'migration_verification_metadata' + if ipaddress.ip_address(display_host).version == 6: + display_host = f"[{display_host}]" + except ValueError: + pass + return f"{scheme}://{display_host}:{port}/" - if not TARGET_MONGO_URI: - logger.error("No connection string provided for migration verifier") - return render_template('error.html', - error_title="No Connection String", - error_message="Please provide a MongoDB Connection String for the migration verifier database.") - # Test MongoDB connection - try: - validate_connection(TARGET_MONGO_URI) - except InvalidURI as e: - logger.error(f"Invalid connection string format: {e}") - clear_connection_cache() - return render_template('error.html', - error_title="Invalid Connection String", - error_message="The connection string format is invalid. Please check your MongoDB connection string and try again.") - except PyMongoError as e: - logger.error(f"Failed to connect: {e}") - clear_connection_cache() - return render_template('error.html', - error_title="Connection Failed", - error_message="Could not connect to MongoDB. Please verify your credentials, network connectivity, and that the cluster is accessible.") - except Exception as e: - logger.error(f"Unexpected error during connection validation: {e}") - clear_connection_cache() - return render_template('error.html', - error_title="Connection Error", - error_message="An unexpected error occurred. Please try again.") +if __name__ == "__main__": + import flask.cli - # Store credentials in server-side in-memory session store (merge into existing session) - session_data = { - 'verifier_connection_string': TARGET_MONGO_URI, - 'verifier_db_name': db_name - } - session_id = _store_session_data(session_data) + flask.cli.show_server_banner = lambda *args, **kwargs: None - # Render the verifier metrics page - response = make_response(plotVerifierMetrics(db_name=db_name)) - - # Set session ID in a secure cookie - response.set_cookie( - SESSION_COOKIE_NAME, - session_id, - httponly=True, - secure=SECURE_COOKIES, - samesite='Strict', - max_age=SESSION_TIMEOUT - ) - - return response + app_info = get_app_info() + logger.info("Starting %s v%s", app_info["name"], app_info["version"]) + logger.info("Log file: %s", app_info["log_file"]) + logger.info("Server: %s:%s", app_info["host"], app_info["port"]) -@app.route('/getVerifierData', methods=['POST']) -def getVerifierData(): - """Get migration verifier metrics data for AJAX refresh.""" - session_id = request.cookies.get(SESSION_COOKIE_NAME) - session_data = session_store.get_session(session_id) + from lib.app_config import SSL_CERT_PATH, SSL_ENABLED, SSL_KEY_PATH - if VERIFIER_CONNECTION_STRING: - connection_string = VERIFIER_CONNECTION_STRING - else: - connection_string = session_data.get('verifier_connection_string') - - if not connection_string: - logger.error("No connection string available for verifier metrics refresh") - return jsonify({"error": "No connection string available. Please refresh the page and re-enter your credentials."}), 400 - - db_name = session_data.get('verifier_db_name', 'migration_verification_metadata') - - return jsonify(gatherVerifierMetrics(connection_string, db_name)) + access_url = _format_access_url(HOST, PORT, use_ssl=SSL_ENABLED) + logger.info("Access URL: %s", access_url) + print(f"\n {app_info['name']} v{app_info['version']}", flush=True) + print(f" Open in browser: {access_url}", flush=True) + if HOST in ("0.0.0.0", "::"): + print(f" (listening on {HOST}:{PORT} — use your machine IP for remote access)", flush=True) + print(flush=True) -if __name__ == '__main__': - # Log startup information - app_info = get_app_info() - logger.info(f"Starting {app_info['name']} v{app_info['version']}") - logger.info(f"Log file: {app_info['log_file']}") - logger.info(f"Server: {app_info['host']}:{app_info['port']}") - - # Clean up expired log store DB files and snapshots from previous runs LogStore.cleanup_old_stores(LOG_STORE_DIR, LOG_STORE_MAX_AGE_HOURS) cleanup_old_snapshots(LOG_STORE_DIR, LOG_STORE_MAX_AGE_HOURS) - - # Import SSL config - from lib.app_config import SSL_ENABLED, SSL_CERT_PATH, SSL_KEY_PATH - - # Run the Flask app with or without SSL + if SSL_ENABLED: import ssl - # Verify certificate files exist if not os.path.exists(SSL_CERT_PATH): - logger.error(f"SSL certificate not found: {SSL_CERT_PATH}") + logger.error("SSL certificate not found: %s", SSL_CERT_PATH) logger.error("Please provide a valid SSL certificate or set MI_SSL_ENABLED=false") exit(1) if not os.path.exists(SSL_KEY_PATH): - logger.error(f"SSL key not found: {SSL_KEY_PATH}") + logger.error("SSL key not found: %s", SSL_KEY_PATH) logger.error("Please provide a valid SSL private key or set MI_SSL_ENABLED=false") exit(1) - - # Create SSL context + context = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER) context.load_cert_chain(SSL_CERT_PATH, SSL_KEY_PATH) - + logger.info("HTTPS enabled - Starting with SSL/TLS encryption") - logger.info(f"SSL Certificate: {SSL_CERT_PATH}") + logger.info("SSL Certificate: %s", SSL_CERT_PATH) app.run(host=HOST, port=PORT, ssl_context=context) else: logger.warning("HTTPS disabled - Starting with HTTP (insecure)") diff --git a/migration/mongosync_insights/mongosync_insights.spec b/migration/mongosync_insights/mongosync_insights.spec index 16336380..6803d54f 100644 --- a/migration/mongosync_insights/mongosync_insights.spec +++ b/migration/mongosync_insights/mongosync_insights.spec @@ -33,6 +33,10 @@ a = Analysis( (certifi_path, 'certifi'), ], hiddenimports=[ + 'blueprints', + 'blueprints.logs', + 'blueprints.live', + 'lib.session_support', 'lib.logs_metrics', 'lib.live_migration_metrics', 'lib.migration_verifier', diff --git a/migration/mongosync_insights/static/js/mi-snapshots.js b/migration/mongosync_insights/static/js/mi-snapshots.js index 74bc1899..1ddeabed 100644 --- a/migration/mongosync_insights/static/js/mi-snapshots.js +++ b/migration/mongosync_insights/static/js/mi-snapshots.js @@ -4,6 +4,24 @@ (function () { 'use strict'; + function _miListUrl() { + return (window.MI_LOGS && window.MI_LOGS.list) || '/logs/list_snapshots'; + } + + function _miLoadUrl(id) { + if (window.MI_LOGS && window.MI_LOGS.loadT) { + return window.MI_LOGS.loadT.replace('__ID__', encodeURIComponent(id)); + } + return '/logs/load_snapshot/' + encodeURIComponent(id); + } + + function _miDeleteUrl(id) { + if (window.MI_LOGS && window.MI_LOGS.delT) { + return window.MI_LOGS.delT.replace('__ID__', encodeURIComponent(id)); + } + return '/logs/delete_snapshot/' + encodeURIComponent(id); + } + function formatFileSize(bytes) { if (!bytes || bytes === 0) return ''; if (bytes >= 1073741824) return (bytes / 1073741824).toFixed(1) + ' GB'; @@ -30,7 +48,7 @@ } function fetchSnapshotsJson() { - return fetch('/list_snapshots').then(function (r) { + return fetch(_miListUrl()).then(function (r) { return r.json(); }); } @@ -73,7 +91,7 @@ var loadLink = document.createElement('a'); loadLink.className = 'prev-analysis-load'; - loadLink.setAttribute('href', '/load_snapshot/' + encodeURIComponent(s.snapshot_id)); + loadLink.setAttribute('href', _miLoadUrl(s.snapshot_id)); loadLink.textContent = 'Load'; var deleteButton = document.createElement('button'); @@ -134,7 +152,7 @@ var loadLink = document.createElement('a'); loadLink.className = 'upload-dialog-load-btn'; - loadLink.setAttribute('href', '/load_snapshot/' + encodeURIComponent(snapshotId)); + loadLink.setAttribute('href', _miLoadUrl(snapshotId)); loadLink.textContent = 'Load'; var deleteButton = document.createElement('button'); @@ -183,7 +201,7 @@ window.deleteSnapshot = function (id) { if (!confirm('Delete this saved analysis?')) return; - fetch('/delete_snapshot/' + encodeURIComponent(id), { method: 'DELETE' }) + fetch(_miDeleteUrl(id), { method: 'DELETE' }) .then(function () { loadPreviousAnalyses(); }) @@ -212,7 +230,7 @@ window.udDeleteSnapshot = function (id) { if (!confirm('Delete this saved analysis?')) return; - fetch('/delete_snapshot/' + encodeURIComponent(id), { method: 'DELETE' }) + fetch(_miDeleteUrl(id), { method: 'DELETE' }) .then(function () { openUploadDialog(); }) @@ -285,7 +303,7 @@ if (_dupState.matches.length > 0) { var loading = document.getElementById('uploadLoadingOverlay'); if (loading) loading.classList.add('active'); - window.location.href = '/load_snapshot/' + encodeURIComponent(_dupState.matches[0].snapshot_id); + window.location.href = _miLoadUrl(_dupState.matches[0].snapshot_id); } }; @@ -295,7 +313,7 @@ var loading = document.getElementById('uploadLoadingOverlay'); if (loading) loading.classList.add('active'); var delPromises = _dupState.matches.map(function (s) { - return fetch('/delete_snapshot/' + encodeURIComponent(s.snapshot_id), { method: 'DELETE' }).catch(function () {}); + return fetch(_miDeleteUrl(s.snapshot_id), { method: 'DELETE' }).catch(function () {}); }); Promise.all(delPromises).then(function () { if (_dupState.form) _dupState.form.submit(); diff --git a/migration/mongosync_insights/templates/_theme_vars.html b/migration/mongosync_insights/templates/_theme_vars.html index 734b3d55..4bb03de2 100644 --- a/migration/mongosync_insights/templates/_theme_vars.html +++ b/migration/mongosync_insights/templates/_theme_vars.html @@ -56,6 +56,7 @@ --mi-accent-warning: #FFC010; --mi-accent-info: #016BF8; --mi-accent-danger: #DB3030; + --mi-accent: var(--mi-accent-success); } /* Light theme overrides */ diff --git a/migration/mongosync_insights/templates/base.html b/migration/mongosync_insights/templates/base.html index 4f40258e..b39bbaa9 100644 --- a/migration/mongosync_insights/templates/base.html +++ b/migration/mongosync_insights/templates/base.html @@ -6,6 +6,13 @@ {% block title %}Mongosync Insights{% endblock %} {% include '_theme_vars.html' %} +