-
Notifications
You must be signed in to change notification settings - Fork 1.3k
[layer1] CDB Commands with Advertised MaxDuration #2343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
pnakka28
wants to merge
2
commits into
sonic-net:master
Choose a base branch
from
pnakka28:max_duration_timeout
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+210
−0
Open
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,210 @@ | ||
| # CDB Commands with Advertised MaxDuration | ||
|
|
||
| ## Table of Contents | ||
| - [List of Tables](#list-of-tables) | ||
| - [1. Revision](#1-revision) | ||
| - [2. Scope](#2-scope) | ||
| - [3. Definitions/Abbreviations](#3-definitionsabbreviations) | ||
| - [4. Overview](#4-overview) | ||
| - [5. Requirements](#5-requirements) | ||
| - [6. CDB Timeouts](#6-cdb-timeouts) | ||
| - [7. High-Level Design](#7-high-level-design) | ||
| - [7.1 Timeout Resolution Flow](#71-timeout-resolution-flow) | ||
| - [7.2 Commands Using Module Advertised MaxDuration (CMIS Compliant)](#72-commands-using-module-advertised-maxduration-cmis-compliant) | ||
| - [7.3 CDB Commands in the Firmware Upgrade Workflow](#73-cdb-commands-in-the-firmware-upgrade-workflow) | ||
| - [7.4 Repository to Change](#74-repository-to-change) | ||
| - [8. SAI API](#8-sai-api) | ||
| - [9. Configuration and Management](#9-configuration-and-management) | ||
| - [10. Warmboot and Fastboot Design Impact](#10-warmboot-and-fastboot-design-impact) | ||
| - [11. Restrictions/Limitations](#11-restrictionslimitations) | ||
| - [12. Testing Requirements](#12-testing-requirements) | ||
| - [References](#references) | ||
|
|
||
| ## List of Tables | ||
| * [Table 1: Definitions](#table-1-definitions) | ||
| * [Table 2: Commands Using Advertised MaxDuration](#table-2-commands-using-advertised-maxduration) | ||
| * [Table 3: Upgrade Workflow CDB Commands](#table-3-upgrade-workflow-cdb-commands) | ||
|
|
||
| ## 1. Revision | ||
| | Rev | Date | Author | Change Description | | ||
| |-----|------------|--------|--------------------| | ||
| | 0.1 | 2026-05-19 | Pavan Kalyan Nakka | Initial version | | ||
| | 0.2 | 2026-05-20 | Pavan Kalyan Nakka | Addressed review comments | | ||
|
|
||
| ## 2. Scope | ||
|
|
||
| This document describes changes to use module advertised MaxDuration timeouts for CDB firmware commands, as defined in [CMIS 5.3](https://www.oiforum.com/wp-content/uploads/OIF-CMIS-05.3.pdf). It proposes which commands should use the module advertised MaxDuration values and which will retain fixed defaults for command execution time. | ||
|
|
||
| ## 3. Definitions/Abbreviations | ||
|
|
||
| #### Table 1: Definitions | ||
| | **Term** | **Definition/Abbreviations** | | ||
| |-----------------|-----------------------------------------------------------------------| | ||
| | CDB | Command Data Block | | ||
| | CMIS | Common Management Interface Specification | | ||
| | LPL | Local Payload | | ||
| | EPL | Extended Payload | | ||
| | FW | Firmware | | ||
| | xcvrd | Transceiver Daemon | | ||
|
|
||
| ## 4. Overview | ||
|
|
||
| The [CMIS 5.3](https://www.oiforum.com/wp-content/uploads/OIF-CMIS-05.3.pdf) defines per-command **MaxDuration** fields on Page 9Fh (bytes 144–153) that advertise the maximum time a module requires to execute specific CDB commands. These are U16 values in milliseconds, scaled by a multiplier **M** defined by bit 137.3: | ||
|
|
||
| - `0b` -> M = 1 (timeout in 1 ms units) | ||
| - `1b` -> M = 10 (timeout in 10 ms units) | ||
|
|
||
| The advertised MaxDuration values (bytes 144–153) are multiplied by M to get the effective timeout in milliseconds. The module populates these fields in the CDB 0041h (Firmware Management Features) reply. | ||
|
|
||
| **For example**: If MaxDurationStart is 100 and M = 10, the timeout for the Start (0101h) command is 100 × 10 = 1000 ms. | ||
|
|
||
| The current implementation applies a fixed timeout to most CDB commands. This can cause: | ||
|
|
||
| - **Premature timeouts** — Some commands may require more time to complete depending on the module. | ||
| - **Non-compliance** — the host must respect the module advertised MaxDuration when present. | ||
|
|
||
| This change uses the module advertised MaxDuration as the per-command timeout where defined. Commands without a MaxDuration field retain fixed defaults. | ||
|
|
||
| ## 5. Requirements | ||
|
|
||
| ### Functional Requirements | ||
|
|
||
| 1. For CDB commands where the CMIS defines a MaxDuration field, use the module advertised value (plus a safety margin) as the command timeout. | ||
| 2. For CDB commands where the CMIS does not define a MaxDuration field, use the fixed default timeout of 9960 ms (defined in [sonic-platform-common](https://github.com/sonic-net/sonic-platform-common/blob/master/sonic_platform_base/sonic_xcvr/cdb/cdb.py#L65-L66)). | ||
| 3. Fall back to the default timeout when CDB 0041h is not supported by the module or returns invalid data. | ||
| 4. Continue to use CDB status polling for early completion detection regardless of timeout value. | ||
| 5. Add a safety margin on top of the advertised MaxDuration to account for latency. | ||
|
|
||
| ### Non-Functional Requirements | ||
|
|
||
| 1. No increase in firmware upgrade time for modules that already complete within the current timeout window. | ||
| 2. Backward compatibility with modules that do not support CDB 0041h. | ||
|
|
||
| ## 6. CDB Timeouts | ||
|
|
||
| The current default firmware CDB commands timeout in SONiC is **9960 ms**, defined in [sonic-platform-common](https://github.com/sonic-net/sonic-platform-common/blob/master/sonic_platform_base/sonic_xcvr/cdb/cdb.py#L65-L66). This fixed value is used for all firmware related CDB commands today. | ||
|
|
||
| This change modifies the timeout in the CDB command execution path (`CdbFwHandler` in `sonic-platform-common/sonic_platform_base/sonic_xcvr/cdb/cdb_fw.py`). The firmware upgrade flow queries MaxDuration values via CDB 0041h at the start, then passes the appropriate timeout (with a safety margin) to each subsequent CDB command: | ||
|
|
||
| ``` | ||
| ┌──────────────────────────────────────────────────────────────────────┐ | ||
| │ Firmware Upgrade │ | ||
| ├──────────────────────────────────────────────────────────────────────┤ | ||
| │ CdbFwHandler │ | ||
| │ │ │ | ||
| │ ├── Query module for MaxDuration values (CDB 0041h) │ | ||
| │ │ │ │ | ||
| │ │ └── Parse per-command MaxDuration from response │ | ||
| │ │ │ | ||
| │ ├── Abort FW Download (0102h) ── use advertised timeout │ | ||
| │ ├── Start FW Download (0101h) ── use advertised timeout │ | ||
| │ ├── Write FW Block (0103h/0104h) ── use advertised timeout │ | ||
| │ ├── Complete FW Download (0107h) ── use advertised timeout │ | ||
| │ ├── Run FW Image (0109h) ── use default timeout │ | ||
| │ └── Commit FW Image (010Ah) ── use default timeout │ | ||
| └──────────────────────────────────────────────────────────────────────┘ | ||
| ``` | ||
| **Note:** The effective timeout for advertised commands is `MaxDuration × M + SAFETY_MARGIN`. This ensures the module is not timed out prematurely due to communication overhead outside of the module's control. | ||
|
|
||
| ## 7. High-Level Design | ||
|
|
||
| ### 7.1 Timeout Resolution Flow | ||
|
|
||
| When a CDB command is issued during firmware upgrade, the timeout is determined as follows: | ||
|
|
||
| ``` | ||
| CDB command issued | ||
| │ | ||
| ▼ | ||
| Does the CMIS spec define a MaxDuration for this command (via CDB 0041h)? | ||
| │ | ||
| ├── Yes ──► Use the module advertised MaxDuration + SAFETY_MARGIN as the timeout ─┐ | ||
| │ │ | ||
| └── No ──► Use the fixed default timeout ─────────────────────────────────────────┤ | ||
| │ | ||
| ▼ | ||
| Poll CDB status register for early completion regardless of timeout value | ||
| ``` | ||
|
|
||
| ### 7.2 Commands Using Module Advertised MaxDuration (CMIS Compliant) | ||
|
|
||
| The following commands have their MaxDuration defined on Page 9Fh (bytes 144–153) and returned in the CDB 0041h response: | ||
|
|
||
| #### Table 2: Commands Using Advertised MaxDuration | ||
| | Command | CMD ID | MaxDuration Field | Byte Offset (Page 9Fh) | | ||
| |---------|--------|-------------------|------------------------| | ||
| | Start FW Download | 0101h | MaxDurationStart | 144–145 | | ||
| | Abort FW Download | 0102h | MaxDurationAbort | 146–147 | | ||
| | Write FW Block (LPL) | 0103h | MaxDurationWrite | 148–149 | | ||
| | Write FW Block (EPL) | 0104h | MaxDurationWrite | 148–149 | | ||
| | Complete FW Download | 0107h | MaxDurationComplete | 150–151 | | ||
|
|
||
| If CDB 0041h is not supported, these commands fall back to the default timeout. | ||
|
|
||
| ### 7.3 CDB Commands in the Firmware Upgrade Workflow | ||
|
|
||
| The following table lists all CDB commands involved in the firmware upgrade process along with proposed timeout behavior. Commands marked "Default" use the fixed default timeout of **9960 ms** (as defined in [sonic-platform-common](https://github.com/sonic-net/sonic-platform-common/blob/master/sonic_platform_base/sonic_xcvr/cdb/cdb.py#L65-L66)) except the Run FW Image command that has its own fixed timeout value (as defined in [sonic-platform-common](https://github.com/sonic-net/sonic-platform-common/blob/master/sonic_platform_base/sonic_xcvr/fields/cdb_consts.py#L61)): | ||
|
|
||
| #### Table 3: Upgrade Workflow CDB Commands | ||
| | # | Command | CMD ID | MaxDuration | | ||
| |---|---------|--------|-------------| | ||
| | 1 | Firmware Management Features | 0041h | Default | | ||
| | 2 | Get FW Info | 0100h | Default | | ||
| | 3 | Start FW Download | 0101h | Advertised | | ||
| | 4 | Abort FW Download | 0102h | Advertised | | ||
| | 5 | Write FW Block (LPL) | 0103h | Advertised | | ||
| | 6 | Write FW Block (EPL) | 0104h | Advertised | | ||
| | 7 | Complete FW Download | 0107h | Advertised | | ||
| | 8 | Run FW Image | 0109h | Default | | ||
| | 9 | Commit FW Image | 010Ah | Default | | ||
| | 10 | Enter Password | 0001h | Default | | ||
|
|
||
| ### 7.4 Repository to Change | ||
|
|
||
| | Repository | Files to Modify | | ||
| |-----------------------|-----------------| | ||
| | sonic-platform-common | `sonic_platform_base/sonic_xcvr/cdb/cdb_fw.py`, `sonic_platform_base/sonic_xcvr/cdb/cdb.py`, `sonic_platform_base/sonic_xcvr/mem_maps/public/cdb.py`, `sonic_platform_base/sonic_xcvr/fields/cdb_consts.py` | | ||
|
|
||
| **Changes:** | ||
|
|
||
| 1. **Parse MaxDuration in `initFwHandler()`** — Extend the existing CDB 0041h response parsing to extract MaxDuration fields (bytes 144–153), read the M multiplier from bit 137.3, multiply the U16 values by M, and store the resulting ms values as instance attributes. | ||
|
|
||
| 2. **Pass timeouts to `send_cmd()`** — Update the following methods to pass their respective MaxDuration value + SAFETY_MARGIN: | ||
| - `start_fw_download()` -> `timeout_start + SAFETY_MARGIN` | ||
| - `abort_fw_download()` -> `timeout_abort + SAFETY_MARGIN` | ||
| - `write_lpl_block()` / `write_epl_block()` -> `timeout_write + SAFETY_MARGIN` | ||
| - `complete_fw_download()` -> `timeout_complete + SAFETY_MARGIN` | ||
|
|
||
| 3. **Fallback** — If the 0041h reply does not contain valid MaxDuration values (e.g., zero or module doesn't support it), retain `None` so `wait_for_cdb_status()` falls back to its existing default. | ||
|
|
||
| No changes to API signatures, xcvrd, or any higher-level callers. | ||
|
|
||
| ## 8. SAI API | ||
|
|
||
| N/A | ||
|
|
||
| ## 9. Configuration and Management | ||
|
|
||
| N/A | ||
|
|
||
| ## 10. Warmboot and Fastboot Design Impact | ||
|
|
||
| N/A | ||
|
|
||
| ## 11. Restrictions/Limitations | ||
|
|
||
| - **Modules not supporting CDB 0041h**: Fall back to the default timeout for all commands. | ||
| - **Zero-value advertisements**: A module advertising 0 ms is treated as "use default timeout" to avoid immediate timeout failures. | ||
| - **Non-advertised commands**: Commands that do not advertise MaxDuration will rely solely on default or fixed timeouts. | ||
|
|
||
| ## 12. Testing Requirements | ||
|
|
||
| 1. Verify advertised MaxDuration is used as timeout for supported CDB commands. | ||
| 2. Verify fallback to default timeout when CDB 0041h is not supported. | ||
| 3. Verify that a 0 ms advertised value falls back to the default timeout. | ||
| 4. Perform firmware upgrade end-to-end and verify no timeout failures on supported modules. | ||
|
|
||
| ## References | ||
|
|
||
| - [OIF CMIS 5.3](https://www.oiforum.com/wp-content/uploads/OIF-CMIS-05.3.pdf) — Section 9.4.2, Page 9Fh MaxDuration fields (bytes 144–153) | ||
| - [sonic-platform-common](https://github.com/sonic-net/sonic-platform-common/blob/master/sonic_platform_base/sonic_xcvr/cdb/cdb_fw.py) — CDB firmware commands implementation | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pnakka28 please highlight what is the default timeout value used in sonic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prgeor addressed this to highlight default timeout