HLD: Reusable service-level component statistics.#2312
Open
yutongzhang-microsoft wants to merge 5 commits into
Open
HLD: Reusable service-level component statistics.#2312yutongzhang-microsoft wants to merge 5 commits into
yutongzhang-microsoft wants to merge 5 commits into
Conversation
Adds a new HLD describing swss::ComponentStats: a reusable library in sonic-swss-common that produces service-level (control-plane) counters, mirrors them to COUNTERS_DB, and exports them via OTLP to a local OpenTelemetry Collector. The existing SwssStats class in sonic-swss is refactored into a thin facade over this library. Related PRs: - sonic-swss-common#1180 - sonic-swss#4516 - sonic-buildimage#26924 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Yutong Zhang <yutongzhang@microsoft.com>
Collaborator
|
/azp run |
|
No pipelines are associated with this pull request. |
… label Address review feedback: - Replace 'Initial draft' with 'Initial revision' in the revision table. - Treat the SwssStats facade as freshly introduced by this work; remove all references to sonic-swss#4434 in Scope, Overview, Requirements, the facade section, Warmboot, Memory, and Testing. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Yutong Zhang <yutongzhang@microsoft.com>
Collaborator
|
/azp run |
|
No pipelines are associated with this pull request. |
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Yutong Zhang <yutongzhang@microsoft.com>
…mplify section 9 - Reword non-swss vocabulary out-of-scope item as future work. - Remove the sonic-buildimage submodule row from the repositories table; not needed. - Section 9: collapse Manifest / CLI / CONFIG_DB subsections into a single 'Not applicable' note. - Update Phase 1 wording and system-test bullet to reference two companion PRs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Yutong Zhang <yutongzhang@microsoft.com>
Split the previous single component-stats-hld.md into two documents so that responsibilities map cleanly to the teams involved: * component-stats-framework-hld.md (SONiC team): the swss::ComponentStats library, the SwssStats facade pattern, hot path, threading, memory ordering, warmboot, memory and testing for the producer. The DB sink is the only sink documented; OTLP is moved to future work. * component-stats-reporting-hld.md (SONiC team, contract with NDM): the COUNTERS_DB schema (key layout, hash fields, idle suppression) and SWSS-specific vocabulary, plus conventions for future components. The reporting transport (telegraf -> mdm -> Geneva) is owned by the NDM HLD and referenced here, not duplicated. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Yutong Zhang <yutongzhang@microsoft.com>
Collaborator
|
/azp run |
|
No pipelines are associated with this pull request. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What I'm doing
Adding a new High-Level Design document at
doc/component-stats/component-stats-hld.mdthat specifies a reusable mechanism for exposing service-level (control-plane software) counters from SONiC containers.The HLD introduces:
swss::ComponentStatsinsonic-swss-common.SwssStatsinsonic-swssbuilt on top of that library, as the first consumer.Counters are published to two sinks driven from a single in-process atomic snapshot:
COUNTERS_DB— for parity with the existing Flex-Counter pipeline and for on-box diagnostic tooling (redis-cli,show ... stats).Why
SONiC already has dataplane counters (Flex-Counter / SAI), but no uniform mechanism for service-level counters such as orchagent task throughput, gNMI request rate, or BMP error counts. A naive per-container implementation would duplicate atomic counter management, dirty tracking, the writer thread, the Redis schema, and an OTLP exporter in every container — concurrency review, bug fixes, and on-the-wire schemas would all drift. This HLD specifies one reusable producer that any container can adopt with a ~100-line facade.
Companion PRs
sonic-net/sonic-swss-common#1180 —swss::ComponentStatslibrary + unit tests.sonic-net/sonic-swss#4516SwssStatsthin facade overComponentStatsinorchagent/.