From 675f16067cfbb14841a080762ae0be9ef815ee47 Mon Sep 17 00:00:00 2001 From: Yutong Zhang Date: Tue, 28 Apr 2026 10:36:17 +0800 Subject: [PATCH 1/5] [component-stats] Add HLD for SONiC component statistics Adds a new HLD describing swss::ComponentStats: a reusable library in sonic-swss-common that produces service-level (control-plane) counters, mirrors them to COUNTERS_DB, and exports them via OTLP to a local OpenTelemetry Collector. The existing SwssStats class in sonic-swss is refactored into a thin facade over this library. Related PRs: - sonic-swss-common#1180 - sonic-swss#4516 - sonic-buildimage#26924 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Yutong Zhang --- doc/component-stats/component-stats-hld.md | 441 +++++++++++++++++++++ 1 file changed, 441 insertions(+) create mode 100644 doc/component-stats/component-stats-hld.md diff --git a/doc/component-stats/component-stats-hld.md b/doc/component-stats/component-stats-hld.md new file mode 100644 index 00000000000..0bd4f03a935 --- /dev/null +++ b/doc/component-stats/component-stats-hld.md @@ -0,0 +1,441 @@ +# SONiC Component Statistics HLD + +## Table of Content + +- [Revision](#1-revision) +- [Scope](#2-scope) +- [Definitions/Abbreviations](#3-definitionsabbreviations) +- [Overview](#4-overview) +- [Requirements](#5-requirements) +- [Architecture Design](#6-architecture-design) +- [High-Level Design](#7-high-level-design) +- [SAI API](#8-sai-api) +- [Configuration and management](#9-configuration-and-management) +- [Warmboot and Fastboot Design Impact](#10-warmboot-and-fastboot-design-impact) +- [Memory Consumption](#11-memory-consumption) +- [Restrictions/Limitations](#12-restrictionslimitations) +- [Testing Requirements/Design](#13-testing-requirementsdesign) +- [Open/Action items](#14-openaction-items) + +### 1. Revision + +| Rev | Date | Author | Change Description | +|-----|------------|---------------|--------------------------| +| 0.1 | 2026-04-28 | Yutong Zhang | Initial draft | + +### 2. Scope + +This HLD specifies a reusable mechanism for exposing **service-level (control-plane software) counters** from SONiC containers. It introduces a new shared library `swss::ComponentStats` in `sonic-swss-common` and refactors the existing `SwssStats` class in `sonic-swss` (introduced by [sonic-swss#4434](https://github.com/sonic-net/sonic-swss/pull/4434)) into a thin façade over the new library. The library publishes counters to: + +1. `COUNTERS_DB`, for parity with the existing Flex-Counter pipeline and for on-box diagnostic tooling (`redis-cli`, `show ... stats`). +2. A local OpenTelemetry (OTLP) Collector sidecar, so the same counters can be forwarded to off-box telemetry systems (e.g. Geneva mdm) that consume OTLP. + +Configuration of the OTel Collector itself, off-box telemetry endpoints, dashboards, and alerts are explicitly **out of scope** for this HLD. + +### 3. Definitions/Abbreviations + +| Term | Definition | +|-----------------|---------------------------------------------------------------------------------------------| +| Component | A SONiC container that produces service-level counters (e.g. `swss`, `gnmi`, `bmp`). | +| Entity | A logical grouping of metrics inside a component (e.g. an orchagent table, a gNMI path). | +| Metric | A named uint64 counter or gauge inside an entity (e.g. `SET`, `DEL`, `COMPLETE`, `ERROR`). | +| ComponentStats | The new shared library in `sonic-swss-common` providing the producer mechanism. | +| SwssStats | A SWSS-specific façade over `ComponentStats` (lives in `sonic-swss`). | +| DB sink | The output path that mirrors counters into `COUNTERS_DB`. | +| OTLP sink | The output path that exports counters via OpenTelemetry Protocol to a local OTel Collector. | +| OTel Collector | A locally-running OpenTelemetry Collector sidecar; not delivered by this HLD. | + +### 4. Overview + +SONiC already publishes **dataplane** counters via the Flex-Counter framework (`CONFIG_DB / FLEX_COUNTER_TABLE` → `syncd` → `COUNTERS_DB`). What is missing is **service-level** counters — software-side events such as orchagent task throughput, gNMI request rate, BMP message error counts. Without these we cannot answer questions like *"is orchagent draining tasks?"*, *"is gNMI seeing subscribe failures?"*, *"is one container dropping more events than its peers?"*. + +A first attempt ([sonic-swss#4434](https://github.com/sonic-net/sonic-swss/pull/4434)) added a class `SwssStats` directly inside `orchagent`. The same plumbing — atomic counters, dirty tracking, a 1-second writer thread, a Redis-side schema — will be needed by every other SONiC container, and we additionally want to expose these counters via OTLP for off-box collection. Copy-pasting the implementation into each container is unacceptable: every container needs its own concurrency review, bug-fixes drift, and the on-the-wire schemas diverge. + +This HLD specifies a single, reusable producer that: + +1. accumulates counters in process-local atomic state with negligible hot-path cost, +2. mirrors them to `COUNTERS_DB` so `redis-cli`, `show ... stats` CLIs, and any other on-box tooling continue to work, +3. emits them as OTLP metrics to a local OTel Collector for forwarding to off-box telemetry systems, +4. exposes a stable public API so each container only needs to write a thin (~100 LoC) façade. + +### 5. Requirements + +**Functional** + +- R1. A reusable C++ library shall accumulate per-component, per-entity, per-metric `uint64` counters. +- R2. The library shall publish counters to `COUNTERS_DB` under a uniform key layout `_STATS:` (Redis hash, fields = metric names, values = decimal `uint64`). +- R3. The library shall publish the same counters as OpenTelemetry OTLP records to a configurable endpoint (default `localhost:4317`). +- R4. The library shall be usable by any SONiC container by writing a thin façade that owns only the container-specific metric vocabulary. +- R5. The existing `SwssStats` public surface (`gSwssStatsRecord`, `SwssStats::getInstance()`, `recordTask/Complete/Error`) shall remain byte-identical to that introduced in #4434. +- R6. The `COUNTERS_DB` schema introduced by #4434 (`SWSS_STATS:` hash with SET/DEL/COMPLETE/ERROR fields) shall remain unchanged. + +**Non-functional** + +- R7. The hot path (`increment` / `setValue`) shall be lock-free and constant-time after the first use of a given (entity, metric) pair. +- R8. Construction of a `ComponentStats` instance shall not crash the host process if Redis or the OTel Collector is not yet reachable; both sinks shall connect lazily and retry independently. +- R9. A failure in one sink (Redis down, OTel Collector restarting) shall not affect the other sink and shall not affect the hot path. After recovery, no monotonic data point shall be lost beyond intermediate samples (the next successful flush carries the latest cumulative value). +- R10. Idle systems shall produce zero outbound traffic on either sink (driven by per-entity dirty tracking). + +**Out of scope** + +- The OTel Collector itself, including its image, configuration, exporter pipeline to off-box telemetry systems, authentication, and operator onboarding. +- Replacing existing FlexCounter / SAI counter pipelines (those measure dataplane state via SAI; this design measures control-plane software events). +- Defining the metric vocabulary for non-swss containers — that is the job of each container's own façade. + +### 6. Architecture Design + +The architecture is unchanged at the SONiC system level. A single new library is introduced in `sonic-swss-common`; an existing class in `sonic-swss` is refactored to delegate to it; future containers may add their own façades using the same library. + +``` +┌────────────────────────────── SONiC switch ──────────────────────────────┐ +│ │ +│ orchagent (sonic-swss) gnmi / bmp / telemetry / … │ +│ ┌──────────────────────┐ ┌──────────────────────┐ │ +│ │ orch.cpp + SwssStats │ … │ gnmistats / bmpstats │ │ +│ └──────────┬───────────┘ └──────────┬───────────┘ │ +│ │ instrument │ │ +│ ▼ ▼ │ +│ ┌────────────────────────────────────────────────────────────┐ │ +│ │ swss::ComponentStats (in libswsscommon) │ │ +│ │ ┌─────────────────────────────────────────────────────┐ │ │ +│ │ │ atomic counters + dirty tracking + writer thread │ │ │ +│ │ └──────────────┬──────────────────────────┬───────────┘ │ │ +│ │ │ │ │ │ +│ │ DB sink OTLP sink │ │ +│ │ (Redis HSET via swss::Table) (OTLP/gRPC, localhost) │ │ +│ └──────────┬──────────────────────────────────┬──────────────┘ │ +│ │ │ │ +│ ▼ ▼ │ +│ ┌──────────────────────────┐ ┌────────────────────────────┐ │ +│ │ COUNTERS_DB │ │ Local OTel Collector │ │ +│ │ SWSS_STATS:PORT_TABLE │ │ (sidecar container) │ │ +│ │ GNMI_STATS:/iface/… │ │ │ │ +│ │ BMP_STATS:… │ │ batches, retries, adds │ │ +│ │ │ │ resource attrs, exports │ │ +│ │ used by: redis-cli, │ │ to off-box telemetry │ │ +│ │ show stats CLI, local │ └─────────────┬──────────────┘ │ +│ │ diagnostic tools │ │ │ +│ └──────────────────────────┘ │ │ +│ │ OTLP │ +└──────────────────────────────────────────────────┼───────────────────────┘ + │ + ▼ + ┌────────────────────┐ + │ Off-box telemetry │ + │ (e.g. Geneva mdm) │ + └────────────────────┘ +``` + +**Layering rule.** `swss-common` knows nothing of orchagent or any specific container; each container knows only its own façade plus `swss::ComponentStats`. New containers get both sinks for free by writing a ~100-line wrapper. + +**Dual-sink design properties.** + +- *One source of truth.* Both sinks consume the same atomic-counter snapshot inside `ComponentStats`. They cannot diverge: if the OTel pipeline is briefly down, `COUNTERS_DB` still reflects current state, and vice versa. +- *No new transport for local debugging.* The `COUNTERS_DB` layout is unchanged, so `redis-cli`, `show ... stats` CLIs, and any existing in-band tooling keep working. +- *No off-box-system-specific code in containers.* Containers know only `ComponentStats`; the OTLP sink talks to a local OTel Collector at `localhost:4317`, and the Collector handles everything beyond that hop. +- *Independent failure domains.* Failures in one sink (DB unreachable, OTel agent restarting) do not affect the other or the hot path. + +### 7. High-Level Design + +#### 7.1 Repositories changed + +| Repository | What changes | +|--------------------------------|-----------------------------------------------------------------------------| +| `sonic-net/sonic-swss-common` | New library `swss::ComponentStats` + unit tests ([PR #1180](https://github.com/sonic-net/sonic-swss-common/pull/1180)). | +| `sonic-net/sonic-swss` | `SwssStats` is reduced to a thin façade over `ComponentStats` ([PR #4516](https://github.com/sonic-net/sonic-swss/pull/4516)). | +| `sonic-net/sonic-buildimage` | Submodule pointer bumps for the two repos above ([PR #26924](https://github.com/sonic-net/sonic-buildimage/pull/26924)). | + +No platform-specific code is added. No SAI changes. No syncd changes. + +#### 7.2 `swss::ComponentStats` — public API + +```cpp +namespace swss { + +class ComponentStats { +public: + using CounterSnapshot = std::map; + + // Sink configuration. Both sinks default to "on". + struct SinkConfig { + bool enableDb = true; // mirror to COUNTERS_DB + bool enableOtlp = true; // export to local OTel Collector + std::string otlpEndpoint = "localhost:4317"; // OTLP/gRPC endpoint + std::string serviceName; // OTel resource attr (default: componentName) + std::string serviceInstanceId; // OTel resource attr (default: hostname) + }; + + static std::shared_ptr create( + const std::string& componentName, + const std::string& dbName = "COUNTERS_DB", + uint32_t intervalSec = 1, + const SinkConfig& sinks = SinkConfig{}); + + void increment(const std::string& entity, const std::string& metric, uint64_t n = 1); + void setValue (const std::string& entity, const std::string& metric, uint64_t value); + + uint64_t get (const std::string& entity, const std::string& metric); + CounterSnapshot getAll(const std::string& entity); + + void setEnabled(bool on); + bool isEnabled() const; + void stop(); +}; + +} // namespace swss +``` + +`create()` consults a process-wide registry keyed by `componentName`. A second call with the same name returns the existing instance, ensuring containers cannot accidentally start multiple writer threads against the same Redis prefix. + +#### 7.3 Internal state + +Per instance: +- `m_entities : std::map` — `std::map` (not `unordered_map`) so references returned by `getOrCreateEntity` remain valid after later inserts. +- `EntityStats` holds `map>` (heap-allocated because `std::atomic` is not movable) plus a per-entity `atomic version`. +- `m_mutex` guards only the **structure** of the maps (insert/find). Hot-path reads/writes of counter values use `std::atomic` and skip the mutex after the first use. +- `m_running`, `m_enabled` — atomic flags. +- `m_cv` — wakes the writer thread immediately on `stop()` instead of waiting up to `intervalSec`. +- `m_thread` — owns the writer. + +Process-wide: +- `registry : std::map>` (`weak_ptr` so a fully released instance can be destroyed). + +#### 7.4 Hot path + +```cpp +void ComponentStats::increment(const string& entity, const string& metric, uint64_t n) { + if (!isEnabled() || n == 0) return; + + auto& e = getOrCreateEntity(entity); // mutex on first use only + auto& c = getOrCreateCounter(e, metric); // mutex on first use only + + c.value .fetch_add(n, memory_order_relaxed); // ① counter + e.version.fetch_add(1, memory_order_release); // ② dirty-bump (release) +} +``` + +Cost after warm-up: two atomic RMWs. No mutex acquisition, no allocation, no syscall. + +#### 7.5 Writer thread + +Runs at `intervalSec` (default 1 s) and fans the snapshot out to both sinks: + +``` +┌───────────────────────────────────────────────────────────────┐ +│ Phase A — connect each enabled sink (run once, with retry) │ +│ loop until m_running == false: │ +│ if enableDb and !dbConnected: try connect Redis │ +│ if enableOtlp and !otlpConnected: try open OTLP exporter │ +│ if all enabled sinks connected: break │ +│ else cv.wait_for(intervalSec, predicate=!m_running) │ +└───────────────────────────────────────────────────────────────┘ +┌───────────────────────────────────────────────────────────────┐ +│ Phase B — flush loop │ +│ loop: │ +│ cv.wait_for(intervalSec, predicate=!m_running) │ +│ if !m_running: break │ +│ │ +│ # SNAPSHOT (under lock) — single snapshot, two sinks │ +│ for each entity e in m_entities: │ +│ v = e.version.load(acquire) ← pairs ② │ +│ if lastVersion[e.name] == v: continue (skip clean)│ +│ lastVersion[e.name] = v │ +│ row = [(metric, c.value.load(relaxed)) for c in e] │ +│ enqueue(name, row) │ +│ │ +│ # FAN-OUT (lock released, sinks fail independently) │ +│ if enableDb: │ +│ for (name, row) in queue: │ +│ try: m_table->set(name, stringify(row)) │ +│ catch: log warn, continue │ +│ │ +│ if enableOtlp: │ +│ build OTLP ResourceMetrics{ … } from queue │ +│ try: m_otlp->Export(batch) │ +│ catch: log warn, continue │ +└───────────────────────────────────────────────────────────────┘ +``` + +Three properties: + +1. *Lock released before any I/O.* Round-trips under the structural lock would briefly stall every concurrent `increment()`. +2. *Idle systems generate zero outbound traffic on either sink.* When no entity has changed, the queue is empty and neither sink is touched. +3. *Sink isolation.* A failure in one sink is logged and skipped; the other sink still publishes the same cycle's snapshot. + +#### 7.6 Memory ordering correctness + +The release/acquire pair (`②` in 7.4 ↔ acquire-load in 7.5) guarantees: + +> If the writer reads `version == N`, then every counter mutation that contributed to bumping the version up to `N` has already happened-before the reader and is visible. + +Without it, on weakly ordered architectures (ARM, POWER) the writer could see the new version but read an old counter value, recording a stale snapshot. + +#### 7.7 OTLP sink details + +- **Wire format.** OTLP/gRPC over plaintext `localhost:4317`. No TLS or authentication on the local hop — the loopback link is inside the switch, and any off-box credentials live in the OTel Collector. OTLP/HTTP is supported as a build option but not the default. +- **Metric model.** Counters set via `increment()` are exported as OTLP `Sum` with `aggregation_temporality = CUMULATIVE` and `is_monotonic = true`. Counters set via `setValue()` (gauges) are exported as OTLP `Gauge`. +- **Resource attributes** attached to every batch: `service.name=`, `service.instance.id=`, `sonic.component=`. +- **Metric attributes** attached to every data point: `entity` — the table name / gNMI path / etc. The entity is a *label*, not part of the metric name, so dashboards can pivot freely. +- **Metric name** convention: `sonic..` (e.g. `sonic.swss.SET`, `sonic.gnmi.SUBSCRIBE`). +- **Batching / retry.** The producer does not batch beyond one `intervalSec` snapshot and does not retry. Batching, queuing, retrying, and back-pressure are the local OTel Collector's responsibility. +- **Container restart.** `start_time_unix_nano` is captured once in the constructor and advances on every container restart. This is the OTel-defined signal for counter reset; consumers handle it natively. + +#### 7.8 `COUNTERS_DB` sink details + +For component name `C` and entity `E`: + +``` +COUNTERS_DB key: "_STATS:" +hash fields: each metric name → uint64_t string +``` + +Example: `redis-cli -n 2 HGETALL "SWSS_STATS:PORT_TABLE"` → + +``` +1) "SET" +2) "1283" +3) "DEL" +4) "17" +5) "COMPLETE" +6) "1300" +7) "ERROR" +8) "0" +``` + +The shape mirrors the existing `COUNTERS:*` keys produced by the Flex-Counter pipeline. + +#### 7.9 `SwssStats` thin façade + +`SwssStats` (in `sonic-swss/orchagent/`) is reduced to a translation layer that owns only the SWSS-specific vocabulary and the global enable flag consumed by `orch.cpp`: + +```cpp +SwssStats::SwssStats() : m_impl(swss::ComponentStats::create("SWSS")) {} + +void SwssStats::recordTask(const std::string& t, const std::string& op) { + if (op == "SET") m_impl->increment(t, "SET"); + else if (op == "DEL") m_impl->increment(t, "DEL"); +} +void SwssStats::recordComplete(const std::string& t, uint64_t n) { m_impl->increment(t, "COMPLETE", n); } +void SwssStats::recordError (const std::string& t, uint64_t n) { m_impl->increment(t, "ERROR", n); } +``` + +The whole file is ~130 lines of straightforward translation. **The public surface (`gSwssStatsRecord`, `SwssStats::getInstance()`, `recordTask`/`recordComplete`/`recordError`) and the on-the-wire `SWSS_STATS:
` Redis layout are byte-identical to those introduced in #4434.** Existing consumers keep working without changes. + +#### 7.10 Adopting the library in a new container + +To add equivalent metrics to e.g. `gnmi`, write a façade analogous to §7.9: + +```cpp +class GnmiStats { +public: + static GnmiStats* getInstance(); + void recordSubscribe(const std::string& path) { m_impl->increment(path, "SUBSCRIBE"); } + void recordError (const std::string& path) { m_impl->increment(path, "ERROR"); } +private: + GnmiStats() : m_impl(swss::ComponentStats::create("GNMI")) {} + std::shared_ptr m_impl; +}; +``` + +Result: counters land in `COUNTERS_DB` under keys `GNMI_STATS:` **and** are exported as OTLP metrics `sonic.gnmi.SUBSCRIBE` / `sonic.gnmi.ERROR` (with attribute `entity=`). No new threads, no new Redis or gRPC client management, no new test harness needed. + +### 8. SAI API + +No SAI API changes are required for this feature. This design measures control-plane software events inside SONiC containers; it does not query or modify any SAI state. + +### 9. Configuration and management + +#### 9.1 Manifest + +Not applicable. This is a built-in SONiC library, not an Application Extension. + +#### 9.2 CLI/YANG model Enhancements + +No new CLI commands or YANG models are introduced by this HLD. Existing CLIs that already read `COUNTERS_DB` (e.g. `redis-cli -n 2 HGETALL`, `show ... stats` style commands) continue to work and gain visibility into the new `_STATS:` keys for free. + +#### 9.3 Config DB Enhancements + +A future enhancement may add a `COMPONENT_STATS` table in `CONFIG_DB`, keyed by component name, to allow operators to flip individual sinks on/off and to override the OTLP endpoint without rebuilding: + +``` +CONFIG_DB key: COMPONENT_STATS| +fields: enable_db : "true" | "false" + enable_otlp : "true" | "false" + otlp_endpoint : + interval_sec : +``` + +The library reads the table once at construction time. Runtime re-configuration is not in scope for the first cut. + +### 10. Warmboot and Fastboot Design Impact + +Counters are kept in process memory and are reset on container restart, including warmboot and fastboot. This matches the existing behaviour of the `SwssStats` introduced in #4434, and is acceptable because consumers (dashboards, alerts) compute rate-of-change rather than absolute values. The OTLP `start_time_unix_nano` attribute advances on every restart, which is the OTel-standard signal for counter reset and is handled natively by OTel-aware consumers. + +#### Warmboot and Fastboot Performance Impact + +- The library does **not** add any stalls, sleeps, or I/O operations to the boot critical chain. Construction is non-blocking; the writer thread connects to Redis and to the OTel Collector lazily and retries in the background, so a not-yet-ready dependency cannot delay container start. +- No CPU-heavy processing (Jinja templates, etc.) is added in the boot path. +- No third-party dependency is updated by this HLD beyond linking against the OpenTelemetry C++ SDK gRPC exporter, which is loaded only when the OTLP sink is enabled. +- The library does not delay any service or Docker container. + +No measurable boot-time degradation is expected. + +### 11. Memory Consumption + +- Per-instance footprint: O(entities × metrics) `uint64` slots plus their `std::map` keys. Bounded by the number of orchagent tables (≈ tens) for the SWSS façade. +- The OTLP exporter adds a small fixed overhead (one gRPC channel, one per-cycle batch buffer). +- When the feature is disabled at runtime via `setEnabled(false)`, the hot path becomes inert and the writer thread's queue stays empty; memory remains bounded. +- When the feature is disabled at compile time (the OTLP sink can be compiled out via build option), there is no residual memory cost beyond the symbols of `swss::ComponentStats` itself (the DB sink remains unconditional, matching #4434 behaviour). + +### 12. Restrictions/Limitations + +- Counters reset to zero on container restart by design. Consumers must compute rate-of-change rather than rely on absolute values across restarts. +- The library does not retain history; it relies on downstream consumers (`COUNTERS_DB` readers, OTel Collector) for retention. +- The OTLP sink depends on a local OTel Collector reachable at the configured endpoint. If absent, the OTLP sink retries silently in the background; the DB sink and the hot path are unaffected. +- The structural mutex (`m_mutex`) is acquired only on the *first* use of a given (entity, metric) pair. Workloads that constantly mint new entity names will see one mutex acquisition per new name; this is not the expected pattern for SONiC containers. + +### 13. Testing Requirements/Design + +#### 13.1 Unit Test cases + +Library unit tests live in `sonic-swss-common/tests/componentstats_ut.cpp`: + +| # | Test | What it proves | +|---|----------------------------|---------------------------------------------------------------------------------------------| +| 1 | BasicIncrement | `increment` + `get` round-trip | +| 2 | MultipleMetrics | metric isolation within an entity | +| 3 | MultipleEntities | entity isolation within a component | +| 4 | SetValueOverwrites | gauge semantics | +| 5 | DisabledIsNoOp | `setEnabled(false)` makes hot path inert | +| 6 | GetAllReturnsSnapshot | bulk read returns the right shape | +| 7 | ConcurrentIncrements | 8 threads × 10 000 increments → exactly 80 000 (no torn writes, no lost updates) | +| 8 | SingletonSameName | `create("X")` returns the same instance | +| 9 | SingletonDifferentNames | `create("X") ≠ create("Y")` | + +The existing `swssstats_ut.cpp` (9 cases) in `sonic-swss` is kept verbatim and continues to pass against the thin façade, proving the public API has not regressed. + +Run: + +``` +cd sonic-swss-common && ./autogen.sh && ./configure && make check +./tests/tests --gtest_filter='ComponentStats*' +``` + +#### 13.2 System Test cases + +- Boot a `sonic-vs` image built with the three companion PRs. +- Exercise orchagent (e.g. `config vlan add`, `config interface ip add`). +- Verify on-box DB sink: + ``` + redis-cli -n 2 KEYS "SWSS_STATS:*" + redis-cli -n 2 HGETALL "SWSS_STATS:PORT_TABLE" + ``` + Counters increment in proportion to operations; idle dwell shows zero further writes (dirty tracking working). +- Verify OTLP sink (Phase 2): point a local OTel Collector at `localhost:4317` with a debug exporter and confirm `sonic.swss.*` metrics arrive with correct resource and metric attributes. +- Confirm warmboot and fastboot are unaffected (no boot-time regression, no service startup ordering change). + +### 14. Open/Action items + +- Phase 1 (this HLD's three PRs) lands the `ComponentStats` library and `SwssStats` refactor with the DB sink fully active and the OTLP sink stubbed (`enableOtlp=false` by default). +- Phase 2 implements the OTLP sink against the OpenTelemetry C++ SDK and is gated on the local OTel Collector sidecar being available in `sonic-buildimage`. Coordination with whichever team owns the local OTel Collector image is required before Phase 2 can be enabled by default. +- Phase 3 onboards additional SONiC containers (`gnmi`, `bmp`, `telemetry`, …) by adding their own façades. Each is a self-contained PR in the relevant repository. From 9e67d7ce92d25329e35561b81d4a960bfc380420 Mon Sep 17 00:00:00 2001 From: Yutong Zhang Date: Tue, 28 Apr 2026 10:43:05 +0800 Subject: [PATCH 2/5] Component Stats HLD: drop sonic-swss#4434 references; reword revision label Address review feedback: - Replace 'Initial draft' with 'Initial revision' in the revision table. - Treat the SwssStats facade as freshly introduced by this work; remove all references to sonic-swss#4434 in Scope, Overview, Requirements, the facade section, Warmboot, Memory, and Testing. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Yutong Zhang --- doc/component-stats/component-stats-hld.md | 27 +++++++++++++--------- 1 file changed, 16 insertions(+), 11 deletions(-) diff --git a/doc/component-stats/component-stats-hld.md b/doc/component-stats/component-stats-hld.md index 0bd4f03a935..5f47f65b3cc 100644 --- a/doc/component-stats/component-stats-hld.md +++ b/doc/component-stats/component-stats-hld.md @@ -21,11 +21,16 @@ | Rev | Date | Author | Change Description | |-----|------------|---------------|--------------------------| -| 0.1 | 2026-04-28 | Yutong Zhang | Initial draft | +| 0.1 | 2026-04-28 | Yutong Zhang | Initial revision | ### 2. Scope -This HLD specifies a reusable mechanism for exposing **service-level (control-plane software) counters** from SONiC containers. It introduces a new shared library `swss::ComponentStats` in `sonic-swss-common` and refactors the existing `SwssStats` class in `sonic-swss` (introduced by [sonic-swss#4434](https://github.com/sonic-net/sonic-swss/pull/4434)) into a thin façade over the new library. The library publishes counters to: +This HLD specifies a reusable mechanism for exposing **service-level (control-plane software) counters** from SONiC containers. It introduces: + +1. A new shared library `swss::ComponentStats` in `sonic-swss-common`. +2. A SWSS-specific façade `SwssStats` in `sonic-swss` built on top of that library, which is the first consumer. + +The library publishes counters to: 1. `COUNTERS_DB`, for parity with the existing Flex-Counter pipeline and for on-box diagnostic tooling (`redis-cli`, `show ... stats`). 2. A local OpenTelemetry (OTLP) Collector sidecar, so the same counters can be forwarded to off-box telemetry systems (e.g. Geneva mdm) that consume OTLP. @@ -49,7 +54,7 @@ Configuration of the OTel Collector itself, off-box telemetry endpoints, dashboa SONiC already publishes **dataplane** counters via the Flex-Counter framework (`CONFIG_DB / FLEX_COUNTER_TABLE` → `syncd` → `COUNTERS_DB`). What is missing is **service-level** counters — software-side events such as orchagent task throughput, gNMI request rate, BMP message error counts. Without these we cannot answer questions like *"is orchagent draining tasks?"*, *"is gNMI seeing subscribe failures?"*, *"is one container dropping more events than its peers?"*. -A first attempt ([sonic-swss#4434](https://github.com/sonic-net/sonic-swss/pull/4434)) added a class `SwssStats` directly inside `orchagent`. The same plumbing — atomic counters, dirty tracking, a 1-second writer thread, a Redis-side schema — will be needed by every other SONiC container, and we additionally want to expose these counters via OTLP for off-box collection. Copy-pasting the implementation into each container is unacceptable: every container needs its own concurrency review, bug-fixes drift, and the on-the-wire schemas diverge. +A naïve implementation would put this plumbing — atomic counters, dirty tracking, a 1-second writer thread, a Redis-side schema, an OTLP exporter — directly inside each container. That is unacceptable: every container would need its own concurrency review, bug fixes would drift, and the on-the-wire schemas would diverge. This HLD specifies a single, reusable producer that: @@ -66,8 +71,8 @@ This HLD specifies a single, reusable producer that: - R2. The library shall publish counters to `COUNTERS_DB` under a uniform key layout `_STATS:` (Redis hash, fields = metric names, values = decimal `uint64`). - R3. The library shall publish the same counters as OpenTelemetry OTLP records to a configurable endpoint (default `localhost:4317`). - R4. The library shall be usable by any SONiC container by writing a thin façade that owns only the container-specific metric vocabulary. -- R5. The existing `SwssStats` public surface (`gSwssStatsRecord`, `SwssStats::getInstance()`, `recordTask/Complete/Error`) shall remain byte-identical to that introduced in #4434. -- R6. The `COUNTERS_DB` schema introduced by #4434 (`SWSS_STATS:
` hash with SET/DEL/COMPLETE/ERROR fields) shall remain unchanged. +- R5. The first consumer of the library is the SWSS-specific façade `SwssStats` (in `sonic-swss/orchagent/`), which exposes a small SWSS-specific public surface: a global `gSwssStatsRecord` enable flag, `SwssStats::getInstance()`, and `recordTask` / `recordComplete` / `recordError` methods. +- R6. The `SwssStats` façade shall write into `COUNTERS_DB` under keys `SWSS_STATS:
` with hash fields `SET` / `DEL` / `COMPLETE` / `ERROR`, following the uniform schema in R2. **Non-functional** @@ -84,7 +89,7 @@ This HLD specifies a single, reusable producer that: ### 6. Architecture Design -The architecture is unchanged at the SONiC system level. A single new library is introduced in `sonic-swss-common`; an existing class in `sonic-swss` is refactored to delegate to it; future containers may add their own façades using the same library. +The architecture is unchanged at the SONiC system level. A new library is introduced in `sonic-swss-common`, and a new SWSS-specific façade (its first consumer) is added in `sonic-swss`; future containers may add their own façades using the same library. ``` ┌────────────────────────────── SONiC switch ──────────────────────────────┐ @@ -142,7 +147,7 @@ The architecture is unchanged at the SONiC system level. A single new library is | Repository | What changes | |--------------------------------|-----------------------------------------------------------------------------| | `sonic-net/sonic-swss-common` | New library `swss::ComponentStats` + unit tests ([PR #1180](https://github.com/sonic-net/sonic-swss-common/pull/1180)). | -| `sonic-net/sonic-swss` | `SwssStats` is reduced to a thin façade over `ComponentStats` ([PR #4516](https://github.com/sonic-net/sonic-swss/pull/4516)). | +| `sonic-net/sonic-swss` | New `SwssStats` thin façade over `ComponentStats` in `orchagent/` ([PR #4516](https://github.com/sonic-net/sonic-swss/pull/4516)). | | `sonic-net/sonic-buildimage` | Submodule pointer bumps for the two repos above ([PR #26924](https://github.com/sonic-net/sonic-buildimage/pull/26924)). | No platform-specific code is added. No SAI changes. No syncd changes. @@ -319,7 +324,7 @@ void SwssStats::recordComplete(const std::string& t, uint64_t n) { m_impl->incre void SwssStats::recordError (const std::string& t, uint64_t n) { m_impl->increment(t, "ERROR", n); } ``` -The whole file is ~130 lines of straightforward translation. **The public surface (`gSwssStatsRecord`, `SwssStats::getInstance()`, `recordTask`/`recordComplete`/`recordError`) and the on-the-wire `SWSS_STATS:
` Redis layout are byte-identical to those introduced in #4434.** Existing consumers keep working without changes. +The whole file is ~130 lines of straightforward delegation. **The public surface (`gSwssStatsRecord`, `SwssStats::getInstance()`, `recordTask`/`recordComplete`/`recordError`) and the on-the-wire `SWSS_STATS:
` Redis layout are deliberately kept narrow and stable so that the SWSS-specific vocabulary remains independent of future evolution of the underlying `ComponentStats` library.** #### 7.10 Adopting the library in a new container @@ -369,7 +374,7 @@ The library reads the table once at construction time. Runtime re-configuration ### 10. Warmboot and Fastboot Design Impact -Counters are kept in process memory and are reset on container restart, including warmboot and fastboot. This matches the existing behaviour of the `SwssStats` introduced in #4434, and is acceptable because consumers (dashboards, alerts) compute rate-of-change rather than absolute values. The OTLP `start_time_unix_nano` attribute advances on every restart, which is the OTel-standard signal for counter reset and is handled natively by OTel-aware consumers. +Counters are kept in process memory and are reset on container restart, including warmboot and fastboot. This is acceptable because consumers (dashboards, alerts) compute rate-of-change rather than absolute values. The OTLP `start_time_unix_nano` attribute advances on every restart, which is the OTel-standard signal for counter reset and is handled natively by OTel-aware consumers. #### Warmboot and Fastboot Performance Impact @@ -385,7 +390,7 @@ No measurable boot-time degradation is expected. - Per-instance footprint: O(entities × metrics) `uint64` slots plus their `std::map` keys. Bounded by the number of orchagent tables (≈ tens) for the SWSS façade. - The OTLP exporter adds a small fixed overhead (one gRPC channel, one per-cycle batch buffer). - When the feature is disabled at runtime via `setEnabled(false)`, the hot path becomes inert and the writer thread's queue stays empty; memory remains bounded. -- When the feature is disabled at compile time (the OTLP sink can be compiled out via build option), there is no residual memory cost beyond the symbols of `swss::ComponentStats` itself (the DB sink remains unconditional, matching #4434 behaviour). +- When the feature is disabled at compile time (the OTLP sink can be compiled out via build option), there is no residual memory cost beyond the symbols of `swss::ComponentStats` itself; the DB sink remains unconditional. ### 12. Restrictions/Limitations @@ -412,7 +417,7 @@ Library unit tests live in `sonic-swss-common/tests/componentstats_ut.cpp`: | 8 | SingletonSameName | `create("X")` returns the same instance | | 9 | SingletonDifferentNames | `create("X") ≠ create("Y")` | -The existing `swssstats_ut.cpp` (9 cases) in `sonic-swss` is kept verbatim and continues to pass against the thin façade, proving the public API has not regressed. +A façade-level test suite `swssstats_ut.cpp` (9 cases) is added in `sonic-swss` and exercises the SwssStats vocabulary (`recordTask`/`recordComplete`/`recordError`, `gSwssStatsRecord` enable flag, singleton behaviour) end-to-end against the new backend. Run: From fc2dff1694f6ea940689fcc8da78681ddc76d40b Mon Sep 17 00:00:00 2001 From: Yutong Zhang Date: Tue, 28 Apr 2026 10:47:34 +0800 Subject: [PATCH 3/5] Component Stats HLD: use plain ASCII for facade/naive Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Yutong Zhang --- doc/component-stats/component-stats-hld.md | 32 +++++++++++----------- 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/doc/component-stats/component-stats-hld.md b/doc/component-stats/component-stats-hld.md index 5f47f65b3cc..826eae312a3 100644 --- a/doc/component-stats/component-stats-hld.md +++ b/doc/component-stats/component-stats-hld.md @@ -28,7 +28,7 @@ This HLD specifies a reusable mechanism for exposing **service-level (control-plane software) counters** from SONiC containers. It introduces: 1. A new shared library `swss::ComponentStats` in `sonic-swss-common`. -2. A SWSS-specific façade `SwssStats` in `sonic-swss` built on top of that library, which is the first consumer. +2. A SWSS-specific facade `SwssStats` in `sonic-swss` built on top of that library, which is the first consumer. The library publishes counters to: @@ -45,7 +45,7 @@ Configuration of the OTel Collector itself, off-box telemetry endpoints, dashboa | Entity | A logical grouping of metrics inside a component (e.g. an orchagent table, a gNMI path). | | Metric | A named uint64 counter or gauge inside an entity (e.g. `SET`, `DEL`, `COMPLETE`, `ERROR`). | | ComponentStats | The new shared library in `sonic-swss-common` providing the producer mechanism. | -| SwssStats | A SWSS-specific façade over `ComponentStats` (lives in `sonic-swss`). | +| SwssStats | A SWSS-specific facade over `ComponentStats` (lives in `sonic-swss`). | | DB sink | The output path that mirrors counters into `COUNTERS_DB`. | | OTLP sink | The output path that exports counters via OpenTelemetry Protocol to a local OTel Collector. | | OTel Collector | A locally-running OpenTelemetry Collector sidecar; not delivered by this HLD. | @@ -54,14 +54,14 @@ Configuration of the OTel Collector itself, off-box telemetry endpoints, dashboa SONiC already publishes **dataplane** counters via the Flex-Counter framework (`CONFIG_DB / FLEX_COUNTER_TABLE` → `syncd` → `COUNTERS_DB`). What is missing is **service-level** counters — software-side events such as orchagent task throughput, gNMI request rate, BMP message error counts. Without these we cannot answer questions like *"is orchagent draining tasks?"*, *"is gNMI seeing subscribe failures?"*, *"is one container dropping more events than its peers?"*. -A naïve implementation would put this plumbing — atomic counters, dirty tracking, a 1-second writer thread, a Redis-side schema, an OTLP exporter — directly inside each container. That is unacceptable: every container would need its own concurrency review, bug fixes would drift, and the on-the-wire schemas would diverge. +A naive implementation would put this plumbing — atomic counters, dirty tracking, a 1-second writer thread, a Redis-side schema, an OTLP exporter — directly inside each container. That is unacceptable: every container would need its own concurrency review, bug fixes would drift, and the on-the-wire schemas would diverge. This HLD specifies a single, reusable producer that: 1. accumulates counters in process-local atomic state with negligible hot-path cost, 2. mirrors them to `COUNTERS_DB` so `redis-cli`, `show ... stats` CLIs, and any other on-box tooling continue to work, 3. emits them as OTLP metrics to a local OTel Collector for forwarding to off-box telemetry systems, -4. exposes a stable public API so each container only needs to write a thin (~100 LoC) façade. +4. exposes a stable public API so each container only needs to write a thin (~100 LoC) facade. ### 5. Requirements @@ -70,9 +70,9 @@ This HLD specifies a single, reusable producer that: - R1. A reusable C++ library shall accumulate per-component, per-entity, per-metric `uint64` counters. - R2. The library shall publish counters to `COUNTERS_DB` under a uniform key layout `_STATS:` (Redis hash, fields = metric names, values = decimal `uint64`). - R3. The library shall publish the same counters as OpenTelemetry OTLP records to a configurable endpoint (default `localhost:4317`). -- R4. The library shall be usable by any SONiC container by writing a thin façade that owns only the container-specific metric vocabulary. -- R5. The first consumer of the library is the SWSS-specific façade `SwssStats` (in `sonic-swss/orchagent/`), which exposes a small SWSS-specific public surface: a global `gSwssStatsRecord` enable flag, `SwssStats::getInstance()`, and `recordTask` / `recordComplete` / `recordError` methods. -- R6. The `SwssStats` façade shall write into `COUNTERS_DB` under keys `SWSS_STATS:
` with hash fields `SET` / `DEL` / `COMPLETE` / `ERROR`, following the uniform schema in R2. +- R4. The library shall be usable by any SONiC container by writing a thin facade that owns only the container-specific metric vocabulary. +- R5. The first consumer of the library is the SWSS-specific facade `SwssStats` (in `sonic-swss/orchagent/`), which exposes a small SWSS-specific public surface: a global `gSwssStatsRecord` enable flag, `SwssStats::getInstance()`, and `recordTask` / `recordComplete` / `recordError` methods. +- R6. The `SwssStats` facade shall write into `COUNTERS_DB` under keys `SWSS_STATS:
` with hash fields `SET` / `DEL` / `COMPLETE` / `ERROR`, following the uniform schema in R2. **Non-functional** @@ -85,11 +85,11 @@ This HLD specifies a single, reusable producer that: - The OTel Collector itself, including its image, configuration, exporter pipeline to off-box telemetry systems, authentication, and operator onboarding. - Replacing existing FlexCounter / SAI counter pipelines (those measure dataplane state via SAI; this design measures control-plane software events). -- Defining the metric vocabulary for non-swss containers — that is the job of each container's own façade. +- Defining the metric vocabulary for non-swss containers — that is the job of each container's own facade. ### 6. Architecture Design -The architecture is unchanged at the SONiC system level. A new library is introduced in `sonic-swss-common`, and a new SWSS-specific façade (its first consumer) is added in `sonic-swss`; future containers may add their own façades using the same library. +The architecture is unchanged at the SONiC system level. A new library is introduced in `sonic-swss-common`, and a new SWSS-specific facade (its first consumer) is added in `sonic-swss`; future containers may add their own facades using the same library. ``` ┌────────────────────────────── SONiC switch ──────────────────────────────┐ @@ -131,7 +131,7 @@ The architecture is unchanged at the SONiC system level. A new library is introd └────────────────────┘ ``` -**Layering rule.** `swss-common` knows nothing of orchagent or any specific container; each container knows only its own façade plus `swss::ComponentStats`. New containers get both sinks for free by writing a ~100-line wrapper. +**Layering rule.** `swss-common` knows nothing of orchagent or any specific container; each container knows only its own facade plus `swss::ComponentStats`. New containers get both sinks for free by writing a ~100-line wrapper. **Dual-sink design properties.** @@ -147,7 +147,7 @@ The architecture is unchanged at the SONiC system level. A new library is introd | Repository | What changes | |--------------------------------|-----------------------------------------------------------------------------| | `sonic-net/sonic-swss-common` | New library `swss::ComponentStats` + unit tests ([PR #1180](https://github.com/sonic-net/sonic-swss-common/pull/1180)). | -| `sonic-net/sonic-swss` | New `SwssStats` thin façade over `ComponentStats` in `orchagent/` ([PR #4516](https://github.com/sonic-net/sonic-swss/pull/4516)). | +| `sonic-net/sonic-swss` | New `SwssStats` thin facade over `ComponentStats` in `orchagent/` ([PR #4516](https://github.com/sonic-net/sonic-swss/pull/4516)). | | `sonic-net/sonic-buildimage` | Submodule pointer bumps for the two repos above ([PR #26924](https://github.com/sonic-net/sonic-buildimage/pull/26924)). | No platform-specific code is added. No SAI changes. No syncd changes. @@ -309,7 +309,7 @@ Example: `redis-cli -n 2 HGETALL "SWSS_STATS:PORT_TABLE"` → The shape mirrors the existing `COUNTERS:*` keys produced by the Flex-Counter pipeline. -#### 7.9 `SwssStats` thin façade +#### 7.9 `SwssStats` thin facade `SwssStats` (in `sonic-swss/orchagent/`) is reduced to a translation layer that owns only the SWSS-specific vocabulary and the global enable flag consumed by `orch.cpp`: @@ -328,7 +328,7 @@ The whole file is ~130 lines of straightforward delegation. **The public surface #### 7.10 Adopting the library in a new container -To add equivalent metrics to e.g. `gnmi`, write a façade analogous to §7.9: +To add equivalent metrics to e.g. `gnmi`, write a facade analogous to §7.9: ```cpp class GnmiStats { @@ -387,7 +387,7 @@ No measurable boot-time degradation is expected. ### 11. Memory Consumption -- Per-instance footprint: O(entities × metrics) `uint64` slots plus their `std::map` keys. Bounded by the number of orchagent tables (≈ tens) for the SWSS façade. +- Per-instance footprint: O(entities × metrics) `uint64` slots plus their `std::map` keys. Bounded by the number of orchagent tables (≈ tens) for the SWSS facade. - The OTLP exporter adds a small fixed overhead (one gRPC channel, one per-cycle batch buffer). - When the feature is disabled at runtime via `setEnabled(false)`, the hot path becomes inert and the writer thread's queue stays empty; memory remains bounded. - When the feature is disabled at compile time (the OTLP sink can be compiled out via build option), there is no residual memory cost beyond the symbols of `swss::ComponentStats` itself; the DB sink remains unconditional. @@ -417,7 +417,7 @@ Library unit tests live in `sonic-swss-common/tests/componentstats_ut.cpp`: | 8 | SingletonSameName | `create("X")` returns the same instance | | 9 | SingletonDifferentNames | `create("X") ≠ create("Y")` | -A façade-level test suite `swssstats_ut.cpp` (9 cases) is added in `sonic-swss` and exercises the SwssStats vocabulary (`recordTask`/`recordComplete`/`recordError`, `gSwssStatsRecord` enable flag, singleton behaviour) end-to-end against the new backend. +A facade-level test suite `swssstats_ut.cpp` (9 cases) is added in `sonic-swss` and exercises the SwssStats vocabulary (`recordTask`/`recordComplete`/`recordError`, `gSwssStatsRecord` enable flag, singleton behaviour) end-to-end against the new backend. Run: @@ -443,4 +443,4 @@ cd sonic-swss-common && ./autogen.sh && ./configure && make check - Phase 1 (this HLD's three PRs) lands the `ComponentStats` library and `SwssStats` refactor with the DB sink fully active and the OTLP sink stubbed (`enableOtlp=false` by default). - Phase 2 implements the OTLP sink against the OpenTelemetry C++ SDK and is gated on the local OTel Collector sidecar being available in `sonic-buildimage`. Coordination with whichever team owns the local OTel Collector image is required before Phase 2 can be enabled by default. -- Phase 3 onboards additional SONiC containers (`gnmi`, `bmp`, `telemetry`, …) by adding their own façades. Each is a self-contained PR in the relevant repository. +- Phase 3 onboards additional SONiC containers (`gnmi`, `bmp`, `telemetry`, …) by adding their own facades. Each is a self-contained PR in the relevant repository. From 6f9f85d4e51d00754ac2e5c724613a5760ac83f5 Mon Sep 17 00:00:00 2001 From: Yutong Zhang Date: Tue, 28 Apr 2026 11:03:21 +0800 Subject: [PATCH 4/5] Component Stats HLD: trim out-of-scope items, drop buildimage row, simplify section 9 - Reword non-swss vocabulary out-of-scope item as future work. - Remove the sonic-buildimage submodule row from the repositories table; not needed. - Section 9: collapse Manifest / CLI / CONFIG_DB subsections into a single 'Not applicable' note. - Update Phase 1 wording and system-test bullet to reference two companion PRs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Yutong Zhang --- doc/component-stats/component-stats-hld.md | 31 ++++------------------ 1 file changed, 5 insertions(+), 26 deletions(-) diff --git a/doc/component-stats/component-stats-hld.md b/doc/component-stats/component-stats-hld.md index 826eae312a3..4d5559a6e96 100644 --- a/doc/component-stats/component-stats-hld.md +++ b/doc/component-stats/component-stats-hld.md @@ -85,7 +85,7 @@ This HLD specifies a single, reusable producer that: - The OTel Collector itself, including its image, configuration, exporter pipeline to off-box telemetry systems, authentication, and operator onboarding. - Replacing existing FlexCounter / SAI counter pipelines (those measure dataplane state via SAI; this design measures control-plane software events). -- Defining the metric vocabulary for non-swss containers — that is the job of each container's own facade. +- Defining the metric vocabulary for non-swss containers (`gnmi`, `bmp`, `telemetry`, …); this is left as future work. ### 6. Architecture Design @@ -148,7 +148,6 @@ The architecture is unchanged at the SONiC system level. A new library is introd |--------------------------------|-----------------------------------------------------------------------------| | `sonic-net/sonic-swss-common` | New library `swss::ComponentStats` + unit tests ([PR #1180](https://github.com/sonic-net/sonic-swss-common/pull/1180)). | | `sonic-net/sonic-swss` | New `SwssStats` thin facade over `ComponentStats` in `orchagent/` ([PR #4516](https://github.com/sonic-net/sonic-swss/pull/4516)). | -| `sonic-net/sonic-buildimage` | Submodule pointer bumps for the two repos above ([PR #26924](https://github.com/sonic-net/sonic-buildimage/pull/26924)). | No platform-specific code is added. No SAI changes. No syncd changes. @@ -350,27 +349,7 @@ No SAI API changes are required for this feature. This design measures control-p ### 9. Configuration and management -#### 9.1 Manifest - -Not applicable. This is a built-in SONiC library, not an Application Extension. - -#### 9.2 CLI/YANG model Enhancements - -No new CLI commands or YANG models are introduced by this HLD. Existing CLIs that already read `COUNTERS_DB` (e.g. `redis-cli -n 2 HGETALL`, `show ... stats` style commands) continue to work and gain visibility into the new `_STATS:` keys for free. - -#### 9.3 Config DB Enhancements - -A future enhancement may add a `COMPONENT_STATS` table in `CONFIG_DB`, keyed by component name, to allow operators to flip individual sinks on/off and to override the OTLP endpoint without rebuilding: - -``` -CONFIG_DB key: COMPONENT_STATS| -fields: enable_db : "true" | "false" - enable_otlp : "true" | "false" - otlp_endpoint : - interval_sec : -``` - -The library reads the table once at construction time. Runtime re-configuration is not in scope for the first cut. +Not applicable. This HLD introduces no new CLI commands, YANG models, manifests, or `CONFIG_DB` schema. Existing CLIs that already read `COUNTERS_DB` (e.g. `redis-cli -n 2 HGETALL`, `show ... stats` style commands) continue to work and gain visibility into the new `_STATS:` keys for free. ### 10. Warmboot and Fastboot Design Impact @@ -428,7 +407,7 @@ cd sonic-swss-common && ./autogen.sh && ./configure && make check #### 13.2 System Test cases -- Boot a `sonic-vs` image built with the three companion PRs. +- Boot a `sonic-vs` image built with the two companion PRs. - Exercise orchagent (e.g. `config vlan add`, `config interface ip add`). - Verify on-box DB sink: ``` @@ -441,6 +420,6 @@ cd sonic-swss-common && ./autogen.sh && ./configure && make check ### 14. Open/Action items -- Phase 1 (this HLD's three PRs) lands the `ComponentStats` library and `SwssStats` refactor with the DB sink fully active and the OTLP sink stubbed (`enableOtlp=false` by default). -- Phase 2 implements the OTLP sink against the OpenTelemetry C++ SDK and is gated on the local OTel Collector sidecar being available in `sonic-buildimage`. Coordination with whichever team owns the local OTel Collector image is required before Phase 2 can be enabled by default. +- Phase 1 (this HLD's two PRs) lands the `ComponentStats` library and the `SwssStats` facade with the DB sink fully active and the OTLP sink stubbed (`enableOtlp=false` by default). +- Phase 2 implements the OTLP sink against the OpenTelemetry C++ SDK and is gated on the local OTel Collector sidecar being available on the switch. Coordination with whichever team owns the local OTel Collector image is required before Phase 2 can be enabled by default. - Phase 3 onboards additional SONiC containers (`gnmi`, `bmp`, `telemetry`, …) by adding their own facades. Each is a self-contained PR in the relevant repository. From 882b33741d552a288cd7188e08baaf83962be80f Mon Sep 17 00:00:00 2001 From: Yutong Zhang Date: Tue, 12 May 2026 11:06:46 +0800 Subject: [PATCH 5/5] Component Stats HLD: split into Framework + Reporting HLDs Split the previous single component-stats-hld.md into two documents so that responsibilities map cleanly to the teams involved: * component-stats-framework-hld.md (SONiC team): the swss::ComponentStats library, the SwssStats facade pattern, hot path, threading, memory ordering, warmboot, memory and testing for the producer. The DB sink is the only sink documented; OTLP is moved to future work. * component-stats-reporting-hld.md (SONiC team, contract with NDM): the COUNTERS_DB schema (key layout, hash fields, idle suppression) and SWSS-specific vocabulary, plus conventions for future components. The reporting transport (telegraf -> mdm -> Geneva) is owned by the NDM HLD and referenced here, not duplicated. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Yutong Zhang --- .../component-stats-framework-hld.md | 376 ++++++++++++++++ doc/component-stats/component-stats-hld.md | 425 ------------------ .../component-stats-reporting-hld.md | 258 +++++++++++ 3 files changed, 634 insertions(+), 425 deletions(-) create mode 100644 doc/component-stats/component-stats-framework-hld.md delete mode 100644 doc/component-stats/component-stats-hld.md create mode 100644 doc/component-stats/component-stats-reporting-hld.md diff --git a/doc/component-stats/component-stats-framework-hld.md b/doc/component-stats/component-stats-framework-hld.md new file mode 100644 index 00000000000..3ab58e55cf4 --- /dev/null +++ b/doc/component-stats/component-stats-framework-hld.md @@ -0,0 +1,376 @@ +# SONiC Component Statistics — Framework HLD + +## Table of Content + +- [Revision](#1-revision) +- [Scope](#2-scope) +- [Definitions/Abbreviations](#3-definitionsabbreviations) +- [Overview](#4-overview) +- [Requirements](#5-requirements) +- [Architecture Design](#6-architecture-design) +- [High-Level Design](#7-high-level-design) +- [SAI API](#8-sai-api) +- [Configuration and management](#9-configuration-and-management) +- [Warmboot and Fastboot Design Impact](#10-warmboot-and-fastboot-design-impact) +- [Memory Consumption](#11-memory-consumption) +- [Restrictions/Limitations](#12-restrictionslimitations) +- [Testing Requirements/Design](#13-testing-requirementsdesign) +- [Open/Action items](#14-openaction-items) + +### 1. Revision + +| Rev | Date | Author | Change Description | +|-----|------------|---------------|------------------------------------------------------| +| 0.1 | 2026-04-28 | Yutong Zhang | Initial revision | +| 0.2 | 2026-05-12 | Yutong Zhang | Split out the reporting pipeline into a separate HLD | + +### 2. Scope + +This HLD specifies a reusable producer-side mechanism for **service-level (control-plane software) counters** in SONiC containers. It introduces: + +1. A new shared library `swss::ComponentStats` in `sonic-swss-common`. +2. A SWSS-specific facade `SwssStats` in `sonic-swss` built on top of that library, which is the first consumer. + +The library publishes counters into `COUNTERS_DB` so that: + +- on-box diagnostic tooling (`redis-cli`, `show ... stats`) keeps working with no new transport, and +- off-box telemetry consumers can pick the counters up via the reporting pipeline described in the companion HLD. + +**This HLD owns the producer side only**: the library, the facade pattern, the hot-path / threading / memory-ordering design, and warmboot / memory / testing concerns for the library itself. The reporting pipeline (how counters travel from `COUNTERS_DB` to Geneva or other off-box telemetry systems) is specified in the companion HLD: + +- [Component Statistics — Reporting HLD](./component-stats-reporting-hld.md) + +### 3. Definitions/Abbreviations + +| Term | Definition | +|-----------------|---------------------------------------------------------------------------------------------| +| Component | A SONiC container that produces service-level counters (e.g. `swss`, `gnmi`, `bmp`). | +| Entity | A logical grouping of metrics inside a component (e.g. an orchagent table, a gNMI path). | +| Metric | A named uint64 counter or gauge inside an entity (e.g. `SET`, `DEL`, `COMPLETE`, `ERROR`). | +| ComponentStats | The new shared library in `sonic-swss-common` providing the producer mechanism. | +| SwssStats | A SWSS-specific facade over `ComponentStats` (lives in `sonic-swss`). | +| DB sink | The output path that mirrors counters into `COUNTERS_DB`. | + +### 4. Overview + +SONiC already publishes **dataplane** counters via the Flex-Counter framework (`CONFIG_DB / FLEX_COUNTER_TABLE` -> `syncd` -> `COUNTERS_DB`). What is missing is **service-level** counters — software-side events such as orchagent task throughput, gNMI request rate, BMP message error counts. Without these we cannot answer questions like *"is orchagent draining tasks?"*, *"is gNMI seeing subscribe failures?"*, *"is one container dropping more events than its peers?"*. + +A naive implementation would put this plumbing — atomic counters, dirty tracking, a 1-second writer thread, and a Redis-side schema — directly inside each container. That is unacceptable: every container would need its own concurrency review, bug fixes would drift, and the on-the-wire schemas would diverge. + +This HLD specifies a single, reusable producer that: + +1. accumulates counters in process-local atomic state with negligible hot-path cost, +2. mirrors them to `COUNTERS_DB` so `redis-cli`, `show ... stats` CLIs, and any other on-box tooling continue to work, +3. exposes a stable public API so each container only needs to write a thin (~100 LoC) facade. + +How the `COUNTERS_DB` rows then reach Geneva or any other off-box system is the responsibility of the [Reporting HLD](./component-stats-reporting-hld.md). + +### 5. Requirements + +**Functional** + +- R1. A reusable C++ library shall accumulate per-component, per-entity, per-metric `uint64` counters. +- R2. The library shall publish counters to `COUNTERS_DB` under a uniform key layout `_STATS:` (Redis hash, fields = metric names, values = decimal `uint64`). The exact key/field contract is normatively defined in the Reporting HLD. +- R3. The library shall be usable by any SONiC container by writing a thin facade that owns only the container-specific metric vocabulary. +- R4. The first consumer of the library is the SWSS-specific facade `SwssStats` (in `sonic-swss/orchagent/`), which exposes a small SWSS-specific public surface: a global `gSwssStatsRecord` enable flag, `SwssStats::getInstance()`, and `recordTask` / `recordComplete` / `recordError` methods. +- R5. The `SwssStats` facade shall write into `COUNTERS_DB` under keys `SWSS_STATS:
` with hash fields `SET` / `DEL` / `COMPLETE` / `ERROR`, following the uniform schema in R2. + +**Non-functional** + +- R6. The hot path (`increment` / `setValue`) shall be lock-free and constant-time after the first use of a given (entity, metric) pair. +- R7. Construction of a `ComponentStats` instance shall not crash the host process if Redis is not yet reachable; the sink shall connect lazily and retry independently. +- R8. A failure in the sink (Redis down) shall not affect the hot path. After recovery, no monotonic data point shall be lost beyond intermediate samples (the next successful flush carries the latest cumulative value). +- R9. Idle systems shall produce zero outbound traffic on the sink (driven by per-entity dirty tracking). + +**Out of scope** + +- The reporting pipeline that consumes the `COUNTERS_DB` rows (telegraf, mdm, Geneva, etc.) — see the [Reporting HLD](./component-stats-reporting-hld.md). +- Replacing existing FlexCounter / SAI counter pipelines (those measure dataplane state via SAI; this design measures control-plane software events). +- Defining the metric vocabulary for non-swss containers (`gnmi`, `bmp`, `telemetry`, …); this is left as future work. + +### 6. Architecture Design + +The architecture is unchanged at the SONiC system level. A new library is introduced in `sonic-swss-common`, and a new SWSS-specific facade (its first consumer) is added in `sonic-swss`; future containers may add their own facades using the same library. + +``` ++---------------------------- SONiC switch ------------------------------+ +| | +| orchagent (sonic-swss) gnmi / bmp / telemetry / ... | +| +----------------------+ +----------------------+ | +| | orch.cpp + SwssStats | ... | gnmistats / bmpstats | | +| +----------+-----------+ +----------+-----------+ | +| | instrument | | +| v v | +| +----------------------------------------------------------+ | +| | swss::ComponentStats (in libswsscommon) | | +| | +---------------------------------------------------+ | | +| | | atomic counters + dirty tracking + writer thread | | | +| | +-------------------------+-------------------------+ | | +| | | | | +| | DB sink | | +| | (Redis HSET via swss::Table) | | +| +-----------------------------+----------------------------+ | +| | | +| v | +| +-------------------------+ | +| | COUNTERS_DB | | +| | SWSS_STATS:PORT_TABLE | | +| | GNMI_STATS:/iface/... | | +| | BMP_STATS:... | | +| | | | +| | used by: | | +| | - redis-cli | | +| | - show stats CLI | | +| | - reporting pipeline | --> see Reporting HLD | +| +-------------------------+ | ++------------------------------------------------------------------------+ +``` + +**Layering rule.** `swss-common` knows nothing of orchagent or any specific container; each container knows only its own facade plus `swss::ComponentStats`. New containers get the sink for free by writing a ~100-line wrapper. + +**Sink design properties.** + +- *One source of truth.* The sink consumes the atomic-counter snapshot inside `ComponentStats`. +- *No new transport for local debugging.* The `COUNTERS_DB` layout follows the existing convention so `redis-cli`, `show ... stats` CLIs, and any in-band tooling keep working. +- *Sink isolation from hot path.* Failures in the sink (Redis unreachable) do not affect the hot path; they are logged and retried. + +### 7. High-Level Design + +#### 7.1 Repositories changed + +| Repository | What changes | +|--------------------------------|-----------------------------------------------------------------------------| +| `sonic-net/sonic-swss-common` | New library `swss::ComponentStats` + unit tests ([PR #1180](https://github.com/sonic-net/sonic-swss-common/pull/1180)). | +| `sonic-net/sonic-swss` | New `SwssStats` thin facade over `ComponentStats` in `orchagent/` ([PR #4516](https://github.com/sonic-net/sonic-swss/pull/4516)). | + +No platform-specific code is added. No SAI changes. No syncd changes. + +#### 7.2 `swss::ComponentStats` — public API + +```cpp +namespace swss { + +class ComponentStats { +public: + using CounterSnapshot = std::map; + + // Sink configuration. The DB sink is on by default; additional + // sinks (e.g. OTLP) may be added by future revisions and are kept + // off by default. + struct SinkConfig { + bool enableDb = true; // mirror to COUNTERS_DB + }; + + static std::shared_ptr create( + const std::string& componentName, + const std::string& dbName = "COUNTERS_DB", + uint32_t intervalSec = 1, + const SinkConfig& sinks = SinkConfig{}); + + void increment(const std::string& entity, const std::string& metric, uint64_t n = 1); + void setValue (const std::string& entity, const std::string& metric, uint64_t value); + + uint64_t get (const std::string& entity, const std::string& metric); + CounterSnapshot getAll(const std::string& entity); + + void setEnabled(bool on); + bool isEnabled() const; + void stop(); +}; + +} // namespace swss +``` + +`create()` consults a process-wide registry keyed by `componentName`. A second call with the same name returns the existing instance, ensuring containers cannot accidentally start multiple writer threads against the same Redis prefix. + +#### 7.3 Internal state + +Per instance: +- `m_entities : std::map` — `std::map` (not `unordered_map`) so references returned by `getOrCreateEntity` remain valid after later inserts. +- `EntityStats` holds `map>` (heap-allocated because `std::atomic` is not movable) plus a per-entity `atomic version`. +- `m_mutex` guards only the **structure** of the maps (insert/find). Hot-path reads/writes of counter values use `std::atomic` and skip the mutex after the first use. +- `m_running`, `m_enabled` — atomic flags. +- `m_cv` — wakes the writer thread immediately on `stop()` instead of waiting up to `intervalSec`. +- `m_thread` — owns the writer. + +Process-wide: +- `registry : std::map>` (`weak_ptr` so a fully released instance can be destroyed). + +#### 7.4 Hot path + +```cpp +void ComponentStats::increment(const string& entity, const string& metric, uint64_t n) { + if (!isEnabled() || n == 0) return; + + auto& e = getOrCreateEntity(entity); // mutex on first use only + auto& c = getOrCreateCounter(e, metric); // mutex on first use only + + c.value .fetch_add(n, memory_order_relaxed); // (1) counter + e.version.fetch_add(1, memory_order_release); // (2) dirty-bump (release) +} +``` + +Cost after warm-up: two atomic RMWs. No mutex acquisition, no allocation, no syscall. + +#### 7.5 Writer thread + +Runs at `intervalSec` (default 1 s) and flushes the snapshot to the DB sink: + +``` ++---------------------------------------------------------------+ +| Phase A - connect the DB sink (run once, with retry) | +| loop until m_running == false: | +| if !dbConnected: try connect Redis | +| if connected: break | +| else cv.wait_for(intervalSec, predicate=!m_running) | ++---------------------------------------------------------------+ ++---------------------------------------------------------------+ +| Phase B - flush loop | +| loop: | +| cv.wait_for(intervalSec, predicate=!m_running) | +| if !m_running: break | +| | +| # SNAPSHOT (under lock) | +| for each entity e in m_entities: | +| v = e.version.load(acquire) <- pairs (2) | +| if lastVersion[e.name] == v: continue (skip clean) | +| lastVersion[e.name] = v | +| row = [(metric, c.value.load(relaxed)) for c in e] | +| enqueue(name, row) | +| | +| # FAN-OUT (lock released) | +| for (name, row) in queue: | +| try: m_table->set(name, stringify(row)) | +| catch: log warn, continue | ++---------------------------------------------------------------+ +``` + +Three properties: + +1. *Lock released before any I/O.* Round-trips under the structural lock would briefly stall every concurrent `increment()`. +2. *Idle systems generate zero outbound traffic.* When no entity has changed, the queue is empty and the sink is not touched. +3. *Hot-path isolation.* A sink failure is logged and skipped; the hot path is never blocked. + +#### 7.6 Memory ordering correctness + +The release/acquire pair ((2) in 7.4 ↔ acquire-load in 7.5) guarantees: + +> If the writer reads `version == N`, then every counter mutation that contributed to bumping the version up to `N` has already happened-before the reader and is visible. + +Without it, on weakly ordered architectures (ARM, POWER) the writer could see the new version but read an old counter value, recording a stale snapshot. + +#### 7.7 `SwssStats` thin facade + +`SwssStats` (in `sonic-swss/orchagent/`) is reduced to a translation layer that owns only the SWSS-specific vocabulary and the global enable flag consumed by `orch.cpp`: + +```cpp +SwssStats::SwssStats() : m_impl(swss::ComponentStats::create("SWSS")) {} + +void SwssStats::recordTask(const std::string& t, const std::string& op) { + if (op == "SET") m_impl->increment(t, "SET"); + else if (op == "DEL") m_impl->increment(t, "DEL"); +} +void SwssStats::recordComplete(const std::string& t, uint64_t n) { m_impl->increment(t, "COMPLETE", n); } +void SwssStats::recordError (const std::string& t, uint64_t n) { m_impl->increment(t, "ERROR", n); } +``` + +The whole file is ~130 lines of straightforward delegation. **The public surface (`gSwssStatsRecord`, `SwssStats::getInstance()`, `recordTask`/`recordComplete`/`recordError`) and the on-the-wire `SWSS_STATS:
` Redis layout are deliberately kept narrow and stable so that the SWSS-specific vocabulary remains independent of future evolution of the underlying `ComponentStats` library.** + +The exact `SWSS_STATS:
` schema (key layout, field names, types) is documented in the [Reporting HLD](./component-stats-reporting-hld.md), which owns the contract with downstream consumers. + +#### 7.8 Adopting the library in a new container + +To add equivalent metrics to e.g. `gnmi`, write a facade analogous to §7.7: + +```cpp +class GnmiStats { +public: + static GnmiStats* getInstance(); + void recordSubscribe(const std::string& path) { m_impl->increment(path, "SUBSCRIBE"); } + void recordError (const std::string& path) { m_impl->increment(path, "ERROR"); } +private: + GnmiStats() : m_impl(swss::ComponentStats::create("GNMI")) {} + std::shared_ptr m_impl; +}; +``` + +Result: counters land in `COUNTERS_DB` under keys `GNMI_STATS:`. No new threads, no new Redis client management, no new test harness needed. Reporting then picks them up automatically via the pipeline described in the Reporting HLD. + +### 8. SAI API + +No SAI API changes are required for this feature. This design measures control-plane software events inside SONiC containers; it does not query or modify any SAI state. + +### 9. Configuration and management + +Not applicable. This HLD introduces no new CLI commands, YANG models, manifests, or `CONFIG_DB` schema. Existing CLIs that already read `COUNTERS_DB` (e.g. `redis-cli -n 2 HGETALL`, `show ... stats` style commands) continue to work and gain visibility into the new `_STATS:` keys for free. + +### 10. Warmboot and Fastboot Design Impact + +Counters are kept in process memory and are reset on container restart, including warmboot and fastboot. This is acceptable because consumers (dashboards, alerts) compute rate-of-change rather than absolute values. + +#### Warmboot and Fastboot Performance Impact + +- The library does **not** add any stalls, sleeps, or I/O operations to the boot critical chain. Construction is non-blocking; the writer thread connects to Redis lazily and retries in the background, so a not-yet-ready dependency cannot delay container start. +- No CPU-heavy processing (Jinja templates, etc.) is added in the boot path. +- No third-party dependency is updated by this HLD. +- The library does not delay any service or Docker container. + +No measurable boot-time degradation is expected. + +### 11. Memory Consumption + +- Per-instance footprint: O(entities × metrics) `uint64` slots plus their `std::map` keys. Bounded by the number of orchagent tables (≈ tens) for the SWSS facade. +- When the feature is disabled at runtime via `setEnabled(false)`, the hot path becomes inert and the writer thread's queue stays empty; memory remains bounded. + +### 12. Restrictions/Limitations + +- Counters reset to zero on container restart by design. Consumers must compute rate-of-change rather than rely on absolute values across restarts. +- The library does not retain history; it relies on downstream consumers (`COUNTERS_DB` readers, the reporting pipeline) for retention. +- The structural mutex (`m_mutex`) is acquired only on the *first* use of a given (entity, metric) pair. Workloads that constantly mint new entity names will see one mutex acquisition per new name; this is not the expected pattern for SONiC containers. + +### 13. Testing Requirements/Design + +#### 13.1 Unit Test cases + +Library unit tests live in `sonic-swss-common/tests/componentstats_ut.cpp`: + +| # | Test | What it proves | +|---|----------------------------|---------------------------------------------------------------------------------------------| +| 1 | BasicIncrement | `increment` + `get` round-trip | +| 2 | MultipleMetrics | metric isolation within an entity | +| 3 | MultipleEntities | entity isolation within a component | +| 4 | SetValueOverwrites | gauge semantics | +| 5 | DisabledIsNoOp | `setEnabled(false)` makes hot path inert | +| 6 | GetAllReturnsSnapshot | bulk read returns the right shape | +| 7 | ConcurrentIncrements | 8 threads × 10 000 increments → exactly 80 000 (no torn writes, no lost updates) | +| 8 | SingletonSameName | `create("X")` returns the same instance | +| 9 | SingletonDifferentNames | `create("X") ≠ create("Y")` | + +A facade-level test suite `swssstats_ut.cpp` (9 cases) is added in `sonic-swss` and exercises the SwssStats vocabulary (`recordTask`/`recordComplete`/`recordError`, `gSwssStatsRecord` enable flag, singleton behaviour) end-to-end against the new backend. + +Run: + +``` +cd sonic-swss-common && ./autogen.sh && ./configure && make check +./tests/tests --gtest_filter='ComponentStats*' +``` + +#### 13.2 System Test cases + +- Boot a `sonic-vs` image built with the two companion PRs. +- Exercise orchagent (e.g. `config vlan add`, `config interface ip add`). +- Verify on-box DB sink: + ``` + redis-cli -n 2 KEYS "SWSS_STATS:*" + redis-cli -n 2 HGETALL "SWSS_STATS:PORT_TABLE" + ``` + Counters increment in proportion to operations; idle dwell shows zero further writes (dirty tracking working). +- Confirm warmboot and fastboot are unaffected (no boot-time regression, no service startup ordering change). + +End-to-end validation of the reporting path (telegraf → mdm → Geneva) is covered in the [Reporting HLD](./component-stats-reporting-hld.md). + +### 14. Open/Action items + +- Phase 1 (this HLD's two PRs) lands the `ComponentStats` library and the `SwssStats` facade with the DB sink fully active. +- Phase 2 onboards additional SONiC containers (`gnmi`, `bmp`, `telemetry`, …) by adding their own facades. Each is a self-contained PR in the relevant repository. +- Phase 3 (future) may add direct OTLP export from the library to a local agent for components that need lower reporting latency than the DB → telegraf path provides. Out of scope for this HLD. diff --git a/doc/component-stats/component-stats-hld.md b/doc/component-stats/component-stats-hld.md deleted file mode 100644 index 4d5559a6e96..00000000000 --- a/doc/component-stats/component-stats-hld.md +++ /dev/null @@ -1,425 +0,0 @@ -# SONiC Component Statistics HLD - -## Table of Content - -- [Revision](#1-revision) -- [Scope](#2-scope) -- [Definitions/Abbreviations](#3-definitionsabbreviations) -- [Overview](#4-overview) -- [Requirements](#5-requirements) -- [Architecture Design](#6-architecture-design) -- [High-Level Design](#7-high-level-design) -- [SAI API](#8-sai-api) -- [Configuration and management](#9-configuration-and-management) -- [Warmboot and Fastboot Design Impact](#10-warmboot-and-fastboot-design-impact) -- [Memory Consumption](#11-memory-consumption) -- [Restrictions/Limitations](#12-restrictionslimitations) -- [Testing Requirements/Design](#13-testing-requirementsdesign) -- [Open/Action items](#14-openaction-items) - -### 1. Revision - -| Rev | Date | Author | Change Description | -|-----|------------|---------------|--------------------------| -| 0.1 | 2026-04-28 | Yutong Zhang | Initial revision | - -### 2. Scope - -This HLD specifies a reusable mechanism for exposing **service-level (control-plane software) counters** from SONiC containers. It introduces: - -1. A new shared library `swss::ComponentStats` in `sonic-swss-common`. -2. A SWSS-specific facade `SwssStats` in `sonic-swss` built on top of that library, which is the first consumer. - -The library publishes counters to: - -1. `COUNTERS_DB`, for parity with the existing Flex-Counter pipeline and for on-box diagnostic tooling (`redis-cli`, `show ... stats`). -2. A local OpenTelemetry (OTLP) Collector sidecar, so the same counters can be forwarded to off-box telemetry systems (e.g. Geneva mdm) that consume OTLP. - -Configuration of the OTel Collector itself, off-box telemetry endpoints, dashboards, and alerts are explicitly **out of scope** for this HLD. - -### 3. Definitions/Abbreviations - -| Term | Definition | -|-----------------|---------------------------------------------------------------------------------------------| -| Component | A SONiC container that produces service-level counters (e.g. `swss`, `gnmi`, `bmp`). | -| Entity | A logical grouping of metrics inside a component (e.g. an orchagent table, a gNMI path). | -| Metric | A named uint64 counter or gauge inside an entity (e.g. `SET`, `DEL`, `COMPLETE`, `ERROR`). | -| ComponentStats | The new shared library in `sonic-swss-common` providing the producer mechanism. | -| SwssStats | A SWSS-specific facade over `ComponentStats` (lives in `sonic-swss`). | -| DB sink | The output path that mirrors counters into `COUNTERS_DB`. | -| OTLP sink | The output path that exports counters via OpenTelemetry Protocol to a local OTel Collector. | -| OTel Collector | A locally-running OpenTelemetry Collector sidecar; not delivered by this HLD. | - -### 4. Overview - -SONiC already publishes **dataplane** counters via the Flex-Counter framework (`CONFIG_DB / FLEX_COUNTER_TABLE` → `syncd` → `COUNTERS_DB`). What is missing is **service-level** counters — software-side events such as orchagent task throughput, gNMI request rate, BMP message error counts. Without these we cannot answer questions like *"is orchagent draining tasks?"*, *"is gNMI seeing subscribe failures?"*, *"is one container dropping more events than its peers?"*. - -A naive implementation would put this plumbing — atomic counters, dirty tracking, a 1-second writer thread, a Redis-side schema, an OTLP exporter — directly inside each container. That is unacceptable: every container would need its own concurrency review, bug fixes would drift, and the on-the-wire schemas would diverge. - -This HLD specifies a single, reusable producer that: - -1. accumulates counters in process-local atomic state with negligible hot-path cost, -2. mirrors them to `COUNTERS_DB` so `redis-cli`, `show ... stats` CLIs, and any other on-box tooling continue to work, -3. emits them as OTLP metrics to a local OTel Collector for forwarding to off-box telemetry systems, -4. exposes a stable public API so each container only needs to write a thin (~100 LoC) facade. - -### 5. Requirements - -**Functional** - -- R1. A reusable C++ library shall accumulate per-component, per-entity, per-metric `uint64` counters. -- R2. The library shall publish counters to `COUNTERS_DB` under a uniform key layout `_STATS:` (Redis hash, fields = metric names, values = decimal `uint64`). -- R3. The library shall publish the same counters as OpenTelemetry OTLP records to a configurable endpoint (default `localhost:4317`). -- R4. The library shall be usable by any SONiC container by writing a thin facade that owns only the container-specific metric vocabulary. -- R5. The first consumer of the library is the SWSS-specific facade `SwssStats` (in `sonic-swss/orchagent/`), which exposes a small SWSS-specific public surface: a global `gSwssStatsRecord` enable flag, `SwssStats::getInstance()`, and `recordTask` / `recordComplete` / `recordError` methods. -- R6. The `SwssStats` facade shall write into `COUNTERS_DB` under keys `SWSS_STATS:
` with hash fields `SET` / `DEL` / `COMPLETE` / `ERROR`, following the uniform schema in R2. - -**Non-functional** - -- R7. The hot path (`increment` / `setValue`) shall be lock-free and constant-time after the first use of a given (entity, metric) pair. -- R8. Construction of a `ComponentStats` instance shall not crash the host process if Redis or the OTel Collector is not yet reachable; both sinks shall connect lazily and retry independently. -- R9. A failure in one sink (Redis down, OTel Collector restarting) shall not affect the other sink and shall not affect the hot path. After recovery, no monotonic data point shall be lost beyond intermediate samples (the next successful flush carries the latest cumulative value). -- R10. Idle systems shall produce zero outbound traffic on either sink (driven by per-entity dirty tracking). - -**Out of scope** - -- The OTel Collector itself, including its image, configuration, exporter pipeline to off-box telemetry systems, authentication, and operator onboarding. -- Replacing existing FlexCounter / SAI counter pipelines (those measure dataplane state via SAI; this design measures control-plane software events). -- Defining the metric vocabulary for non-swss containers (`gnmi`, `bmp`, `telemetry`, …); this is left as future work. - -### 6. Architecture Design - -The architecture is unchanged at the SONiC system level. A new library is introduced in `sonic-swss-common`, and a new SWSS-specific facade (its first consumer) is added in `sonic-swss`; future containers may add their own facades using the same library. - -``` -┌────────────────────────────── SONiC switch ──────────────────────────────┐ -│ │ -│ orchagent (sonic-swss) gnmi / bmp / telemetry / … │ -│ ┌──────────────────────┐ ┌──────────────────────┐ │ -│ │ orch.cpp + SwssStats │ … │ gnmistats / bmpstats │ │ -│ └──────────┬───────────┘ └──────────┬───────────┘ │ -│ │ instrument │ │ -│ ▼ ▼ │ -│ ┌────────────────────────────────────────────────────────────┐ │ -│ │ swss::ComponentStats (in libswsscommon) │ │ -│ │ ┌─────────────────────────────────────────────────────┐ │ │ -│ │ │ atomic counters + dirty tracking + writer thread │ │ │ -│ │ └──────────────┬──────────────────────────┬───────────┘ │ │ -│ │ │ │ │ │ -│ │ DB sink OTLP sink │ │ -│ │ (Redis HSET via swss::Table) (OTLP/gRPC, localhost) │ │ -│ └──────────┬──────────────────────────────────┬──────────────┘ │ -│ │ │ │ -│ ▼ ▼ │ -│ ┌──────────────────────────┐ ┌────────────────────────────┐ │ -│ │ COUNTERS_DB │ │ Local OTel Collector │ │ -│ │ SWSS_STATS:PORT_TABLE │ │ (sidecar container) │ │ -│ │ GNMI_STATS:/iface/… │ │ │ │ -│ │ BMP_STATS:… │ │ batches, retries, adds │ │ -│ │ │ │ resource attrs, exports │ │ -│ │ used by: redis-cli, │ │ to off-box telemetry │ │ -│ │ show stats CLI, local │ └─────────────┬──────────────┘ │ -│ │ diagnostic tools │ │ │ -│ └──────────────────────────┘ │ │ -│ │ OTLP │ -└──────────────────────────────────────────────────┼───────────────────────┘ - │ - ▼ - ┌────────────────────┐ - │ Off-box telemetry │ - │ (e.g. Geneva mdm) │ - └────────────────────┘ -``` - -**Layering rule.** `swss-common` knows nothing of orchagent or any specific container; each container knows only its own facade plus `swss::ComponentStats`. New containers get both sinks for free by writing a ~100-line wrapper. - -**Dual-sink design properties.** - -- *One source of truth.* Both sinks consume the same atomic-counter snapshot inside `ComponentStats`. They cannot diverge: if the OTel pipeline is briefly down, `COUNTERS_DB` still reflects current state, and vice versa. -- *No new transport for local debugging.* The `COUNTERS_DB` layout is unchanged, so `redis-cli`, `show ... stats` CLIs, and any existing in-band tooling keep working. -- *No off-box-system-specific code in containers.* Containers know only `ComponentStats`; the OTLP sink talks to a local OTel Collector at `localhost:4317`, and the Collector handles everything beyond that hop. -- *Independent failure domains.* Failures in one sink (DB unreachable, OTel agent restarting) do not affect the other or the hot path. - -### 7. High-Level Design - -#### 7.1 Repositories changed - -| Repository | What changes | -|--------------------------------|-----------------------------------------------------------------------------| -| `sonic-net/sonic-swss-common` | New library `swss::ComponentStats` + unit tests ([PR #1180](https://github.com/sonic-net/sonic-swss-common/pull/1180)). | -| `sonic-net/sonic-swss` | New `SwssStats` thin facade over `ComponentStats` in `orchagent/` ([PR #4516](https://github.com/sonic-net/sonic-swss/pull/4516)). | - -No platform-specific code is added. No SAI changes. No syncd changes. - -#### 7.2 `swss::ComponentStats` — public API - -```cpp -namespace swss { - -class ComponentStats { -public: - using CounterSnapshot = std::map; - - // Sink configuration. Both sinks default to "on". - struct SinkConfig { - bool enableDb = true; // mirror to COUNTERS_DB - bool enableOtlp = true; // export to local OTel Collector - std::string otlpEndpoint = "localhost:4317"; // OTLP/gRPC endpoint - std::string serviceName; // OTel resource attr (default: componentName) - std::string serviceInstanceId; // OTel resource attr (default: hostname) - }; - - static std::shared_ptr create( - const std::string& componentName, - const std::string& dbName = "COUNTERS_DB", - uint32_t intervalSec = 1, - const SinkConfig& sinks = SinkConfig{}); - - void increment(const std::string& entity, const std::string& metric, uint64_t n = 1); - void setValue (const std::string& entity, const std::string& metric, uint64_t value); - - uint64_t get (const std::string& entity, const std::string& metric); - CounterSnapshot getAll(const std::string& entity); - - void setEnabled(bool on); - bool isEnabled() const; - void stop(); -}; - -} // namespace swss -``` - -`create()` consults a process-wide registry keyed by `componentName`. A second call with the same name returns the existing instance, ensuring containers cannot accidentally start multiple writer threads against the same Redis prefix. - -#### 7.3 Internal state - -Per instance: -- `m_entities : std::map` — `std::map` (not `unordered_map`) so references returned by `getOrCreateEntity` remain valid after later inserts. -- `EntityStats` holds `map>` (heap-allocated because `std::atomic` is not movable) plus a per-entity `atomic version`. -- `m_mutex` guards only the **structure** of the maps (insert/find). Hot-path reads/writes of counter values use `std::atomic` and skip the mutex after the first use. -- `m_running`, `m_enabled` — atomic flags. -- `m_cv` — wakes the writer thread immediately on `stop()` instead of waiting up to `intervalSec`. -- `m_thread` — owns the writer. - -Process-wide: -- `registry : std::map>` (`weak_ptr` so a fully released instance can be destroyed). - -#### 7.4 Hot path - -```cpp -void ComponentStats::increment(const string& entity, const string& metric, uint64_t n) { - if (!isEnabled() || n == 0) return; - - auto& e = getOrCreateEntity(entity); // mutex on first use only - auto& c = getOrCreateCounter(e, metric); // mutex on first use only - - c.value .fetch_add(n, memory_order_relaxed); // ① counter - e.version.fetch_add(1, memory_order_release); // ② dirty-bump (release) -} -``` - -Cost after warm-up: two atomic RMWs. No mutex acquisition, no allocation, no syscall. - -#### 7.5 Writer thread - -Runs at `intervalSec` (default 1 s) and fans the snapshot out to both sinks: - -``` -┌───────────────────────────────────────────────────────────────┐ -│ Phase A — connect each enabled sink (run once, with retry) │ -│ loop until m_running == false: │ -│ if enableDb and !dbConnected: try connect Redis │ -│ if enableOtlp and !otlpConnected: try open OTLP exporter │ -│ if all enabled sinks connected: break │ -│ else cv.wait_for(intervalSec, predicate=!m_running) │ -└───────────────────────────────────────────────────────────────┘ -┌───────────────────────────────────────────────────────────────┐ -│ Phase B — flush loop │ -│ loop: │ -│ cv.wait_for(intervalSec, predicate=!m_running) │ -│ if !m_running: break │ -│ │ -│ # SNAPSHOT (under lock) — single snapshot, two sinks │ -│ for each entity e in m_entities: │ -│ v = e.version.load(acquire) ← pairs ② │ -│ if lastVersion[e.name] == v: continue (skip clean)│ -│ lastVersion[e.name] = v │ -│ row = [(metric, c.value.load(relaxed)) for c in e] │ -│ enqueue(name, row) │ -│ │ -│ # FAN-OUT (lock released, sinks fail independently) │ -│ if enableDb: │ -│ for (name, row) in queue: │ -│ try: m_table->set(name, stringify(row)) │ -│ catch: log warn, continue │ -│ │ -│ if enableOtlp: │ -│ build OTLP ResourceMetrics{ … } from queue │ -│ try: m_otlp->Export(batch) │ -│ catch: log warn, continue │ -└───────────────────────────────────────────────────────────────┘ -``` - -Three properties: - -1. *Lock released before any I/O.* Round-trips under the structural lock would briefly stall every concurrent `increment()`. -2. *Idle systems generate zero outbound traffic on either sink.* When no entity has changed, the queue is empty and neither sink is touched. -3. *Sink isolation.* A failure in one sink is logged and skipped; the other sink still publishes the same cycle's snapshot. - -#### 7.6 Memory ordering correctness - -The release/acquire pair (`②` in 7.4 ↔ acquire-load in 7.5) guarantees: - -> If the writer reads `version == N`, then every counter mutation that contributed to bumping the version up to `N` has already happened-before the reader and is visible. - -Without it, on weakly ordered architectures (ARM, POWER) the writer could see the new version but read an old counter value, recording a stale snapshot. - -#### 7.7 OTLP sink details - -- **Wire format.** OTLP/gRPC over plaintext `localhost:4317`. No TLS or authentication on the local hop — the loopback link is inside the switch, and any off-box credentials live in the OTel Collector. OTLP/HTTP is supported as a build option but not the default. -- **Metric model.** Counters set via `increment()` are exported as OTLP `Sum` with `aggregation_temporality = CUMULATIVE` and `is_monotonic = true`. Counters set via `setValue()` (gauges) are exported as OTLP `Gauge`. -- **Resource attributes** attached to every batch: `service.name=`, `service.instance.id=`, `sonic.component=`. -- **Metric attributes** attached to every data point: `entity` — the table name / gNMI path / etc. The entity is a *label*, not part of the metric name, so dashboards can pivot freely. -- **Metric name** convention: `sonic..` (e.g. `sonic.swss.SET`, `sonic.gnmi.SUBSCRIBE`). -- **Batching / retry.** The producer does not batch beyond one `intervalSec` snapshot and does not retry. Batching, queuing, retrying, and back-pressure are the local OTel Collector's responsibility. -- **Container restart.** `start_time_unix_nano` is captured once in the constructor and advances on every container restart. This is the OTel-defined signal for counter reset; consumers handle it natively. - -#### 7.8 `COUNTERS_DB` sink details - -For component name `C` and entity `E`: - -``` -COUNTERS_DB key: "_STATS:" -hash fields: each metric name → uint64_t string -``` - -Example: `redis-cli -n 2 HGETALL "SWSS_STATS:PORT_TABLE"` → - -``` -1) "SET" -2) "1283" -3) "DEL" -4) "17" -5) "COMPLETE" -6) "1300" -7) "ERROR" -8) "0" -``` - -The shape mirrors the existing `COUNTERS:*` keys produced by the Flex-Counter pipeline. - -#### 7.9 `SwssStats` thin facade - -`SwssStats` (in `sonic-swss/orchagent/`) is reduced to a translation layer that owns only the SWSS-specific vocabulary and the global enable flag consumed by `orch.cpp`: - -```cpp -SwssStats::SwssStats() : m_impl(swss::ComponentStats::create("SWSS")) {} - -void SwssStats::recordTask(const std::string& t, const std::string& op) { - if (op == "SET") m_impl->increment(t, "SET"); - else if (op == "DEL") m_impl->increment(t, "DEL"); -} -void SwssStats::recordComplete(const std::string& t, uint64_t n) { m_impl->increment(t, "COMPLETE", n); } -void SwssStats::recordError (const std::string& t, uint64_t n) { m_impl->increment(t, "ERROR", n); } -``` - -The whole file is ~130 lines of straightforward delegation. **The public surface (`gSwssStatsRecord`, `SwssStats::getInstance()`, `recordTask`/`recordComplete`/`recordError`) and the on-the-wire `SWSS_STATS:
` Redis layout are deliberately kept narrow and stable so that the SWSS-specific vocabulary remains independent of future evolution of the underlying `ComponentStats` library.** - -#### 7.10 Adopting the library in a new container - -To add equivalent metrics to e.g. `gnmi`, write a facade analogous to §7.9: - -```cpp -class GnmiStats { -public: - static GnmiStats* getInstance(); - void recordSubscribe(const std::string& path) { m_impl->increment(path, "SUBSCRIBE"); } - void recordError (const std::string& path) { m_impl->increment(path, "ERROR"); } -private: - GnmiStats() : m_impl(swss::ComponentStats::create("GNMI")) {} - std::shared_ptr m_impl; -}; -``` - -Result: counters land in `COUNTERS_DB` under keys `GNMI_STATS:` **and** are exported as OTLP metrics `sonic.gnmi.SUBSCRIBE` / `sonic.gnmi.ERROR` (with attribute `entity=`). No new threads, no new Redis or gRPC client management, no new test harness needed. - -### 8. SAI API - -No SAI API changes are required for this feature. This design measures control-plane software events inside SONiC containers; it does not query or modify any SAI state. - -### 9. Configuration and management - -Not applicable. This HLD introduces no new CLI commands, YANG models, manifests, or `CONFIG_DB` schema. Existing CLIs that already read `COUNTERS_DB` (e.g. `redis-cli -n 2 HGETALL`, `show ... stats` style commands) continue to work and gain visibility into the new `_STATS:` keys for free. - -### 10. Warmboot and Fastboot Design Impact - -Counters are kept in process memory and are reset on container restart, including warmboot and fastboot. This is acceptable because consumers (dashboards, alerts) compute rate-of-change rather than absolute values. The OTLP `start_time_unix_nano` attribute advances on every restart, which is the OTel-standard signal for counter reset and is handled natively by OTel-aware consumers. - -#### Warmboot and Fastboot Performance Impact - -- The library does **not** add any stalls, sleeps, or I/O operations to the boot critical chain. Construction is non-blocking; the writer thread connects to Redis and to the OTel Collector lazily and retries in the background, so a not-yet-ready dependency cannot delay container start. -- No CPU-heavy processing (Jinja templates, etc.) is added in the boot path. -- No third-party dependency is updated by this HLD beyond linking against the OpenTelemetry C++ SDK gRPC exporter, which is loaded only when the OTLP sink is enabled. -- The library does not delay any service or Docker container. - -No measurable boot-time degradation is expected. - -### 11. Memory Consumption - -- Per-instance footprint: O(entities × metrics) `uint64` slots plus their `std::map` keys. Bounded by the number of orchagent tables (≈ tens) for the SWSS facade. -- The OTLP exporter adds a small fixed overhead (one gRPC channel, one per-cycle batch buffer). -- When the feature is disabled at runtime via `setEnabled(false)`, the hot path becomes inert and the writer thread's queue stays empty; memory remains bounded. -- When the feature is disabled at compile time (the OTLP sink can be compiled out via build option), there is no residual memory cost beyond the symbols of `swss::ComponentStats` itself; the DB sink remains unconditional. - -### 12. Restrictions/Limitations - -- Counters reset to zero on container restart by design. Consumers must compute rate-of-change rather than rely on absolute values across restarts. -- The library does not retain history; it relies on downstream consumers (`COUNTERS_DB` readers, OTel Collector) for retention. -- The OTLP sink depends on a local OTel Collector reachable at the configured endpoint. If absent, the OTLP sink retries silently in the background; the DB sink and the hot path are unaffected. -- The structural mutex (`m_mutex`) is acquired only on the *first* use of a given (entity, metric) pair. Workloads that constantly mint new entity names will see one mutex acquisition per new name; this is not the expected pattern for SONiC containers. - -### 13. Testing Requirements/Design - -#### 13.1 Unit Test cases - -Library unit tests live in `sonic-swss-common/tests/componentstats_ut.cpp`: - -| # | Test | What it proves | -|---|----------------------------|---------------------------------------------------------------------------------------------| -| 1 | BasicIncrement | `increment` + `get` round-trip | -| 2 | MultipleMetrics | metric isolation within an entity | -| 3 | MultipleEntities | entity isolation within a component | -| 4 | SetValueOverwrites | gauge semantics | -| 5 | DisabledIsNoOp | `setEnabled(false)` makes hot path inert | -| 6 | GetAllReturnsSnapshot | bulk read returns the right shape | -| 7 | ConcurrentIncrements | 8 threads × 10 000 increments → exactly 80 000 (no torn writes, no lost updates) | -| 8 | SingletonSameName | `create("X")` returns the same instance | -| 9 | SingletonDifferentNames | `create("X") ≠ create("Y")` | - -A facade-level test suite `swssstats_ut.cpp` (9 cases) is added in `sonic-swss` and exercises the SwssStats vocabulary (`recordTask`/`recordComplete`/`recordError`, `gSwssStatsRecord` enable flag, singleton behaviour) end-to-end against the new backend. - -Run: - -``` -cd sonic-swss-common && ./autogen.sh && ./configure && make check -./tests/tests --gtest_filter='ComponentStats*' -``` - -#### 13.2 System Test cases - -- Boot a `sonic-vs` image built with the two companion PRs. -- Exercise orchagent (e.g. `config vlan add`, `config interface ip add`). -- Verify on-box DB sink: - ``` - redis-cli -n 2 KEYS "SWSS_STATS:*" - redis-cli -n 2 HGETALL "SWSS_STATS:PORT_TABLE" - ``` - Counters increment in proportion to operations; idle dwell shows zero further writes (dirty tracking working). -- Verify OTLP sink (Phase 2): point a local OTel Collector at `localhost:4317` with a debug exporter and confirm `sonic.swss.*` metrics arrive with correct resource and metric attributes. -- Confirm warmboot and fastboot are unaffected (no boot-time regression, no service startup ordering change). - -### 14. Open/Action items - -- Phase 1 (this HLD's two PRs) lands the `ComponentStats` library and the `SwssStats` facade with the DB sink fully active and the OTLP sink stubbed (`enableOtlp=false` by default). -- Phase 2 implements the OTLP sink against the OpenTelemetry C++ SDK and is gated on the local OTel Collector sidecar being available on the switch. Coordination with whichever team owns the local OTel Collector image is required before Phase 2 can be enabled by default. -- Phase 3 onboards additional SONiC containers (`gnmi`, `bmp`, `telemetry`, …) by adding their own facades. Each is a self-contained PR in the relevant repository. diff --git a/doc/component-stats/component-stats-reporting-hld.md b/doc/component-stats/component-stats-reporting-hld.md new file mode 100644 index 00000000000..7e63a281eab --- /dev/null +++ b/doc/component-stats/component-stats-reporting-hld.md @@ -0,0 +1,258 @@ +# SONiC Component Statistics — Reporting HLD + +## Table of Content + +- [Revision](#1-revision) +- [Scope](#2-scope) +- [Definitions/Abbreviations](#3-definitionsabbreviations) +- [Overview](#4-overview) +- [Requirements](#5-requirements) +- [Architecture Design](#6-architecture-design) +- [High-Level Design](#7-high-level-design) +- [SAI API](#8-sai-api) +- [Configuration and management](#9-configuration-and-management) +- [Warmboot and Fastboot Design Impact](#10-warmboot-and-fastboot-design-impact) +- [Memory Consumption](#11-memory-consumption) +- [Restrictions/Limitations](#12-restrictionslimitations) +- [Testing Requirements/Design](#13-testing-requirementsdesign) +- [Open/Action items](#14-openaction-items) + +### 1. Revision + +| Rev | Date | Author | Change Description | +|-----|------------|---------------|----------------------------------------------------------| +| 0.1 | 2026-05-12 | Yutong Zhang | Initial revision (split from component-stats Framework HLD) | + +### 2. Scope + +This HLD specifies how the service-level component counters produced by `swss::ComponentStats` (see the [Framework HLD](./component-stats-framework-hld.md)) are **reported** from a SONiC switch to off-box telemetry systems. + +For the initial revision the reporting path is exactly one: + +``` +component (swss/gnmi/...) + -> ComponentStats library + -> COUNTERS_DB (Redis) + -> telegraf (Geneva mdm pipeline) + -> Geneva +``` + +This HLD owns the **schema contract** between the producer (`ComponentStats`) and the consumer (telegraf). The deployment, configuration, and operation of the telegraf and mdm containers themselves are owned by the NDM "Geneva integration with SONiC" HLD; this document references them but does not duplicate them. + +Direct application-side OTLP export (e.g. the `OpenTelemetry SDK -> mdm` path described in the NDM HLD §4) is **not** part of this revision; it is listed as future work in §14. + +### 3. Definitions/Abbreviations + +| Term | Definition | +|-----------------|---------------------------------------------------------------------------------------------| +| Component | A SONiC container that produces service-level counters (e.g. `swss`, `gnmi`, `bmp`). | +| Entity | A logical grouping of metrics inside a component (e.g. an orchagent table, a gNMI path). | +| Metric | A named `uint64` counter or gauge inside an entity. | +| ComponentStats | The reusable producer library specified in the Framework HLD. | +| `COUNTERS_DB` | The existing SONiC Redis database (logical DB 2) holding counter rows. | +| telegraf | The off-box-friendly metric agent running on the switch; configured and operated by NDM. | +| mdm | Geneva metric agent that consumes telegraf output and forwards it to Geneva. | +| NDM HLD | "Geneva integration with SONiC" HLD, owned by the NDM team. | + +### 4. Overview + +The Framework HLD specifies a producer that writes each component's service-level counters into `COUNTERS_DB` under a uniform key layout. To make those counters useful off-box, we need a stable contract between that producer and whatever agent harvests Redis on the switch and forwards data to Geneva. + +NDM has already designed and is rolling out a telegraf-based pipeline for harvesting `COUNTERS_DB` and forwarding to Geneva (see NDM HLD §5 "Existing stats collecting from Database via mdm"). This HLD therefore does **not** introduce a new transport. Instead it: + +1. **Defines the Redis schema** that the producer writes and that telegraf consumes (key layout, hash fields, types, dirty-tracking semantics). +2. **Specifies the SWSS-specific vocabulary** (`SWSS_STATS:
` with `SET` / `DEL` / `COMPLETE` / `ERROR`). +3. **States the conventions** that future components must follow so that telegraf can pick them up by pattern match without a per-component configuration change. + +The result is a thin, declarative contract between two teams: SONiC owns what is written; NDM owns how it is harvested and forwarded. + +### 5. Requirements + +**Functional** + +- R1. Every SONiC container that integrates `ComponentStats` shall expose its counters in `COUNTERS_DB` under the uniform key layout defined in §7.1. +- R2. The schema shall be discoverable by pattern match (`_STATS:*`) so that a single telegraf input definition can pick up all current and future components without code or configuration changes. +- R3. The SWSS facade (`SwssStats`) shall publish counters under `SWSS_STATS:
` with hash fields `SET`, `DEL`, `COMPLETE`, `ERROR` (decimal `uint64`). +- R4. The schema shall include a per-entity *update marker* (the version-bump in the producer; observable to telegraf as the row's hash value changing) so that idle rows are not re-emitted to Geneva every cycle. + +**Non-functional** + +- R5. The reporting path shall not require changes to the SONiC dataplane, syncd, SAI, or the existing Flex-Counter pipeline. +- R6. The reporting path shall not impose any on-the-wire dependency between SONiC and a specific off-box telemetry system. SONiC writes Redis; whatever consumes Redis is replaceable. +- R7. A failure of telegraf, mdm, or Geneva shall not affect the producer or any other SONiC service. + +**Out of scope** + +- Telegraf container packaging, lifecycle, and configuration. See NDM HLD §5.2 ("telegraf design"). +- mdm container deployment, KubeSonic rollout. See NDM HLD §3 and §6. +- Geneva endpoint, authentication, dashboards, alerting. +- Direct OTLP export from the application (see future work, §14). + +### 6. Architecture Design + +``` ++-------------------------- SONiC switch ---------------------------+ +| | +| +-- container (e.g. swss) -----------------------------------+ | +| | application -> ComponentStats library | | +| +------------------------+-----------------------------------+ | +| | HSET | +| v | +| +-------------------------+ | +| | COUNTERS_DB (Redis DB 2)| | +| | SWSS_STATS:PORT_TABLE | | +| | GNMI_STATS:/iface/... | | +| | BMP_STATS:... | | +| +-----------+-------------+ | +| | HSCAN / HGETALL | +| v | +| +-------------------------+ | +| | telegraf | (owned by NDM HLD §5.2) | +| +-----------+-------------+ | +| | | +| v | +| +-------------------------+ | +| | mdm | (owned by NDM HLD §4) | +| +-----------+-------------+ | +| | | ++--------------------------|----------------------------------------+ + v + +--------+ + | Geneva | + +--------+ +``` + +The boundary owned by this HLD is the box labelled `COUNTERS_DB`. Everything above it (the producer) is specified in the Framework HLD; everything below it (telegraf, mdm, Geneva) is specified in the NDM HLD. This HLD owns the **interface between the two**. + +### 7. High-Level Design + +#### 7.1 `COUNTERS_DB` key layout (the contract) + +For a component named `C` (case-insensitive at the API; rendered uppercase on the wire) and an entity `E`: + +``` +db: COUNTERS_DB (logical DB 2) +key: "_STATS:" +type: Redis hash +fields: each metric name -> decimal uint64 string +``` + +Properties guaranteed by the producer: + +- **Stable suffix `_STATS`.** Every component writes under `_STATS:*` and only there, so telegraf can match `*_STATS:*` (or a per-component pattern such as `SWSS_STATS:*`) to discover all rows for that component without an allow-list. +- **Hash, never string.** Field names are metric names; values are decimal `uint64`. Telegraf can call `HGETALL` and produce one measurement per (key, field) pair. +- **Idle suppression.** A row is `HSET` only when at least one of its metrics changed during the producer's 1 s cycle. Rows that did not change are not rewritten. Therefore an idle SONiC produces zero extra Redis traffic and telegraf, when configured to detect "no change since last poll", produces no upstream traffic either. +- **No TTL.** Keys are not expired; their lifetime is the producer process. On container restart they are recreated by the next 1 s flush. +- **No deletion in v1.** Entities that disappear at the application layer leave their last `HSET` in Redis until the container restarts. Garbage collection is left to the application; the framework does not delete keys (this keeps the contract simple). + +Example for `componentName="SWSS"`, entity `PORT_TABLE`: + +``` +redis-cli -n 2 HGETALL "SWSS_STATS:PORT_TABLE" +1) "SET" +2) "1283" +3) "DEL" +4) "17" +5) "COMPLETE" +6) "1300" +7) "ERROR" +8) "0" +``` + +The shape mirrors the existing `COUNTERS:*` keys produced by the Flex-Counter pipeline so that on-box tooling (`redis-cli`, `show ... stats`) needs no changes. + +#### 7.2 SWSS-specific vocabulary + +The SWSS facade (`SwssStats`) writes to: + +| Key | Field | Meaning | +|--------------------------------------|------------|-------------------------------------------------| +| `SWSS_STATS:` | `SET` | Number of `SET` operations seen on the table. | +| `SWSS_STATS:` | `DEL` | Number of `DEL` operations seen on the table. | +| `SWSS_STATS:` | `COMPLETE` | Number of operations that finished successfully.| +| `SWSS_STATS:` | `ERROR` | Number of operations that finished with error. | + +`` is the same identifier used by orchagent (e.g. `PORT_TABLE`, `VLAN_TABLE`, `ROUTE_TABLE`); no transformation is applied. + +#### 7.3 Conventions for future components + +When onboarding a new component (`gnmi`, `bmp`, `telemetry`, …) using the framework: + +1. Pick a stable, uppercase component name `C`. Counters land under `C_STATS:*` automatically. +2. Define a short, finite set of metric names (verbs/states) that describe the events the component cares about. Avoid putting cardinality-heavy values (interface name, neighbour IP) inside the metric name; put them in the entity (`E`) instead. Telegraf reads the entity from the Redis key and the metric from the hash field, so dashboards can pivot freely. +3. Document the vocabulary in the component's own HLD (one row per field, the same shape as §7.2). + +No telegraf configuration change is required to onboard a new component, provided telegraf is configured to scan `*_STATS:*` patterns (NDM HLD §5.2.1). + +#### 7.4 Interaction with the producer + +The producer (specified in the Framework HLD) maintains a per-entity *version* counter that is bumped on every `increment()` / `setValue()`. The 1 s writer thread snapshots only entities whose version changed since the last cycle and issues one `HSET` per dirty entity. As a result: + +- A row that has not changed since the previous cycle is **not** rewritten — telegraf and Redis monitoring both see this as no activity. +- A row that has changed even once is rewritten with the latest cumulative values, so the next `HGETALL` always returns the latest snapshot. +- There is no risk of telegraf reading a half-written row: each `HSET` is atomic on the Redis side, and a single `HSET` writes all fields of the entity together. + +#### 7.5 Telegraf interface (consumed, not specified here) + +Telegraf is expected to: + +- Run on the switch alongside the SONiC containers (NDM HLD §5.2.2 "telegraf container"). +- Scan `COUNTERS_DB` for keys matching `*_STATS:*`. +- Convert each `(key, field)` pair into a metric named `sonic..` with attributes `entity=`, `host=`. +- Forward to mdm. + +The exact telegraf configuration (input plugin, polling interval, output to mdm) is owned by the NDM HLD §5.2.1. This HLD only commits to the schema described in §7.1 / §7.2 / §7.3. + +### 8. SAI API + +No SAI API changes are required. This HLD covers a Redis schema and an interface to a consumer agent; SAI is not involved. + +### 9. Configuration and management + +Not applicable. This HLD introduces no new CLI commands, YANG models, manifests, or `CONFIG_DB` schema. Operator-facing configuration of telegraf / mdm is documented in the NDM HLD. + +### 10. Warmboot and Fastboot Design Impact + +The Redis schema is process-local: keys live in `COUNTERS_DB` for the duration of the producer container. On warmboot / fastboot the producer container restarts, the keys are recreated at the next 1 s flush, and counters start again from zero (see Framework HLD §10). Telegraf treats the appearance of fresh keys as new measurements; consumers compute rate-of-change and tolerate the reset. + +No boot-critical-chain dependency is added. + +### 11. Memory Consumption + +The reporting path adds no new in-container state beyond what the Framework HLD already describes for the DB sink (one Redis client per producer instance). Redis-side memory is bounded by the number of `(component, entity)` rows × the number of fields × the size of a `uint64` ASCII string; for the SWSS facade this is on the order of tens of rows × four fields. + +Telegraf and mdm memory are owned by the NDM HLD. + +### 12. Restrictions/Limitations + +- The schema is hash-only. Field values are decimal `uint64` strings; non-numeric fields are not supported. Components that need richer types must use a different reporting path (out of scope). +- The schema does not encode metric units. Units are implicit in the metric name (events) for v1; if a future component needs to report bytes / seconds / etc. it should put the unit in the metric name (e.g. `BYTES_RX`) until a more elaborate schema is introduced. +- Entity names are opaque strings. They must be safe for use as a Redis key suffix and for use as an attribute value downstream; in practice all SONiC table names already satisfy this. +- No deletion in v1 (see §7.1). Stale rows accumulate until container restart. + +### 13. Testing Requirements/Design + +#### 13.1 Unit / library tests + +The library-level invariants (`HSET` on dirty entities, idle suppression, field naming) are covered by the Framework HLD unit-test suite (`componentstats_ut.cpp`). No additional unit tests are introduced by this HLD. + +#### 13.2 System tests + +- Boot a `sonic-vs` image that includes the Framework HLD's two companion PRs. +- Exercise orchagent so that the SWSS facade increments counters (e.g. `config vlan add`, `config interface ip add`). +- Verify the schema directly in Redis: + ``` + redis-cli -n 2 KEYS "SWSS_STATS:*" + redis-cli -n 2 HGETALL "SWSS_STATS:PORT_TABLE" + ``` + Confirm that: + - The key shape matches §7.1. + - All four SWSS fields (`SET`, `DEL`, `COMPLETE`, `ERROR`) are present and are decimal integers. + - After a quiescent dwell, no `HSET` traffic is observed (idle suppression). +- End-to-end with telegraf (on a testbed configured per the NDM HLD): exercise orchagent and confirm metrics named `sonic.swss.SET` (etc.) arrive in Geneva with attribute `entity=
`. + +### 14. Open/Action items + +- The single reporting path in this revision is `COUNTERS_DB -> telegraf -> mdm -> Geneva`. Direct OTLP export from the application (the `OpenTelemetry SDK -> mdm` path described in NDM HLD §4) is a possible future addition; it would be specified in a future revision of this document if and when SONiC components need lower reporting latency than 1 s polling can provide. +- Garbage collection of stale `*_STATS:` keys on long-lived containers is left for a future revision. The current behaviour (cleared on container restart) is sufficient for the planned consumers. +- When additional components (`gnmi`, `bmp`, `telemetry`, …) adopt the framework, each one should add its vocabulary table to §7.3 by a small follow-up PR on this HLD.