Skip to content

Move startup helpers into library crates for external embedding #2462

@lalitb

Description

@lalitb

Pre-filing checklist

  • I searched existing issues and didn't find a duplicate

Component(s)

Rust OTAP dataflow (rust/otap-dataflow/)

Is your feature request related to a problem?

Summary

The engine crates are already usable as libraries - Controller, OtelDataflowSpec, PipelineFactory, and node registration via linkme all work from external binaries. That's great.

The gap is that several practical startup functions live only in src/main.rs and can't be reused without copying them:

  • validate_engine_components() - checks that every node URN in the config maps to a registered component and runs per-component config validation. This is the kind of thing every
    embedding binary needs, and it only uses public APIs (PipelineFactory getters, NodeKind, factory entry validate_config).

  • apply_cli_overrides() - applies core allocation and admin bind overrides to an OtelDataflowSpec. Uses only public types (CoreAllocation, HttpAdminSettings, ResourcesPolicy).

  • system_info() - prints registered components and system info. Useful for diagnostics in any distribution.

These are small, stable functions with no private dependencies. Moving them into a library module (e.g. otap_df_controller::startup or a thin otap-df-cli crate) would let custom binaries share them without copying code that may drift across releases.

This would make the engine meaningfully easier to embed in custom distributions - the same pattern used by projects like bindplane-otel-collector in the Go ecosystem.

Proposed Solution

  • Added above.

Alternatives Considered

  1. Copy the functions into each downstream binary.: It works, but the functions are non-trivial (~80 lines combined), touch config internals, and would silently diverge from upstream if validation logic changes. Maintainable for one binary, less so for several.

  2. Make src/main.rs itself a thin wrapper over library calls. This is effectively the proposed change - main.rs would still exist and work the same way, it would just call into library functions instead of defining them inline. No behavioral change for existing users.

  3. Publish a separate otap-df-cli utility crate. Cleaner separation, but probably overkill for three functions. A module inside otap-df-controller or otap-df-config is simpler and avoids a new crate.

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions