Skip to content

SnapCap TCP Connection Reliability Improvements#2381

Open
miceno wants to merge 12 commits intoindilib:masterfrom
miceno:feat/retry-snapcap
Open

SnapCap TCP Connection Reliability Improvements#2381
miceno wants to merge 12 commits intoindilib:masterfrom
miceno:feat/retry-snapcap

Conversation

@miceno
Copy link
Copy Markdown
Contributor

@miceno miceno commented Apr 29, 2026

SnapCap TCP Connection Reliability Improvements

Problem Statement

SnapCap communication can degrade when:

  • The controller reboots unexpectedly.
  • A TCP route is temporarily interrupted.
  • The socket remains open but the endpoint is no longer responsive.

Without recovery logic, manual disconnect/reconnect is required from the client.

Proposed Design

The driver now uses a reconnect abstraction:

  • SnapCap::ReconnectInterface in drivers/auxiliary/snapcap.h
  • SnapCapReconnect implementation in drivers/auxiliary/snapcap.cpp

Reconnect is non-blocking and driven from TimerHit(), so command handlers do not sleep-loop while recovering.

Configurable policy from INDI UI (RECONNECT_POLICY)

  • Exposed on CONNECTION_TAB as property RECONNECT_POLICY.
  • UI labels are now explicit:
    • FAILURES_BEFORE_RECONNECT shows as Max Failures
    • DELAY_MS shows as Base delay (ms)
  • Fields:
    • FAILURES_BEFORE_RECONNECT (MIN_FAILURES_BEFORE_RECONNECT..MAX_FAILURES_BEFORE_RECONNECT, default 2)
      • Meaning: consecutive command-failure threshold before reconnect is scheduled.
      • It does not limit the total number of reconnect attempts.
    • DELAY_MS (MIN_RECONNECT_DELAY_MS..MAX_RECONNECT_DELAY_SETTING_MS, default 500)
      • Meaning: base delay used by reconnect backoff after a failed reconnect attempt.
      • The first reconnect attempt is scheduled immediately; delay applies to subsequent retries.
  • Values are clamped in ISNewNumber() and persisted with saveConfig(ReconnectNP).
  • When changed, the driver logs a user-facing confirmation message with the interpreted meaning (failure threshold + base backoff delay).

Exponential backoff with cap

  • Delay is computed in computeReconnectDelayMs(attemptCount).
  • Uses base delay from DELAY_MS, doubles by attempt, capped by MAX_RECONNECT_DELAY_MS (4000 ms).
  • Effective sequence after reconnect failures is approximately: 2x, 4x, 8x, ... of DELAY_MS (with cap).

Future work

Use this same strategy for other components that communicate over TCP, like the rolloffino.

@knro
Copy link
Copy Markdown
Contributor

knro commented Apr 30, 2026

Thanks. This would affect any INDI driver, not just SnapCap. What would lead to this exactly? WiFi/Network Controller drops the connection or what?

@miceno
Copy link
Copy Markdown
Contributor Author

miceno commented Apr 30, 2026

The driver is for a home-made SnapCap dust cover. The cover is implemented using a ESP8266 and the connection is by WiFi. From time to time the ESP8266 disconnects from the network or reboots. When it reboots, or disconnects, the connection is lost forever, the driver does not try it again.

If you have automated a session, the session will not succeed, since the cap will never open nor close when required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants