Skip to content

feat(pool): enable reconnect by default, add disconnect/reconnect callbacks#534

Open
kai-familiar wants to merge 4 commits into
nbd-wtf:masterfrom
kai-familiar:pool-reconnect-subscriptions
Open

feat(pool): enable reconnect by default, add disconnect/reconnect callbacks#534
kai-familiar wants to merge 4 commits into
nbd-wtf:masterfrom
kai-familiar:pool-reconnect-subscriptions

Conversation

@kai-familiar

Copy link
Copy Markdown
Contributor

Problem

Fixes #513 — Long-running SimplePool subscriptions degrade silently as relays disconnect one by one. The pool waits for ALL relay subscriptions to close before firing onclose, so clients have no visibility into individual relay health.

Root Cause

AbstractRelay already has robust reconnection logic (enableReconnect, exponential backoff via resubscribeBackoff, automatic re-subscription with since filter adjustment). However, SimplePool defaults enableReconnect to false, so this existing machinery is unused unless explicitly opted in.

Changes

  1. Default enableReconnect to true in SimplePool (was false)

    • Relays that disconnect during active subscriptions now automatically reconnect
    • Existing re-subscription logic handles re-sending REQ with updated since filters to avoid duplicate events
    • Opt out with enableReconnect: false for backward compatibility
  2. Add onreconnect callback to AbstractRelay

    • Fires when a relay successfully reconnects after a disconnection
    • Pairs with existing onclose for full lifecycle visibility
  3. Add onRelayDisconnect / onRelayReconnect callbacks to pool options

    • Enables monitoring connection health in long-running clients
    • Complements existing onRelayConnectionFailure / onRelayConnectionSuccess

Usage

const pool = new SimplePool({
  // reconnect is now enabled by default
  onRelayDisconnect: (url) => console.log(`⚠️ ${url} disconnected, reconnecting...`),
  onRelayReconnect: (url) => console.log(`✅ ${url} reconnected`),
})

// Long-running subscription stays healthy as relays drop and reconnect
pool.subscribe(["wss://relay1", "wss://relay2"], filter, {
  onevent: (ev) => handleEvent(ev),
})

Breaking Change

enableReconnect now defaults to true instead of false. Existing code that relied on the pool NOT reconnecting should explicitly pass enableReconnect: false. In practice, most long-running clients want reconnection, so this should be a net improvement.

Kai and others added 4 commits April 23, 2026 12:08
…lbacks

- Change enableReconnect default from false to true in SimplePool
  - Relays that disconnect during long-running subscriptions now
    automatically reconnect with exponential backoff
  - Existing resubscription logic in AbstractRelay already handles
    re-sending REQ with updated since filters
  - Opt out with enableReconnect: false

- Add onreconnect callback to AbstractRelay
  - Fires when a relay successfully reconnects after disconnection

- Add onRelayDisconnect/onRelayReconnect callbacks to pool options
  - Enables monitoring connection health in long-running clients

Fixes nbd-wtf#513
onclose is not called when enableReconnect is true; the relay goes
straight to reconnection. The test was waiting for an event that
never fires, causing a 20s timeout.

This was a pre-existing failure on master.
Root causes found and fixed:
1. Pool tests set pingTimeout/pingFrequency AFTER ensureRelay() calls
   connect(), but the ping interval was already created with the default
   29s frequency. Fix: restart the interval after changing settings.

2. Reconnect tests waited for onclose, but onclose is not called when
   enableReconnect is true — the relay goes straight to reconnection.
   Fix: wait for relay.connected === false instead.

3. Pool 'ping-pong timeout' test didn't set enableReconnect: false,
   which now defaults to true (changed in this PR). Fix: explicitly
   disable reconnect for the timeout-only test.

All 4 previously-failing tests now pass in <500ms each.
@kai-familiar

Copy link
Copy Markdown
Contributor Author

CI update: format check now passes ✅. The only remaining test failure is NegentropySync > syncs events from wss://relay.damus.io — a flaky external relay connection test that also fails on master (all 3 recent master CI runs show the same failure).

As a bonus, this PR also fixes 4 pre-existing test failures on master:

  • reconnect on disconnect (relay.test.ts) — was timing out at 20s
  • ping-pong timeout in pool (pool.test.ts) — was timing out at 20s
  • reconnect on disconnect in pool (pool.test.ts) — was timing out at 20s
  • reconnect with filter update in pool (pool.test.ts) — was timing out at 20s

Root causes: ping interval created with 29s default before test could override settings, and reconnect tests waiting for onclose which doesn't fire when enableReconnect is true.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reconnect to relays in SimplePool subscribe

1 participant