Skip to content

perf(session): optimise ProducerBuffer — swap field roles#1674

Open
Tehsmash wants to merge 3 commits into
mainfrom
perf/session-optimise-producer-buffer
Open

perf(session): optimise ProducerBuffer — swap field roles#1674
Tehsmash wants to merge 3 commits into
mainfrom
perf/session-optimise-producer-buffer

Conversation

@Tehsmash

Copy link
Copy Markdown
Member

Description

ProducerBuffer is the fixed-capacity ring buffer used in the reliable-mode hot path to store messages for retransmission. The previous layout kept full Message objects in a Vec<Option<Message>> ring and message IDs in a HashMap<usize, usize> index.

The bottleneck in push() was the eviction step: to evict the oldest slot we had to read buffer[self.next] — a large protobuf struct unlikely to be cache-warm — just to extract its .get_id().

This PR swaps the field roles:

// Before
buffer: Vec<Option<Message>>   — ring slots (large structs)
map:    HashMap<usize, usize>  — message_id → slot index

// After
messages: HashMap<usize, Message>  — message_id → Message
ring:     Vec<usize>               — compact ring of IDs (usize::MAX sentinel = empty)

The ring Vec is 512 × 8 B = 4 KB — fits entirely in L1. Eviction now reads a single integer instead of dereferencing a large struct.

Per-operation impact:

Operation Before After
push eviction Access Vec<Option<Message>>[next].get_id() Read Vec<usize>[next] — integer, L1-hot
get HashMap → Vec index → clone HashMap → clone (one fewer indirection)
clear Reallocates vec![None; capacity] HashMap::clear() only, no allocation
iter Vec scan with Option unwrap Ring-ordered walk (cold flush path only)

iter() is only called on the cold flush-on-connect path so its slightly more complex traversal has no hot-path impact.

All call sites in session_sender.rs use the same public API (push, get, iter, clear) — no changes needed outside producer_buffer.rs.

Type of Change

  • Bugfix
  • New Feature
  • Breaking Change
  • Refactor
  • Other (please describe)

Performance optimisation — no behaviour change.

Checklist

  • I have read the contributing guidelines
  • Existing issues have been referenced (where applicable)
  • I have verified this change is not present in other open pull requests
  • Functionality is documented
  • All code style checks pass
  • New code contribution is covered by automated tests
  • All new and existing tests pass

@Tehsmash Tehsmash requested a review from a team as a code owner May 26, 2026 17:17
@github-actions

github-actions Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

The latest Buf updates on your PR. Results from workflow ci-buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedJun 9, 2026, 9:59 AM

@Tehsmash Tehsmash force-pushed the perf/session-optimise-producer-buffer branch from 57de4ca to c08522f Compare May 27, 2026 16:03
@codecov

codecov Bot commented May 27, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@Tehsmash Tehsmash force-pushed the perf/session-optimise-producer-buffer branch from c08522f to 2e71ff9 Compare May 28, 2026 19:33

@micpapal micpapal left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with minor changes

self.ring[self.next] = id;
self.next = (self.next + 1) % self.capacity;
true
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we can drop the return value as we always return true

@Tehsmash Tehsmash force-pushed the perf/session-optimise-producer-buffer branch 4 times, most recently from 90c7ad0 to 389cb3b Compare June 11, 2026 13:01
Tehsmash added 3 commits June 16, 2026 13:08
Replace the Vec<Option<Message>> ring + HashMap<usize,usize> index layout
with HashMap<usize,Message> (messages) + Vec<usize> ring (ring).

The key hot-path gain is in push(): evicting the oldest entry no longer
requires touching the large Message struct — we read a single usize from
the compact ring Vec (512 × 8 B = 4 KB, L1-resident) instead of
dereferencing a Vec<Option<Message>> slot to call .get_id().

Additional wins:
- get(): one fewer indirection (map lookup → clone, not map → vec → clone)
- clear(): HashMap::clear() only; no vec![None; capacity] reallocation
- iter(): ring-ordered walk preserving insertion order; cold path only

Signed-off-by: Sam Betts <1769706+Tehsmash@users.noreply.github.com>
`push()` always returned `true`; no caller used the value.
Change the return type to `()` as suggested in PR review.

Signed-off-by: Sam Betts <1769706+Tehsmash@users.noreply.github.com>
Replace assert!(messages.push(...)) with plain push() calls
now that push() returns () instead of bool.

Signed-off-by: Sam Betts <1769706+Tehsmash@users.noreply.github.com>
@Tehsmash Tehsmash force-pushed the perf/session-optimise-producer-buffer branch from 389cb3b to 6f8a6ce Compare June 16, 2026 12:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants