Skip to content

Port the synchronous event serialization core to Rust#19837

Open
erikjohnston wants to merge 26 commits into
developfrom
erikj/rust_event_serialization
Open

Port the synchronous event serialization core to Rust#19837
erikjohnston wants to merge 26 commits into
developfrom
erikj/rust_event_serialization

Conversation

@erikjohnston

@erikjohnston erikjohnston commented Jun 9, 2026

Copy link
Copy Markdown
Member

Summary

This moves the synchronous core of client event serialization out of synapse/events/utils.py and into Rust (rust/src/events/serialize.rs).

Event serialization is on the hot path for /sync, /messages, and most client-facing endpoints. Previously it was a recursive pure-Python routine (_serialize_event / _inject_bundled_aggregations / only_fields) that interleaved
CPU-bound formatting with async DB/IO. This PR separates those two concerns and moves the CPU-bound half to Rust:

  • Async prep stays in Python. EventClientSerializer._prepare_serialization does all DB/IO up front: batch-fetching redaction events and running the registered module unsigned-callback hooks, for the top-level events and every bundled sub-event (edits and thread latest events, which are themselves serialized). The admin/MSC4354 config is resolved once via _update_config, rather than re-checked on every recursive call as the old code did.
  • The synchronous core moves to Rust. Given an event plus the pre-fetched redaction_map, unsigned_additions, and bundle_aggregations, the Rust code produces the client JSON entirely in Rust — including the v1/v2 format transforms, only_event_fields filtering, redaction handling, and recursive bundled aggregations.

Details

  • The Rust entry point is a single batch function, serialize_events, taking a list of (event, membership) pairs. The three lookup maps are shared across the whole batch, so they're read out of Python and converted to Rust
    structures once per batch rather than once per event. EventClientSerializer.serialize_event (singular) is a thin wrapper that calls it with a one-element list.
  • SerializeEventConfig is now a Rust pyclass, and the old event_format callable is replaced by the EventFormat enum (Raw / ClientV1 / ClientV2 / ClientV2WithoutRoomId). Call sites in rest/admin/events.py and
    rest/client/{notifications,room,sync}.py are updated to pass the enum. make_config_for_admin and MSC4354 enablement now go through SerializeEventConfig.for_admin() / with_msc4354().
  • New accessors on EventInternalMetadata (redacted_by, txn_id, device_id, token_id, delay_id, soft_failed, policy_server_spammy) expose to Rust the fields the serializer reads.
  • The _split_field unit tests move from tests/events/test_utils.py to a Rust test in serialize.rs, since the implementation moved.

Behaviour

This is intended to be a behaviour-preserving refactor — the Rust core mirrors the previous Python output (field ordering, v1 key promotion, redaction placement per room version, null-redacts handling, transaction-ID gating).

Existing serialization, relations, and sync tests pass unchanged.

This comment was marked as outdated.

@erikjohnston erikjohnston force-pushed the erikj/rust_event_serialization branch 4 times, most recently from 7ea20c9 to 96a80d1 Compare June 16, 2026 09:27
erikjohnston and others added 2 commits June 16, 2026 10:32
Move `BundledAggregations` and `_ThreadAggregation` from the Python
`RelationsHandler` into frozen Rust pyclasses exposed from
`synapse.synapse_rust.events`. They are re-exported from
`synapse.handlers.relations` so existing call sites are unaffected.

The referenced events are stored by value (the `Event` pyclass now
derives a cheap, Arc-sharing `Clone`). The `references` field uses the
`JsonObject` type, which gains a `FromPyObject` impl so it can be built
from any Python mapping in place of a plain dict.

As the pyclass is immutable, `get_bundled_aggregations` is reworked to
collect each kind of aggregation and build one instance per event.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This is in prep for doing the actual serialization in Rust.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.

Comment thread rust/src/events/serialize.rs
Comment thread synapse/events/utils.py
erikjohnston and others added 2 commits June 16, 2026 11:20
Move the synchronous core of client event serialization out of
`synapse/events/utils.py` and into `rust/src/events/serialize.rs`. The
Python `EventClientSerializer` now performs all DB/IO (fetching
redactions, running module callbacks, resolving admin/MSC4354 config) up
front in `_prepare_serialization`, then hands the pre-fetched data to the
Rust `serialize_event`, which recurses entirely in Rust. Bundled
aggregations are read directly from the Rust `BundledAggregations`
pyclass.

`SerializeEventConfig` and the `event_format` callable become a Rust
pyclass and the `EventFormat` enum respectively; call sites are updated
to pass the enum.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@erikjohnston erikjohnston force-pushed the erikj/rust_event_serialization branch from 96a80d1 to 7d5dc63 Compare June 16, 2026 10:20
@erikjohnston erikjohnston marked this pull request as ready for review June 16, 2026 10:23
@erikjohnston erikjohnston requested a review from a team as a code owner June 16, 2026 10:23
Comment on lines 97 to +104
/// The `signatures` and `unsigned` fields are kept separate from the other
/// fields as they are mutable (and must be deep-copied if the event is cloned).
/// `common_fields` and `specific_fields` are both `#[serde(flatten)]`ed so that
/// the serialised JSON is a single flat object matching the Matrix spec.
/// fields as they are mutable. Note the derived [`Clone`] is *shallow*: it
/// shares the mutable `signatures`/`unsigned`/internal state behind their
/// `Arc`s (cheap, and fine for read-only uses such as bundled aggregations).
/// Use [`FormattedEvent::deep_copy`] when an independently-mutable copy is
/// required. `common_fields` and `specific_fields` are both
/// `#[serde(flatten)]`ed so that the serialised JSON is a single flat object
/// matching the Matrix spec.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Clone seems like a new footgun

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yes. Not really sure I see much alternative though.

If this was a pure Rust class we could get rid of the interior mutability and have a top level Arc, however to make Python work we need to be able to get a mutable version of EventInternalMetadata and that requires interior mutability.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we do what was previously suggested: "and must be deep-copied if the event is cloned"?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm, suppose we could. Really, we want a reference to the same event here, in that we could also store it as a Py<Event> (I somewhat want to avoid storing python references as then you have to partake in the GC). Event::deep_copy is really only intended for if you want a completely new copy of an event that you can edit (which is very rare, mostly we just share a reference to the same event).

I suppose we could also implement a shallow_clone() method instead, so as to force users to choose?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a shallow_copy in 1ace531, though in practice it has highlighted a couple of things:

  1. This means that other things (like ThreadAggregation) has to manually derive clone.
  2. Actually, cloning isn't as cheap as I thought as it will copy the immutable bits in FormattedEvent that aren't in an Arc.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed tack and made the aggregations structs store a Py<Event> for now, and dropped both Clone and shallow_copy.

It's not entirely clear to me if we have to implement __traverse__, since we know there won't be any reference cycles between the relation classes and the Event. Nonetheless I've implemented it to make sure we don't accidentally leak memory.

c.f. https://pyo3.rs/v0.28.3/class/protocols#garbage-collector-integration and https://docs.python.org/3/c-api/gcsupport.html

Comment thread rust/src/events/internal_metadata.rs
Comment thread rust/src/events/json_object.rs Outdated
Comment thread rust/src/events/serialize.rs Outdated
Comment thread rust/src/events/serialize.rs
// min() ensures the origin server can't claim a time in the future
// to exceed the stickiness duration limit.
let expires_at = std::cmp::min(event.origin_server_ts(), time_now_ms)
+ sticky_duration.as_millis() as i64;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as cast 😬 but I guess we can assume this is safe given canonical JSON numbers ([-(2**53)+1, (2**53)-1]) being within the i64 range

@erikjohnston erikjohnston Jun 22, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that does need a comment. We actually cap the sticky duration (to 1 hour), so the cast is safe.

1cb9fab

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear if that's actually true. You could receive an event from federation with a bogus sticky duration and we just cap it to 1 hour or the event is just not sticky at all.

But the event does have to valid canonical JSON which would protect us.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean? Event::sticky_duration caps it at one hour before returning, so its impossible for sticky_duration to not be a valid i64 in this case.

We don't validate canonical json integer ranges for older room versions, so we can't necessarily rely on that.


result.push(unescape(&field[prev_start..]));
result
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New logic for split_field(...) compared to previous

def _split_field(field: str) -> list[str]:
"""
Splits strings on unescaped dots and removes escaping.
Args:
field: A string representing a path to a field.
Returns:
A list of nested fields to traverse.
"""
# Convert the field and remove escaping:
#
# 1. "content.body.thing\.with\.dots"
# 2. ["content", "body", "thing\.with\.dots"]
# 3. ["content", "body", "thing.with.dots"]
# Find all dots (and their preceding backslashes). If the dot is unescaped
# then emit a new field part.
result = []
prev_start = 0
for match in SPLIT_FIELD_REGEX.finditer(field):
# If the match is an *even* number of characters than the dot was escaped.
if len(match.group()) % 2 == 0:
continue
# Add a new part (up to the dot, exclusive) after escaping.
result.append(
ESCAPE_SEQUENCE_PATTERN.sub(
_escape_slash, field[prev_start : match.end() - 1]
)
)
prev_start = match.end()
# Add any part of the field after the last unescaped dot. (Note that if the
# character is a dot this correctly adds a blank string.)
result.append(re.sub(r"\\(.)", _escape_slash, field[prev_start:]))
return result

Haven't vetted this. I do see that we added some tests.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, it does it a different way to avoid regex. Sorry, should have flagged. It's more or less the same idea, find every . character and see if its been escaped or not (taking into account backslash escaping to, so \. is escaped, but \\. is not as it's a literal backslash followed by a dot).

The test cases are exactly the same though.

Comment thread synapse/events/utils.py Outdated
Comment thread synapse/events/utils.py
Comment thread synapse/events/utils.py
erikjohnston and others added 9 commits June 18, 2026 13:50
The previous comment said the method returned "a clone … in a
Value::Object", but it actually returns a plain reference to the
underlying BTreeMap.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous name was ambiguous — it sounds like a list of field names
to include *or* exclude. "allowlist" makes the direction explicit.
Updates the Rust struct field, the Python-facing getter, and all call
sites.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The old name was an implementation-detail label. serialize_event better
describes what the function does: serialize a single event, recursing
into its redactions and bundled aggregations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Rather than implicitly updating the config in `serialize_event`, let's
add a helper for creating the config that does the necessary
checks/updates itself.
@erikjohnston erikjohnston force-pushed the erikj/rust_event_serialization branch from 19ba756 to a6c349e Compare June 22, 2026 13:04
Comment thread synapse/events/utils.py
Comment on lines 97 to +104
/// The `signatures` and `unsigned` fields are kept separate from the other
/// fields as they are mutable (and must be deep-copied if the event is cloned).
/// `common_fields` and `specific_fields` are both `#[serde(flatten)]`ed so that
/// the serialised JSON is a single flat object matching the Matrix spec.
/// fields as they are mutable. Note the derived [`Clone`] is *shallow*: it
/// shares the mutable `signatures`/`unsigned`/internal state behind their
/// `Arc`s (cheap, and fine for read-only uses such as bundled aggregations).
/// Use [`FormattedEvent::deep_copy`] when an independently-mutable copy is
/// required. `common_fields` and `specific_fields` are both
/// `#[serde(flatten)]`ed so that the serialised JSON is a single flat object
/// matching the Matrix spec.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we do what was previously suggested: "and must be deep-copied if the event is cloned"?

Comment on lines +215 to +218
/// Returns a reference to the underlying map of this object's entries.
pub fn as_map(&self) -> &BTreeMap<Box<str>, Value> {
&self.object
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused

Helpful?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah it's become unused now. I think it's likely going to be useful in future, so am minded to leave it in now that we wrote it.

Comment thread rust/src/events/serialize.rs Outdated
Comment on lines +102 to +107
/// Whether to apply the client event format transform (v1/v2/raw). When
/// `false`, the federation-format event is returned as-is.
as_client_event: bool,
/// Which client event format variant to apply (only used when
/// `as_client_event` is `true`).
event_format: EventFormat,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't look like these need to be separate and we could just rely on event_format.

I'm guessing prior art but we should call this out with a FIXME for a follow-up PR

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm, does look like the case. Have added a FIXME 18d488c

Comment thread rust/src/events/serialize.rs Outdated
Comment thread rust/src/events/serialize.rs Outdated
Comment thread rust/src/events/serialize.rs Outdated
Comment on lines +450 to +451
// us it *should* be an integer, but it is possible for it to be out of i64
// range. In that case, just omit `age` rather than erroring.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is that possible?

Why omit? Not important?

Seems like something to assert/panic on

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they transmit an age that is close to, but not over, the i64 limit, then eventually our internal calculations would make age over the limit. It's not clear how you'd properly protect yourself in that case, beyond discarding dodgy looking age fields, which is somewhat equivalent to what we're doing here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// min() ensures the origin server can't claim a time in the future
// to exceed the stickiness duration limit.
let expires_at = std::cmp::min(event.origin_server_ts(), time_now_ms)
+ sticky_duration.as_millis() as i64;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear if that's actually true. You could receive an event from federation with a bogus sticky duration and we just cap it to 1 hour or the event is just not sticky at all.

But the event does have to valid canonical JSON which would protect us.

Comment thread synapse/events/utils.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants