Port the synchronous event serialization core to Rust#19837
Port the synchronous event serialization core to Rust#19837erikjohnston wants to merge 26 commits into
Conversation
19e4ce9 to
88eeb19
Compare
7ea20c9 to
96a80d1
Compare
Move `BundledAggregations` and `_ThreadAggregation` from the Python `RelationsHandler` into frozen Rust pyclasses exposed from `synapse.synapse_rust.events`. They are re-exported from `synapse.handlers.relations` so existing call sites are unaffected. The referenced events are stored by value (the `Event` pyclass now derives a cheap, Arc-sharing `Clone`). The `references` field uses the `JsonObject` type, which gains a `FromPyObject` impl so it can be built from any Python mapping in place of a plain dict. As the pyclass is immutable, `get_bundled_aggregations` is reworked to collect each kind of aggregation and build one instance per event. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This is in prep for doing the actual serialization in Rust.
Move the synchronous core of client event serialization out of `synapse/events/utils.py` and into `rust/src/events/serialize.rs`. The Python `EventClientSerializer` now performs all DB/IO (fetching redactions, running module callbacks, resolving admin/MSC4354 config) up front in `_prepare_serialization`, then hands the pre-fetched data to the Rust `serialize_event`, which recurses entirely in Rust. Bundled aggregations are read directly from the Rust `BundledAggregations` pyclass. `SerializeEventConfig` and the `event_format` callable become a Rust pyclass and the `EventFormat` enum respectively; call sites are updated to pass the enum. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
96a80d1 to
7d5dc63
Compare
| /// The `signatures` and `unsigned` fields are kept separate from the other | ||
| /// fields as they are mutable (and must be deep-copied if the event is cloned). | ||
| /// `common_fields` and `specific_fields` are both `#[serde(flatten)]`ed so that | ||
| /// the serialised JSON is a single flat object matching the Matrix spec. | ||
| /// fields as they are mutable. Note the derived [`Clone`] is *shallow*: it | ||
| /// shares the mutable `signatures`/`unsigned`/internal state behind their | ||
| /// `Arc`s (cheap, and fine for read-only uses such as bundled aggregations). | ||
| /// Use [`FormattedEvent::deep_copy`] when an independently-mutable copy is | ||
| /// required. `common_fields` and `specific_fields` are both | ||
| /// `#[serde(flatten)]`ed so that the serialised JSON is a single flat object | ||
| /// matching the Matrix spec. |
There was a problem hiding this comment.
The Clone seems like a new footgun
There was a problem hiding this comment.
Hmm, yes. Not really sure I see much alternative though.
If this was a pure Rust class we could get rid of the interior mutability and have a top level Arc, however to make Python work we need to be able to get a mutable version of EventInternalMetadata and that requires interior mutability.
There was a problem hiding this comment.
Why don't we do what was previously suggested: "and must be deep-copied if the event is cloned"?
There was a problem hiding this comment.
Mmm, suppose we could. Really, we want a reference to the same event here, in that we could also store it as a Py<Event> (I somewhat want to avoid storing python references as then you have to partake in the GC). Event::deep_copy is really only intended for if you want a completely new copy of an event that you can edit (which is very rare, mostly we just share a reference to the same event).
I suppose we could also implement a shallow_clone() method instead, so as to force users to choose?
There was a problem hiding this comment.
I've added a shallow_copy in 1ace531, though in practice it has highlighted a couple of things:
- This means that other things (like
ThreadAggregation) has to manually derive clone. - Actually, cloning isn't as cheap as I thought as it will copy the immutable bits in
FormattedEventthat aren't in anArc.
There was a problem hiding this comment.
I've changed tack and made the aggregations structs store a Py<Event> for now, and dropped both Clone and shallow_copy.
It's not entirely clear to me if we have to implement __traverse__, since we know there won't be any reference cycles between the relation classes and the Event. Nonetheless I've implemented it to make sure we don't accidentally leak memory.
c.f. https://pyo3.rs/v0.28.3/class/protocols#garbage-collector-integration and https://docs.python.org/3/c-api/gcsupport.html
| // min() ensures the origin server can't claim a time in the future | ||
| // to exceed the stickiness duration limit. | ||
| let expires_at = std::cmp::min(event.origin_server_ts(), time_now_ms) | ||
| + sticky_duration.as_millis() as i64; |
There was a problem hiding this comment.
as cast 😬 but I guess we can assume this is safe given canonical JSON numbers ([-(2**53)+1, (2**53)-1]) being within the i64 range
There was a problem hiding this comment.
Yup, that does need a comment. We actually cap the sticky duration (to 1 hour), so the cast is safe.
There was a problem hiding this comment.
Unclear if that's actually true. You could receive an event from federation with a bogus sticky duration and we just cap it to 1 hour or the event is just not sticky at all.
But the event does have to valid canonical JSON which would protect us.
There was a problem hiding this comment.
What do you mean? Event::sticky_duration caps it at one hour before returning, so its impossible for sticky_duration to not be a valid i64 in this case.
We don't validate canonical json integer ranges for older room versions, so we can't necessarily rely on that.
|
|
||
| result.push(unescape(&field[prev_start..])); | ||
| result | ||
| } |
There was a problem hiding this comment.
New logic for split_field(...) compared to previous
synapse/synapse/events/utils.py
Lines 146 to 184 in 8cd695a
Haven't vetted this. I do see that we added some tests.
There was a problem hiding this comment.
Oh yeah, it does it a different way to avoid regex. Sorry, should have flagged. It's more or less the same idea, find every . character and see if its been escaped or not (taking into account backslash escaping to, so \. is escaped, but \\. is not as it's a literal backslash followed by a dot).
The test cases are exactly the same though.
The previous comment said the method returned "a clone … in a Value::Object", but it actually returns a plain reference to the underlying BTreeMap. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous name was ambiguous — it sounds like a list of field names to include *or* exclude. "allowlist" makes the direction explicit. Updates the Rust struct field, the Python-facing getter, and all call sites. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The old name was an implementation-detail label. serialize_event better describes what the function does: serialize a single event, recursing into its redactions and bundled aggregations. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Rather than implicitly updating the config in `serialize_event`, let's add a helper for creating the config that does the necessary checks/updates itself.
19ba756 to
a6c349e
Compare
| /// The `signatures` and `unsigned` fields are kept separate from the other | ||
| /// fields as they are mutable (and must be deep-copied if the event is cloned). | ||
| /// `common_fields` and `specific_fields` are both `#[serde(flatten)]`ed so that | ||
| /// the serialised JSON is a single flat object matching the Matrix spec. | ||
| /// fields as they are mutable. Note the derived [`Clone`] is *shallow*: it | ||
| /// shares the mutable `signatures`/`unsigned`/internal state behind their | ||
| /// `Arc`s (cheap, and fine for read-only uses such as bundled aggregations). | ||
| /// Use [`FormattedEvent::deep_copy`] when an independently-mutable copy is | ||
| /// required. `common_fields` and `specific_fields` are both | ||
| /// `#[serde(flatten)]`ed so that the serialised JSON is a single flat object | ||
| /// matching the Matrix spec. |
There was a problem hiding this comment.
Why don't we do what was previously suggested: "and must be deep-copied if the event is cloned"?
| /// Returns a reference to the underlying map of this object's entries. | ||
| pub fn as_map(&self) -> &BTreeMap<Box<str>, Value> { | ||
| &self.object | ||
| } |
There was a problem hiding this comment.
Hmm, yeah it's become unused now. I think it's likely going to be useful in future, so am minded to leave it in now that we wrote it.
| /// Whether to apply the client event format transform (v1/v2/raw). When | ||
| /// `false`, the federation-format event is returned as-is. | ||
| as_client_event: bool, | ||
| /// Which client event format variant to apply (only used when | ||
| /// `as_client_event` is `true`). | ||
| event_format: EventFormat, |
There was a problem hiding this comment.
Doesn't look like these need to be separate and we could just rely on event_format.
I'm guessing prior art but we should call this out with a FIXME for a follow-up PR
There was a problem hiding this comment.
Mmm, does look like the case. Have added a FIXME 18d488c
| // us it *should* be an integer, but it is possible for it to be out of i64 | ||
| // range. In that case, just omit `age` rather than erroring. |
There was a problem hiding this comment.
How is that possible?
Why omit? Not important?
Seems like something to assert/panic on
There was a problem hiding this comment.
If they transmit an age that is close to, but not over, the i64 limit, then eventually our internal calculations would make age over the limit. It's not clear how you'd properly protect yourself in that case, beyond discarding dodgy looking age fields, which is somewhat equivalent to what we're doing here.
| // min() ensures the origin server can't claim a time in the future | ||
| // to exceed the stickiness duration limit. | ||
| let expires_at = std::cmp::min(event.origin_server_ts(), time_now_ms) | ||
| + sticky_duration.as_millis() as i64; |
There was a problem hiding this comment.
Unclear if that's actually true. You could receive an event from federation with a bogus sticky duration and we just cap it to 1 hour or the event is just not sticky at all.
But the event does have to valid canonical JSON which would protect us.
Co-authored-by: Eric Eastwood <madlittlemods@gmail.com>
This reverts commit 1ace531.
Summary
This moves the synchronous core of client event serialization out of
synapse/events/utils.pyand into Rust (rust/src/events/serialize.rs).Event serialization is on the hot path for
/sync,/messages, and most client-facing endpoints. Previously it was a recursive pure-Python routine (_serialize_event/_inject_bundled_aggregations/only_fields) that interleavedCPU-bound formatting with
asyncDB/IO. This PR separates those two concerns and moves the CPU-bound half to Rust:EventClientSerializer._prepare_serializationdoes all DB/IO up front: batch-fetching redaction events and running the registered moduleunsigned-callback hooks, for the top-level events and every bundled sub-event (edits and thread latest events, which are themselves serialized). The admin/MSC4354 config is resolved once via_update_config, rather than re-checked on every recursive call as the old code did.redaction_map,unsigned_additions, andbundle_aggregations, the Rust code produces the client JSON entirely in Rust — including the v1/v2 format transforms,only_event_fieldsfiltering, redaction handling, and recursive bundled aggregations.Details
serialize_events, taking a list of(event, membership)pairs. The three lookup maps are shared across the whole batch, so they're read out of Python and converted to Ruststructures once per batch rather than once per event.
EventClientSerializer.serialize_event(singular) is a thin wrapper that calls it with a one-element list.SerializeEventConfigis now a Rustpyclass, and the oldevent_formatcallable is replaced by theEventFormatenum (Raw/ClientV1/ClientV2/ClientV2WithoutRoomId). Call sites inrest/admin/events.pyandrest/client/{notifications,room,sync}.pyare updated to pass the enum.make_config_for_adminand MSC4354 enablement now go throughSerializeEventConfig.for_admin()/with_msc4354().EventInternalMetadata(redacted_by,txn_id,device_id,token_id,delay_id,soft_failed,policy_server_spammy) expose to Rust the fields the serializer reads._split_fieldunit tests move fromtests/events/test_utils.pyto a Rust test inserialize.rs, since the implementation moved.Behaviour
This is intended to be a behaviour-preserving refactor — the Rust core mirrors the previous Python output (field ordering, v1 key promotion, redaction placement per room version, null-
redactshandling, transaction-ID gating).Existing serialization, relations, and sync tests pass unchanged.