Skip to content

fix(mnesia): load ram_copies table from 'better' copy when start#11020

Open
qzhuyan wants to merge 1 commit into
erlang:maint-28from
qzhuyan:fix/mnesia-ram-copies-safeload-during-start
Open

fix(mnesia): load ram_copies table from 'better' copy when start#11020
qzhuyan wants to merge 1 commit into
erlang:maint-28from
qzhuyan:fix/mnesia-ram-copies-safeload-during-start

Conversation

@qzhuyan
Copy link
Copy Markdown
Contributor

@qzhuyan qzhuyan commented Apr 15, 2026

fix #11021

Without this fix, ram_copies is 'safe loaded' locally when remote nodes have not been connected (yet). This makes the table accessible by application becasue 'where_to_read' is set to local node().

mnesia:dirty_read(emqx_route_filters,1).
[]

Also when remote node is connected afterwards, ram_copies table will not load the 'better' copy, this makes the data inconsistent within the cluster.

This commit fix that during mnesia start, ram_copy table should NOT do local safe load when there is a better copy from the remote. the where_to_read will stay in 'nowhere' and table access won't be served.

mnesia:dirty_read(emqx_route_filters,1).
** exception exit: {aborted,{no_exists,[emqx_route_filters,1]}}

After remote node is connected, local node will do net_load_table from the remote.

Resue adopt_orphans function to resolve the conflicts when there is a deadlock of deciding the better copy, that is the same behaviour for disc_copies tables.

note1: BetterCopies0 = mnesia_lib:remote_copy_holders(Cs) -- Downs

note2: disc copy table has no such issue.

note3: if there is no better copy (when other nodes are down before current one), it is correct to load from local.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 15, 2026

CT Test Results

  2 files   61 suites   19m 8s ⏱️
694 tests 543 ✅ 151 💤 0 ❌
749 runs  583 ✅ 166 💤 0 ❌

Results for commit ddc6d9d.

♻️ This comment has been updated with latest results.

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

Without this fix, ram_copies is 'safe loaded' locally when remote nodes
have not been connected (yet). This makes the table accessible by application
becasue 'where_to_read' is set to local node().

```
mnesia:dirty_read(emqx_route_filters,1).
[]
```

Also when remote node is connected afterwards, ram_copies table will not load the 'better'
copy, this makes the data inconsistent within the cluster.

This commit fix that during mnesia start, ram_copy table should NOT
do local safe load when there is a better copy from the remote. the
where_to_read will stay in 'nowhere' and table access won't be served.

```
mnesia:dirty_read(emqx_route_filters,1).
** exception exit: {aborted,{no_exists,[emqx_route_filters,1]}}
```

After remote node is connected, local node will do `net_load_table` from
the remote.

Resue `adopt_orphans` function to resolve the conflicts when there is a
deadlock of deciding the better copy, that is the same behaviour for
disc_copies tables.

note1: BetterCopies0 = mnesia_lib:remote_copy_holders(Cs) -- Downs

note2: disc copy table has no such issue.

note3: if there is no better copy (when other nodes are down before
current one), it is correct to load from local.
@qzhuyan qzhuyan force-pushed the fix/mnesia-ram-copies-safeload-during-start branch from dc297e9 to ddc6d9d Compare April 20, 2026 08:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

team:PS Assigned to OTP team PS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

mnesia: Data inconsistencies observed in Mnesia ram_copies table after an isolated node restart and reconnection.

2 participants