Describe the bug
net_kernel:start/2 followed by net_kernel:stop/0 in a tight loop intermittently fails to restart with:
Protocol 'inet_tcp': the name @ seems to be in use by another Erlang node
The failure is a race between epmd's userspace processing of the prior TCP FIN and the new node's ALIVE2_REQ. When gen_tcp:close/1 returns to erl_epmd, the kernel has queued FIN, but epmd's read loop hasn't necessarily processed the EOF and unregistered the node by the time the next ALIVE2_REQ arrives; so it
returns ALIVE2_RESP status=1 ("already registered"), which erl_epmd:wait_for_reg_reply/2 translates to {error, duplicate_name} and net_kernel:init/1 propagates as {stop, duplicate_name}.
TCP close is the only unregister signal in the protocol, so a client that wants to re-register cannot avoid this race except by waiting or retrying.
To Reproduce
-module(repro_otp_epmd_race).
-export([main/0]).
main() ->
Errors = loop(1000, 0, []),
io:format("done. failures=~p~n", [length(Errors)]),
erlang:halt(case Errors of [] -> 0; _ -> 1 end).
loop(0, _, Errors) -> Errors;
loop(N, Iter, Errors) ->
NextIter = Iter + 1,
case net_kernel:start(otp_repro, #{name_domain => shortnames}) of
{ok, _} ->
ok = net_kernel:stop(),
loop(N - 1, NextIter, Errors);
{error, _} = E ->
catch net_kernel:stop(),
loop(N - 1, NextIter, [{NextIter, E} | Errors])
end.
On an idle Linux multipass VM on this Intel Mac, I see 4-5 failures per 1000 iterations. Under CPU load (e.g. taskset -c 1,2,3 stress-ng --cpu 3 --cpu-load 90) the rate rises a little bit.
Expected behavior
net_kernel:start/2 either succeeds or fails for a real reason (name actually in use by another live node). It should not fail because epmd hasn't yet finished bookkeeping for this same node's prior connection.
Affected versions
This was tested with OTP 28.1.
Additional context
I packet-captured 1000 iterations of an equivalent reproducer with tcpdump -w epmd.pcap port 4369 and parsed the ALIVE2_RESP status byte. All status=1 responses (16/1000) occurred when the new connection's SYN arrived 1 to 3ms after the prior connection's FIN; every connection with more than 3ms gap succeeded.
This was found investigating a downstream test_net_kernel flake in AtomVM. The retry approach reduced the failure rate from ~2.5% to 0/1000 in our erl_epmd (4 retries, 5/10/20/40ms backoff). Happy to send a PR for OTP's erl_epmd if the approach is acceptable.
Describe the bug
net_kernel:start/2followed bynet_kernel:stop/0in a tight loop intermittently fails to restart with:The failure is a race between epmd's userspace processing of the prior TCP FIN and the new node's
ALIVE2_REQ. Whengen_tcp:close/1returns to erl_epmd, the kernel has queued FIN, but epmd's read loop hasn't necessarily processed the EOF and unregistered the node by the time the next ALIVE2_REQ arrives; so itreturns ALIVE2_RESP status=1 ("already registered"), which
erl_epmd:wait_for_reg_reply/2translates to{error, duplicate_name}andnet_kernel:init/1propagates as{stop, duplicate_name}.TCP close is the only unregister signal in the protocol, so a client that wants to re-register cannot avoid this race except by waiting or retrying.
To Reproduce
On an idle Linux multipass VM on this Intel Mac, I see 4-5 failures per 1000 iterations. Under CPU load (e.g. taskset -c 1,2,3 stress-ng --cpu 3 --cpu-load 90) the rate rises a little bit.
Expected behavior
net_kernel:start/2either succeeds or fails for a real reason (name actually in use by another live node). It should not fail because epmd hasn't yet finished bookkeeping for this same node's prior connection.Affected versions
This was tested with OTP 28.1.
Additional context
I packet-captured 1000 iterations of an equivalent reproducer with
tcpdump -w epmd.pcap port 4369and parsed theALIVE2_RESPstatus byte. All status=1 responses (16/1000) occurred when the new connection's SYN arrived 1 to 3ms after the prior connection's FIN; every connection with more than 3ms gap succeeded.This was found investigating a downstream
test_net_kernelflake in AtomVM. The retry approach reduced the failure rate from ~2.5% to 0/1000 in ourerl_epmd(4 retries, 5/10/20/40ms backoff). Happy to send a PR for OTP's erl_epmd if the approach is acceptable.