Skip to content

net_kernel:start/stop cycles race against epmd's FIN processing #11083

@pguyot

Description

@pguyot

Describe the bug

net_kernel:start/2 followed by net_kernel:stop/0 in a tight loop intermittently fails to restart with:

Protocol 'inet_tcp': the name @ seems to be in use by another Erlang node

The failure is a race between epmd's userspace processing of the prior TCP FIN and the new node's ALIVE2_REQ. When gen_tcp:close/1 returns to erl_epmd, the kernel has queued FIN, but epmd's read loop hasn't necessarily processed the EOF and unregistered the node by the time the next ALIVE2_REQ arrives; so it
returns ALIVE2_RESP status=1 ("already registered"), which erl_epmd:wait_for_reg_reply/2 translates to {error, duplicate_name} and net_kernel:init/1 propagates as {stop, duplicate_name}.

TCP close is the only unregister signal in the protocol, so a client that wants to re-register cannot avoid this race except by waiting or retrying.

To Reproduce

-module(repro_otp_epmd_race).
-export([main/0]).                                                                                                                                           
 
main() ->                                                                                                                                                    
    Errors = loop(1000, 0, []),                                 
    io:format("done. failures=~p~n", [length(Errors)]),
    erlang:halt(case Errors of [] -> 0; _ -> 1 end).                                                                                                         
 
loop(0, _, Errors) -> Errors;                                                                                                                                
loop(N, Iter, Errors) ->                                        
    NextIter = Iter + 1,                                                                                                                                     
    case net_kernel:start(otp_repro, #{name_domain => shortnames}) of
        {ok, _} ->                                                                                                                                           
            ok = net_kernel:stop(),                                                                                                                          
            loop(N - 1, NextIter, Errors);
        {error, _} = E ->                                                                                                                                    
            catch net_kernel:stop(),                                                                                                                         
            loop(N - 1, NextIter, [{NextIter, E} | Errors])
    end.                                                                                                                                                     

On an idle Linux multipass VM on this Intel Mac, I see 4-5 failures per 1000 iterations. Under CPU load (e.g. taskset -c 1,2,3 stress-ng --cpu 3 --cpu-load 90) the rate rises a little bit.

Expected behavior

net_kernel:start/2 either succeeds or fails for a real reason (name actually in use by another live node). It should not fail because epmd hasn't yet finished bookkeeping for this same node's prior connection.

Affected versions

This was tested with OTP 28.1.

Additional context

I packet-captured 1000 iterations of an equivalent reproducer with tcpdump -w epmd.pcap port 4369 and parsed the ALIVE2_RESP status byte. All status=1 responses (16/1000) occurred when the new connection's SYN arrived 1 to 3ms after the prior connection's FIN; every connection with more than 3ms gap succeeded.

This was found investigating a downstream test_net_kernel flake in AtomVM. The retry approach reduced the failure rate from ~2.5% to 0/1000 in our erl_epmd (4 retries, 5/10/20/40ms backoff). Happy to send a PR for OTP's erl_epmd if the approach is acceptable.

Metadata

Metadata

Assignees

Labels

bugIssue is reported as a bugteam:VMAssigned to OTP team VM

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions