Skip to content

🐛 BUG:Denial of Service in SSH Management Interface via Synchronous Handshake (Blocking Accept Loop) #1606

@mrtufan

Description

@mrtufan

What version of nebula are you using? (nebula -version)

1.10.3

What operating system are you using?

Linux

Describe the Bug

Summary

The internal SSH server in Nebula (sshd package) handles the initial protocol handshake synchronously within the main TCP accept loop. A single TCP connection that connects but does not send the protocol version string can block the listener, preventing legitimate administrators from accessing the management interface.

Technical Analysis

I analyzed the sshd/server.go file and identified a concurrency issue in the Run method. The server accepts a new TCP connection and immediately calls ssh.NewServerConn(c, s.config) in the same thread/goroutine as the accept loop.

Vulnerable Code: sshd/server.go

func (s *SSHServer) run() {
    for {
        // 1. Accept new TCP connection
        c, err := s.listener.Accept()
        if err != nil { ... }

        // VULNERABILITY: Blocking call.
        // The loop halts here waiting for the SSH handshake to complete.
        // If the client does not send the version string, this blocks forever.
        conn, chans, reqs, err := ssh.NewServerConn(c, s.config) 

        // The goroutine is started ONLY after the handshake returns
        go func() { ... }()
    }
}

The ssh.NewServerConn function waits for the client to send the SSH version string (e.g., SSH-2.0-Client...). If a client connects (completes TCP handshake) but never sends this version string, the NewServerConn function blocks indefinitely. Because this happens inside the main for loop, s.listener.Accept() is never called for subsequent connections, effectively locking out all other users.

Steps to Reproduce

  1. Setup Environment
    Generate the necessary CA, certificates, and a dummy SSH host key to run Nebula.
# Generate CA and Host Certificate
./nebula-cert ca -name "TestOrg"
./nebula-cert sign -name "host" -ip "192.168.100.1/24"

# Generate a dummy host key for SSHD
ssh-keygen -t ed25519 -f ssh_host_key -N ""
  1. Run Nebula
    Start the nebula instance:
./nebula -config config.yml
  1. Attack (The Exploit)
    Open a terminal and connect using netcat, but do not send any data. This simulates a hung client:
nc 127.0.0.1 2222

(Leave this command running. The server sends its banner, but we send nothing back.)

  1. Verification
    Open a second terminal and attempt to connect as a legitimate administrator:
ssh -v -p 2222 admin@127.0.0.1

Result:
The legitimate SSH client successfully establishes a TCP connection (debug1: Connection established) but hangs indefinitely at debug1: Local version string.... It never receives the remote protocol version because the server's accept loop is blocked by the netcat connection.

Impact

Primary Impact: Denial of Service (DoS) of Management Interface

Loss of Administrative Access: An unauthenticated attacker can render the SSH management interface completely unavailable to legitimate administrators. This prevents the operations team from logging in to monitor, debug, reconfigure, or restart the node remotely.

Operational Blindness: During an active attack, administrators effectively lose control over the node's configuration and status monitoring via SSH.

Ease of Exploitation: The attack requires minimal resources (a single standard TCP connection using netcat) and no authentication. It can be sustained indefinitely with negligible cost to the attacker.

Scope: This vulnerability specifically impacts the Management Plane. While the overlay VPN traffic (Data Plane) may continue to function, the inability to manage the node represents a significant availability risk in production environments.

Recommended Fix

The handshake logic should be handled asynchronously to ensure the main accept loop is never blocked. Ideally, a deadline should also be set.

Suggested Patch Pattern:

func (s *SSHServer) run() {
    for {
        c, err := s.listener.Accept()
        if err != nil { continue }

        // Fix: Hand off to goroutine immediately
        go s.handleHandshake(c)
    }
}

func (s *SSHServer) handleHandshake(c net.Conn) {
    // Set deadline to prevent resource leaks
    c.SetDeadline(time.Now().Add(10 * time.Second))
    
    conn, chans, reqs, err := ssh.NewServerConn(c, s.config)
    // ... handle connection ...
}

Logs from affected hosts

1️⃣ Nebula Startup Logs

$ ./nebula -config config.yml
INFO[0000] Firewall rule added
INFO[0000] Firewall rule added
INFO[0000] Firewall started
INFO[0000] no ssh users to authorize
INFO[0000] listening on 0.0.0.0:4242
INFO[0000] Main HostMap created
INFO[0000] punchy disabled
WARN[0000] No lighthouse.hosts configured, this host will only be able to initiate tunnels with static_host_map entries
INFO[0000] Loaded send_recv_error config
INFO[0000] Loaded accept_recv_error config
INFO[0000] Nebula interface is active
INFO[0000] SSH server is listening                       sshListener="127.0.0.1:2222" subsystem=sshd

2️⃣ Attacker Connection (Stalled Handshake)

$ nc 127.0.0.1 2222
SSH-2.0-Nebula???

3️⃣ Legitimate Administrator Attempt (Blocked)

$ ssh -v -p 2222 admin@127.0.0.1
debug1: OpenSSH_10.2p1, OpenSSL 3.6.1 27 Jan 2026
debug1: Reading configuration data
debug1: Connecting to 127.0.0.1 [127.0.0.1] port 2222.
debug1: Connection established.
debug1: Local version string SSH-2.0-OpenSSH_10.2

Config files from affected hosts

Create config.yml
Save the following configuration in the same directory:

pki:
  ca: ca.crt
  cert: host.crt
  key: host.key

sshd:
  enabled: true
  listen: 127.0.0.1:2222
  host_key: ./ssh_host_key

tun:
  # Disabled to simplify reproduction (no root required)
  disabled: true 

listen:
  host: 0.0.0.0
  port: 4242

firewall:
  outbound: [{port: any, proto: any, host: any}]
  inbound: [{port: any, proto: any, host: any}]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions