Skip to content

netstack: let ICMP echo handlers consume requests#13189

Open
Amaindex wants to merge 1 commit into
google:masterfrom
Amaindex:netstack-icmp-echo-default-handler
Open

netstack: let ICMP echo handlers consume requests#13189
Amaindex wants to merge 1 commit into
google:masterfrom
Amaindex:netstack-icmp-echo-default-handler

Conversation

@Amaindex
Copy link
Copy Markdown
Contributor

Overview

This change makes ICMP Echo Request handling follow the usual SetTransportProtocolHandler default-handler contract: if an ICMP default handler returns true, the stack treats the request as consumed and does not also synthesize a built-in Echo Reply.

The default behavior is preserved when there is no ICMP default handler, or when the handler returns false.

The intended scope is small:

Related discussion:

Context

TCP and UDP default handlers already have a clear ownership signal:

defaultHandler returns true
        |
        v
packet is handled by the embedder
protocol fallback does not continue

ICMP Echo is different today because Echo Requests are intercepted by IPv4/IPv6 network endpoints before the built-in reply is generated.

That makes it possible for an embedder to receive an Echo Request and still get a second reply from netstack.

The earlier discussion started in #8657, where @fredwangwang reported that ICMP Echo could still be answered by the stack even when an ICMP handler was installed with SetTransportProtocolHandler. @deepcode2019 noted the same need, and @kevinGC suggested that a tested PR would be welcome.

#11609 took a narrow first step for IPv4 promiscuous-mode temporary addresses. After it merged, @ericpauley pointed out that restoring the old behavior could require reimplementing ICMP in a custom handler, and @dyhkwong noted the remaining IPv4/IPv6 inconsistency.

This change addresses that follow-up without widening the behavior change beyond Echo Request default-handler ownership.

The compatibility-sensitive case is runsc. It registers ICMP as a normal transport protocol, but does not install an ICMP default handler for this path:

runsc/boot/loader.go

transProtos := []stack.TransportProtocolFactory{
    tcp.NewProtocol,
    udp.NewProtocol,
    icmp.NewProtocol4,
    icmp.NewProtocol6,
}

So the no-default-handler path remains the ordinary built-in Echo Reply path.

Why This Matters

The main beneficiary is not ICMP itself, but embedders that use netstack inside user-space tunneling or virtual-networking software.

Several widely used downstream or adjacent projects use netstack, or downstream forks of it, in this role, including Tailscale's userspace networking, wireguard-go's netstack TUN backend, sing-box/mihomo-style TUN stacks, and tun2socks.

For those programs, an Echo Reply is often a policy decision:

Echo Request
  -> should this address be reachable through the tunnel?
  -> should the request be forwarded to a remote peer?
  -> should the reply reflect tunnel reachability, latency, filtering, or policy?

If netstack synthesizes a local Echo Reply after the embedder has consumed the request, unreachable or policy-blocked addresses can appear reachable.

After this change, ICMP default handlers can return true and take responsibility for Echo behavior. That lets tunneling software return replies that reflect its own forwarding policy and the current state of the tunnel.

Flow

Current behavior:

IPv4 Echo Request
  -> network/ipv4 ICMP handler
  -> DeliverTransportPacket(ICMPv4, pkt)
  -> if LocalAddressTemporary: stop
  -> built-in Echo Reply

IPv6 Echo Request
  -> network/ipv6 ICMP handler
  -> built-in Echo Reply

New behavior:

ICMP Echo Request
  -> network ICMP handler
  -> deliver to ICMP endpoint/defaultHandler
       |-- defaultHandler returned true  -> stop
       `-- otherwise                     -> continue built-in handling

built-in handling
  -> (IPv4 only: if LocalAddressTemporary -> stop)
  -> synthesize Echo Reply

The asymmetry above is intentional: the new default-handler ownership rule is shared by IPv4 and IPv6, while LocalAddressTemporary is a pre-existing IPv4 compatibility guard from #11609.

Implementation

Relevant code points:

TransportPacketHandled is too broad for this decision:

TransportPacketHandled
  |-- registered endpoint matched
  |-- defaultHandler returned true
  |-- unknown-destination handler consumed it
  `-- malformed packet was swallowed

Only one branch should suppress the built-in Echo Reply:

defaultHandler returned true

This change therefore adds a narrow dispatcher extension:

type TransportDispatcherWithDefaultHandlerResult interface {
    TransportDispatcher

    DeliverTransportPacketWithDefaultHandlerResult(
        tcpip.TransportProtocolNumber,
        *PacketBuffer,
    ) (TransportPacketDisposition, bool)
}

The second result is true only when the per-stack defaultHandler returned true:

if n.stack.demux.deliverPacket(protocol, pkt, id) {
    return TransportPacketHandled, false
}

if state.defaultHandler != nil {
    if state.defaultHandler(id, pkt) {
        return TransportPacketHandled, true
    }
}

This keeps endpoint visibility separate from default-handler ownership.

For IPv6, the built-in reply path now saves the data needed for the reply before transport delivery, because DeliverTransportPacket may modify the PacketBuffer:

replyPayload := pkt.Data().ToBuffer()
replyHeader := make([]byte, header.ICMPv6EchoMinimumSize)
copy(replyHeader, h[:header.ICMPv6EchoMinimumSize])

This also makes IPv6 Echo Requests observable through the same registered-endpoint/default-handler delivery path used by IPv4. A registered ICMP endpoint does not suppress the built-in reply; only a per-stack default handler returning true does.

Compatibility

Default runsc/netstack behavior should remain unchanged.

This change separates two decisions:

defaultHandler returned true
  -> new shared IPv4/IPv6 ownership rule

IPv4 LocalAddressTemporary
  -> existing #11609 guard, preserved as-is

Behavior matrix:

no ICMP default handler        -> built-in Echo Reply
default handler returns false  -> built-in Echo Reply
default handler returns true   -> consumed by handler; no built-in Echo Reply
IPv4 LocalAddressTemporary     -> preserve #11609; no built-in Echo Reply

ICMPv6 scope:

changed:    Echo Request
unchanged:  NDP, Packet Too Big, Destination Unreachable, Time Exceeded, other control paths

I am not trying to make IPv6 mirror the IPv4 LocalAddressTemporary path in this change. That would be a separate behavior change. Here, IPv6 gains the Echo Request endpoint/default-handler delivery path and the default-handler ownership rule.

LocalAddressTemporary remains useful as an IPv4 signal for embedders that already need to distinguish promiscuous-mode temporary local delivery from ordinary local delivery.

Follow-ups

I plan to keep the remaining work split into smaller follow-ups.

If this direction looks acceptable, I can follow up promptly with the mechanical helper extraction first. The ICMP Forwarder API can then be discussed on top of that smaller refactor, so the public API shape does not have to be decided in this change.

First, factor the built-in IPv4/IPv6 Echo Reply construction into an internal helper while preserving route lookup, rate limiting, stats, checksums, IPv4 options, IPv6 traffic class handling, and output hooks.

Then add an ICMP Forwarder API, analogous to the TCP/UDP forwarders:

f := icmp.NewForwarder(s, func(r *icmp.ForwarderRequest) bool {
    if shouldHandleByTunnel(r) {
        return handleByTunnel(r)
    }
    return r.Reply()
})

s.SetTransportProtocolHandler(icmp.ProtocolNumber4, f.HandlePacket)
s.SetTransportProtocolHandler(icmp.ProtocolNumber6, f.HandlePacket)

The goal of ForwarderRequest.Reply() is to let embedders explicitly delegate Echo Reply generation back to the stack, without copying netstack's reply construction logic downstream. That should also provide a cleaner way to restore the old "let netstack reply" behavior when a custom handler decides not to own a particular Echo Request.

Tests

The new tests cover the behavior matrix directly.

IPv4:

TestICMPEchoDefaultHandlerControlsReply
  no default handler             -> built-in Echo Reply is preserved
  default handler returns false  -> built-in Echo Reply is preserved
  default handler returns true   -> no built-in Echo Reply

TestICMPEchoRegisteredEndpointDoesNotSuppressReply
  registered endpoint is matched -> default handler is not called
                                  -> built-in Echo Reply is preserved

TestICMPEchoTemporaryAddressSuppressesReply
  promiscuous temporary address  -> built-in Echo Reply is suppressed

IPv6:

TestICMPEchoDefaultHandlerControlsReply
  no default handler             -> built-in Echo Reply is preserved
  default handler returns false  -> built-in Echo Reply is preserved
  default handler returns true   -> no built-in Echo Reply

TestICMPEchoRegisteredEndpointDoesNotSuppressReply
  registered endpoint is matched -> default handler is not called
                                  -> built-in Echo Reply is preserved

Commands:

make test TARGETS=//pkg/tcpip/network/ipv4:ipv4_test OPTIONS=--test_filter=TestICMPEcho
make test TARGETS=//pkg/tcpip/network/ipv6:ipv6_test OPTIONS=--test_filter=TestICMPEcho
make test TARGETS=//pkg/tcpip/network/ipv4:ipv4_test OPTIONS=--test_filter=TestIcmpRateLimit
make test TARGETS=//pkg/tcpip/network/ipv6:ipv6_test OPTIONS=--test_filter=TestNeighborSolicitationResponse
make test TARGETS="//pkg/tcpip/network/ipv4:ipv4_test //pkg/tcpip/network/ipv6:ipv6_test"
make test TARGETS=//pkg/tcpip/stack:stack_test

All of the commands above pass in my fork branch, netstack-icmp-echo-default-handler.

Review Context

While preparing the change, I discussed the direction with several downstream/heavy netstack users to validate the embedding use case and compatibility expectations.

I also used AI-assisted review as an additional pass to look for missing edge cases. The technical argument here is still based on the source links, behavior matrix, and tests above.

Fixes #8657.

IPv4 echo requests are delivered to the transport dispatcher before netstack
builds the in-stack echo reply, but a per-stack default handler cannot tell
the ICMP endpoint that it has taken ownership of the request. This leaves
embedders that install a custom ICMP handler with no direct way to suppress
the built-in reply.

Add a transport dispatcher extension that reports whether the per-stack
default transport handler consumed the packet, and use that signal in the
IPv4 and IPv6 ICMP echo paths. For IPv6, deliver echo requests through the
same endpoint/default-handler path before deciding whether to synthesize the
built-in reply.

Keep ordinary endpoint delivery distinct from default-handler ownership so
registered ICMP endpoints and raw sockets can observe echo requests without
suppressing the normal reply path.

This also preserves the IPv4 temporary-address behavior introduced by the
earlier echo-handling fix.

Signed-off-by: Zi Li <zi.li@linux.dev>
Signed-off-by: Amaindex <amaindex@outlook.com>
@Amaindex
Copy link
Copy Markdown
Contributor Author

Amaindex commented May 15, 2026

cc for gVisor context and review: @nybidari @ayushr2 @avagin
cc for downstream and embedder context: @ericpauley @dyhkwong @Zwlin98 @Lanius-collaris

Hi all, sorry for taking so long to come back to this. After #11609 I was focused on other work for a while, but I recently revisited the remaining ICMP Echo handler semantics from #8657 and prepared a smaller follow-up change.

The scope is intentionally narrower than the full icmp.Forwarder direction we discussed earlier. It only establishes the default-handler contract for ICMP Echo Requests:

default handler returns true:
  the stack treats the Echo Request as consumed
  and does not also synthesize a built-in Echo Reply

no handler / handler returns false:
  preserve the existing built-in Echo Reply behavior

It also keeps the temporary-address behavior from #11609, so the existing promiscuous-mode fix is not rolled back.

While preparing the change, I discussed the direction with several downstream/heavy netstack users to validate the embedding use case and compatibility expectations. I also used AI-assisted review as an additional pass to look for missing edge cases. The technical argument is still based on the source-level reasoning and tests in the description.

The unit tests exercise the behavior directly without setting up a TUN device:

make test TARGETS=//pkg/tcpip/network/ipv4:ipv4_test OPTIONS=--test_filter=TestICMPEcho
make test TARGETS=//pkg/tcpip/network/ipv6:ipv6_test OPTIONS=--test_filter=TestICMPEcho

The expected matrix is:

no ICMP default handler        -> built-in Echo Reply is preserved
default handler returns false  -> built-in Echo Reply is preserved
default handler returns true   -> no built-in Echo Reply
registered ICMP endpoint       -> default handler is not called; built-in Echo Reply is preserved
IPv4 LocalAddressTemporary     -> built-in Echo Reply is still suppressed, preserving #11609

I also ran TUN-based manual tests. They are not required for review, but they make the behavior visible with ping. The IPv4 program is included below because it also demonstrates the preserved LocalAddressTemporary behavior.

Optional IPv4 manual TUN test

Use the synthetic Go branch prepared from this change. It mirrors netstack-icmp-echo-default-handler, but is generated with gVisor's tools/go_branch.sh so standard Go tooling can resolve the tree directly:

git clone -b go-netstack-icmp-echo-default-handler https://github.com/Amaindex/gvisor.git
cd gvisor

cat >./icmp_echo_tun_testprog.go <<'EOF'
package main

import (
	"flag"
	"fmt"
	"log"
	"net"

	"gvisor.dev/gvisor/pkg/rawfile"
	"gvisor.dev/gvisor/pkg/tcpip"
	"gvisor.dev/gvisor/pkg/tcpip/header"
	"gvisor.dev/gvisor/pkg/tcpip/link/fdbased"
	"gvisor.dev/gvisor/pkg/tcpip/link/tun"
	"gvisor.dev/gvisor/pkg/tcpip/network/ipv4"
	"gvisor.dev/gvisor/pkg/tcpip/stack"
	"gvisor.dev/gvisor/pkg/tcpip/transport/icmp"
)

func main() {
	tunName := flag.String("tun", "gvicmp0", "TUN device name")
	handlerMode := flag.String("handler", "false", "none, false, or true")
	flag.Parse()

	s := stack.New(stack.Options{
		NetworkProtocols:   []stack.NetworkProtocolFactory{ipv4.NewProtocol},
		TransportProtocols: []stack.TransportProtocolFactory{icmp.NewProtocol4},
	})

	mtu, err := rawfile.GetMTU(*tunName)
	if err != nil {
		log.Fatal(err)
	}
	fd, err := tun.Open(*tunName)
	if err != nil {
		log.Fatal(err)
	}
	linkEP, err := fdbased.New(&fdbased.Options{
		FDs:            []int{fd},
		MTU:            mtu,
		EthernetHeader: false,
	})
	if err != nil {
		log.Fatal(err)
	}

	const nicID = 1
	if err := s.CreateNIC(nicID, linkEP); err != nil {
		log.Fatal(err)
	}
	if err := s.SetPromiscuousMode(nicID, true); err != nil {
		log.Fatal(err)
	}
	if err := s.SetSpoofing(nicID, true); err != nil {
		log.Fatal(err)
	}

	stackAddr := tcpip.AddrFromSlice(net.IPv4(11, 0, 0, 2).To4())
	if err := s.AddProtocolAddress(nicID, tcpip.ProtocolAddress{
		Protocol: ipv4.ProtocolNumber,
		AddressWithPrefix: tcpip.AddressWithPrefix{
			Address:   stackAddr,
			PrefixLen: 24,
		},
	}, stack.AddressProperties{}); err != nil {
		log.Fatal(err)
	}
	s.SetRouteTable([]tcpip.Route{{
		Destination: header.IPv4EmptySubnet,
		NIC:         nicID,
	}})

	switch *handlerMode {
	case "none":
	case "false", "true":
		handled := *handlerMode == "true"
		s.SetTransportProtocolHandler(icmp.ProtocolNumber4, func(id stack.TransportEndpointID, pkt *stack.PacketBuffer) bool {
			h := header.ICMPv4(pkt.TransportHeader().Slice())
			fmt.Printf("handler: %s -> %s type=%d temporary=%t returning=%t\n",
				id.RemoteAddress, id.LocalAddress, h.Type(), pkt.NetworkPacketInfo.LocalAddressTemporary, handled)
			return handled
		})
	default:
		log.Fatalf("unknown -handler=%q; want none, false, or true", *handlerMode)
	}

	fmt.Printf("stack address: 11.0.0.2/24, handler=%s\n", *handlerMode)
	select {}
}
EOF

sudo ip tuntap del dev gvicmp0 mode tun 2>/dev/null || true
sudo ip tuntap add dev gvicmp0 mode tun
sudo ip addr add 11.0.0.1/24 dev gvicmp0
sudo ip link set gvicmp0 up

Terminal 1:

sudo go run ./icmp_echo_tun_testprog.go -tun=gvicmp0 -handler=false

Terminal 2:

ping -c 2 11.0.0.2

Expected result: the handler logs the Echo Requests and ping receives replies, because returning false preserves the built-in Echo Reply path.

Then restart Terminal 1 with:

sudo go run ./icmp_echo_tun_testprog.go -tun=gvicmp0 -handler=true

Run the same ping -c 2 11.0.0.2.

Expected result: the handler logs the Echo Requests and ping gets no replies, because returning true consumes the packet and suppresses the built-in Echo Reply.

The existing #11609 temporary-address behavior can be verified with:

sudo go run ./icmp_echo_tun_testprog.go -tun=gvicmp0 -handler=false
ping -c 2 11.0.0.99

Expected result: the handler logs temporary=true, and ping gets no replies because the IPv4 LocalAddressTemporary guard is still preserved.

Cleanup:

sudo ip tuntap del dev gvicmp0 mode tun
rm -f ./icmp_echo_tun_testprog.go

I also exercised the same TUN-style flow for IPv6 with fd00:11::1/64 on the host side and fd00:11::2/64 inside netstack: handler=none and handler=false both replied to ping -6, while handler=true logged ICMPv6 Echo Requests with type 128 and produced no replies.

Related regression tests also pass:

make test TARGETS=//pkg/tcpip/network/ipv4:ipv4_test OPTIONS=--test_filter=TestIcmpRateLimit
make test TARGETS=//pkg/tcpip/network/ipv6:ipv6_test OPTIONS=--test_filter=TestNeighborSolicitationResponse
make test TARGETS="//pkg/tcpip/network/ipv4:ipv4_test //pkg/tcpip/network/ipv6:ipv6_test"
make test TARGETS=//pkg/tcpip/stack:stack_test

I would especially appreciate feedback on two design points:

  1. Whether exposing default-handler-specific delivery metadata through TransportDispatcherWithDefaultHandlerResult is the right internal shape, or whether this should be a narrower ICMP-only hook.
  2. Whether keeping LocalAddressTemporary as a reply-suppression condition in this first PR is the right conservative choice before a later icmp.Forwarder / ForwarderRequest.Reply() API.

Thanks again for the earlier discussion and for your patience.

@ericpauley
Copy link
Copy Markdown
Contributor

@Amaindex thanks for the CC. For our use case the LocalAddressTemporary suppression has been a blocker. We use gVisor for many addresses. Because of the LAT suppression we need to add all of these to the gVisor stack, but certain code paths in gVisor (e.g., https://github.com/google/gvisor/blob/master/pkg/tcpip/stack/addressable_endpoint_state.go#L567) are O(n) with respect to number of registered addresses.

The ideal solution would be to support ICMP replies on temporary addresses, or to allow handlers to explicitly invoke the default stack response.

@EtiennePerot EtiennePerot requested a review from nybidari May 15, 2026 18:25
@Amaindex
Copy link
Copy Markdown
Contributor Author

Amaindex commented May 16, 2026

@Amaindex thanks for the CC. For our use case the LocalAddressTemporary suppression has been a blocker. We use gVisor for many addresses. Because of the LAT suppression we need to add all of these to the gVisor stack, but certain code paths in gVisor (e.g., https://github.com/google/gvisor/blob/master/pkg/tcpip/stack/addressable_endpoint_state.go#L567) are O(n) with respect to number of registered addresses.

The ideal solution would be to support ICMP replies on temporary addresses, or to allow handlers to explicitly invoke the default stack response.

@ericpauley thanks, this is helpful context.

I agree that LocalAddressTemporary is still an important follow-up, especially for embedders using netstack with a large address space. I also think we need to balance that with the tunnel/proxy case, where an ICMP handler may intentionally want to own Echo Requests and avoid an automatic local reply.

My current thinking is to keep this PR focused on the default-handler ownership rule and preserve the existing LocalAddressTemporary behavior from #11609. If that direction looks OK, the next step would be to factor the built-in Echo Reply construction into an internal helper, then build an ICMP Forwarder API on top.

That follow-up API should let a handler explicitly delegate an Echo Request back to netstack, e.g. with something like ForwarderRequest.Reply(). That should cover your case without requiring every address to be registered, while still preserving explicit ownership for proxy/tunnel users.

Once that explicit delegation API exists, I think we can also discuss making LocalAddressTemporary handler-visible metadata rather than the thing that unconditionally decides reply behavior. I would prefer to wait for gVisor maintainer feedback before expanding this PR, but I agree this is one of the main cases the follow-up design should account for.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

allow custom icmp handler to respond icmp packets

2 participants