Memory usage of TCP applications in Mirage are a well-known problem 12. There are workarounds (such as the restart-on-failure mode mentioned in the linked article), but they are not guaranteed to always work.
Defending against this with hardcoded limits is difficult, because packets can be of different size, and it is difficult to predict how much memory they'd use once decoded and handled by the Mirage stack (and user application).
Instead I proporse to (approximatively) track the memory used by a connection, and take actions based on that. The tracking doesn't have to be entirely precise, as long as the real memory used is within some constant factor of the desired target.
mirage-tcpip could implement soft limits and window shrinking:
-
allow the user to specify the desired memory target (especially important for unikernels, which may have hard limits on how much memory you can assign to them)
-
have various parts of the code track (approximately) how much memory it is using (e.g. size of buffers, count of entries in various tables). This doesn't have to be entirely accurate (e.g. could be an overestimate), but at least it'd ensure that memory usage is bounded by at most a constant factor of the configured memory target.
-
slow down the rate at which new connections are accepted if we get close to that memory limit, starting to drop packets once we actually hit it
-
reduce the size of the TCP window for connections which have too high backlog of pending packets (this puts the burden on the other side to retry, see below about window shrinking and zero window). 3
-
ensure keepalives are used in all corners cases. This is needed, because otherwise certain TCP states would require us to keep state forever , see 4
-
Dropping packets should only be done as a last resort (hard limit), see 5 for when dropping packets is not ideal (or even allowed by the RFC)
Hard limits
I have patches that implement some missing hard limits, and they improve the availability of mirage applications.
But for best results they should be combined with some soft limits as described above (I don't yet have patches for that).
Nevertheless hard limits are useful as a last line of defense against bugs or inaccuracies in the soft limit implementation.
Testing
I have some code that triggers various issues in the Mirage stack, that can be used to test the effectiveness of various solutions.
For obvious reasons I won't be publishing that code.
Similar approaches
I've successfully implemented a similar approach (tracking resource usage at runtime) to defend oxenstored against out of memory issues:
https://xenbits.xen.org/xsa/advisory-326.html
Next steps
I propose to open some PRs which implement the fallback hard limits as a starting point (once we find a solution for #533).
I'd be happy to hear your thoughts on the suggested window shrinking defense (or if you have any other defenses in mind).
cc @Firobe
Background reading:
Memory usage of TCP applications in Mirage are a well-known problem 12. There are workarounds (such as the restart-on-failure mode mentioned in the linked article), but they are not guaranteed to always work.
Defending against this with hardcoded limits is difficult, because packets can be of different size, and it is difficult to predict how much memory they'd use once decoded and handled by the Mirage stack (and user application).
Instead I proporse to (approximatively) track the memory used by a connection, and take actions based on that. The tracking doesn't have to be entirely precise, as long as the real memory used is within some constant factor of the desired target.
mirage-tcpip could implement soft limits and window shrinking:
allow the user to specify the desired memory target (especially important for unikernels, which may have hard limits on how much memory you can assign to them)
have various parts of the code track (approximately) how much memory it is using (e.g. size of buffers, count of entries in various tables). This doesn't have to be entirely accurate (e.g. could be an overestimate), but at least it'd ensure that memory usage is bounded by at most a constant factor of the configured memory target.
slow down the rate at which new connections are accepted if we get close to that memory limit, starting to drop packets once we actually hit it
reduce the size of the TCP window for connections which have too high backlog of pending packets (this puts the burden on the other side to retry, see below about window shrinking and zero window). 3
ensure keepalives are used in all corners cases. This is needed, because otherwise certain TCP states would require us to keep state forever , see 4
Dropping packets should only be done as a last resort (hard limit), see 5 for when dropping packets is not ideal (or even allowed by the RFC)
Hard limits
I have patches that implement some missing hard limits, and they improve the availability of mirage applications.
But for best results they should be combined with some soft limits as described above (I don't yet have patches for that).
Nevertheless hard limits are useful as a last line of defense against bugs or inaccuracies in the soft limit implementation.
Testing
I have some code that triggers various issues in the Mirage stack, that can be used to test the effectiveness of various solutions.
For obvious reasons I won't be publishing that code.
Similar approaches
I've successfully implemented a similar approach (tracking resource usage at runtime) to defend
oxenstoredagainst out of memory issues:https://xenbits.xen.org/xsa/advisory-326.html
Next steps
I propose to open some PRs which implement the fallback hard limits as a starting point (once we find a solution for #533).
I'd be happy to hear your thoughts on the suggested window shrinking defense (or if you have any other defenses in mind).
cc @Firobe
Background reading:
Footnotes
https://tarides.com/blog/2024-01-24-mirageos-designing-a-more-resilient-networking-stack-with-tcp/ ↩
https://hannes.robur.coop/Posts/TCP-ns ↩
https://blog.cloudflare.com/unbounded-memory-usage-by-tcp-for-receive-buffers-and-how-we-fixed-it/#shrinking-the-window ↩
https://blog.cloudflare.com/when-tcp-sockets-refuse-to-die/#idle-estab-is-forever ↩
https://blog.cloudflare.com/unbounded-memory-usage-by-tcp-for-receive-buffers-and-how-we-fixed-it/#drop-incoming-packets ↩