Lately, we've been experiencing 137 exit codes from this sidecar in AWS ECS Task (Fargate). After a brief investigation, I found the following:
- 137 corresponds to SIGKILL.
- According to the ECS task lifecycle, the container is first sent a SIGTERM, followed by a 30-second timeout, and then forcibly stopped with SIGKILL.
- The Graceful shutdowns with ECS blog explains that while the shell receives SIGTERM, it doesn’t propagate it to child processes, leading to a timeout and a forced SIGKILL.
In the Dockerfile, the command is:
How can we gracefully shut down the sidecar container? This issue mainly occurs when we run one-off tasks (e.g., database migrations) that complete very quickly (when there's nothing to migrate). However, when the task takes longer (e.g., actual migrations are required), it usually works as expected and exits with exit code 0.
Have you encountered anything similar? Do my findings seem correct, or am I missing something?
Lately, we've been experiencing 137 exit codes from this sidecar in AWS ECS Task (Fargate). After a brief investigation, I found the following:
In the Dockerfile, the command is:
How can we gracefully shut down the sidecar container? This issue mainly occurs when we run one-off tasks (e.g., database migrations) that complete very quickly (when there's nothing to migrate). However, when the task takes longer (e.g., actual migrations are required), it usually works as expected and exits with exit code 0.
Have you encountered anything similar? Do my findings seem correct, or am I missing something?