Skip to content

Random 137 exit code #2

@zrzka

Description

@zrzka

Lately, we've been experiencing 137 exit codes from this sidecar in AWS ECS Task (Fargate). After a brief investigation, I found the following:

  • 137 corresponds to SIGKILL.
  • According to the ECS task lifecycle, the container is first sent a SIGTERM, followed by a 30-second timeout, and then forcibly stopped with SIGKILL.
  • The Graceful shutdowns with ECS blog explains that while the shell receives SIGTERM, it doesn’t propagate it to child processes, leading to a timeout and a forced SIGKILL.

In the Dockerfile, the command is:

sh /entry.sh

How can we gracefully shut down the sidecar container? This issue mainly occurs when we run one-off tasks (e.g., database migrations) that complete very quickly (when there's nothing to migrate). However, when the task takes longer (e.g., actual migrations are required), it usually works as expected and exits with exit code 0.

Have you encountered anything similar? Do my findings seem correct, or am I missing something?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions