mirror of https://github.com/docker/cli.git
Add support for user-defined healthchecks
This PR adds support for user-defined health-check probes for Docker containers. It adds a `HEALTHCHECK` instruction to the Dockerfile syntax plus some corresponding "docker run" options. It can be used with a restart policy to automatically restart a container if the check fails. The `HEALTHCHECK` instruction has two forms: * `HEALTHCHECK [OPTIONS] CMD command` (check container health by running a command inside the container) * `HEALTHCHECK NONE` (disable any healthcheck inherited from the base image) The `HEALTHCHECK` instruction tells Docker how to test a container to check that it is still working. This can detect cases such as a web server that is stuck in an infinite loop and unable to handle new connections, even though the server process is still running. When a container has a healthcheck specified, it has a _health status_ in addition to its normal status. This status is initially `starting`. Whenever a health check passes, it becomes `healthy` (whatever state it was previously in). After a certain number of consecutive failures, it becomes `unhealthy`. The options that can appear before `CMD` are: * `--interval=DURATION` (default: `30s`) * `--timeout=DURATION` (default: `30s`) * `--retries=N` (default: `1`) The health check will first run **interval** seconds after the container is started, and then again **interval** seconds after each previous check completes. If a single run of the check takes longer than **timeout** seconds then the check is considered to have failed. It takes **retries** consecutive failures of the health check for the container to be considered `unhealthy`. There can only be one `HEALTHCHECK` instruction in a Dockerfile. If you list more than one then only the last `HEALTHCHECK` will take effect. The command after the `CMD` keyword can be either a shell command (e.g. `HEALTHCHECK CMD /bin/check-running`) or an _exec_ array (as with other Dockerfile commands; see e.g. `ENTRYPOINT` for details). The command's exit status indicates the health status of the container. The possible values are: - 0: success - the container is healthy and ready for use - 1: unhealthy - the container is not working correctly - 2: starting - the container is not ready for use yet, but is working correctly If the probe returns 2 ("starting") when the container has already moved out of the "starting" state then it is treated as "unhealthy" instead. For example, to check every five minutes or so that a web-server is able to serve the site's main page within three seconds: HEALTHCHECK --interval=5m --timeout=3s \ CMD curl -f http://localhost/ || exit 1 To help debug failing probes, any output text (UTF-8 encoded) that the command writes on stdout or stderr will be stored in the health status and can be queried with `docker inspect`. Such output should be kept short (only the first 4096 bytes are stored currently). When the health status of a container changes, a `health_status` event is generated with the new status. The health status is also displayed in the `docker ps` output. Signed-off-by: Thomas Leonard <thomas.leonard@docker.com> Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This commit is contained in:
parent
cceb74311b
commit
51ddea93a2
|
@ -1470,6 +1470,73 @@ The `STOPSIGNAL` instruction sets the system call signal that will be sent to th
|
|||
This signal can be a valid unsigned number that matches a position in the kernel's syscall table, for instance 9,
|
||||
or a signal name in the format SIGNAME, for instance SIGKILL.
|
||||
|
||||
## HEALTHCHECK
|
||||
|
||||
The `HEALTHCHECK` instruction has two forms:
|
||||
|
||||
* `HEALTHCHECK [OPTIONS] CMD command` (check container health by running a command inside the container)
|
||||
* `HEALTHCHECK NONE` (disable any healthcheck inherited from the base image)
|
||||
|
||||
The `HEALTHCHECK` instruction tells Docker how to test a container to check that
|
||||
it is still working. This can detect cases such as a web server that is stuck in
|
||||
an infinite loop and unable to handle new connections, even though the server
|
||||
process is still running.
|
||||
|
||||
When a container has a healthcheck specified, it has a _health status_ in
|
||||
addition to its normal status. This status is initially `starting`. Whenever a
|
||||
health check passes, it becomes `healthy` (whatever state it was previously in).
|
||||
After a certain number of consecutive failures, it becomes `unhealthy`.
|
||||
|
||||
The options that can appear before `CMD` are:
|
||||
|
||||
* `--interval=DURATION` (default: `30s`)
|
||||
* `--timeout=DURATION` (default: `30s`)
|
||||
* `--retries=N` (default: `1`)
|
||||
|
||||
The health check will first run **interval** seconds after the container is
|
||||
started, and then again **interval** seconds after each previous check completes.
|
||||
|
||||
If a single run of the check takes longer than **timeout** seconds then the check
|
||||
is considered to have failed.
|
||||
|
||||
It takes **retries** consecutive failures of the health check for the container
|
||||
to be considered `unhealthy`.
|
||||
|
||||
There can only be one `HEALTHCHECK` instruction in a Dockerfile. If you list
|
||||
more than one then only the last `HEALTHCHECK` will take effect.
|
||||
|
||||
The command after the `CMD` keyword can be either a shell command (e.g. `HEALTHCHECK
|
||||
CMD /bin/check-running`) or an _exec_ array (as with other Dockerfile commands;
|
||||
see e.g. `ENTRYPOINT` for details).
|
||||
|
||||
The command's exit status indicates the health status of the container.
|
||||
The possible values are:
|
||||
|
||||
- 0: success - the container is healthy and ready for use
|
||||
- 1: unhealthy - the container is not working correctly
|
||||
- 2: starting - the container is not ready for use yet, but is working correctly
|
||||
|
||||
If the probe returns 2 ("starting") when the container has already moved out of the
|
||||
"starting" state then it is treated as "unhealthy" instead.
|
||||
|
||||
For example, to check every five minutes or so that a web-server is able to
|
||||
serve the site's main page within three seconds:
|
||||
|
||||
HEALTHCHECK --interval=5m --timeout=3s \
|
||||
CMD curl -f http://localhost/ || exit 1
|
||||
|
||||
To help debug failing probes, any output text (UTF-8 encoded) that the command writes
|
||||
on stdout or stderr will be stored in the health status and can be queried with
|
||||
`docker inspect`. Such output should be kept short (only the first 4096 bytes
|
||||
are stored currently).
|
||||
|
||||
When the health status of a container changes, a `health_status` event is
|
||||
generated with the new status.
|
||||
|
||||
The `HEALTHCHECK` feature was added in Docker 1.12.
|
||||
|
||||
|
||||
|
||||
## Dockerfile examples
|
||||
|
||||
Below you can see some examples of Dockerfile syntax. If you're interested in
|
||||
|
|
|
@ -1250,6 +1250,7 @@ Dockerfile instruction and how the operator can override that setting.
|
|||
#entrypoint-default-command-to-execute-at-runtime)
|
||||
- [EXPOSE (Incoming Ports)](#expose-incoming-ports)
|
||||
- [ENV (Environment Variables)](#env-environment-variables)
|
||||
- [HEALTHCHECK](#healthcheck)
|
||||
- [VOLUME (Shared Filesystems)](#volume-shared-filesystems)
|
||||
- [USER](#user)
|
||||
- [WORKDIR](#workdir)
|
||||
|
@ -1398,6 +1399,65 @@ above, or already defined by the developer with a Dockerfile `ENV`:
|
|||
|
||||
Similarly the operator can set the **hostname** with `-h`.
|
||||
|
||||
### HEALTHCHECK
|
||||
|
||||
```
|
||||
--health-cmd Command to run to check health
|
||||
--health-interval Time between running the check
|
||||
--health-retries Consecutive failures needed to report unhealthy
|
||||
--health-timeout Maximum time to allow one check to run
|
||||
--no-healthcheck Disable any container-specified HEALTHCHECK
|
||||
```
|
||||
|
||||
Example:
|
||||
|
||||
$ docker run --name=test -d \
|
||||
--health-cmd='stat /etc/passwd || exit 1' \
|
||||
--health-interval=2s \
|
||||
busybox sleep 1d
|
||||
$ sleep 2; docker inspect --format='{{.State.Health.Status}}' test
|
||||
healthy
|
||||
$ docker exec test rm /etc/passwd
|
||||
$ sleep 2; docker inspect --format='{{json .State.Health}}' test
|
||||
{
|
||||
"Status": "unhealthy",
|
||||
"FailingStreak": 3,
|
||||
"Log": [
|
||||
{
|
||||
"Start": "2016-05-25T17:22:04.635478668Z",
|
||||
"End": "2016-05-25T17:22:04.7272552Z",
|
||||
"ExitCode": 0,
|
||||
"Output": " File: /etc/passwd\n Size: 334 \tBlocks: 8 IO Block: 4096 regular file\nDevice: 32h/50d\tInode: 12 Links: 1\nAccess: (0664/-rw-rw-r--) Uid: ( 0/ root) Gid: ( 0/ root)\nAccess: 2015-12-05 22:05:32.000000000\nModify: 2015..."
|
||||
},
|
||||
{
|
||||
"Start": "2016-05-25T17:22:06.732900633Z",
|
||||
"End": "2016-05-25T17:22:06.822168935Z",
|
||||
"ExitCode": 0,
|
||||
"Output": " File: /etc/passwd\n Size: 334 \tBlocks: 8 IO Block: 4096 regular file\nDevice: 32h/50d\tInode: 12 Links: 1\nAccess: (0664/-rw-rw-r--) Uid: ( 0/ root) Gid: ( 0/ root)\nAccess: 2015-12-05 22:05:32.000000000\nModify: 2015..."
|
||||
},
|
||||
{
|
||||
"Start": "2016-05-25T17:22:08.823956535Z",
|
||||
"End": "2016-05-25T17:22:08.897359124Z",
|
||||
"ExitCode": 1,
|
||||
"Output": "stat: can't stat '/etc/passwd': No such file or directory\n"
|
||||
},
|
||||
{
|
||||
"Start": "2016-05-25T17:22:10.898802931Z",
|
||||
"End": "2016-05-25T17:22:10.969631866Z",
|
||||
"ExitCode": 1,
|
||||
"Output": "stat: can't stat '/etc/passwd': No such file or directory\n"
|
||||
},
|
||||
{
|
||||
"Start": "2016-05-25T17:22:12.971033523Z",
|
||||
"End": "2016-05-25T17:22:13.082015516Z",
|
||||
"ExitCode": 1,
|
||||
"Output": "stat: can't stat '/etc/passwd': No such file or directory\n"
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
The health status is also displayed in the `docker ps` output.
|
||||
|
||||
### TMPFS (mount tmpfs filesystems)
|
||||
|
||||
```bash
|
||||
|
|
Loading…
Reference in New Issue