Align default seccomp profile with selected capabilities

Currently the default seccomp profile is fixed. This changes it so that it varies depending on the Linux capabilities selected with the --cap-add and --cap-drop options. Without this, if a user adds privileges, eg to allow ptrace with --cap-add sys_ptrace then still cannot actually use ptrace as it is still blocked by seccomp, so they will probably disable seccomp or use --privileged. With this change the syscalls that are needed for the capability are also allowed by the seccomp profile based on the selected capabilities. While this patch makes it easier to do things with for example cap_sys_admin enabled, as it will now allow creating new namespaces and use of mount, it still allows less than --cap-add cap_sys_admin --security-opt seccomp:unconfined would have previously. It is not recommended that users run containers with cap_sys_admin as this does give full access to the host machine. It also cleans up some architecture specific system calls to be only selected when needed. Signed-off-by: Justin Cormack <justin.cormack@docker.com>
2016-05-06 15:17:41 +01:00 · 2016-05-06 15:17:41 +01:00 · ba8f5cfbb8
parent 09be3c1129
commit ba8f5cfbb8
1 changed files with 5 additions and 8 deletions
--- a/docs/reference/run.md
+++ b/docs/reference/run.md
@ -1089,14 +1089,6 @@ one can use this flag:
    --privileged=false: Give extended privileges to this container
    --device=[]: Allows you to run devices inside the container without the --privileged flag.

-> **Note:**
-> With Docker 1.10 and greater, the default seccomp profile will also block
-> syscalls, regardless of `--cap-add` passed to the container. We recommend in
-> these cases to create your own custom seccomp profile based off our
-> [default](https://github.com/docker/docker/blob/master/profiles/seccomp/default.json).
-> Or if you don't want to run with the default seccomp profile, you can pass
-> `--security-opt=seccomp=unconfined` on run.
-
 By default, Docker containers are "unprivileged" and cannot, for
 example, run a Docker daemon inside a Docker container. This is because
 by default a container is not allowed to access any devices, but a
@ -1214,6 +1206,11 @@ To mount a FUSE based filesystem, you need to combine both `--cap-add` and
    -rw-rw-r-- 1 1000 1000    461 Dec  4 06:08 .gitignore
    ....

+The default seccomp profile will adjust to the selected capabilities, in order to allow
+use of facilities allowed by the capabilities, so you should not have to adjust this,
+since Docker 1.12. In Docker 1.10 and 1.11 this did not happen and it may be necessary
+to use a custom seccomp profile or use `--security-opt seccomp=unconfined` when adding
+capabilities.

 ## Logging drivers (--log-driver)