From 72a357858c4929d62e1e8dd407311b7fef871837 Mon Sep 17 00:00:00 2001 From: Sebastiaan van Stijn Date: Wed, 16 Sep 2020 16:02:20 +0200 Subject: [PATCH 1/3] docs: resize capabilities table Signed-off-by: Sebastiaan van Stijn --- docs/reference/run.md | 84 +++++++++++++++++++++---------------------- 1 file changed, 42 insertions(+), 42 deletions(-) diff --git a/docs/reference/run.md b/docs/reference/run.md index 6294f531fe..873363ea9f 100644 --- a/docs/reference/run.md +++ b/docs/reference/run.md @@ -1285,51 +1285,51 @@ capabilities using `--cap-add` and `--cap-drop`. By default, Docker has a defaul list of capabilities that are kept. The following table lists the Linux capability options which are allowed by default and can be dropped. -| Capability Key | Capability Description | -|:-----------------|:------------------------------------------------------------------------------------------------------------------------------| -| AUDIT_WRITE | Write records to kernel auditing log. | -| CHOWN | Make arbitrary changes to file UIDs and GIDs (see chown(2)). | -| DAC_OVERRIDE | Bypass file read, write, and execute permission checks. | -| FOWNER | Bypass permission checks on operations that normally require the file system UID of the process to match the UID of the file. | -| FSETID | Don't clear set-user-ID and set-group-ID permission bits when a file is modified. | -| KILL | Bypass permission checks for sending signals. | -| MKNOD | Create special files using mknod(2). | -| NET_BIND_SERVICE | Bind a socket to internet domain privileged ports (port numbers less than 1024). | -| NET_RAW | Use RAW and PACKET sockets. | -| SETFCAP | Set file capabilities. | -| SETGID | Make arbitrary manipulations of process GIDs and supplementary GID list. | -| SETPCAP | Modify process capabilities. | -| SETUID | Make arbitrary manipulations of process UIDs. | -| SYS_CHROOT | Use chroot(2), change root directory. | +| Capability Key | Capability Description | +|:----------------------|:-------------------------------------------------------------------------------------------------------------------------------| +| AUDIT_WRITE | Write records to kernel auditing log. | +| CHOWN | Make arbitrary changes to file UIDs and GIDs (see chown(2)). | +| DAC_OVERRIDE | Bypass file read, write, and execute permission checks. | +| FOWNER | Bypass permission checks on operations that normally require the file system UID of the process to match the UID of the file. | +| FSETID | Don't clear set-user-ID and set-group-ID permission bits when a file is modified. | +| KILL | Bypass permission checks for sending signals. | +| MKNOD | Create special files using mknod(2). | +| NET_BIND_SERVICE | Bind a socket to internet domain privileged ports (port numbers less than 1024). | +| NET_RAW | Use RAW and PACKET sockets. | +| SETFCAP | Set file capabilities. | +| SETGID | Make arbitrary manipulations of process GIDs and supplementary GID list. | +| SETPCAP | Modify process capabilities. | +| SETUID | Make arbitrary manipulations of process UIDs. | +| SYS_CHROOT | Use chroot(2), change root directory. | The next table shows the capabilities which are not granted by default and may be added. -| Capability Key | Capability Description | -|:----------------|:----------------------------------------------------------------------------------------------------------------| -| AUDIT_CONTROL | Enable and disable kernel auditing; change auditing filter rules; retrieve auditing status and filtering rules. | -| AUDIT_READ | Allow reading audit messages from the kernel. | -| BLOCK_SUSPEND | Employ features that can block system suspend. | -| DAC_READ_SEARCH | Bypass file read permission checks and directory read and execute permission checks. | -| IPC_LOCK | Lock memory (mlock(2), mlockall(2), mmap(2), shmctl(2)). | -| IPC_OWNER | Bypass permission checks for operations on System V IPC objects. | -| LEASE | Establish leases on arbitrary files (see fcntl(2)). | -| LINUX_IMMUTABLE | Set the FS_APPEND_FL and FS_IMMUTABLE_FL i-node flags. | -| MAC_ADMIN | Allow MAC configuration or state changes. Implemented for the Smack LSM. | -| MAC_OVERRIDE | Override Mandatory Access Control (MAC). Implemented for the Smack Linux Security Module (LSM). | -| NET_ADMIN | Perform various network-related operations. | -| NET_BROADCAST | Make socket broadcasts, and listen to multicasts. | -| SYS_ADMIN | Perform a range of system administration operations. | -| SYS_BOOT | Use reboot(2) and kexec_load(2), reboot and load a new kernel for later execution. | -| SYS_MODULE | Load and unload kernel modules. | -| SYS_NICE | Raise process nice value (nice(2), setpriority(2)) and change the nice value for arbitrary processes. | -| SYS_PACCT | Use acct(2), switch process accounting on or off. | -| SYS_PTRACE | Trace arbitrary processes using ptrace(2). | -| SYS_RAWIO | Perform I/O port operations (iopl(2) and ioperm(2)). | -| SYS_RESOURCE | Override resource Limits. | -| SYS_TIME | Set system clock (settimeofday(2), stime(2), adjtimex(2)); set real-time (hardware) clock. | -| SYS_TTY_CONFIG | Use vhangup(2); employ various privileged ioctl(2) operations on virtual terminals. | -| SYSLOG | Perform privileged syslog(2) operations. | -| WAKE_ALARM | Trigger something that will wake up the system. | +| Capability Key | Capability Description | +|:----------------------|:-------------------------------------------------------------------------------------------------------------------------------| +| AUDIT_CONTROL | Enable and disable kernel auditing; change auditing filter rules; retrieve auditing status and filtering rules. | +| AUDIT_READ | Allow reading audit messages from the kernel. | +| BLOCK_SUSPEND | Employ features that can block system suspend. | +| DAC_READ_SEARCH | Bypass file read permission checks and directory read and execute permission checks. | +| IPC_LOCK | Lock memory (mlock(2), mlockall(2), mmap(2), shmctl(2)). | +| IPC_OWNER | Bypass permission checks for operations on System V IPC objects. | +| LEASE | Establish leases on arbitrary files (see fcntl(2)). | +| LINUX_IMMUTABLE | Set the FS_APPEND_FL and FS_IMMUTABLE_FL i-node flags. | +| MAC_ADMIN | Allow MAC configuration or state changes. Implemented for the Smack LSM. | +| MAC_OVERRIDE | Override Mandatory Access Control (MAC). Implemented for the Smack Linux Security Module (LSM). | +| NET_ADMIN | Perform various network-related operations. | +| NET_BROADCAST | Make socket broadcasts, and listen to multicasts. | +| SYS_ADMIN | Perform a range of system administration operations. | +| SYS_BOOT | Use reboot(2) and kexec_load(2), reboot and load a new kernel for later execution. | +| SYS_MODULE | Load and unload kernel modules. | +| SYS_NICE | Raise process nice value (nice(2), setpriority(2)) and change the nice value for arbitrary processes. | +| SYS_PACCT | Use acct(2), switch process accounting on or off. | +| SYS_PTRACE | Trace arbitrary processes using ptrace(2). | +| SYS_RAWIO | Perform I/O port operations (iopl(2) and ioperm(2)). | +| SYS_RESOURCE | Override resource Limits. | +| SYS_TIME | Set system clock (settimeofday(2), stime(2), adjtimex(2)); set real-time (hardware) clock. | +| SYS_TTY_CONFIG | Use vhangup(2); employ various privileged ioctl(2) operations on virtual terminals. | +| SYSLOG | Perform privileged syslog(2) operations. | +| WAKE_ALARM | Trigger something that will wake up the system. | Further reference information is available on the [capabilities(7) - Linux man page](http://man7.org/linux/man-pages/man7/capabilities.7.html) From f19e31afe24e677814248728219f600a59b4687d Mon Sep 17 00:00:00 2001 From: Sebastiaan van Stijn Date: Wed, 16 Sep 2020 16:15:10 +0200 Subject: [PATCH 2/3] docs: add link to linux kernel source code for capabilities Signed-off-by: Sebastiaan van Stijn --- docs/reference/run.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/reference/run.md b/docs/reference/run.md index 873363ea9f..e0f0f3c4e6 100644 --- a/docs/reference/run.md +++ b/docs/reference/run.md @@ -1331,7 +1331,8 @@ The next table shows the capabilities which are not granted by default and may b | SYSLOG | Perform privileged syslog(2) operations. | | WAKE_ALARM | Trigger something that will wake up the system. | -Further reference information is available on the [capabilities(7) - Linux man page](http://man7.org/linux/man-pages/man7/capabilities.7.html) +Further reference information is available on the [capabilities(7) - Linux man page](http://man7.org/linux/man-pages/man7/capabilities.7.html), +and in the [Linux kernel source code](https://github.com/torvalds/linux/blob/124ea650d3072b005457faed69909221c2905a1f/include/uapi/linux/capability.h). Both flags support the value `ALL`, so to allow a container to use all capabilities except for `MKNOD`: From 6065dccc9812c64d19fd6a0c6baefada368b6d2d Mon Sep 17 00:00:00 2001 From: Sebastiaan van Stijn Date: Wed, 16 Sep 2020 16:20:40 +0200 Subject: [PATCH 3/3] Add docs and bash-completion for new Linux capabilities Signed-off-by: Sebastiaan van Stijn --- contrib/completion/bash/docker | 3 +++ docs/reference/run.md | 7 +++++-- 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/contrib/completion/bash/docker b/contrib/completion/bash/docker index ccf2f0f576..314ed244f7 100644 --- a/contrib/completion/bash/docker +++ b/contrib/completion/bash/docker @@ -837,6 +837,8 @@ __docker_complete_capabilities_addable() { CAP_AUDIT_CONTROL CAP_AUDIT_READ CAP_BLOCK_SUSPEND + CAP_BPF + CAP_CHECKPOINT_RESTORE CAP_DAC_READ_SEARCH CAP_IPC_LOCK CAP_IPC_OWNER @@ -846,6 +848,7 @@ __docker_complete_capabilities_addable() { CAP_MAC_OVERRIDE CAP_NET_ADMIN CAP_NET_BROADCAST + CAP_PERFMON CAP_SYS_ADMIN CAP_SYS_BOOT CAP_SYSLOG diff --git a/docs/reference/run.md b/docs/reference/run.md index e0f0f3c4e6..516d3e739f 100644 --- a/docs/reference/run.md +++ b/docs/reference/run.md @@ -1307,8 +1307,10 @@ The next table shows the capabilities which are not granted by default and may b | Capability Key | Capability Description | |:----------------------|:-------------------------------------------------------------------------------------------------------------------------------| | AUDIT_CONTROL | Enable and disable kernel auditing; change auditing filter rules; retrieve auditing status and filtering rules. | -| AUDIT_READ | Allow reading audit messages from the kernel. | -| BLOCK_SUSPEND | Employ features that can block system suspend. | +| AUDIT_READ | Allow reading the audit log via multicast netlink socket. | +| BLOCK_SUSPEND | Allow preventing system suspends. | +| BPF | Allow creating BPF maps, loading BPF Type Format (BTF) data, retrieve JITed code of BPF programs, and more. | +| CHECKPOINT_RESTORE | Allow checkpoint/restore related operations. Introduced in kernel 5.9. | | DAC_READ_SEARCH | Bypass file read permission checks and directory read and execute permission checks. | | IPC_LOCK | Lock memory (mlock(2), mlockall(2), mmap(2), shmctl(2)). | | IPC_OWNER | Bypass permission checks for operations on System V IPC objects. | @@ -1318,6 +1320,7 @@ The next table shows the capabilities which are not granted by default and may b | MAC_OVERRIDE | Override Mandatory Access Control (MAC). Implemented for the Smack Linux Security Module (LSM). | | NET_ADMIN | Perform various network-related operations. | | NET_BROADCAST | Make socket broadcasts, and listen to multicasts. | +| PERFMON | Allow system performance and observability privileged operations using perf_events, i915_perf and other kernel subsystems | | SYS_ADMIN | Perform a range of system administration operations. | | SYS_BOOT | Use reboot(2) and kexec_load(2), reboot and load a new kernel for later execution. | | SYS_MODULE | Load and unload kernel modules. |