Coding With Fun
Home Docker Django Node.js Articles Python pip guide FAQ Policy

Docker kernel capability mechanism


May 22, 2021 Docker From entry to practice



Capability is a powerful feature of the Linux kernel that provides fine-grained access control. The Linux kernel has supported the capability mechanism since version 2.2, dividing permissions into more granular operational capabilities that can work on both processes and files.

For example, a Web service process only needs permission to bind a port below 1024, and does not require root permissions. T hen it only needs to net_bind_service ability. In addition, there are many other similar capabilities to prevent processes from getting root permissions.

By default, Docker-initiated containers are severely restricted to using only a portion of the kernel's capabilities.

Using capability mechanisms has many benefits for enhancing the security of Docker containers. T ypically, a bunch of processes that require privileged permissions run on the server, including ssh, cron, syslogd, hardware management tool modules (such as load modules), network configuration tools, and so on. Containers are different from these processes because almost all privileged processes are managed by support systems other than containers.

  • ssh access is managed by the host on the ssh service;
  • Cron should usually be executed as a user process, with permissions handed over to the app that uses its services;
  • The log system can be managed by Docker or third-party services;
  • Hardware management is irrelevant, and there is no need to perform udevd and similar services in containers;
  • Network management is also set up on the host, and containers do not need to be configured on the network unless special needs are required.

As you can see from the example above, in most cases, containers do not require "real" root permissions, containers require only a few capabilities. To enhance security, containers can disable unnecessary permissions.

  • Completely prohibit any mounting operation;
  • Do not directly access the local host socket;
  • Prohibit access to some file systems, such as creating new devices, modifying file properties, etc.
  • The module is not loaded.

In this way, even if an attacker obtains root permissions in the container, he or she will not be able to obtain higher permissions from the local host and will be able to do limited damage.

By default, Docker uses the whitelisting mechanism to disable permissions other than the required functionality. Of course, users can also enable additional permissions for Docker containers based on their needs.