1 Rootless containers, security and root
Docker traditionally ran as the root
user. Users who wanted to run docker containers needed to be given sudo
access and use sudo docker
, or be added to the docker
group, so they could run docker without typing sudo
first. In both cases, they were running docker with root privileges.
This is considered a bad security practice because it effectively grants root host privileges to all docker users. However, namespaces and control groups where not as mature when docker started as they are now, and no better alternative was available. But we have an alternative now. Docker offers the possibility to run in rootless mode and podman runs rootless by design.
Running a container rootless does not mean that the container does not have any root-like capabilities, it means that the container engine does not run as root.
For most rocker-related projects, running rootless is a security advantage.
1.1 Who are we?
At the host:
whoami
# sergio
In the container:
podman run --rm docker.io/rocker/rstudio whoami
# root
1.2 Using apt-get inside a rootless container
It is perfectly possible to run apt-get
commands on a rootless container, because it just modifies files inside the container.
At the host:
apt-get update
# Reading package lists... Done
# E: Could not open lock file /var/lib/apt/lists/lock - open (13: Permission denied)
In the container:
podman run --rm docker.io/rocker/rstudio apt-get update
# Get:1 http://security.ubuntu.com/ubuntu jammy-security InRelease [110 kB]
# ...
# Fetched 26.8 MB in 6s (4,750 kB/s)
# Reading package lists...
1.3 Modifying files
You can bind mount the /etc/
directory (e.g. using -v /etc:/hostetc
) but you won’t be able to modify most of its files, since you are not allowed to do that when you are outside the container.
At the host:
touch /etc/try-creating-a-file
# touch: cannot touch '/etc/try-creating-a-file': Permission denied
In the container: Rootless means no additional host permissions
podman run --rm -v /etc/:/hostetc docker.io/rocker/rstudio \
touch /hostetc/try-creating-a-file# touch: cannot touch '/hostetc/try-creating-a-file': Permission denied
However, you can modify the files within the container:
podman run --rm docker.io/rocker/rstudio touch /etc/try-creating-a-file
1.4 Port binding
You can’t bind your container to host ports lower than 1024, since those are reserved to root (or to be precise reserved to processes with CAP_NET_BIND_SERVICE
capability set).
podman run --rm -p 80:8787 docker.io/rocker/rstudio
# Error: rootlessport cannot expose privileged port 80, you can add
# 'net.ipv4.ip_unprivileged_port_start=80' to /etc/sysctl.conf (currently 1024),
# or choose a larger port number (>= 1024):
# listen tcp 0.0.0.0:80: bind: permission denied
However larger port numbers work perfectly fine:
podman run --rm -p 8787:8787 docker.io/rocker/rstudio
2 Rootless containers and file permissions
If you have a bit of experience with containers you have probably suffered of “permission issues”.
The typical issue with permissions is that you mount a directory into the container, and the processes in the container write files in that directory with a user id different than yours (usually root). Once you are out of the container you can’t access or modify those files.
2.1 How users work in rootless containers
With rootless containers, even if you are only one user, your container has to behave (read and write files…) as if there were many users. There is no way to magically do this, so the host operating system actually gives you many “subordinate user ids” and “subordinate group ids” for you to use as you wish. How many? Usually around 65k user ids and 65k group ids. When you use a rootless container you may be impersonating up to 65k users! Since it would be a very bad idea to impersonate other users in your computer (impersonating root would be the most dangerous) the system administrator gives you unassigned user ids that do not overlap with anyone else. The list of subordinate user and group ids assigned to each user is stored in /etc/subuid
and /etc/subgid
files.
cat /etc/subuid
# sergio:100000:65536
# ana:165536:65536
This file is read as follows:
- The user
sergio
has assigned 65536 additional subordinate user IDs starting at 100000. This spans the range 100000-165535. - The user
ana
has assigned 65536 additional subordinate user IDs starting at 165536. This spans the range 165536-231071.
When you start a container, the user and group ids used by the image should be mapped to the host. The default user mapping in podman maps the 0 container uid (corresponding to the container root user) to your real user id in the host, and all your subordinate user IDs are mapped to user ids 1:n
in the container. The same applies to group id mapping.
2.2 Working alone
In the container, you can use user ids without issues (e.g. you can be root).
If you bind mount a directory that you own:
- If you create a file as the root user in the container, outside of it the file will be owned by you.
- If you create a file as the container UID 1000, outside of the container will appear to be owned by one of your subordinate IDs (e.g. 100999)
What about mounting directories that you DO NOT own?
- The files and directories that you do not own belong to host UIDs that are not mapped into the container, so when the container asks for their UID the operating system returns the “overflow user id”, which is the ID 65534 by default and usually are listed as owned by
nobody
ornogroup
.