buildkit/docs/rootless.md

186 lines
6.0 KiB
Markdown
Raw Normal View History

# Rootless mode (Experimental)
Requirements:
- runc `a00bf0190895aa465a5fbed0268888e2c8ddfe85` (Oct 15, 2018) or later
- Some distros such as Debian (excluding Ubuntu) and Arch Linux require `sudo sh -c "echo 1 > /proc/sys/kernel/unprivileged_userns_clone"`.
- RHEL/CentOS 7 requires `sudo sh -c "echo 28633 > /proc/sys/user/max_user_namespaces"`. You may also need `sudo grubby --args="namespace.unpriv_enable=1 user_namespace.enable=1" --update-kernel="$(grubby --default-kernel)"`.
- `newuidmap` and `newgidmap` need to be installed on the host. These commands are provided by the `uidmap` package. For RHEL/CentOS 7, RPM is not officially provided but available at https://copr.fedorainfracloud.org/coprs/vbatts/shadow-utils-newxidmap/ .
- `/etc/subuid` and `/etc/subgid` should contain >= 65536 sub-IDs. e.g. `penguin:231072:65536`.
## Set up
Setting up rootless mode also requires some bothersome steps as follows, but you can also use [`rootlesskit`](https://github.com/rootless-containers/rootlesskit) for automating these steps.
### Terminal 1:
```
$ unshare -U -m
unshared$ echo $$ > /tmp/pid
```
Unsharing mountns (and userns) is required for mounting filesystems without real root privileges.
### Terminal 2:
```
$ id -u
1001
$ grep $(whoami) /etc/subuid
penguin:231072:65536
$ grep $(whoami) /etc/subgid
penguin:231072:65536
$ newuidmap $(cat /tmp/pid) 0 1001 1 1 231072 65536
$ newgidmap $(cat /tmp/pid) 0 1001 1 1 231072 65536
```
### Terminal 1:
```
unshared# buildkitd
```
* The data dir will be set to `/home/penguin/.local/share/buildkit`
* The address will be set to `unix:///run/user/1001/buildkit/buildkitd.sock`
* `overlayfs` snapshotter is not supported except Ubuntu-flavored kernel: http://kernel.ubuntu.com/git/ubuntu/ubuntu-artful.git/commit/fs/overlayfs?h=Ubuntu-4.13.0-25.29&id=0a414bdc3d01f3b61ed86cfe3ce8b63a9240eba7
* containerd worker is not supported ( pending PR: https://github.com/containerd/containerd/pull/2006 )
* Network namespace is not used at the moment.
* Cgroups is disabled.
### Terminal 2:
```
$ go get ./examples/build-using-dockerfile
$ build-using-dockerfile --buildkit-addr unix:///run/user/1001/buildkit/buildkitd.sock -t foo /path/to/somewhere
```
## Set up (using a container)
Docker image is available as [`moby/buildkit:rootless`](https://hub.docker.com/r/moby/buildkit/tags/).
```
$ docker run --name buildkitd -d --privileged -p 1234:1234 moby/buildkit:rootless --addr tcp://0.0.0.0:1234
```
```
$ go get ./examples/build-using-dockerfile
$ build-using-dockerfile --buildkit-addr tcp://127.0.0.1:1234 -t foo /path/to/somewhere
```
### Security consideration
Although `moby/buildkit:rootless` executes the BuildKit daemon as a normal user, `docker run` still requires `--privileged`.
This is to allow build executor containers to mount `/proc`, by providing "unmasked" `/proc` to the BuildKit daemon container.
See [`docker/cli#1347`](https://github.com/docker/cli/pull/1347) for the ongoing work to remove this requirement.
See also [Disabling process sandbox](#disabling-process-sandbox).
#### UID/GID
The `moby/buildkit:rootless` image has the following UID/GID configuration:
Actual ID (shown in the host and the BuildKit daemon container)| Mapped ID (shown in build executor containers)
----------|----------
1000 | 0
100000 | 1
... | ...
165535 | 65536
```
$ docker exec buildkitd id
uid=1000(user) gid=1000(user)
$ docker exec buildkitd ps aux
PID USER TIME COMMAND
1 user 0:00 rootlesskit buildkitd --addr tcp://0.0.0.0:1234
13 user 0:00 /proc/self/exe buildkitd --addr tcp://0.0.0.0:1234
21 user 0:00 buildkitd --addr tcp://0.0.0.0:1234
29 user 0:00 ps aux
$ docker exec cat /etc/subuid
user:100000:65536
```
To change the UID/GID configuration, you need to modify and build the BuildKit image manually.
```
$ vi hack/dockerfiles/test.Dockerfile
$ docker build -t buildkit-rootless-custom --target rootless -f hack/dockerfiles/test.Dockerfile .
```
#### Disabling process sandbox
By passing `--oci-worker-no-process-sandbox` to the `buildkitd` arguments, BuildKit can be executed in a container without `--privileged`.
However, you still need to pass `--security-opt seccomp=unconfined --security-opt apparmor=unconfined` to `docker run`.
```
$ docker run --name buildkitd -d --security-opt seccomp=unconfined --security-opt apparmor=unconfined -p 1234:1234 moby/buildkit:rootless --addr tcp://0.0.0.0:1234 --oci-worker-no-process-sandbox
```
Note that `--oci-worker-no-process-sandbox` allows build executor containers to `kill` (and potentially `ptrace` depending on the seccomp configuration) an arbitrary process in the BuildKit daemon container.
## Set up (using Kubernetes)
### With `securityContext`
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: buildkitd
name: buildkitd
spec:
selector:
matchLabels:
app: buildkitd
template:
metadata:
labels:
app: buildkitd
spec:
containers:
- image: moby/buildkit:rootless
args:
- --addr
- tcp://0.0.0.0:1234
name: buildkitd
ports:
- containerPort: 1234
securityContext:
privileged: true
```
This configuration requires privileged containers to be enabled.
If you are using Kubernetes v1.12+ with either Docker v18.06+, containerd v1.2+, or CRI-O v1.12+ as the CRI runtime, you can replace `privileged: true` with `procMount: Unmasked`.
### Without `securityContext` but with `--oci-worker-no-process-sandbox`
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: buildkitd
name: buildkitd
spec:
selector:
matchLabels:
app: buildkitd
template:
metadata:
labels:
app: buildkitd
spec:
containers:
- image: moby/buildkit:rootless
args:
- --addr
- tcp://0.0.0.0:1234
- --oci-worker-no-process-sandbox
name: buildkitd
ports:
- containerPort: 1234
```
See [Disabling process sandbox](#disabling-process-sandbox) for security notice.