How to Secure Rootless Podman Against CVE-2026-31431 Copy Fail

Understanding CVE-2026-31431: Copy Fail

CVE-2026-31431, nicknamed "Copy Fail," is a Linux kernel privilege escalation vulnerability that exploits page cache corruption through scatterlist mechanics. The vulnerability allows unprivileged attackers to overwrite arbitrary files (like /usr/bin/su) with malicious ELF payloads, executing them with elevated privileges.

The exploit works by:

  1. Corrupting kernel page cache structures
  2. Injecting a malicious ELF binary into memory
  3. Overwriting legitimate system binaries when they're executed
  4. Triggering privilege escalation via setuid syscalls

However, rootless containers present a fundamental barrier to this attack. This guide walks through understanding why and testing it in your own environment.

Why Rootless Containers Stop Copy Fail

Rootless Podman isolates container processes through user namespace remapping (uid_map). When a container runs as a non-root user on the host, the kernel enforces strict constraints:

  • The container's root user (UID 0 inside the container) maps to an unprivileged host UID (e.g., 100000)
  • Page cache corruption still occurs, but the kernel rejects privilege escalation attempts
  • Setuid syscalls fail because the kernel compares the effective UID against the namespace boundary
  • The exploit's setuid(0) call cannot escalate beyond the namespace-remapped UID

This is a critical difference from running containers with --userns=host or as root.

Setting Up Your Lab Environment

Prerequisites

You'll need:

  • A Linux system with Podman installed (versions 4.0+)
  • strace and eBPF tools for syscall tracing
  • A test container image (we'll use a minimal Fedora or Ubuntu)
  • The Copy Fail exploit (available on security research repositories)

Installing Rootless Podman

If you haven't set up rootless Podman yet:

# Install Podman (Fedora example)
sudo dnf install podman

# Enable rootless mode for your user
podman system migrate

# Verify rootless setup
podman info | grep -A 5 rootless

Check that rootless: true appears in the output. This confirms containers run with user namespace isolation.

Creating a Test Container

Start with a minimal image:

podman run -d \
  --name copy-fail-test \
  --security-opt=no-new-privileges:false \
  fedora:39 \
  sleep 3600

The --security-opt=no-new-privileges:false flag allows setuid binaries (necessary to reproduce the vulnerability conditions).

Extracting and Analyzing the Exploit Shellcode

Before running the exploit, you should audit what it actually does. The public exploit embeds compressed shellcode:

#!/usr/bin/env python3
import zlib

hex_str = "78daab77f57163626464800126063b0610af82c101cc7760c0040e0c160c301d209a154d16999e07e5c1680601086578c0f0ff864c7e568f5e5b7e10f75b9675c44c7e56c3ff593611fcacfa499979fac5190c0c0c0032c310d3"

compressed_bytes = bytes.fromhex(hex_str)
raw_payload = zlib.decompress(compressed_bytes)

with open("shellcode.bin", "wb") as f:
    f.write(raw_payload)

print(f"Payload extracted: {len(raw_payload)} bytes")

Run this to extract the binary, then disassemble it:

python3 extract_shellcode.py
objdump -D -b binary -m i386:x86-64 shellcode.bin | head -50

The shellcode performs these syscalls (starting at offset 0x78):

78: 31 c0                xor    %eax,%eax
79: 31 ff                xor    %edi,%edi
7c: b0 69                mov    $0x69,%al
7e: 0f 05                syscall

This is setuid(0) โ€” the privilege escalation attempt. The key insight: this syscall will fail in a rootless container because the effective UID cannot escalate beyond the namespace boundary.

Running the Exploit in Rootless Podman

Inside your test container:

podman exec -it copy-fail-test bash

# Inside the container, download or copy the exploit
wget https://github.com/[repository]/copy-fail-exploit.py

# Run the exploit
python3 copy-fail-exploit.py

Expected output in a vulnerable (non-rootless) environment:

[+] Exploit successful: spawned root shell
root@host:~#

Expected output in a protected (rootless) environment:

[!] setuid(0) failed: Operation not permitted
[!] Privilege escalation blocked by user namespace

Tracing the Kernel's Rejection with strace

To watch the kernel reject the escalation in real time:

podman exec -it copy-fail-test bash

# Inside the container, run the exploit under strace
strace -e trace=setuid python3 copy-fail-exploit.py 2>&1 | grep -A 5 setuid

You'll see output like:

setuid(0)                               = -1 EPERM (Operation not permitted)

The EPERM (Permission Denied) error is the kernel's namespace-aware security check rejecting the privilege escalation.

Advanced: Verifying uid_map with eBPF

For deeper inspection, use eBPF to observe the uid_map check:

# Check the container's uid_map
podman inspect copy-fail-test | grep -A 10 UIDMap

Output example:

"IDMappings": {
  "UIDMap": [
    {"ContainerID": 0, "HostID": 100000, "Size": 65536}
  ]
}

This confirms container UID 0 maps to host UID 100000. When the exploit tries to call setuid(0), the kernel compares:

  • Requested UID: 0 (inside namespace)
  • Effective UID: 100000 (host perspective)
  • Namespace boundary: prevents escalation

Deployment Best Practices

| Practice | Impact | Effort | |----------|--------|--------| | Run containers rootless by default | Blocks Copy Fail + similar exploits | Low | | Use --userns=private explicitly | Per-container isolation guarantee | Low | | Combine with SELinux/AppArmor | Defense in depth | Medium | | Regular kernel updates | Patch upstream fix | Ongoing | | Audit with podman inspect | Verify namespace config | Low |

Conclusion

Rootless Podman's user namespace isolation provides strong protection against CVE-2026-31431 Copy Fail. The privilege escalation fundamentally fails because the kernel's namespace enforcement prevents uid 0 from escaping the container's remapped boundaries.

For CI/CD systems like GNOME's runner infrastructure, this architecture eliminates the need for per-job VM isolation specifically to contain this class of kernel exploit โ€” though defense-in-depth (combining with SELinux, immutable rootfs, etc.) remains a best practice.

Always audit exploit code before running it in your lab. The techniques shown here (shellcode extraction, strace tracing, namespace inspection) apply to any Linux privilege escalation research.

Recommended Tools