What Is Copy Fail(CVE-2026-31431) local privilege escalation (LPE) vulnerability ?
Copy Fail (CVE-2026-31431) is a high-severity local privilege escalation (LPE) vulnerability in the Linux kernel’s cryptographic subsystem. Disclosed publicly on April 29, 2026 by security research firm Theori, it allows any unprivileged local user to escalate to root using a 732-byte Python script — with no race conditions, no retries, and no crashes.
In a nutshell: a logic flaw buried in the Linux kernel’s cryptographic subsystem lets any unprivileged local user corrupt the kernel’s in-memory view of a setuid binary and walk away with a root shell. The exploit is 732 bytes of Python. It requires no race condition wins, no kernel-specific offsets, no compiled payloads. The same script, unmodified, gets root on Ubuntu, Amazon Linux, RHEL, and SUSE. CISA added it to its Known Exploited Vulnerabilities catalog on May 1, 2026, and federal agencies have been given until May 15 to patch.
Why Copy Fail Is Different From Previous Linux LPEs
To understand why Copy Fail caused immediate alarm across the security community, it helps to compare it to prior high-profile Linux kernel exploits:
| Vulnerability | Year | Reliability | Portability | Complexity |
|---|---|---|---|---|
| Dirty Cow (CVE-2016-5195) | 2016 | Race condition — needed multiple retries | Broad | High |
| Dirty Pipe (CVE-2022-0847) | 2022 | Reliable | Version-specific | Medium |
| Copy Fail (CVE-2026-31431) | 2026 | Deterministic — no race | Every distro since 2017 | Trivially low |
Copy Fail is a straight-line logic flaw. It triggers without races, retries, or crash-prone timing windows. The exact same script works across Ubuntu, Amazon Linux, RHEL, and SUSE — no per-distro offsets, no recompilation, no version checks. The entire exploit requires only Python 3.10+ and standard library modules (os, socket, zlib).
Technical Root Cause
The Affected Component: algif_aead and AF_ALG
The Linux kernel’s AF_ALG interface exposes hardware-accelerated cryptographic functions to userspace applications. The vulnerable module within this interface is algif_aead, which handles Authenticated Encryption with Associated Data (AEAD) operations.
How This Bug Was Found
The story behind the discovery is as interesting as the vulnerability itself. A researcher at Theori named Taeyang Lee had been studying the intersection of the Linux crypto subsystem and page-cache-backed data — territory he’d mapped in earlier kernelCTF work. His hypothesis was that AF_ALG combined with splice() creates a path where unprivileged userspace can feed file-backed page cache pages directly into the kernel crypto subsystem, and that this interaction might be an underexplored source of vulnerabilities.
From there, the team used Xint Code, an AI-assisted security research tool, to scale the analysis across the entire crypto/ subsystem. The operator prompt given to the scan was deliberately minimal — essentially just pointing the tool at the crypto subsystem and noting the key observation about splice() delivering page-cache references to crypto TX scatterlists. About an hour later, Copy Fail emerged as the highest-severity finding. The researchers also noted that the scan surfaced additional high-severity vulnerabilities that are still in coordinated disclosure.
Understanding the Pieces That Came Together
The vulnerability doesn’t stem from a single bad commit. It is the collision of three independent kernel changes made in 2011, 2015, and 2017 — none of which was problematic in isolation. Understanding why Copy Fail works requires understanding each piece.
AF_ALG: The Crypto Subsystem’s Userspace Door
AF_ALG is a socket type that exposes the Linux kernel’s cryptographic API to unprivileged userspace. Since roughly 2003, any user can open an AF_ALG socket, bind it to a supported algorithm — symmetric ciphers, hash functions, AEAD constructs — and use it to encrypt or decrypt data. There is no capability requirement. This is intentional: hardware-accelerated cryptography in the kernel can benefit any application, so the interface is wide open.
splice(): Zero-Copy File I/O
splice() is a Linux system call that moves data between file descriptors and pipes without making userspace copies of the data. Instead of copying bytes, it passes references to the underlying kernel memory pages — specifically, the physical pages that sit in the page cache. When a file is memory-mapped or opened for reading, the kernel maintains cached copies of its contents in what’s called the page cache. splice() works by handing off references to those cached pages directly. This is great for performance. It becomes a problem when those page references end up somewhere they aren’t supposed to.
authencesn: The IPsec ESN Wrapper
authencesn is an AEAD (Authenticated Encryption with Associated Data) template added to the Linux kernel in 2011 to support IPsec’s Extended Sequence Number (ESN) mode, defined in RFC 4303. IPsec uses 64-bit sequence numbers to prevent replay attacks, but they’re split in a somewhat awkward way on the wire: the low 32 bits are transmitted explicitly, and the high 32 bits are inferred. When computing the authentication tag, the kernel needs both halves concatenated in a specific order for the HMAC. authencesn handles this rearrangement by using the destination scatterlist as scratch space — temporarily shuffling bytes around before hashing.
This worked fine in 2011 because the only caller was the kernel’s internal xfrm layer, and the destination scatterlist was always a normal kernel buffer that authencesn legitimately controlled.
The Vulnerability: Three Changes, One Catastrophe
In 2015, AF_ALG gained AEAD support through algif_aead.c. This opened authencesn — and every other AEAD template — to direct invocation from unprivileged userspace. At the same time, authencesn was updated to the new AEAD interface. But at this stage, AF_ALG still operated out-of-place: the source scatterlist (containing input data) and the destination scatterlist (where output goes) were separate. Page cache pages could end up in the source scatterlist via splice(), but only in a read role. The authencesn scratch writes went to the destination, which was the user’s buffer. Not yet exploitable.
Then in 2017, an optimization was committed to algif_aead.c (commit 72548b093ee3) to make AEAD decryption operate in-place. The idea was to avoid an extra copy: instead of keeping separate source and destination scatterlists, the code would copy the AAD and ciphertext into the RX buffer, and then chain the authentication tag pages from the TX scatterlist directly onto the end of the RX scatterlist using sg_chain(). It then set req->src = req->dst, making both the source and destination point to the same combined chain.
This is the moment the three pieces collide. When a user splices a file into the TX path of an AF_ALG socket, the page cache pages of that file land in the TX scatterlist. After the 2017 in-place optimization, those tag pages get chained into the RX (destination) scatterlist. And authencesn, which uses the destination as scratch space, now has page cache pages in its writable destination.
The result is a controlled, deterministic write into the kernel’s cached copy of any file the attacker can read. The AEAD operation fails (the fabricated ciphertext doesn’t authenticate), so the recvmsg() call returns an error. The caller sees an error. The kernel thinks nothing unusual happened. And four bytes in the page cache of the target file have been silently overwritten.
The Mechanics of the Write Primitive
When authencesn processes a decryption request, it reads the Extended Sequence Number bytes from the AAD field and writes them back to the destination at offset assoclen + cryptlen. This write is supposed to land in the caller’s buffer. With the 2017 in-place path, it can instead land in page cache pages belonging to a file on disk.
The attacker controls all three critical parameters:
The target file is determined by which file descriptor gets passed to splice(). Any file readable by the current user is in scope. This includes setuid binaries like /usr/bin/su, which is world-readable and setuid-root on essentially every mainstream Linux distribution.
The target offset within that file is determined by the splice() call parameters — the file offset, splice length, and assoclen value. By doing the math, the attacker can position the write at any 4-byte boundary within the file’s page cache.
The value being written comes from bytes 4–7 of the AAD that the attacker provides in sendmsg(). These bytes represent seqno_lo in the ESN rearrangement logic, and they’re written verbatim to the destination. The attacker writes whatever they want.
The write is reliable. There is no timing dependency, no race window to win. Every invocation writes exactly four bytes to exactly the specified offset. By iterating over a shellcode payload in four-byte chunks, the attacker can overwrite arbitrarily large regions of the target file’s page cache.
The Exploit: 732 Bytes to Root
The published proof-of-concept targets /usr/bin/su by default, though it accepts any setuid binary as an argument. Here’s the flow at a high level:
First, the exploit opens an AF_ALG socket and binds it to authencesn(hmac(sha256),cbc(aes)). This is a standard AEAD template that exercises the vulnerable authencesn code path. No privileges are required — AF_ALG is available to all users in every mainstream distro’s default configuration.
Next, for each four-byte chunk of the shellcode payload, the exploit constructs a sendmsg() + splice() pair. The sendmsg carries the AAD with the desired four bytes in positions 4–7. The splice delivers pages from /usr/bin/su’s page cache as the ciphertext. The AEAD parameters are chosen so that the scratch write lands at the right offset within su’s .text section.
Each recv() call triggers the decryption attempt. Inside authencesn, the kernel reads the ESN bytes and writes seqno_lo across what it believes is its output buffer — but is actually a page cache page belonging to /usr/bin/su. The HMAC verification then fails, recvmsg() returns EBADMSG, and nothing looks alarming. Repeat for each four-byte chunk until the shellcode is fully written.
Finally, the exploit calls execve(“/usr/bin/su”). The kernel loads the binary from the page cache. The page cache version now contains the injected shellcode. Because su is setuid-root, the shellcode runs as UID 0. The attacker has root.
Affected Systems
Copy Fail affects Linux kernels built since the 2017 in-place optimization commit. Practically, this means:
Distribution | Kernel Version |
Ubuntu 24.04 LTS | 6.17.0-1007-aws |
Amazon Linux 2023 | 6.18.8-9.213.amzn2023 |
RHEL 10.1 | 6.12.0-124.45.1.el10_1 |
SUSE 16 | 6.12.0-160000.9-default |
Why This Is Harder to Detect Than It Looks
The stealth characteristics of Copy Fail deserve their own discussion, because they cut directly against common defensive assumptions.
The most obvious one: the on-disk file is untouched. The kernel’s writeback machinery marks pages dirty when they’ve been modified through the normal VFS write path. The authencesn scratch write goes around that path entirely — it reaches the page cache via the crypto scatterlist and never sets the dirty bit. File integrity monitoring tools that work by computing checksums of files on disk will see nothing wrong. sha256sum /usr/bin/su will return the expected hash. Only a tool that specifically reads from /proc/kpageflags or hashes the in-memory page cache would catch the modification.
Additionally, the exploit uses only legitimate system calls. From the kernel’s perspective, the attacker is just doing a failed AEAD decryption — an operation that any application might perform for legitimate reasons. There is no kernel panic, no unusual memory allocation pattern, no process spawning suspicious children. The recvmsg() returns an error, which looks like normal application behavior. Security monitoring systems that rely on syscall analysis have a difficult time distinguishing this from normal crypto library activity.
The only window where detection is more practical is at execution time: if you have an integrity monitoring system that hashes executables from /proc/<pid>/exe or reads the actual page cache at runtime, the modified su binary will produce a different hash. But most deployments don’t have this.
Container Escape Implications
The implications extend well beyond the local privilege escalation story. The page cache is global — it is shared across all processes on a Linux host, including across container boundaries. When a container runtime like Docker or LXC launches a container, it does not give each container its own copy of the host’s page cache. The /usr/bin/su binary that lives in the container’s view of the filesystem and the one visible to the host might share physical pages.
Kaspersky’s analysis noted that Docker, LXC, and Kubernetes grant container processes access to the AF_ALG subsystem if the algif_aead module is loaded in the host kernel — which it typically is by default. This means the full Copy Fail primitive is available from inside a container. A compromised container process can corrupt a setuid binary’s page cache on the host and escalate to root on the underlying node, not just within the container.
For cloud environments running multi-tenant Kubernetes clusters, CI/CD infrastructure running untrusted pull request code on self-hosted runners, or notebook and sandbox environments that execute user-supplied code on shared kernels, the risk profile is substantially higher than a typical local privilege escalation.
The Fix
The patch (commit a664bf3d603d in the mainline kernel) is straightforward in its approach: it reverts the 2017 in-place optimization entirely. Instead of setting req->src = req->dst and chaining page cache pages into the writable destination scatterlist, the fixed code restores out-of-place operation — req->src points to the TX scatterlist, req->dst points to the RX buffer, and the two never share pages.
// Before (vulnerable): src and dst point to the same combined scatterlist
aead_request_set_crypt(&areq->cra_u.aead_req, rsgl_src, rsgl_src, used, ctx->iv);
// After (fixed): src is TX SGL, dst is RX SGL — fully separated
aead_request_set_crypt(&areq->cra_u.aead_req, tsgl_src,
areq->first_rsgl.sgl.sgt.sgl, used, ctx->iv);
The commit message is blunt about the conclusion: there is no performance benefit to in-place AEAD operation in algif_aead because the source and destination data already come from different memory mappings. The 2017 optimization was solving a problem that didn’t exist and created one that does.
Fixes are available in Linux kernel versions 6.18.22, 6.19.12, and 7.0. Major distributions began shipping patched kernel packages shortly after public disclosure.
Immediate Mitigation
If patching is not immediately possible, the algif_aead kernel module can be blocked from loading:
This has minimal practical impact for most systems. The components that do not use AF_ALG — and this covers the vast majority of Linux software — include dm-crypt/LUKS, kTLS, in-kernel IPsec/XFRM, OpenSSL in its default configuration, GnuTLS, NSS, and SSH. These all call the kernel’s crypto API through internal interfaces, not through the AF_ALG socket layer. Disabling algif_aead does not affect disk encryption, TLS, or SSH.
What it may affect is userspace software explicitly configured to use OpenSSL’s afalg engine, or embedded crypto offload paths that bind aead sockets directly. You can check for active AF_ALG usage on a running system with lsof | grep AF_ALG or ss -xa.
For containerized and multi-tenant environments, consider adding a seccomp profile that blocks AF_ALG socket creation regardless of patch state. This provides defense-in-depth even on patched systems and limits the blast radius of any future AF_ALG-adjacent vulnerabilities.
echo "install algif_aead /bin/false" > /etc/modprobe.d/disable-algif.conf
rmmod algif_aead 2>/dev/null
This has minimal practical impact for most systems. The components that do not use AF_ALG — and this covers the vast majority of Linux software — include dm-crypt/LUKS, kTLS, in-kernel IPsec/XFRM, OpenSSL in its default configuration, GnuTLS, NSS, and SSH. These all call the kernel’s crypto API through internal interfaces, not through the AF_ALG socket layer. Disabling algif_aead does not affect disk encryption, TLS, or SSH.
What it may affect is userspace software explicitly configured to use OpenSSL’s afalg engine, or embedded crypto offload paths that bind aead sockets directly. You can check for active AF_ALG usage on a running system with lsof | grep AF_ALG or ss -xa.
For containerized and multi-tenant environments, consider adding a seccomp profile that blocks AF_ALG socket creation regardless of patch state. This provides defense-in-depth even on patched systems and limits the blast radius of any future AF_ALG-adjacent vulnerabilities.
What CISA’s KEV Listing Means
CISA added CVE-2026-31431 to its Known Exploited Vulnerabilities catalog on May 1, 2026. Federal Civilian Executive Branch agencies have a hard deadline of May 15, 2026 to apply vendor-supplied patches. The KEV listing is based on evidence of active exploitation in the wild, though CISA has not yet published specific details about the observed attack campaigns.
The Microsoft Defender Security Research Team has noted preliminary testing activity that is likely to evolve into broader threat actor exploitation within days. Their characterization is worth quoting for its precision: while the attack vector is local and not remotely exploitable in isolation, it becomes highly impactful when chained with an initial access vector — SSH access obtained through credential theft, a malicious CI job in a developer pipeline, or a container foothold from an application-layer vulnerability. Any path that delivers a shell on a vulnerable Linux host, even a low-privilege one, becomes a path to full root via Copy Fail.
Go and Rust reimplementations of the original Python PoC have already appeared in open-source repositories, according to Kaspersky’s analysis. The barrier to entry is falling.
The Bigger Picture: How Does Something Like This Hide for Nine Years?
It is easy to read the disclosure and ask how three changes spanning 2011 to 2017 could sit undetected for nearly a decade. The answer is partly architectural and partly cultural.
The architectural answer is that each individual change was defensible. Adding authencesn in 2011 was a legitimate addition for IPsec support. Adding AEAD to AF_ALG in 2015 was a natural evolution of the userspace crypto interface. The 2017 in-place optimization was a reasonable performance improvement given the information available at the time. None of these changes had an obvious security flaw when reviewed in isolation. The vulnerability lived in the composition — in the set of invariants that each component assumed the others would uphold, but none of them enforced.
The cultural answer is that the AF_ALG interface’s interaction with page-cache-backed data was simply not a well-studied attack surface. Taeyang Lee’s initial insight — that splice() delivering page cache pages into the crypto subsystem might be a source of vulnerabilities — was genuinely novel. The AF_ALG interface is correct in isolation. The splice() semantics are correct in isolation. authencesn’s behavior was, at most, an underdocumented design choice. The intersection of all three was a blind spot no one had thought to look into.
This is a class of vulnerability that gets harder to find as codebases grow and as subsystem interactions multiply. It is also, arguably, exactly the class of vulnerability that AI-assisted code analysis is well-positioned to surface — tools that can reason about invariants across subsystem boundaries without the cognitive overhead of tracking all the context manually.
Summary and Action Checklist
| Priority | Action |
|---|---|
| 🔴 Immediate | Patch the kernel on all affected systems |
| 🔴 Immediate | On Kubernetes nodes and CI runners: treat as critical, patch first |
| 🟠 If unpatched | Disable algif_aead (Debian/Ubuntu/SUSE only — does NOT work on RHEL family) |
| 🟠 If unpatched | Enable SELinux enforcing mode to reduce attack surface |
| 🟡 Ongoing | Enable automated kernel security updates |
| 🟡 Ongoing | Add AF_ALG socket monitoring to your detection stack |
| 🟡 Ongoing | Review container isolation strategy — consider VM-level boundaries |