You gave me a u32. I gave you root. (io_uring ZCRX freelist LPE)

· systems · Source ↗

Linux 6.15–6.19 carries a missing bounds check in io_uring’s ZCRX zero-copy receive path that lets a CAP_NET_ADMIN holder escalate to uid=0 on real ZCRX-capable NICs.

What Matters

  • Two teardown paths (ptr_ring drain + scrub loop) both call io_zcrx_return_niov_freelist; a race lets free_count exceed num_niovs, writing one u32 past the freelist array end.
  • The OOB value is a niov index (0–31 for 128KB area); choosing area size at registration selects the slab cache and therefore the adjacent victim object.
  • Chain targets a kmalloc-128 msg_msg; the u32 corrupts m_list.next‘s low 32 bits, preserving the high physmap prefix, enabling a controlled heap over-read via msgrcv MSG_COPY.
  • KASLR is broken via /proc/kallsyms (kptr_restrict=0), dmesg, or the msgrcv over-read scanning for pointers in range 0xffffffff80000000–0xfffffffffe000000.
  • Fix is commit 770594e; not yet backported to any stable branch at time of writing.
  • [HN: @PlasmaPower] CAP_NET_ADMIN and CAP_SYS_ADMIN are both available inside unprivileged user namespaces (unshare -Ur), which nearly all distros permit for browser sandboxing — making the capability bar lower than it appears.
  • [HN: @staticassertion] io_uring is already disabled by default in most container runtimes; disabling it outright is a reasonable mitigation given its recurring LPE history.

Original | Discuss on HN