CVE-2026-46223

Analyzed Analyzed - Analysis Complete

Use-After-Free in Linux Kernel cgroup Subsystem

Publication date: 2026-05-28

Last updated on: 2026-06-11

Assigner: kernel.org

Description

In the Linux kernel, the following vulnerability has been resolved: cgroup: Defer css percpu_ref kill on rmdir until cgroup is depopulated A chain of commits going back to v7.0 reworked rmdir to satisfy the controller invariant that a subsystem's ->css_offline() must not run while tasks are still doing kernel-side work in the cgroup. [1] d245698d727a ("cgroup: Defer task cgroup unlink until after the task is done switching out") [2] a72f73c4dd9b ("cgroup: Don't expose dead tasks in cgroup") [3] 1b164b876c36 ("cgroup: Wait for dying tasks to leave on rmdir") [4] 4c56a8ac6869 ("cgroup: Fix cgroup_drain_dying() testing the wrong condition") [5] 13e786b64bd3 ("cgroup: Increment nr_dying_subsys_* from rmdir context") [1] moved task cset unlink from do_exit() to finish_task_switch() so a task's cset link drops only after the task has fully stopped scheduling. That made tasks past exit_signals() linger on cset->tasks until their final context switch, which led to a series of problems as what userspace expected to see after rmdir diverged from what the kernel needs to wait for. [2]-[5] tried to bridge that divergence: [2] filtered the exiting tasks from cgroup.procs; [3] had rmdir(2) sleep in TASK_UNINTERRUPTIBLE for them; [4] fixed the wait's condition; [5] made nr_dying_subsys_* visible synchronously. The cgroup_drain_dying() wait in [3] turned out to be a dead end. When the rmdir caller is also the reaper of a zombie that pins a pidns teardown (e.g. host PID 1 systemd reaping orphan pids that were re-parented to it during the same teardown), rmdir blocks in TASK_UNINTERRUPTIBLE waiting for those pids to free, the pids can't free because PID 1 is the reaper and it's stuck in rmdir, and the system A-A deadlocks. No internal lock ordering breaks this; the wait itself is the bug. The css killing side that drove the original reorder, however, can be made cleanly asynchronous: ->css_offline() is already async, run from css_killed_work_fn() driven by percpu_ref_kill_and_confirm(). The fix is to make that chain start only after all tasks have left the cgroup. rmdir's user-visible side then returns as soon as cgroup.procs and friends are empty, while ->css_offline() still runs only after the cgroup is fully drained. Verified by the original reproducer (pidns teardown + zombie reaper, runs under vng) which hangs vanilla and succeeds here, and by per-commit deterministic repros for [2], [3], [4], [5] with a boot parameter that widens the post-exit_signals() window so each state is reliably reachable. Some stress tests on top of that. cgroup_apply_control_disable() has the same shape of pre-existing race: when a controller is disabled via subtree_control, kill_css() ran synchronously while tasks past exit_signals() could still be linked to the cgroup's csets, and ->css_offline() could fire before they drained. This patch preserves the existing synchronous behavior at that call site (kill_css_sync() + kill_css_finish() back-to-back) and a follow-up patch will defer kill_css_finish() there using a per-css trigger. This seems like the right approach and I don't see problems with it. The changes are somewhat invasive but not excessively so, so backporting to -stable should be okay. If something does turn out to be wrong, the fallback is to revert the entire chain ([1]-[5]) and rework in the development branch instead. v2: Pin cgrp across the deferred destroy work with explicit cgroup_get()/cgroup_put() around queue_work() and the work_fn. v1 wasn't actually broken (ordered cgroup_offline_wq + queue_work order in cgroup_task_dead() saved it) but the explicit ref removes the dependency on those non-obvious invariants. Also note the pre-existing cgroup_apply_control_disable() race in the description; a follow-up will defer kill_css_finish() there.

CVSS Scores

EPSS Scores

Probability:
Percentile:

Meta Information

Published

2026-05-28

Last Modified

2026-06-11

Generated

2026-07-28

AI Q&A

2026-05-28

EPSS Evaluated

2026-07-26

NVD

CVE-2026-46223

EUVD

EUVD-2026-32850

Affected Vendors & Products

Vendor	Product	Version / Range
linux	linux_kernel	7.0
linux	linux_kernel	7.1
linux	linux_kernel	7.1
linux	linux_kernel	7.0
linux	linux_kernel	From 6.19.12 (inc) to 7.0 (exc)
linux	linux_kernel	From 7.0.1 (inc) to 7.0.9 (exc)

Helpful Resources

Exploitability

CWE

KEV

CWE ID	Description
CWE-667	The product does not properly acquire or release a lock on a resource, leading to unexpected resource state changes and behaviors.

Attack-Flow Graph

Executive Summary

This vulnerability involves the Linux kernel's cgroup subsystem, specifically how the kernel handles the removal of cgroups (rmdir). A series of changes were made to defer the killing of per-CPU references (css percpu_ref) until the cgroup is fully depopulated, meaning all tasks have left the cgroup.

Previously, the kernel tried to synchronously kill cgroup references while tasks might still be linked to the cgroup, which could cause deadlocks. For example, if the process removing the cgroup was also the reaper of zombie processes, it could block indefinitely waiting for those processes to free, causing a system deadlock.

The fix defers the asynchronous cleanup (css_offline) until after all tasks have left the cgroup, allowing rmdir to return as soon as the cgroup appears empty to userspace, while the actual cleanup happens asynchronously afterward. This avoids the deadlock scenario and aligns kernel behavior with userspace expectations.

Impact Analysis

This vulnerability can cause a system deadlock in Linux environments using cgroups. Specifically, if a process responsible for cleaning up cgroups is also the reaper of zombie processes, the system can hang indefinitely during cgroup removal.

Such a deadlock can lead to system instability, unresponsiveness, or crashes, impacting availability and reliability of services running on the affected Linux system.

Compliance Impact

The provided CVE description does not include any information regarding the impact of this vulnerability on compliance with common standards and regulations such as GDPR or HIPAA.

Mitigation Strategies

The vulnerability has been resolved by deferring the css percpu_ref kill on rmdir until the cgroup is fully depopulated, preventing deadlocks caused by waiting on tasks that cannot exit.

Immediate mitigation steps include updating the Linux kernel to a version that includes the fixes described in the chain of commits [1]-[5], which rework the rmdir behavior and defer the css_offline() execution until all tasks have left the cgroup.

Backporting these fixes to stable kernel versions is considered safe and recommended to avoid the deadlock scenario.

Hi! I’m here to help you understand CVE-2026-46223. Ask me anything about the vulnerability, its impact, or mitigation strategies.

0/70