CVE-2026-43404
Received Received - Intake
Memory Corruption in Linux Kernel mm Subsystem

Publication date: 2026-05-08

Last updated on: 2026-05-08

Assigner: kernel.org

Description
In the Linux kernel, the following vulnerability has been resolved: mm: Fix a hmm_range_fault() livelock / starvation problem If hmm_range_fault() fails a folio_trylock() in do_swap_page, trying to acquire the lock of a device-private folio for migration, to ram, the function will spin until it succeeds grabbing the lock. However, if the process holding the lock is depending on a work item to be completed, which is scheduled on the same CPU as the spinning hmm_range_fault(), that work item might be starved and we end up in a livelock / starvation situation which is never resolved. This can happen, for example if the process holding the device-private folio lock is stuck in migrate_device_unmap()->lru_add_drain_all() sinc lru_add_drain_all() requires a short work-item to be run on all online cpus to complete. A prerequisite for this to happen is: a) Both zone device and system memory folios are considered in migrate_device_unmap(), so that there is a reason to call lru_add_drain_all() for a system memory folio while a folio lock is held on a zone device folio. b) The zone device folio has an initial mapcount > 1 which causes at least one migration PTE entry insertion to be deferred to try_to_migrate(), which can happen after the call to lru_add_drain_all(). c) No or voluntary only preemption. This all seems pretty unlikely to happen, but indeed is hit by the "xe_exec_system_allocator" igt test. Resolve this by waiting for the folio to be unlocked if the folio_trylock() fails in do_swap_page(). Rename migration_entry_wait_on_locked() to softleaf_entry_wait_unlock() and update its documentation to indicate the new use-case. Future code improvements might consider moving the lru_add_drain_all() call in migrate_device_unmap() to be called *after* all pages have migration entries inserted. That would eliminate also b) above. v2: - Instead of a cond_resched() in hmm_range_fault(), eliminate the problem by waiting for the folio to be unlocked in do_swap_page() (Alistair Popple, Andrew Morton) v3: - Add a stub migration_entry_wait_on_locked() for the !CONFIG_MIGRATION case. (Kernel Test Robot) v4: - Rename migrate_entry_wait_on_locked() to softleaf_entry_wait_on_locked() and update docs (Alistair Popple) v5: - Add a WARN_ON_ONCE() for the !CONFIG_MIGRATION version of softleaf_entry_wait_on_locked(). - Modify wording around function names in the commit message (Andrew Morton) (cherry picked from commit a69d1ab971a624c6f112cea61536569d579c3215)
CVSS Scores
EPSS Scores
Probability:
Percentile:
Meta Information
Published
2026-05-08
Last Modified
2026-05-08
Generated
2026-05-09
AI Q&A
2026-05-08
EPSS Evaluated
N/A
NVD
EUVD
Affected Vendors & Products
Showing 1 associated CPE
Vendor Product Version / Range
linux linux_kernel *
Helpful Resources
Exploitability
CWE
CWE Icon
KEV
KEV Icon
CWE ID Description
CWE-UNKNOWN
Attack-Flow Graph
AI Powered Q&A
What immediate steps should I take to mitigate this vulnerability?

The vulnerability is resolved by updating the Linux kernel to a version that includes the fix for the hmm_range_fault() livelock/starvation problem.

Specifically, the fix involves waiting for the folio to be unlocked if folio_trylock() fails in do_swap_page(), preventing the livelock condition.

Therefore, the immediate mitigation step is to apply the kernel patch or upgrade to a kernel version that contains this fix.


Can you explain this vulnerability to me?

This vulnerability exists in the Linux kernel's memory management subsystem, specifically in the hmm_range_fault() function. When hmm_range_fault() tries to acquire a lock on a device-private folio for migration to RAM and fails, it spins waiting for the lock to be released. If the process holding the lock depends on a work item scheduled on the same CPU as the spinning hmm_range_fault(), that work item can be starved, causing a livelock or starvation situation that never resolves.

This situation can occur when the process holding the device-private folio lock is stuck in migrate_device_unmap()->lru_add_drain_all(), where lru_add_drain_all() requires a short work-item to run on all online CPUs to complete. The spinning hmm_range_fault() prevents this work-item from running, causing the livelock.

The vulnerability requires specific conditions, such as both zone device and system memory folios being considered in migrate_device_unmap(), the zone device folio having an initial mapcount greater than 1, and no or voluntary only preemption.

The issue was resolved by changing the code to wait for the folio to be unlocked if folio_trylock() fails in do_swap_page(), preventing the spinning and thus the livelock.


How can this vulnerability impact me? :

This vulnerability can cause a livelock or starvation situation in the Linux kernel's memory management, where a process spins indefinitely waiting for a lock that cannot be released because the work needed to release it is starved.

The impact is that system resources could be tied up, potentially leading to degraded system performance or unresponsiveness in scenarios involving device memory migration.

However, the conditions required for this vulnerability to manifest are quite specific and unlikely, so the practical impact may be limited to certain workloads or test scenarios.


Ask Our AI Assistant
Need more information? Ask your question to get an AI reply (Powered by our expertise)
0/70
EPSS Chart