CVE-2025-38242
Unknown Unknown - Not Provided
BaseFortify

Publication date: 2025-07-09

Last updated on: 2025-11-19

Assigner: kernel.org

Description
In the Linux kernel, the following vulnerability has been resolved: mm: userfaultfd: fix race of userfaultfd_move and swap cache This commit fixes two kinds of races, they may have different results: Barry reported a BUG_ON in commit c50f8e6053b0, we may see the same BUG_ON if the filemap lookup returned NULL and folio is added to swap cache after that. If another kind of race is triggered (folio changed after lookup) we may see RSS counter is corrupted: [ 406.893936] BUG: Bad rss-counter state mm:ffff0000c5a9ddc0 type:MM_ANONPAGES val:-1 [ 406.894071] BUG: Bad rss-counter state mm:ffff0000c5a9ddc0 type:MM_SHMEMPAGES val:1 Because the folio is being accounted to the wrong VMA. I'm not sure if there will be any data corruption though, seems no. The issues above are critical already. On seeing a swap entry PTE, userfaultfd_move does a lockless swap cache lookup, and tries to move the found folio to the faulting vma. Currently, it relies on checking the PTE value to ensure that the moved folio still belongs to the src swap entry and that no new folio has been added to the swap cache, which turns out to be unreliable. While working and reviewing the swap table series with Barry, following existing races are observed and reproduced [1]: In the example below, move_pages_pte is moving src_pte to dst_pte, where src_pte is a swap entry PTE holding swap entry S1, and S1 is not in the swap cache: CPU1 CPU2 userfaultfd_move move_pages_pte() entry = pte_to_swp_entry(orig_src_pte); // Here it got entry = S1 ... < interrupted> ... <swapin src_pte, alloc and use folio A> // folio A is a new allocated folio // and get installed into src_pte <frees swap entry S1> // src_pte now points to folio A, S1 // has swap count == 0, it can be freed // by folio_swap_swap or swap // allocator's reclaim. <try to swap out another folio B> // folio B is a folio in another VMA. <put folio B to swap cache using S1 > // S1 is freed, folio B can use it // for swap out with no problem. ... folio = filemap_get_folio(S1) // Got folio B here !!! ... < interrupted again> ... <swapin folio B and free S1> // Now S1 is free to be used again. <swapout src_pte & folio A using S1> // Now src_pte is a swap entry PTE // holding S1 again. folio_trylock(folio) move_swap_pte double_pt_lock is_pte_pages_stable // Check passed because src_pte == S1 folio_move_anon_rmap(...) // Moved invalid folio B here !!! The race window is very short and requires multiple collisions of multiple rare events, so it's very unlikely to happen, but with a deliberately constructed reproducer and increased time window, it can be reproduced easily. This can be fixed by checking if the folio returned by filemap is the valid swap cache folio after acquiring the folio lock. Another similar race is possible: filemap_get_folio may return NULL, but folio (A) could be swapped in and then swapped out again using the same swap entry after the lookup. In such a case, folio (A) may remain in the swap cache, so it must be moved too: CPU1 CPU2 userfaultfd_move move_pages_pte() entry = pte_to_swp_entry(orig_src_pte); // Here it got entry = S1, and S1 is not in swap cache folio = filemap_get ---truncated---
CVSS Scores
EPSS Scores
Probability:
Percentile:
Meta Information
Published
2025-07-09
Last Modified
2025-11-19
Generated
2026-05-07
AI Q&A
2025-07-09
EPSS Evaluated
2026-05-05
NVD
Affected Vendors & Products
Showing 5 associated CPEs
Vendor Product Version / Range
linux linux_kernel 6.16
linux linux_kernel 6.16
linux linux_kernel 6.16
linux linux_kernel From 5.15.160 (inc) to 5.16 (inc)
linux linux_kernel From 5.15.160 (inc) to 5.16 (inc)
Helpful Resources
Exploitability
CWE
CWE Icon
KEV
KEV Icon
CWE ID Description
CWE-362 The product contains a concurrent code sequence that requires temporary, exclusive access to a shared resource, but a timing window exists in which the shared resource can be modified by another code sequence operating concurrently.
Attack-Flow Graph
AI Powered Q&A
Can you explain this vulnerability to me?

This vulnerability in the Linux kernel involves a race condition in the userfaultfd subsystem related to moving swap cache entries. Specifically, there are timing issues when the system tries to move memory pages (folios) between virtual memory areas (VMAs) during swap operations. Due to these races, the system may incorrectly associate memory pages with the wrong VMAs, leading to corrupted RSS counters and potential kernel bugs. The race occurs because the code relies on checking page table entries (PTEs) without proper locking, which can result in using stale or invalid swap cache entries. Although the race window is very short and requires rare conditions, it can be reproduced with deliberate effort. The fix involves adding proper validation and locking when accessing swap cache folios to ensure correctness.


How can this vulnerability impact me? :

This vulnerability can cause kernel bugs such as corrupted RSS counters and BUG_ON errors, which may lead to system instability or crashes. While it is unclear if data corruption occurs, the incorrect accounting of memory pages to VMAs can affect system memory management reliability. Exploiting this race condition could potentially disrupt normal system operations, especially in environments with heavy memory swapping and userfaultfd usage.


How can this vulnerability be detected on my network or system? Can you suggest some commands?

This vulnerability may manifest as kernel BUG_ON messages related to bad rss-counter state, such as logs containing 'BUG: Bad rss-counter state mm:ffff...' indicating corrupted RSS counters. Detection involves monitoring kernel logs (e.g., using 'dmesg' or 'journalctl -k') for such BUG_ON messages. Specific commands to check kernel logs include: 'dmesg | grep -i bug', 'journalctl -k | grep -i bug', or monitoring for anomalies in memory management related logs. There are no specific network detection commands since this is a kernel memory management race condition.


What immediate steps should I take to mitigate this vulnerability?

Immediate mitigation involves updating the Linux kernel to a version where this race condition in userfaultfd_move and swap cache has been fixed. Since the issue is a race condition in kernel memory management, applying the patch or upgrading to the fixed kernel version is necessary. There are no configuration workarounds or temporary fixes mentioned.


Ask Our AI Assistant
Need more information? Ask your question to get an AI reply (Powered by our expertise)
0/70
EPSS Chart