CVE-2025-68174
BaseFortify
Publication date: 2025-12-16
Last updated on: 2025-12-18
Assigner: kernel.org
Description
Description
CVSS Scores
EPSS Scores
| Probability: | |
| Percentile: |
Meta Information
Affected Vendors & Products
| Vendor | Product | Version / Range |
|---|---|---|
| amd | amdgpu | * |
Helpful Resources
Exploitability
| CWE ID | Description |
|---|---|
| CWE-UNKNOWN |
Attack-Flow Graph
AI Powered Q&A
Can you explain this vulnerability to me?
This vulnerability is a race condition in the Linux kernel's AMD GPU driver (amdkfd). It occurs because the current switch partition logic only checks if the kfd_processes_table is empty, but the process teardown and entry deletion happen in different functions, leading to a race between two processes accessing and tearing down shared resources concurrently. This can cause kernel errors or crashes. The fix involves adding an atomic counter to track kfd processes more accurately and prevent the race condition.
How can this vulnerability impact me? :
This vulnerability can lead to race conditions in the AMD GPU driver, potentially causing kernel crashes or instability. This may result in system crashes, data loss, or denial of service on affected systems using the AMD GPU driver in the Linux kernel.
How can this vulnerability be detected on my network or system? Can you suggest some commands?
This vulnerability can be detected by monitoring the system logs (dmesg) for specific error messages related to the amdgpu kernel module, such as 'divide error', 'kfd_process_wq_release', or stack traces involving amdgpu functions. You can use the command 'dmesg | grep -i amdgpu' to check for relevant error messages indicating the race condition or crashes caused by this vulnerability.
What immediate steps should I take to mitigate this vulnerability?
Immediate mitigation involves updating the Linux kernel to a version that includes the patch resolving this race condition in the amdkfd driver. Since the vulnerability is fixed by adding an atomic kfd_process counter and adjusting process teardown, applying the vendor's kernel update or patch is necessary. Until then, monitoring for crashes and avoiding workloads that trigger the race condition may reduce impact.