CVE-2025-40310
Unknown Unknown - Not Provided
BaseFortify

Publication date: 2025-12-08

Last updated on: 2025-12-08

Assigner: kernel.org

Description
In the Linux kernel, the following vulnerability has been resolved: amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw There is race in amdgpu_amdkfd_device_fini_sw and interrupt. if amdgpu_amdkfd_device_fini_sw run in b/w kfd_cleanup_nodes and kfree(kfd), and KGD interrupt generated. kernel panic log: BUG: kernel NULL pointer dereference, address: 0000000000000098 amdgpu 0000:c8:00.0: amdgpu: Requesting 4 partitions through PSP PGD d78c68067 P4D d78c68067 kfd kfd: amdgpu: Allocated 3969056 bytes on gart PUD 1465b8067 PMD @ Oops: @002 [#1] SMP NOPTI kfd kfd: amdgpu: Total number of KFD nodes to be created: 4 CPU: 115 PID: @ Comm: swapper/115 Kdump: loaded Tainted: G S W OE K RIP: 0010:_raw_spin_lock_irqsave+0x12/0x40 Code: 89 e@ 41 5c c3 cc cc cc cc 66 66 2e Of 1f 84 00 00 00 00 00 OF 1f 40 00 Of 1f 44% 00 00 41 54 9c 41 5c fa 31 cO ba 01 00 00 00 <fO> OF b1 17 75 Ba 4c 89 e@ 41 Sc 89 c6 e8 07 38 5d RSP: 0018: ffffc90@1a6b0e28 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000018 0000000000000001 RSI: ffff8883bb623e00 RDI: 0000000000000098 ffff8883bb000000 RO8: ffff888100055020 ROO: ffff888100055020 0000000000000000 R11: 0000000000000000 R12: 0900000000000002 ffff888F2b97da0@ R14: @000000000000098 R15: ffff8883babdfo00 CS: 010 DS: 0000 ES: 0000 CRO: 0000000080050033 CR2: 0000000000000098 CR3: 0000000e7cae2006 CR4: 0000000002770ce0 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 0000000000000000 DR6: 00000000fffeO7FO DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> kgd2kfd_interrupt+@x6b/0x1f@ [amdgpu] ? amdgpu_fence_process+0xa4/0x150 [amdgpu] kfd kfd: amdgpu: Node: 0, interrupt_bitmap: 3 YcpxFl Rant tErace amdgpu_irq_dispatch+0x165/0x210 [amdgpu] amdgpu_ih_process+0x80/0x100 [amdgpu] amdgpu: Virtual CRAT table created for GPU amdgpu_irq_handler+0x1f/@x60 [amdgpu] __handle_irq_event_percpu+0x3d/0x170 amdgpu: Topology: Add dGPU node [0x74a2:0x1002] handle_irq_event+0x5a/@xcO handle_edge_irq+0x93/0x240 kfd kfd: amdgpu: KFD node 1 partition @ size 49148M asm_call_irq_on_stack+0xf/@x20 </IRQ> common_interrupt+0xb3/0x130 asm_common_interrupt+0x1le/0x40 5.10.134-010.a1i5000.a18.x86_64 #1
CVSS Scores
EPSS Scores
Probability:
Percentile:
Meta Information
Published
2025-12-08
Last Modified
2025-12-08
Generated
2026-05-07
AI Q&A
2025-12-08
EPSS Evaluated
2026-05-05
NVD
Affected Vendors & Products
Showing 1 associated CPE
Vendor Product Version / Range
amd amdgpu *
Helpful Resources
Exploitability
CWE
CWE Icon
KEV
KEV Icon
CWE ID Description
CWE-UNKNOWN
Attack-Flow Graph
AI Powered Q&A
Can you explain this vulnerability to me?

This vulnerability is a race condition in the Linux kernel's AMD GPU driver, specifically in the function amdgpu_amdkfd_device_fini_sw. The race occurs between this function and an interrupt when certain cleanup operations are happening, which can lead to a kernel panic due to a NULL pointer dereference.


How can this vulnerability impact me? :

This vulnerability can cause the Linux kernel to panic, leading to system crashes and potential denial of service. It affects systems using the AMD GPU driver, potentially causing instability or downtime.


How can this vulnerability be detected on my network or system? Can you suggest some commands?

This vulnerability manifests as a kernel panic related to a NULL pointer dereference in the amdgpu driver, specifically involving amdgpu_amdkfd_device_fini_sw and interrupts. Detection can be done by monitoring kernel logs for panic messages or oops reports containing references to amdgpu, kfd, or the specific error messages shown in the kernel panic log (e.g., 'BUG: kernel NULL pointer dereference', 'amdgpu_amdkfd_device_fini_sw', 'kfd: amdgpu'). Commands to check kernel logs include: 'dmesg | grep -i amdgpu', 'journalctl -k | grep -i amdgpu', or 'grep -i amdgpu /var/log/kern.log'. Additionally, monitoring for system crashes or reboots related to GPU activity may indicate the issue.


What immediate steps should I take to mitigate this vulnerability?

Immediate mitigation steps include updating the Linux kernel to a version where this race condition in amdgpu_amdkfd_device_fini_sw has been resolved. Until an update is applied, consider limiting or disabling workloads that heavily use the amdgpu driver or KFD nodes to reduce the chance of triggering the race condition. Monitoring system stability and avoiding operations that cause frequent device finalization or interrupts related to amdgpu may help reduce risk.


Ask Our AI Assistant
Need more information? Ask your question to get an AI reply (Powered by our expertise)
0/70
EPSS Chart