CVE-2026-7141
Uninitialized Resource Vulnerability in vllm KV Block Handler
Publication date: 2026-04-27
Last updated on: 2026-05-01
Assigner: VulDB
Description
Description
CVSS Scores
EPSS Scores
| Probability: | |
| Percentile: |
Meta Information
Affected Vendors & Products
| Vendor | Product | Version / Range |
|---|---|---|
| vllm | vllm | to 0.19.0 (inc) |
Helpful Resources
Exploitability
| CWE ID | Description |
|---|---|
| CWE-908 | The product uses or accesses a resource that has not been initialized. |
Attack-Flow Graph
AI Powered Q&A
Can you explain this vulnerability to me?
CVE-2026-7141 is a vulnerability in the vLLM project affecting the KV (key-value) block handler, specifically in the function has_mamba_layers of the file vllm/v1/kv_cache_interface.py. The issue arises from improper management of KV cache blocks in the base scheduler, where recycled KV blocks are reused without clearing stale GPU memory. This leads to corrupted outputs and non-deterministic generation results when prefix caching is disabled.
The bug manifests as identical prompts producing different output sequences across runs due to stale KV data being reused. It occurs under concurrency with multiple simultaneous requests and is related to how KV blocks are returned to the free pool and reused without zeroing. This can cause NaN or Inf values to propagate in model operations, compromising correctness and stability.
The vulnerability is exploitable remotely but with high complexity and difficult exploitability. A patch has been released to fix the issue by ensuring that recycled KV cache blocks are zeroed before reuse, especially for FullAttention models and those with Mamba layers.
How can this vulnerability impact me? :
This vulnerability can impact users by causing non-deterministic and corrupted outputs from the vLLM model scheduler, which can undermine the reliability and correctness of AI model results.
In practical terms, if you rely on vLLM for AI model inference, especially in concurrent environments without prefix caching enabled, you may experience inconsistent or incorrect outputs due to stale KV cache data being reused.
Additionally, the presence of stale data can lead to propagation of invalid values (NaN or Inf) during model computations, potentially causing crashes or instability in applications using the affected models.
Because the exploit is public, attackers could potentially trigger these corrupted outputs remotely, although the attack complexity is high and exploitability is difficult.
How can this vulnerability be detected on my network or system? Can you suggest some commands?
This vulnerability manifests as non-deterministic output from the vLLM base scheduler when prefix caching is disabled. Identical prompts produce different output sequences across runs, especially at temperature=0.
Detection involves reproducing the issue by running vLLM with a specified model and GPU memory utilization, then executing a repro script with JSON trace files that demonstrate output divergence patterns.
Specifically, fuzz testing was used to discover the bug, and it is reproducible 10/10 times across multiple independent traces without speculative decoding enabled.
Commands to detect the vulnerability would include running vLLM with prefix caching disabled (not using --enable-prefix-caching), setting temperature=0, and using the repro.py script with provided JSON traces to observe output divergence.
What immediate steps should I take to mitigate this vulnerability?
The recommended mitigation is to deploy the patch identified by commit 1ad67864c0c20f167929e64c875f5c28e1aad9fd.
This patch modifies the handling of recycled KV cache blocks to ensure they are zeroed before reuse, preventing stale key/value data leakage.
Specifically, the patch updates the property controlling KV cache zeroing to include FullAttention models and their subclasses, ensuring proper clearing of KV cache blocks.
Until the patch is applied, avoid running vLLM with prefix caching disabled and multiple concurrent requests that increase memory pressure, as these conditions exacerbate the issue.
How does this vulnerability affect compliance with common standards and regulations (like GDPR, HIPAA)?:
The provided information does not explicitly describe how the vulnerability CVE-2026-7141 affects compliance with common standards and regulations such as GDPR or HIPAA.