CVE-2026-44223
Modified Modified - Updated After Analysis
BaseFortify

Publication date: 2026-05-12

Last updated on: 2026-05-15

Assigner: GitHub, Inc.

Description
vLLM is an inference and serving engine for large language models (LLMs). From to before 0.20.0, the extract_hidden_states speculative decoding proposer in vLLM returns a tensor with an incorrect shape after the first decode step, causing a RuntimeError that crashes the EngineCore process. The crash is triggered when any request in the batch uses sampling penalty parameters (repetition_penalty, frequency_penalty, or presence_penalty). A single request with a penalty parameter (e.g., "repetition_penalty": 1.1) is sufficient to crash the server. This vulnerability is fixed in 0.20.0.
CVSS Scores
EPSS Scores
Probability:
Percentile:
Meta Information
Published
2026-05-12
Last Modified
2026-05-15
Generated
2026-06-10
EPSS Evaluated
2026-06-08
NVD
Affected Vendors & Products
Showing 1 associated CPE
Vendor Product Version / Range
vllm vllm From 0.18.0 (inc) to 0.20.0 (exc)
Helpful Resources
Exploitability
CWE
CWE Icon
KEV
KEV Icon
CWE ID Description
CWE-704 The product does not correctly convert an object, resource, or structure from one type to a different type.
CWE-131 The product does not correctly calculate the size to be used when allocating a buffer, which could lead to a buffer overflow.
Attack-Flow Graph
AI Quick Actions
Instant insights powered by AI
AI Quick Actions have not been generated yet.
Chat Assistant
Ask questions about this CVE
Hi! I’m here to help you understand CVE-2026-44223. Ask me anything about the vulnerability, its impact, or mitigation strategies.
0/70
EPSS Chart