CVE-2026-27940
Integer Overflow in llama.cpp gguf.cpp Causes Heap Buffer Overflow
Publication date: 2026-03-12
Last updated on: 2026-04-28
Assigner: GitHub, Inc.
Description
Description
CVSS Scores
EPSS Scores
| Probability: | |
| Percentile: |
Meta Information
Affected Vendors & Products
| Vendor | Product | Version / Range |
|---|---|---|
| ggml | llama.cpp | to b8146 (exc) |
Helpful Resources
Exploitability
| CWE ID | Description |
|---|---|
| CWE-122 | A heap overflow condition is a buffer overflow, where the buffer that can be overwritten is allocated in the heap portion of memory, generally meaning that the buffer was allocated using a routine such as malloc(). |
| CWE-190 | The product performs a calculation that can produce an integer overflow or wraparound when the logic assumes that the resulting value will always be larger than the original value. This occurs when an integer value is incremented to a value that is too large to store in the associated representation. When this occurs, the value may become a very small or negative number. |
Attack-Flow Graph
AI Powered Q&A
Can you explain this vulnerability to me?
[{'type': 'paragraph', 'content': 'CVE-2026-27940 is a high-severity heap buffer overflow vulnerability in the function gguf_init_from_file_impl() within the llama.cpp project. It occurs due to an integer overflow in the calculation of the memory size (mem_size) needed for heap allocation when parsing GGUF model files.'}, {'type': 'paragraph', 'content': 'Specifically, the vulnerability arises because the final addition in the mem_size calculation lacks an overflow check. When ctx->size is near the maximum size_t value, adding the tensor overhead causes the value to wrap around, resulting in a much smaller allocation than required.'}, {'type': 'paragraph', 'content': 'This leads to an undersized heap allocation, and the subsequent fread() call writes attacker-controlled data beyond the allocated buffer boundary, causing a heap buffer overflow.'}, {'type': 'paragraph', 'content': "An attacker can craft a malicious GGUF file with large tensors to trigger this overflow, corrupt heap metadata, bypass heap integrity checks in glibc's tcache allocator, and ultimately achieve arbitrary code execution, such as spawning a root shell."}, {'type': 'paragraph', 'content': 'This vulnerability also bypasses a previous fix (CVE-2025-53630) because that fix only checked intermediate additions but missed the final addition in mem_size calculation.'}] [1]
How can this vulnerability impact me? :
This vulnerability can have severe impacts including heap corruption, application crashes, and arbitrary code execution.
An attacker with local access can exploit this vulnerability by providing a specially crafted GGUF model file to the vulnerable llama.cpp functions.
Successful exploitation allows the attacker to bypass heap integrity checks, corrupt heap metadata, and hijack control flow to execute arbitrary code with the privileges of the affected process.
This can lead to full system compromise, including spawning a root shell, resulting in high confidentiality, integrity, and availability impacts.
The attack complexity is low, no privileges are required, but user interaction is needed to trigger the vulnerability.
How does this vulnerability affect compliance with common standards and regulations (like GDPR, HIPAA)?:
I don't know
How can this vulnerability be detected on my network or system? Can you suggest some commands?
This vulnerability arises from processing crafted GGUF model files with the vulnerable `gguf_init_from_file_impl()` function in llama.cpp. Detection involves monitoring for crashes or abnormal behavior when loading GGUF files, especially SIGSEGV or SIGABRT signals caused by heap corruption.
Since the vulnerability is triggered locally by loading malicious GGUF files, network detection is limited. However, you can detect exploitation attempts by monitoring for crashes or suspicious process behavior related to llama.cpp tools such as `llama-quantize`, `llama-imatrix`, and `llama-gguf`.
Suggested commands to detect potential exploitation or test for the vulnerability include running the vulnerable tools with crafted GGUF files and observing for crashes or abnormal termination signals:
- Use `strace` or `ltrace` to monitor system calls and library calls during model loading to detect abnormal behavior.
- Run `llama-quantize` or `llama-gguf` with suspicious or untrusted GGUF files and check for segmentation faults or abort signals.
- Use `dmesg` or system logs to identify heap corruption or memory errors related to these processes.
- Monitor process crashes with commands like `journalctl -xe` or `tail -f /var/log/syslog` on Linux systems.
What immediate steps should I take to mitigate this vulnerability?
The primary mitigation is to update llama.cpp to version b8146 or later, where the integer overflow in `gguf_init_from_file_impl()` is fixed.
Until the update is applied, avoid loading untrusted or crafted GGUF model files with vulnerable tools such as `llama-quantize`, `llama-imatrix`, and `llama-gguf`.
Restrict access to systems running vulnerable versions to trusted users only, as exploitation requires local user interaction.
Monitor for unusual crashes or behavior in llama.cpp related processes and investigate immediately.