CVE-2025-49847
BaseFortify
Publication date: 2025-06-17
Last updated on: 2025-08-27
Assigner: GitHub, Inc.
Description
Description
CVSS Scores
EPSS Scores
| Probability: | |
| Percentile: |
Meta Information
Affected Vendors & Products
| Vendor | Product | Version / Range |
|---|---|---|
| ggml | llama.cpp | to b5662 (exc) |
Helpful Resources
Exploitability
| CWE ID | Description |
|---|---|
| CWE-119 | The product performs operations on a memory buffer, but it reads from or writes to a memory location outside the buffer's intended boundary. This may result in read or write operations on unexpected memory locations that could be linked to other variables, data structures, or internal program data. |
| CWE-195 | The product uses a signed primitive and performs a cast to an unsigned primitive, which can produce an unexpected value if the value of the signed primitive can not be represented using an unsigned primitive. |
Attack-Flow Graph
AI Powered Q&A
Can you explain this vulnerability to me?
CVE-2025-49847 is a critical buffer overflow vulnerability in the llama.cpp project. It occurs in the vocabulary-loading code when processing attacker-supplied GGUF model vocabularies. Specifically, the function handling token lengths casts a large unsigned size_t token length into a signed int32_t, causing an integer overflow that bypasses a length check. This allows memcpy to copy more data than the buffer can hold, leading to memory corruption and potentially arbitrary code execution. [2]
How can this vulnerability impact me? :
This vulnerability can lead to arbitrary memory corruption, application crashes (denial of service), and potentially remote code execution by overwriting heap metadata or control flow pointers. Any application using llama.cpp to load GGUF models from untrusted sources, such as inference servers or chatbots, is at risk. An attacker can exploit this by supplying a maliciously crafted model vocabulary, requiring only user interaction and no privileges. [2]
How can this vulnerability be detected on my network or system? Can you suggest some commands?
Detection of this vulnerability involves monitoring for the loading of malicious GGUF model vocabularies that contain oversized token lengths causing buffer overflow. Since the vulnerability occurs during the loading of GGUF models in llama.cpp, you can detect attempts by checking for crashes or abnormal behavior in applications using llama.cpp when loading models. Additionally, scanning for the version of llama.cpp in use can help identify vulnerable instances (versions prior to b5662). Specific commands are not provided in the resources, but general approaches include: 1) Checking application logs for crashes or memory corruption during model loading. 2) Using network monitoring tools to detect suspicious GGUF model file transfers. 3) Verifying the llama.cpp version with commands like `llama.cpp --version` or inspecting the deployed software version. 4) Employing fuzz testing or custom scripts to load GGUF models with oversized token lengths to trigger the vulnerability in a controlled environment. [2]
What immediate steps should I take to mitigate this vulnerability?
Immediate mitigation steps include updating llama.cpp to version b5662 or later, where the vulnerability has been patched by correcting the length check to prevent integer overflow and buffer overflow. If updating is not immediately possible, avoid loading GGUF models from untrusted or unverified sources to prevent exploitation. Additionally, monitor applications for crashes or abnormal behavior related to model loading and consider applying any available patches or workarounds from the llama.cpp project repository. [1, 2]