CVE-2025-49847

Unknown Unknown - Not Provided

BaseFortify

Publication date: 2025-06-17

Last updated on: 2025-08-27

Assigner: GitHub, Inc.

Description

llama.cpp is an inference of several LLM models in C/C++. Prior to version b5662, an attacker‐supplied GGUF model vocabulary can trigger a buffer overflow in llama.cpp’s vocabulary‐loading code. Specifically, the helper _try_copy in llama.cpp/src/vocab.cpp: llama_vocab::impl::token_to_piece() casts a very large size_t token length into an int32_t, causing the length check (if (length < (int32_t)size)) to be bypassed. As a result, memcpy is still called with that oversized size, letting a malicious model overwrite memory beyond the intended buffer. This can lead to arbitrary memory corruption and potential code execution. This issue has been patched in version b5662.

CVSS Scores

EPSS Scores

Probability:
Percentile:

Meta Information

Published

2025-06-17

Last Modified

2025-08-27

Generated

2026-05-07

AI Q&A

2025-06-17

EPSS Evaluated

2026-05-05

NVD

CVE-2025-49847

Affected Vendors & Products

Vendor	Product	Version / Range
ggml	llama.cpp	to b5662 (exc)

Helpful Resources

Exploitability

CWE

KEV

CWE ID	Description
CWE-119	The product performs operations on a memory buffer, but it reads from or writes to a memory location outside the buffer's intended boundary. This may result in read or write operations on unexpected memory locations that could be linked to other variables, data structures, or internal program data.
CWE-195	The product uses a signed primitive and performs a cast to an unsigned primitive, which can produce an unexpected value if the value of the signed primitive can not be represented using an unsigned primitive.

Attack-Flow Graph

AI Powered Q&A

Can you explain this vulnerability to me?

CVE-2025-49847 is a critical buffer overflow vulnerability in the llama.cpp project. It occurs in the vocabulary-loading code when processing attacker-supplied GGUF model vocabularies. Specifically, the function handling token lengths casts a large unsigned size_t token length into a signed int32_t, causing an integer overflow that bypasses a length check. This allows memcpy to copy more data than the buffer can hold, leading to memory corruption and potentially arbitrary code execution. [2]

How can this vulnerability impact me? :

This vulnerability can lead to arbitrary memory corruption, application crashes (denial of service), and potentially remote code execution by overwriting heap metadata or control flow pointers. Any application using llama.cpp to load GGUF models from untrusted sources, such as inference servers or chatbots, is at risk. An attacker can exploit this by supplying a maliciously crafted model vocabulary, requiring only user interaction and no privileges. [2]

How can this vulnerability be detected on my network or system? Can you suggest some commands?

Detection of this vulnerability involves monitoring for the loading of malicious GGUF model vocabularies that contain oversized token lengths causing buffer overflow. Since the vulnerability occurs during the loading of GGUF models in llama.cpp, you can detect attempts by checking for crashes or abnormal behavior in applications using llama.cpp when loading models. Additionally, scanning for the version of llama.cpp in use can help identify vulnerable instances (versions prior to b5662). Specific commands are not provided in the resources, but general approaches include: 1) Checking application logs for crashes or memory corruption during model loading. 2) Using network monitoring tools to detect suspicious GGUF model file transfers. 3) Verifying the llama.cpp version with commands like `llama.cpp --version` or inspecting the deployed software version. 4) Employing fuzz testing or custom scripts to load GGUF models with oversized token lengths to trigger the vulnerability in a controlled environment. [2]

What immediate steps should I take to mitigate this vulnerability?

Immediate mitigation steps include updating llama.cpp to version b5662 or later, where the vulnerability has been patched by correcting the length check to prevent integer overflow and buffer overflow. If updating is not immediately possible, avoid loading GGUF models from untrusted or unverified sources to prevent exploitation. Additionally, monitor applications for crashes or abnormal behavior related to model loading and consider applying any available patches or workarounds from the llama.cpp project repository. [1, 2]

Ask Our AI Assistant

Need more information? Ask your question to get an AI reply (Powered by our expertise)

0/70

CVE-2025-49847

BaseFortify

Description

CVSS Scores

EPSS Scores

Meta Information

Affected Vendors & Products

Helpful Resources

Exploitability

Attack-Flow Graph

AI Powered Q&A

Ask Our AI Assistant

EPSS Chart