CVE-2026-21869
Unknown Unknown - Not Provided
Out-of-Bounds Write in llama.cpp Server Enables RCE

Publication date: 2026-01-08

Last updated on: 2026-02-02

Assigner: GitHub, Inc.

Description
llama.cpp is an inference of several LLM models in C/C++. In commits 55d4206c8 and prior, the n_discard parameter is parsed directly from JSON input in the llama.cpp server's completion endpoints without validation to ensure it's non-negative. When a negative value is supplied and the context fills up, llama_memory_seq_rm/add receives a reversed range and negative offset, causing out-of-bounds memory writes in the token evaluation loop. This deterministic memory corruption can crash the process or enable remote code execution (RCE). There is no fix at the time of publication.
CVSS Scores
EPSS Scores
Probability:
Percentile:
Meta Information
Published
2026-01-08
Last Modified
2026-02-02
Generated
2026-05-07
AI Q&A
2026-01-08
EPSS Evaluated
2026-05-05
NVD
Affected Vendors & Products
Showing 2 associated CPEs
Vendor Product Version / Range
ggml-org llama.cpp to 55d4206c8 (exc)
ggml llama.cpp *
Helpful Resources
Exploitability
CWE
CWE Icon
KEV
KEV Icon
CWE ID Description
CWE-787 The product writes data past the end, or before the beginning, of the intended buffer.
Attack-Flow Graph
AI Powered Q&A
Can you explain this vulnerability to me?

CVE-2026-21869 is a critical vulnerability in the llama.cpp project's server component where a parameter called 'n_discard' is taken from client JSON requests without checking if it is negative. If a negative value is used and the server's context is full, this causes functions to operate on reversed memory ranges, leading to out-of-bounds memory writes. This memory corruption can crash the server process or allow remote attackers to execute arbitrary code. The vulnerability affects both CPU and GPU builds and requires the server to be started with the '--context-shift' option enabled. [1]


How can this vulnerability impact me? :

This vulnerability can allow remote, unauthenticated attackers with network access to crash the llama.cpp server or potentially execute arbitrary code remotely. This can lead to denial of service or full compromise of the server running the vulnerable software, impacting confidentiality, integrity, and availability of the system. [1]


How can this vulnerability be detected on my network or system? Can you suggest some commands?

This vulnerability can be detected by monitoring server logs for entries indicating context shifts with negative n_discard values, which appear after receiving HTTP POST requests to endpoints like /completions with negative n_discard parameters (e.g., -32). Additionally, if AddressSanitizer (ASan) or similar memory protection tools are enabled, detection can occur through aborts or errors related to out-of-bounds memory accesses, such as invalid position errors (e.g., pos_min == -1). A practical detection method is to capture and analyze HTTP POST requests to the llama.cpp server's completion endpoints for negative n_discard values. For example, using curl or similar tools to send crafted requests and observing server behavior or logs can help detect exploitation attempts. Specific commands to test or detect might include: curl -X POST http://<server>/completions -d '{"n_discard": -32, ...}' and monitoring server logs for anomalies or crashes. [1]


What immediate steps should I take to mitigate this vulnerability?

Immediate mitigation steps include disabling the --context-shift option when starting the llama.cpp server, as the vulnerability requires this option to be enabled. Since no patched versions are available at the time of reporting, avoiding enabling context shifting prevents the vulnerable code path from being exercised. Additionally, restricting network access to the server to trusted clients only, implementing firewall rules to block unauthorized HTTP POST requests to the affected endpoints (/completions, /chat/completions, /slots/(resume)), and monitoring logs for suspicious activity can reduce exposure. Enabling memory protection tools like AddressSanitizer during testing can help detect exploitation attempts. Ultimately, applying patches or updates once available is necessary for full remediation. [1]


Ask Our AI Assistant
Need more information? Ask your question to get an AI reply (Powered by our expertise)
0/70
EPSS Chart