CVE-2026-21869
Out-of-Bounds Write in llama.cpp Server Enables RCE
Publication date: 2026-01-08
Last updated on: 2026-02-02
Assigner: GitHub, Inc.
Description
Description
CVSS Scores
EPSS Scores
| Probability: | |
| Percentile: |
Meta Information
Affected Vendors & Products
| Vendor | Product | Version / Range |
|---|---|---|
| ggml-org | llama.cpp | to 55d4206c8 (exc) |
| ggml | llama.cpp | * |
Helpful Resources
Exploitability
| CWE ID | Description |
|---|---|
| CWE-787 | The product writes data past the end, or before the beginning, of the intended buffer. |
Attack-Flow Graph
AI Powered Q&A
Can you explain this vulnerability to me?
CVE-2026-21869 is a critical vulnerability in the llama.cpp project's server component where a parameter called 'n_discard' is taken from client JSON requests without checking if it is negative. If a negative value is used and the server's context is full, this causes functions to operate on reversed memory ranges, leading to out-of-bounds memory writes. This memory corruption can crash the server process or allow remote attackers to execute arbitrary code. The vulnerability affects both CPU and GPU builds and requires the server to be started with the '--context-shift' option enabled. [1]
How can this vulnerability impact me? :
This vulnerability can allow remote, unauthenticated attackers with network access to crash the llama.cpp server or potentially execute arbitrary code remotely. This can lead to denial of service or full compromise of the server running the vulnerable software, impacting confidentiality, integrity, and availability of the system. [1]
How can this vulnerability be detected on my network or system? Can you suggest some commands?
This vulnerability can be detected by monitoring server logs for entries indicating context shifts with negative n_discard values, which appear after receiving HTTP POST requests to endpoints like /completions with negative n_discard parameters (e.g., -32). Additionally, if AddressSanitizer (ASan) or similar memory protection tools are enabled, detection can occur through aborts or errors related to out-of-bounds memory accesses, such as invalid position errors (e.g., pos_min == -1). A practical detection method is to capture and analyze HTTP POST requests to the llama.cpp server's completion endpoints for negative n_discard values. For example, using curl or similar tools to send crafted requests and observing server behavior or logs can help detect exploitation attempts. Specific commands to test or detect might include: curl -X POST http://<server>/completions -d '{"n_discard": -32, ...}' and monitoring server logs for anomalies or crashes. [1]
What immediate steps should I take to mitigate this vulnerability?
Immediate mitigation steps include disabling the --context-shift option when starting the llama.cpp server, as the vulnerability requires this option to be enabled. Since no patched versions are available at the time of reporting, avoiding enabling context shifting prevents the vulnerable code path from being exercised. Additionally, restricting network access to the server to trusted clients only, implementing firewall rules to block unauthorized HTTP POST requests to the affected endpoints (/completions, /chat/completions, /slots/(resume)), and monitoring logs for suspicious activity can reduce exposure. Enabling memory protection tools like AddressSanitizer during testing can help detect exploitation attempts. Ultimately, applying patches or updates once available is necessary for full remediation. [1]