CVE-2026-40682
XML External Entity Injection in Apache OpenNLP
Publication date: 2026-05-04
Last updated on: 2026-05-06
Assigner: Apache Software Foundation
Description
Description
CVSS Scores
EPSS Scores
| Probability: | |
| Percentile: |
Meta Information
Affected Vendors & Products
| Vendor | Product | Version / Range |
|---|---|---|
| apache | opennlp | to 2.5.9 (exc) |
| apache | opennlp | 3.0.0 |
| apache | opennlp | 3.0.0 |
Helpful Resources
Exploitability
| CWE ID | Description |
|---|---|
| CWE-611 | The product processes an XML document that can contain XML entities with URIs that resolve to documents outside of the intended sphere of control, causing the product to embed incorrect documents into its output. |
Attack-Flow Graph
AI Powered Q&A
How does this vulnerability affect compliance with common standards and regulations (like GDPR, HIPAA)?:
The vulnerability allows an attacker to supply a crafted dictionary file containing a malicious DOCTYPE declaration, which can lead to local file disclosure or server-side request forgery during XML parsing. This exposure of sensitive files or unauthorized requests could potentially lead to unauthorized access to personal or protected data.
Such unauthorized data disclosure or access could impact compliance with data protection regulations like GDPR or HIPAA, which require safeguarding personal and sensitive information against unauthorized access or breaches.
Mitigations include upgrading to fixed versions or validating input to reject XML containing DOCTYPE declarations, which helps maintain compliance by preventing exploitation.
Can you explain this vulnerability to me?
This vulnerability is an XML External Entity (XXE) attack in the Apache OpenNLP DictionaryEntryPersistor class. The class initializes an XML parser without enabling secure processing features or disabling DTD processing. As a result, when a dictionary file containing a malicious DOCTYPE declaration is parsed, an attacker can exploit this to perform local file disclosure or server-side request forgery by referencing external entities during XML parsing.
Specifically, the vulnerability arises because the XMLReader used only has namespace support enabled, but external entity resolution and DOCTYPE declarations remain enabled, allowing crafted dictionary files to trigger these attacks before any dictionary entries are processed.
How can this vulnerability impact me? :
This vulnerability can lead to serious security impacts including local file disclosure, where sensitive files on the server can be read by an attacker, and server-side request forgery (SSRF), where the server can be tricked into making unauthorized HTTP requests to internal or external systems.
These impacts can compromise confidentiality and potentially allow attackers to gather sensitive information or perform unauthorized actions within the affected system.
What immediate steps should I take to mitigate this vulnerability?
To mitigate this vulnerability, users should upgrade Apache OpenNLP to version 2.5.9 if using the 2.x branch, or to 3.0.0-M3 if using the 3.x branch.
If immediate upgrade is not possible, ensure that all dictionary files are sourced from trusted origins.
Additionally, consider wrapping the Dictionary(InputStream) constructor with input validation that rejects any XML containing a DOCTYPE declaration before it reaches the parser.
How can this vulnerability be detected on my network or system? Can you suggest some commands?
This vulnerability can be detected by inspecting dictionary files used by Apache OpenNLP for the presence of malicious DOCTYPE declarations or external entity references, which are indicators of XML External Entity (XXE) attacks.
Since the vulnerability arises from processing crafted XML dictionary files, you can detect potential exploitation attempts by scanning these files for DOCTYPE declarations or external entity references before they are parsed.
- Use grep or similar tools to search for DOCTYPE declarations in dictionary files, for example: grep -i '<!DOCTYPE' /path/to/dictionaries/*
- Check for external entity references such as file:// or http:// in XML files: grep -E 'file://|http://' /path/to/dictionaries/*
Additionally, monitoring network traffic for unexpected outbound HTTP requests originating from the application during dictionary parsing may help detect server-side request forgery attempts triggered by this vulnerability.
No specific commands for runtime detection or network scanning are provided in the available resources.