CVE-2026-26831
Received Received - Intake
OS Command Injection in textract Extractors via Unsanitized File Paths

Publication date: 2026-03-25

Last updated on: 2026-03-30

Assigner: MITRE

Description
textract through 2.5.0 is vulnerable to OS Command Injection via the file path parameter in multiple extractors. When processing files with malicious filenames, the filePath is passed directly to child_process.exec() in lib/extractors/doc.js, rtf.js, dxf.js, images.js, and lib/util.js with inadequate sanitization
CVSS Scores
EPSS Scores
Probability:
Percentile:
Meta Information
Published
2026-03-25
Last Modified
2026-03-30
Generated
2026-05-07
AI Q&A
2026-03-25
EPSS Evaluated
2026-05-05
NVD
EUVD
Affected Vendors & Products
Showing 1 associated CPE
Vendor Product Version / Range
dbashford textract to 2.5.0 (inc)
Helpful Resources
Exploitability
CWE
CWE Icon
KEV
KEV Icon
CWE ID Description
CWE-94 The product constructs all or part of a code segment using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify the syntax or behavior of the intended code segment.
CWE-78 The product constructs all or part of an OS command using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify the intended OS command when it is sent to a downstream component.
Attack-Flow Graph
AI Powered Q&A
Can you explain this vulnerability to me?

This vulnerability exists in textract versions up to 2.5.0 and is an OS Command Injection issue. It occurs because the file path parameter used in multiple extractors is passed directly to the child_process.exec() function without proper sanitization. This happens in several files including lib/extractors/doc.js, rtf.js, dxf.js, images.js, and lib/util.js. If a file with a malicious filename is processed, it can lead to execution of arbitrary operating system commands.


How can this vulnerability impact me? :

The vulnerability can allow an attacker to execute arbitrary OS commands on the system where textract is running. This can lead to unauthorized access, data theft, system compromise, or disruption of services depending on the privileges of the process running textract.


How does this vulnerability affect compliance with common standards and regulations (like GDPR, HIPAA)?:

CVE-2026-26831 is a critical OS command injection vulnerability in the textract library that allows attackers to execute arbitrary OS commands by controlling file paths during text extraction. This can lead to unauthorized access, modification, or destruction of sensitive data.

Such unauthorized access and potential data breaches can directly impact compliance with common standards and regulations like GDPR and HIPAA, which mandate strict controls over data confidentiality, integrity, and availability.

Exploitation of this vulnerability could result in exposure or alteration of personal or protected health information, thereby violating regulatory requirements and potentially leading to legal and financial consequences.

Mitigations such as avoiding passing attacker-controlled file paths, removing unsafe shell command concatenations, or migrating to safer extraction solutions are critical to maintain compliance.


How can this vulnerability be detected on my network or system? Can you suggest some commands?

This vulnerability arises from the textract library passing unsanitized file path parameters directly into shell commands via child_process.exec(), allowing OS command injection if an attacker controls the file name or path.

To detect exploitation attempts or the presence of this vulnerability on your system or network, you can monitor for suspicious command executions involving textract, especially commands that include unusual or suspicious file names containing shell metacharacters such as semicolons (;), backticks (`), or command substitutions ($()).

Suggested detection methods include:

  • Monitor process execution logs or audit logs for textract-related commands invoking antiword, unrtf, or other extractors with suspicious file paths.
  • Use system command-line tools to search for suspicious running processes or command history entries that include suspicious file names, for example:
  • On Linux, check running processes with: `ps aux | grep textract` or `ps aux | grep antiword`
  • Search shell history for suspicious commands: `grep -E 'antiword|unrtf' ~/.bash_history`
  • Use file system monitoring tools to detect files with suspicious names containing shell metacharacters, e.g., `find /path/to/textract/input -name '*;*' -o -name '*`*' -o -name '*$()*'`

Additionally, you can create test files with malicious filenames such as `test";whoami;".doc` and observe if textract executes injected commands, which would confirm the vulnerability.


What immediate steps should I take to mitigate this vulnerability?

Immediate mitigation steps for this critical OS command injection vulnerability in textract (up to version 2.5.0) include:

  • Avoid processing files with attacker-controlled or untrusted filenames or file paths using the vulnerable textract versions.
  • If you maintain code using textract, remove or disable the use of textract until a fixed version is released.
  • Modify the textract source code to eliminate shell string concatenation in the vulnerable extractor files (lib/extractors/doc.js, rtf.js, dxf.js, images.js, and lib/util.js). Instead, replace child_process.exec() calls with safer alternatives such as child_process.execFile() or spawn() that accept arguments as arrays, preventing shell injection.
  • Sanitize or validate all file path inputs rigorously to remove or escape shell metacharacters before passing them to any shell commands.
  • Consider migrating to a maintained and secure text extraction library or solution that does not have this vulnerability.

Ask Our AI Assistant
Need more information? Ask your question to get an AI reply (Powered by our expertise)
0/70
EPSS Chart