CVE-2026-26831

Received Received - Intake

OS Command Injection in textract Extractors via Unsanitized File Paths

Publication date: 2026-03-25

Last updated on: 2026-03-30

Assigner: MITRE

Description

textract through 2.5.0 is vulnerable to OS Command Injection via the file path parameter in multiple extractors. When processing files with malicious filenames, the filePath is passed directly to child_process.exec() in lib/extractors/doc.js, rtf.js, dxf.js, images.js, and lib/util.js with inadequate sanitization

CVSS Scores

EPSS Scores

Probability:
Percentile:

Meta Information

Published

2026-03-25

Last Modified

2026-03-30

Generated

2026-07-27

AI Q&A

2026-03-25

EPSS Evaluated

2026-07-25

NVD

CVE-2026-26831

EUVD

EUVD-2026-15459

Affected Vendors & Products

Vendor	Product	Version / Range
dbashford	textract	to 2.5.0 (inc)

Helpful Resources

Exploitability

CWE

KEV

CWE ID	Description
CWE-78	The product constructs all or part of an OS command using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify the intended OS command when it is sent to a downstream component.
CWE-94	The product constructs all or part of a code segment using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify the syntax or behavior of the intended code segment.

Attack-Flow Graph

Executive Summary

This vulnerability exists in textract versions up to 2.5.0 and is an OS Command Injection issue. It occurs because the file path parameter used in multiple extractors is passed directly to the child_process.exec() function without proper sanitization. This happens in several files including lib/extractors/doc.js, rtf.js, dxf.js, images.js, and lib/util.js. If a file with a malicious filename is processed, it can lead to execution of arbitrary operating system commands.

Detection Guidance

This vulnerability arises from the textract library passing unsanitized file path parameters directly into shell commands via child_process.exec(), allowing OS command injection if an attacker controls the file name or path.

To detect exploitation attempts or the presence of this vulnerability on your system or network, you can monitor for suspicious command executions involving textract, especially commands that include unusual or suspicious file names containing shell metacharacters such as semicolons (;), backticks (`), or command substitutions ($()).

Suggested detection methods include:

Monitor process execution logs or audit logs for textract-related commands invoking antiword, unrtf, or other extractors with suspicious file paths.
Use system command-line tools to search for suspicious running processes or command history entries that include suspicious file names, for example:
On Linux, check running processes with: `ps aux | grep textract` or `ps aux | grep antiword`
Search shell history for suspicious commands: `grep -E 'antiword|unrtf' ~/.bash_history`
Use file system monitoring tools to detect files with suspicious names containing shell metacharacters, e.g., `find /path/to/textract/input -name '*;*' -o -name '*`*' -o -name '*$()*'`

Additionally, you can create test files with malicious filenames such as `test";whoami;".doc` and observe if textract executes injected commands, which would confirm the vulnerability.

Impact Analysis

The vulnerability can allow an attacker to execute arbitrary OS commands on the system where textract is running. This can lead to unauthorized access, data theft, system compromise, or disruption of services depending on the privileges of the process running textract.

Compliance Impact

CVE-2026-26831 is a critical OS command injection vulnerability in the textract library that allows attackers to execute arbitrary OS commands by controlling file paths during text extraction. This can lead to unauthorized access, modification, or destruction of sensitive data.

Such unauthorized access and potential data breaches can directly impact compliance with common standards and regulations like GDPR and HIPAA, which mandate strict controls over data confidentiality, integrity, and availability.

Exploitation of this vulnerability could result in exposure or alteration of personal or protected health information, thereby violating regulatory requirements and potentially leading to legal and financial consequences.

Mitigations such as avoiding passing attacker-controlled file paths, removing unsafe shell command concatenations, or migrating to safer extraction solutions are critical to maintain compliance.

Mitigation Strategies

Immediate mitigation steps for this critical OS command injection vulnerability in textract (up to version 2.5.0) include:

Avoid processing files with attacker-controlled or untrusted filenames or file paths using the vulnerable textract versions.
If you maintain code using textract, remove or disable the use of textract until a fixed version is released.
Modify the textract source code to eliminate shell string concatenation in the vulnerable extractor files (lib/extractors/doc.js, rtf.js, dxf.js, images.js, and lib/util.js). Instead, replace child_process.exec() calls with safer alternatives such as child_process.execFile() or spawn() that accept arguments as arrays, preventing shell injection.
Sanitize or validate all file path inputs rigorously to remove or escape shell metacharacters before passing them to any shell commands.
Consider migrating to a maintained and secure text extraction library or solution that does not have this vulnerability.

Hi! I’m here to help you understand CVE-2026-26831. Ask me anything about the vulnerability, its impact, or mitigation strategies.

0/70