CVE-2026-33929
Received Received - Intake
Path Traversal in Apache PDFBox ExtractEmbeddedFiles Example

Publication date: 2026-04-14

Last updated on: 2026-04-20

Assigner: Apache Software Foundation

Description
Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal') vulnerability in Apache PDFBox Examples. This issue affects the ExtractEmbeddedFiles example in Apache PDFBox: from 2.0.24 through 2.0.36, from 3.0.0 through 3.0.7. Users are recommended to update to version 2.0.37 or 3.0.8 once available. Until then, they should apply the fix provided in GitHub PR 427. The ExtractEmbeddedFiles example contained a path traversal vulnerability (CWE-22) mentioned in CVE-2026-23907. However the change in the releases 2.0.36 and 3.0.7 is flawed because it doesn't consider the file path separator. Because of that, a user having writing rights on /home/ABC could be victim to a malicious PDF resulting in a write attempt to any path starting withΒ /home/ABC, e.g.Β "/home/ABCDEF". Users who have copied this example into their production code should apply the mentioned change. The example has been changed accordingly and is available in the project repository.
CVSS Scores
EPSS Scores
Probability:
Percentile:
Meta Information
Published
2026-04-14
Last Modified
2026-04-20
Generated
2026-05-07
AI Q&A
2026-04-14
EPSS Evaluated
2026-05-05
NVD
EUVD
Affected Vendors & Products
Showing 2 associated CPEs
Vendor Product Version / Range
apache pdfbox From 2.0.24 (inc) to 2.0.37 (exc)
apache pdfbox From 3.0.0 (inc) to 3.0.8 (exc)
Helpful Resources
Exploitability
CWE
CWE Icon
KEV
KEV Icon
CWE ID Description
CWE-22 The product uses external input to construct a pathname that is intended to identify a file or directory that is located underneath a restricted parent directory, but the product does not properly neutralize special elements within the pathname that can cause the pathname to resolve to a location that is outside of the restricted directory.
Attack-Flow Graph
AI Powered Q&A
How can this vulnerability impact me? :

This vulnerability can allow an attacker to write or overwrite files outside the intended directory on a system where the vulnerable ExtractEmbeddedFiles example is used, potentially leading to unauthorized file modifications.

If an attacker crafts a malicious PDF, they could exploit this flaw to place files in arbitrary locations within the file system where the user has write permissions, which could lead to data corruption, privilege escalation, or other malicious activities depending on the files written.


How does this vulnerability affect compliance with common standards and regulations (like GDPR, HIPAA)?:

The vulnerability allows a malicious PDF to write files outside the intended restricted directory, potentially leading to unauthorized file writes or overwrites.

Such unauthorized file operations could result in exposure or modification of sensitive data, which may impact compliance with data protection standards and regulations like GDPR or HIPAA that require strict control over data access and integrity.

Therefore, if the vulnerable ExtractEmbeddedFiles example is used in production without the fix, it could pose risks to maintaining compliance with these regulations by enabling potential data breaches or unauthorized data manipulation.


Can you explain this vulnerability to me?

This vulnerability is a path traversal issue in the ExtractEmbeddedFiles example of Apache PDFBox. It allows a malicious PDF to cause files to be written outside the intended restricted directory by exploiting insufficient checks on file paths.

Specifically, the original code did not properly verify that extracted files stayed within the target directory, allowing an attacker to write files to paths starting with the target directory path but actually outside it, such as "/home/ABCDEF" when the intended directory was "/home/ABC".

The fix improves the directory boundary check by using canonical paths and ensuring that the extracted file's parent directory is either exactly the target directory or a subdirectory within it, preventing unauthorized file writes outside the intended location.


How can this vulnerability be detected on my network or system? Can you suggest some commands?

This vulnerability is related to a path traversal issue in the ExtractEmbeddedFiles example of Apache PDFBox, which allows malicious PDFs to write files outside the intended directory.

Detection would involve monitoring or analyzing PDF files processed by the ExtractEmbeddedFiles example, especially looking for attempts to extract embedded files to paths outside the intended directory.

Since the vulnerability is in code logic rather than network traffic, direct network commands may not detect it. Instead, you can check your system for suspicious file writes or attempts to write files outside expected directories when processing PDFs.

Suggested commands include:

  • Use file system monitoring tools (e.g., inotifywait on Linux) to watch the directory where embedded files are extracted for unexpected file creations outside the intended directory.
  • Search logs or audit trails for file write operations triggered by PDFBox ExtractEmbeddedFiles, especially those targeting paths starting with but extending beyond the intended directory (e.g., paths like /home/ABCDEF if /home/ABC is the intended directory).
  • Manually review or scan PDF files for embedded file names containing path traversal sequences (e.g., ../) before processing.

What immediate steps should I take to mitigate this vulnerability?

The primary mitigation step is to update Apache PDFBox to version 2.0.37 or 3.0.8 once they are available, as these versions include the fix for this vulnerability.

Until the official update is available, users who have copied the ExtractEmbeddedFiles example into their production code should apply the fix provided in GitHub Pull Request 427.

The fix involves improving the directory boundary check by verifying the canonical path of the target extraction directory to ensure embedded files cannot be extracted outside the intended directory.

Additionally, restrict write permissions on directories used by ExtractEmbeddedFiles to minimize the impact of any exploitation attempts.


Ask Our AI Assistant
Need more information? Ask your question to get an AI reply (Powered by our expertise)
0/70
EPSS Chart