CVE-2026-23907
Received Received - Intake
Path Traversal in Apache PDFBox ExtractEmbeddedFiles Example

Publication date: 2026-03-10

Last updated on: 2026-03-13

Assigner: Apache Software Foundation

Description
This issue affects the ExtractEmbeddedFiles example in Apache PDFBox: from 2.0.24 through 2.0.35, from 3.0.0 through 3.0.6. The ExtractEmbeddedFiles example contains a path traversal vulnerability (CWE-22) because the filename that is obtained from PDComplexFileSpecification.getFilename() is appended to the extraction path. Users who have copied this example into their production code should review it to ensure that the extraction path is acceptable. The example has been changed accordingly, now the initial path and the extraction paths are converted into canonical paths and it is verified that extraction path contains the initial path. The documentation has also been adjusted.
CVSS Scores
EPSS Scores
Probability:
Percentile:
Meta Information
Published
2026-03-10
Last Modified
2026-03-13
Generated
2026-05-07
AI Q&A
2026-03-10
EPSS Evaluated
2026-05-05
NVD
EUVD
Affected Vendors & Products
Showing 2 associated CPEs
Vendor Product Version / Range
apache pdfbox From 2.0.24 (inc) to 2.0.35 (inc)
apache pdfbox From 3.0.0 (inc) to 3.0.7 (inc)
Helpful Resources
Exploitability
CWE
CWE Icon
KEV
KEV Icon
CWE ID Description
CWE-22 The product uses external input to construct a pathname that is intended to identify a file or directory that is located underneath a restricted parent directory, but the product does not properly neutralize special elements within the pathname that can cause the pathname to resolve to a location that is outside of the restricted directory.
Attack-Flow Graph
AI Powered Q&A
Can you explain this vulnerability to me?

CVE-2026-23907 is a path traversal vulnerability in the ExtractEmbeddedFiles example code of Apache PDFBox. The issue occurs because the filename obtained from PDComplexFileSpecification.getFilename() is directly appended to the extraction path without proper validation.

This allows an attacker to craft filenames that can traverse directories, potentially causing embedded files to be extracted outside the intended directory.

The vulnerability affects Apache PDFBox versions 2.0.24 through 2.0.35 and 3.0.0 through 3.0.6. The example code has been updated to convert both the initial extraction path and the target extraction paths into canonical paths and verify that the extraction path is contained within the initial path, preventing directory traversal.


How can this vulnerability impact me? :

This vulnerability can impact you if you have copied the vulnerable ExtractEmbeddedFiles example code into your production environment without proper validation of extraction paths.

An attacker could exploit this vulnerability by crafting filenames that cause files to be extracted outside the intended directory, potentially overwriting or placing files in unauthorized locations.

This could lead to unauthorized file creation or modification, which might be used to compromise system integrity or facilitate further attacks.


How does this vulnerability affect compliance with common standards and regulations (like GDPR, HIPAA)?:

I don't know


How can this vulnerability be detected on my network or system? Can you suggest some commands?

[{'type': 'paragraph', 'content': 'This vulnerability arises from the ExtractEmbeddedFiles example code in Apache PDFBox where filenames obtained from PDComplexFileSpecification.getFilename() are appended to extraction paths without validation, allowing path traversal.'}, {'type': 'paragraph', 'content': 'To detect this vulnerability on your system, you should review your codebase for usage of the ExtractEmbeddedFiles example or similar code that extracts embedded files using PDComplexFileSpecification.getFilename() without validating or canonicalizing the extraction path.'}, {'type': 'paragraph', 'content': 'There are no specific network detection commands provided in the resources. However, you can search your source code for the vulnerable pattern using commands like:'}, {'type': 'list_item', 'content': "grep -r 'PDComplexFileSpecification.getFilename()' /path/to/your/code"}, {'type': 'list_item', 'content': "grep -r 'ExtractEmbeddedFiles' /path/to/your/code"}, {'type': 'paragraph', 'content': 'Additionally, monitor file extraction operations for unexpected directory traversal by checking if files are extracted outside intended directories.'}] [1]


What immediate steps should I take to mitigate this vulnerability?

Immediate mitigation steps include reviewing any use of the ExtractEmbeddedFiles example code in your production environment and ensuring that the extraction path is properly validated.

Specifically, modify your code to convert both the initial extraction path and the target extraction paths into canonical paths and verify that the extraction path is contained within the initial path before extracting files.

If possible, update to a fixed version of Apache PDFBox where this issue has been addressed (versions after 2.0.35 and 3.0.6).

Also, review the updated documentation for guidance on safe extraction path handling.


Ask Our AI Assistant
Need more information? Ask your question to get an AI reply (Powered by our expertise)
0/70
EPSS Chart