CVE-2026-23907

Received Received - Intake

Path Traversal in Apache PDFBox ExtractEmbeddedFiles Example

Publication date: 2026-03-10

Last updated on: 2026-03-13

Assigner: Apache Software Foundation

Description

This issue affects the ExtractEmbeddedFiles example in Apache PDFBox: from 2.0.24 through 2.0.35, from 3.0.0 through 3.0.6. The ExtractEmbeddedFiles example contains a path traversal vulnerability (CWE-22) because the filename that is obtained from PDComplexFileSpecification.getFilename() is appended to the extraction path. Users who have copied this example into their production code should review it to ensure that the extraction path is acceptable. The example has been changed accordingly, now the initial path and the extraction paths are converted into canonical paths and it is verified that extraction path contains the initial path. The documentation has also been adjusted.

CVSS Scores

EPSS Scores

Probability:
Percentile:

Meta Information

Published

2026-03-10

Last Modified

2026-03-13

Generated

2026-07-26

AI Q&A

2026-03-10

EPSS Evaluated

2026-07-25

NVD

CVE-2026-23907

EUVD

EUVD-2026-10481

Affected Vendors & Products

Vendor	Product	Version / Range
apache	pdfbox	From 2.0.24 (inc) to 2.0.35 (inc)
apache	pdfbox	From 3.0.0 (inc) to 3.0.7 (inc)

Helpful Resources

Exploitability

CWE

KEV

CWE ID	Description
CWE-22	The product uses external input to construct a pathname that is intended to identify a file or directory that is located underneath a restricted parent directory, but the product does not properly neutralize special elements within the pathname that can cause the pathname to resolve to a location that is outside of the restricted directory.

Attack-Flow Graph

Executive Summary

CVE-2026-23907 is a path traversal vulnerability in the ExtractEmbeddedFiles example code of Apache PDFBox. The issue occurs because the filename obtained from PDComplexFileSpecification.getFilename() is directly appended to the extraction path without proper validation.

This allows an attacker to craft filenames that can traverse directories, potentially causing embedded files to be extracted outside the intended directory.

The vulnerability affects Apache PDFBox versions 2.0.24 through 2.0.35 and 3.0.0 through 3.0.6. The example code has been updated to convert both the initial extraction path and the target extraction paths into canonical paths and verify that the extraction path is contained within the initial path, preventing directory traversal.

Detection Guidance

[{'type': 'paragraph', 'content': 'This vulnerability arises from the ExtractEmbeddedFiles example code in Apache PDFBox where filenames obtained from PDComplexFileSpecification.getFilename() are appended to extraction paths without validation, allowing path traversal.'}, {'type': 'paragraph', 'content': 'To detect this vulnerability on your system, you should review your codebase for usage of the ExtractEmbeddedFiles example or similar code that extracts embedded files using PDComplexFileSpecification.getFilename() without validating or canonicalizing the extraction path.'}, {'type': 'paragraph', 'content': 'There are no specific network detection commands provided in the resources. However, you can search your source code for the vulnerable pattern using commands like:'}, {'type': 'list_item', 'content': "grep -r 'PDComplexFileSpecification.getFilename()' /path/to/your/code"}, {'type': 'list_item', 'content': "grep -r 'ExtractEmbeddedFiles' /path/to/your/code"}, {'type': 'paragraph', 'content': 'Additionally, monitor file extraction operations for unexpected directory traversal by checking if files are extracted outside intended directories.'}] [1]

Impact Analysis

This vulnerability can impact you if you have copied the vulnerable ExtractEmbeddedFiles example code into your production environment without proper validation of extraction paths.

An attacker could exploit this vulnerability by crafting filenames that cause files to be extracted outside the intended directory, potentially overwriting or placing files in unauthorized locations.

This could lead to unauthorized file creation or modification, which might be used to compromise system integrity or facilitate further attacks.

Compliance Impact

I don't know

Mitigation Strategies

Immediate mitigation steps include reviewing any use of the ExtractEmbeddedFiles example code in your production environment and ensuring that the extraction path is properly validated.

Specifically, modify your code to convert both the initial extraction path and the target extraction paths into canonical paths and verify that the extraction path is contained within the initial path before extracting files.

If possible, update to a fixed version of Apache PDFBox where this issue has been addressed (versions after 2.0.35 and 3.0.6).

Also, review the updated documentation for guidance on safe extraction path handling.

Hi! I’m here to help you understand CVE-2026-23907. Ask me anything about the vulnerability, its impact, or mitigation strategies.

0/70