CVE-2026-12243

Analyzed Analyzed - Analysis Complete

Path Traversal in NLTK via Percent-Encoded Sequences

Publication date: 2026-06-30

Last updated on: 2026-06-30

Assigner: huntr.dev

Description

NLTK version 3.9.4 is vulnerable to a path traversal attack due to an incomplete fix for GitHub Issue #3504. The `_UNSAFE_NO_PROTOCOL_RE` regex in `nltk/data.py` checks for literal `../` sequences but fails to account for percent-encoded traversal sequences such as `..%2f`. The `url2pathname()` function decodes these sequences after the validation step, allowing an attacker to bypass the protection. This vulnerability enables an attacker to read arbitrary files accessible to the Python process by controlling the resource name parameter passed to `nltk.data.load()` or `nltk.data.find()`. The issue affects applications that rely on NLTK for resource loading, including NLP web applications, Jupyter notebooks, and CLI tools. The default `pathsec.ENFORCE=False` setting exacerbates the impact by not blocking the file read at the `open()` stage.

CVSS Scores

EPSS Scores

Probability:
Percentile:

Meta Information

Published

2026-06-30

Last Modified

2026-06-30

Generated

2026-07-20

AI Q&A

2026-06-30

EPSS Evaluated

2026-07-18

NVD

CVE-2026-12243

EUVD

EUVD-2026-40240

Affected Vendors & Products

Vendor	Product	Version / Range
nltk	nltk	3.9.4

Helpful Resources

Exploitability

CWE

KEV

CWE ID	Description
CWE-22	The product uses external input to construct a pathname that is intended to identify a file or directory that is located underneath a restricted parent directory, but the product does not properly neutralize special elements within the pathname that can cause the pathname to resolve to a location that is outside of the restricted directory.

Attack-Flow Graph

Executive Summary

NLTK version 3.9.4 has a path traversal vulnerability caused by an incomplete fix for a previous issue. The vulnerability arises because the regex used to detect unsafe path sequences only checks for literal '../' but does not detect percent-encoded traversal sequences like '..%2f'. Since the decoding of these sequences happens after the validation, an attacker can bypass the check and manipulate the resource name parameter passed to functions like nltk.data.load() or nltk.data.find() to read arbitrary files accessible to the Python process.

This affects applications using NLTK for resource loading, such as NLP web applications, Jupyter notebooks, and command-line tools. Additionally, the default setting pathsec.ENFORCE=False means the file read is not blocked at the open() stage, making exploitation easier.

Impact Analysis

This vulnerability allows an attacker to read arbitrary files on the system that the Python process has access to by exploiting the path traversal flaw. This can lead to unauthorized disclosure of sensitive information, such as configuration files, credentials, or other private data stored on the system.

Because the vulnerability does not require user interaction or privileges (as indicated by the CVSS vector AV:N/AC:L/PR:N/UI:N), it can be exploited remotely and easily, increasing the risk to affected applications.

Compliance Impact

This vulnerability allows an attacker to read arbitrary files accessible to the Python process by exploiting a path traversal flaw in NLTK's resource loading functions. Such unauthorized access to files could potentially lead to exposure of sensitive or personal data.

Exposure of sensitive data due to this vulnerability could impact compliance with data protection regulations such as GDPR and HIPAA, which require strict controls over access to personal and health-related information.

However, the provided information does not explicitly describe the specific compliance implications or how organizations should address this vulnerability in the context of these regulations.

Mitigation Strategies

To mitigate this vulnerability, you should avoid using NLTK version 3.9.4 or earlier versions that contain the incomplete fix for the path traversal issue.

Consider upgrading to a version of NLTK where this vulnerability is fixed or apply patches that properly validate and sanitize resource names passed to nltk.data.load() or nltk.data.find().

Additionally, setting pathsec.ENFORCE to True may help block unauthorized file reads at the open() stage.

Review and restrict access permissions for the Python process to limit the impact of potential arbitrary file reads.

Hi! I’m here to help you understand CVE-2026-12243. Ask me anything about the vulnerability, its impact, or mitigation strategies.

0/70

Path Traversal in NLTK via Percent-Encoded Sequences

Description

CVSS Scores

EPSS Scores

Meta Information

Affected Vendors & Products

Helpful Resources

Exploitability

Attack-Flow Graph

AI Quick Actions

Chat Assistant

EPSS Chart