CVE-2026-26019
Awaiting Analysis Awaiting Analysis - Queue
Improper URL Validation in LangChain RecursiveUrlLoader Enables SSRF

Publication date: 2026-02-11

Last updated on: 2026-02-19

Assigner: GitHub, Inc.

Description
LangChain is a framework for building LLM-powered applications. Prior to 1.1.14, the RecursiveUrlLoader class in @langchain/community is a web crawler that recursively follows links from a starting URL. Its preventOutside option (enabled by default) is intended to restrict crawling to the same site as the base URL. The implementation used String.startsWith() to compare URLs, which does not perform semantic URL validation. An attacker who controls content on a crawled page could include links to domains that share a string prefix with the target, causing the crawler to follow links to attacker-controlled or internal infrastructure. Additionally, the crawler performed no validation against private or reserved IP addresses. A crawled page could include links targeting cloud metadata services, localhost, or RFC 1918 addresses, and the crawler would fetch them without restriction. This vulnerability is fixed in 1.1.14.
CVSS Scores
EPSS Scores
Probability:
Percentile:
Meta Information
Published
2026-02-11
Last Modified
2026-02-19
Generated
2026-05-27
AI Q&A
2026-02-12
EPSS Evaluated
2026-05-25
NVD
Affected Vendors & Products
Showing 1 associated CPE
Vendor Product Version / Range
langchain langchain_community to 1.1.14 (exc)
Helpful Resources
Exploitability
CWE
CWE Icon
KEV
KEV Icon
CWE ID Description
CWE-918 The web server receives a URL or similar request from an upstream component and retrieves the contents of this URL, but it does not sufficiently ensure that the request is being sent to the expected destination.
Attack-Flow Graph
AI Powered Q&A
Can you explain this vulnerability to me?

This vulnerability exists in the RecursiveUrlLoader class of the LangChain framework prior to version 1.1.14. The class is a web crawler that follows links recursively starting from a given URL. It has a preventOutside option meant to restrict crawling to the same site as the base URL. However, the implementation used a simple string prefix check (String.startsWith()) to compare URLs, which does not properly validate URLs semantically.

Because of this, an attacker controlling content on a crawled page could insert links to domains that share a string prefix with the target domain, tricking the crawler into following links to attacker-controlled or internal infrastructure. Additionally, the crawler did not validate against private or reserved IP addresses, allowing it to fetch links pointing to cloud metadata services, localhost, or RFC 1918 addresses without restriction.

This flaw could lead to unintended crawling of malicious or sensitive internal resources. The vulnerability was fixed in version 1.1.14.


How can this vulnerability impact me? :

This vulnerability can impact you by allowing an attacker to manipulate the web crawler to access unintended domains or internal network resources. This could lead to exposure of sensitive internal infrastructure or cloud metadata services that are normally protected.

Such unauthorized crawling could result in information disclosure, potentially leaking internal IP addresses or sensitive data accessible via internal services. It may also allow attackers to use the crawler as a proxy to access restricted resources.


How does this vulnerability affect compliance with common standards and regulations (like GDPR, HIPAA)?:

I don't know


How can this vulnerability be detected on my network or system? Can you suggest some commands?

I don't know


What immediate steps should I take to mitigate this vulnerability?

To mitigate this vulnerability, upgrade the LangChain framework to version 1.1.14 or later, where the issue with RecursiveUrlLoader's URL validation is fixed.


Ask Our AI Assistant
Need more information? Ask your question to get an AI reply (Powered by our expertise)
0/70
EPSS Chart