CVE-2026-57516

Analyzed Analyzed - Analysis Complete

Unsafe Deserialization in Ray WebDataset Reader

Publication date: 2026-07-01

Last updated on: 2026-07-14

Assigner: VulnCheck

Description

Ray prior to 2.56.0 contains an unsafe deserialization vulnerability in the WebDataset reader that allows attackers to achieve remote code execution by supplying a malicious tar archive to the read_webdataset() function. The _default_decoder() function in webdataset_datasource.py unconditionally calls pickle.loads() on tar entries with .pkl/.pickle extensions and torch.load() with weights_only=False on .pt/.pth entries, executing arbitrary code inside Ray remote workers on every worker that processes the malicious archive.

CVSS Scores

EPSS Scores

Probability:
Percentile:

Meta Information

Published

2026-07-01

Last Modified

2026-07-14

Generated

2026-07-21

AI Q&A

2026-07-01

EPSS Evaluated

2026-07-20

NVD

CVE-2026-57516

EUVD

EUVD-2026-41089

Affected Vendors & Products

Vendor	Product	Version / Range
anyscale	ray	to 2.56.0 (exc)

Helpful Resources

Exploitability

CWE

KEV

CWE ID	Description
CWE-502	The product deserializes untrusted data without sufficiently ensuring that the resulting data will be valid.

Attack-Flow Graph

Executive Summary

CVE-2026-57516 is a high-severity security vulnerability in the Ray distributed computing framework versions prior to 2.56.0. It arises from unsafe deserialization in the WebDataset reader component. Specifically, the _default_decoder() function in webdataset_datasource.py automatically executes pickle.loads() on files with .pkl or .pickle extensions and torch.load() with weights_only=False on .pt or .pth files within tar archives processed by the read_webdataset() function.

Because these deserialization functions execute arbitrary code embedded in these files, an attacker can craft a malicious tar archive containing such files. When this archive is processed by Ray workers, the malicious code is executed remotely, leading to remote code execution (RCE) on every worker that processes the archive.

The vulnerability is triggered by default with no opt-in or warning, affecting all calls to read_webdataset unless a custom safe decoder is provided. The unsafe deserialization branches are now gated behind an environment variable to mitigate the risk.

Detection Guidance

The vulnerability involves unsafe deserialization when processing WebDataset-format TAR files containing .pkl, .pickle, .pt, or .pth files. Detection involves monitoring or inspecting data loading operations that use the read_webdataset() function in Ray prior to version 2.56.0.

Since the unsafe deserialization happens automatically during data loading, one way to detect exploitation attempts is to monitor for unexpected or suspicious TAR archives being loaded, especially those containing the vulnerable file extensions.

There are no explicit commands provided in the resources for detection, but you can check the environment variable related to unsafe deserialization or audit logs for calls to read_webdataset().

Check if the environment variable RAY_DATA_WEBDATASET_ALLOW_UNSAFE_DESERIALIZATION is set, which controls whether unsafe deserialization is allowed.
Audit your system or network for TAR files with .pkl, .pickle, .pt, or .pth extensions being loaded by Ray workers.
Monitor Ray worker logs for unexpected execution or errors during read_webdataset() calls.

Impact Analysis

This vulnerability allows attackers to achieve remote code execution on Ray remote workers by supplying malicious tar archives to the read_webdataset() function. This means that if an attacker can provide or influence the input data (such as via S3, HTTP, HuggingFace Hub, email, or other sources), they can execute arbitrary code within the Ray processes.

The impact includes potential full compromise of the affected systems running Ray workers, unauthorized access, data manipulation, disruption of services, and further lateral movement within the network.

Because the vulnerability is exploitable remotely without requiring privileges and with low complexity, it poses a significant security risk to any environment using vulnerable Ray versions for distributed data processing.

Compliance Impact

CVE-2026-57516 allows remote code execution through unsafe deserialization in the Ray WebDataset reader, which can lead to unauthorized execution of arbitrary code within Ray remote workers.

Such a vulnerability can impact compliance with common standards and regulations like GDPR and HIPAA because it potentially enables attackers to access, modify, or exfiltrate sensitive data processed by Ray, violating data protection and privacy requirements.

Specifically, the ability to execute arbitrary code remotely increases the risk of data breaches, unauthorized data manipulation, and loss of data integrity, all of which are critical concerns under these regulations.

Organizations using vulnerable versions of Ray without applying the fixes or mitigations may fail to maintain adequate security controls required by these standards.

Mitigation Strategies

To mitigate this vulnerability, upgrade Ray to version 2.56.0 or later, where the unsafe deserialization vulnerability in the WebDataset reader has been fixed.

If upgrading is not immediately possible, set the environment variable RAY_DATA_WEBDATASET_ALLOW_UNSAFE_DESERIALIZATION=0 (or ensure it is unset) to disable unsafe deserialization of .pkl, .pickle, .pt, and .pth files by default.

Alternatively, use a custom safe decoder with the decoder parameter in ray.data.read_webdataset() to avoid executing arbitrary code during data loading.

Avoid loading untrusted or unauthenticated TAR archives containing these file types into Ray workers.

Hi! I’m here to help you understand CVE-2026-57516. Ask me anything about the vulnerability, its impact, or mitigation strategies.

0/70