CVE-2026-45311

Deferred Deferred - Pending Action

Automatic Test Execution in CodeWhale Agent

Publication date: 2026-05-28

Last updated on: 2026-06-01

Assigner: GitHub, Inc.

Description

CodeWhale is a DeepSeek + MiMo coding agent in terminal. From 0.3.0 to 0.8.23, the run_tests tool executes cargo test in the workspace with ApprovalRequirement::Auto, meaning it runs without any user approval prompt. cargo test compiles and executes arbitrary code: test binaries, build.rs build scripts, and proc macros. While auto-approving test execution is a deliberate design choice, it creates an inconsistency in the security boundary. However, in a malicious repository, test code can execute arbitrary shell commands, exfiltrate credentials, or establish persistence with zero approval. The attack is amplified by AGENTS.md (auto-loaded into the system prompt), which can instruct the model to run tests proactively at session start. This vulnerability is fixed in 0.8.23.

CVSS Scores

EPSS Scores

Probability:
Percentile:

Meta Information

Published

2026-05-28

Last Modified

2026-06-01

Generated

2026-07-28

AI Q&A

2026-05-28

EPSS Evaluated

2026-07-27

NVD

CVE-2026-45311

EUVD

EUVD-2026-32965

Affected Vendors & Products

Vendor	Product	Version / Range
codewhale	codewhale	to 0.8.23 (inc)
codewhale	codewhale	From 0.3.0 (inc) to 0.8.23 (exc)

Helpful Resources

Exploitability

CWE

KEV

CWE ID	Description
CWE-94	The product constructs all or part of a code segment using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify the syntax or behavior of the intended code segment.

Attack-Flow Graph

Executive Summary

CVE-2026-45311 is a critical vulnerability in the CodeWhale project's run_tests tool, affecting versions from 0.3.0 to 0.8.23. The tool executes cargo test with automatic approval, meaning it runs tests without asking the user for permission.

Since cargo test compiles and runs arbitrary code such as test binaries, build scripts, and procedural macros, this auto-approval allows malicious test code embedded in a repository to execute arbitrary shell commands, steal credentials, or establish persistence on the system without any user interaction.

This risk is increased by the AGENTS.md file, which is auto-loaded into the system prompt and can instruct the model to proactively run tests at session start, enabling remote code execution triggered by malicious test files.

The vulnerability is classified as a code injection weakness (CWE-94) and has been fixed in version 0.8.23 by requiring explicit user approval before running tests.

Detection Guidance

This vulnerability involves the `run_tests` tool executing `cargo test` with auto-approval, allowing arbitrary code execution without user consent. To detect if your system is vulnerable, you should check the version of CodeWhale and whether `run_tests` is configured to run tests automatically without approval.

Check the installed version of CodeWhale to see if it is between 0.3.0 and 0.8.23, which are the affected versions.
Inspect if the `run_tests` tool is running with `ApprovalRequirement::Auto` by reviewing its configuration or source code.
Look for the presence of `AGENTS.md` files in repositories or system prompts that might trigger automatic test execution.

While no specific detection commands are provided, you can use commands like the following to gather relevant information:

Check CodeWhale version: `codewhale --version` or check package versions if installed via package managers.
Search for `AGENTS.md` files: `find /path/to/codewhale/workspace -name AGENTS.md`
Review running processes or logs to detect if `run_tests` is executing tests automatically without user prompts.

Impact Analysis

This vulnerability can have severe impacts including unauthorized execution of arbitrary code on your system, which can lead to credential theft, unauthorized persistence, and full system compromise.

Because the attack can be triggered remotely by a malicious repository, it poses a high risk of remote code execution without user consent.

The CVSS score of 9.6 reflects the high severity, indicating that confidentiality, integrity, and availability of your system can be significantly affected.

Compliance Impact

This vulnerability allows malicious test code to execute arbitrary shell commands, exfiltrate credentials, and establish persistence without user approval. Such unauthorized data exfiltration and execution of arbitrary code can lead to breaches of confidentiality, integrity, and availability of sensitive data.

Because of these impacts, organizations using affected versions of CodeWhale may face challenges in maintaining compliance with data protection regulations such as GDPR and HIPAA, which require strict controls over data access, processing, and breach prevention.

Specifically, the risk of credential exfiltration and unauthorized code execution could result in unauthorized access to personal or protected health information, potentially leading to regulatory violations and associated penalties.

Mitigation Strategies

The primary mitigation is to update CodeWhale to version 0.8.23 or later, where the vulnerability is fixed by requiring explicit user approval before executing tests.

If updating immediately is not possible, you should disable or restrict the use of the `run_tests` tool to prevent automatic execution of tests without user approval.

Additionally, review and remove any untrusted or suspicious `AGENTS.md` files that could instruct the system to run tests proactively.

Consider auditing repositories for malicious test code that could execute arbitrary shell commands.

Hi! I’m here to help you understand CVE-2026-45311. Ask me anything about the vulnerability, its impact, or mitigation strategies.

0/70