CVE-2026-5970
Remote Code Injection in FoundationAgents MetaGPT HumanEvalBenchmark
Publication date: 2026-04-09
Last updated on: 2026-04-29
Assigner: VulDB
Description
Description
CVSS Scores
EPSS Scores
| Probability: | |
| Percentile: |
Meta Information
Affected Vendors & Products
| Vendor | Product | Version / Range |
|---|---|---|
| deepwisdom | metagpt | to 0.8.1 (inc) |
Helpful Resources
Exploitability
| CWE ID | Description |
|---|---|
| CWE-74 | The product constructs all or part of a command, data structure, or record using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify how it is parsed or interpreted when it is sent to a downstream component. |
| CWE-94 | The product constructs all or part of a code segment using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify the syntax or behavior of the intended code segment. |
Attack-Flow Graph
AI Powered Q&A
Can you explain this vulnerability to me?
CVE-2026-5970 is a critical remote code execution vulnerability in the MetaGPT project's aflow extension, specifically in the HumanEvalBenchmark and MBPPBenchmark components. It arises because the system unsafely executes code generated by large language models (LLMs) using Python's exec() function without proper sandboxing or sufficient sanitization.
The vulnerable functions, such as check_solution and exec_code, run LLM-generated solution and test code directly in the global namespace. Although a sanitize() function is applied, it fails to block dangerous imports like os or subprocess, allowing attackers to inject arbitrary commands.
Attackers can exploit this by manipulating the LLM output through prompt injection or poisoned datasets, leading to arbitrary code execution on the server running MetaGPT.
How can this vulnerability impact me? :
This vulnerability can lead to full system compromise by allowing attackers to execute arbitrary Python code on the host system.
- Theft of sensitive environment variables such as API keys.
- Unauthorized access to local files.
- Potential data destruction.
- Corruption of module state and interference with subsequent code evaluations within the same process.
The vulnerability is especially dangerous because the affected modules run automatically in optimization loops without human review, enabling attackers to execute malicious code without user interaction.
How can this vulnerability be detected on my network or system? Can you suggest some commands?
Detection of this vulnerability involves identifying if the vulnerable MetaGPT versions (up to 0.8.1) are running and if the vulnerable functions are being invoked with untrusted input. Since the vulnerability arises from unsafe use of Python's exec() function on LLM-generated code without proper sandboxing, monitoring execution of the affected functions (check_solution in HumanEvalBenchmark/MBPPBenchmark and exec_code in operator.py) is critical.
There are no explicit detection commands provided in the resources, but you can check for the presence of the vulnerable MetaGPT package and inspect running processes or logs for suspicious execution of Python code involving these modules.
- Check installed MetaGPT version: `pip show metagtp` or `pip show metagtp | grep Version`
- Search for usage of vulnerable functions in your codebase or environment: `grep -r 'check_solution' /path/to/metagtp` and `grep -r 'exec_code' /path/to/metagtp`
- Monitor runtime Python processes for exec() calls or suspicious subprocesses that may indicate code injection attempts.
- Look for unexpected file creations such as `/tmp/simple_rce_proof` which was used in the proof of concept to confirm exploitation.
What immediate steps should I take to mitigate this vulnerability?
Immediate mitigation steps include isolating the execution environment of the vulnerable code and preventing untrusted code from running with full global access.
Specifically, the following actions are recommended based on the fix described:
- Update MetaGPT to a version that includes the fix which runs test code in a subprocess with an empty namespace to prevent access to sensitive globals.
- Ensure that execution of LLM-generated code uses subprocess isolation with timeouts to prevent infinite loops or crashes.
- Avoid running MetaGPT optimization loops or benchmarks automatically on untrusted or unreviewed datasets or prompts.
- If an immediate update is not possible, consider restricting or sandboxing the environment where MetaGPT runs, such as using containerization or limiting permissions to reduce impact.
- Monitor for suspicious activity such as unexpected file writes or network connections initiated by MetaGPT processes.
How does this vulnerability affect compliance with common standards and regulations (like GDPR, HIPAA)?:
The vulnerability in MetaGPT's aflow extension allows attackers to execute arbitrary code on the host system, leading to full system compromise and theft of sensitive environment variables such as API keys. This unauthorized access and potential data theft could result in violations of data protection regulations like GDPR and HIPAA, which mandate strict controls over personal and sensitive data.
Because the vulnerability enables remote code execution without user interaction and can expose sensitive information, it undermines confidentiality and integrity requirements central to compliance frameworks. Organizations using affected versions of MetaGPT may face increased risk of data breaches, unauthorized data access, and potential regulatory penalties if such vulnerabilities are exploited.
The vulnerability's exploitation could lead to unauthorized disclosure or alteration of protected health information (PHI) under HIPAA or personal data under GDPR, thereby impacting compliance with these standards.