OpenClaw: Loads of CVEs Published — An Agent Reality Check
A large number of CVEs have recently been published against OpenClaw, a self-hosted personal AI assistant. That may sound dramatic, but this is less about one project and more about a broader shift: LLMs are no longer just generating text. They are executing commands, controlling browsers, installing plugins, and connecting to external systems. When AI becomes operational, its security profile begins to resemble a full automation platform. That’s the real agent reality check.
When an LLM Gets Real Capabilities
OpenClaw connects a language model to tools: Docker sandboxes, shell execution, browser automation, chat integrations, and a local gateway API. This transforms the model from “answer engine” into “action engine.” The moment you allow a model to operate on files, networks, and processes, classic security categories return to the spotlight—just with more entry points.
For example, consider a simplified execution pattern:
// simplified example
execSync(`docker create ${userSuppliedArgs}`);
If userSuppliedArgs isn’t strictly constructed and validated, shell metacharacters can alter execution. In agent systems, parameters can be influenced indirectly—via tool outputs, retrieved content, connector payloads, plugin metadata, or prompt injection. This isn’t “AI magic.” It’s command injection with new plumbing.
Filesystem Boundaries Matter
A second recurring theme in the disclosures is unsafe file handling: paths that weren’t confined to safe roots, and inputs that allowed traversal sequences like ../../ or absolute paths. A minimal illustration:
// simplified example
const filePath = req.body.path;
await browser.setInputFiles(filePath);
If filePath is not restricted to a fixed uploads directory and traversal is not rejected, an attacker can potentially read arbitrary files. Agent frameworks amplify this risk because they routinely install skills/plugins, handle uploads and downloads, resolve dynamic paths, and pass files between tools.
If you build internal tooling around uploads/downloads, a quick “does traversal get blocked?” check can look like this:
curl -X POST http://localhost:PORT/upload \
-d '{"path":"../../../../etc/passwd"}'
If the service accepts traversal input and returns file content, you have a boundary failure. Even if the service is “local,” remember that local services can still be reached by other local processes, misconfigurations, or browser-mediated attacks.
Network & Gateway Exposure
Another cluster of issues involves outbound connections and gateway control. A typical risk pattern is allowing tool-supplied URLs to be used for network connections:
// simplified example
new WebSocket(userProvidedGatewayUrl);
If URL inputs aren’t strictly allowlisted, the host may be tricked into attempting connections to localhost services, private network ranges, or cloud metadata endpoints. This is SSRF territory. In agent systems, outbound requests are part of normal behavior—which makes validation and routing controls even more important.
A practical self-check on Linux is to inspect what’s actually listening:
ss -tulpn | grep LISTEN
If your agent gateway is bound to 0.0.0.0 (or otherwise exposed beyond loopback) without strong authentication and tight network controls, treat that as a priority configuration review.
Prompt Structure Is an Input Surface
Some disclosures highlight an agent-specific twist: prompt structure can be manipulated when untrusted strings are embedded into high-trust system context without sanitization. Conceptually:
systemPrompt = `
Workspace: ${process.cwd()}
Rules: ...
`;
If workspace paths, connector metadata, filenames, or other values contain control characters, they can break the prompt’s intended structure and inject content in surprising ways. The takeaway is simple: every string that enters a system prompt is an input surface and should be treated like untrusted input in a web application.
The Pattern Behind the CVE Wave
Grouped together, the vulnerabilities fall into well-known classes—command injection, path traversal, SSRF, secret leakage, and UI injection. The difference isn’t the vulnerability type; it’s the density of entry points. Agent frameworks touch filesystems, containers, browsers, webhooks, messaging APIs, and runtime environments. That makes them small automation platforms, and automation platforms require mature security engineering.
To keep the article actionable (and short), below is a small selection of OpenClaw CVEs referenced in this recent wave. Each links to a full BaseFortify CVE report.
Selected OpenClaw CVEs
Each BaseFortify CVE report includes a practical technical breakdown and mitigation guidance. Readers can also consult the built-in Q&A section and use the AI assistant on each CVE report page to explore defensive steps and configuration hardening for their specific setup.
How BaseFortify Helps You Stay Ahead of Agent CVEs
When disclosure waves hit agent ecosystems, the real bottleneck is rarely “finding the CVE.” It’s knowing whether you run an affected version anywhere in your stack. Agent deployments span multiple layers—framework, runtime, gateway, browser tooling, container engine, and integrations. If you track those components by vendor/product/version, CVE matching becomes straightforward and repeatable.
In BaseFortify, you can record components (including agent frameworks and supporting infrastructure). As a practical starting point, here are example entries in CPE format you can add to your watch list (adapt versions to your environment):
cpe:2.3:a:openclaw:openclaw:2026.2.12:*:*:*:*:*:*:* cpe:2.3:a:docker:docker:25.0.3:*:*:*:*:*:*:* cpe:2.3:a:nodejs:node.js:20.11.1:*:*:*:*:*:*:* cpe:2.3:a:microsoft:playwright:1.43.0:*:*:*:*:*:*:*
If you want to try it, registration at https://basefortify.eu/register is free. Once your components are tracked, BaseFortify can help you spot when new CVEs match your recorded versions—turning vulnerability waves into clear, actionable remediation work instead of noise.
Continue Reading
If this agent reality check resonates with you, we have previously covered related topics in more depth. These articles provide practical guidance on modeling LLM components, understanding real-world security failures, and navigating regulatory obligations.
- How to Model LLM Components in BaseFortify (So CVEs Match)
- LLM Security: Concrete Risks and Defenses
- LLMs in Your Stack: What EU Rules Mean for You
Together, these pieces outline a consistent message: LLM systems are no longer experimental features. They are components in your production stack—subject to regulatory scrutiny, operational risk, and continuous vulnerability disclosure. Treat them accordingly.