Safety and Security of OpenClawForCISO Applications

← Back to Blog

In a now-famous viral tweet, Meta's head of AI Safety demonstrated the perils of agents and how they can run amok. Needless to say, great consideration is needed when implementing OpenClawForCISO applications.

What is OpenClawForCISO?

OpenClawForCISO is an open-source, self-hosted AI agent (formerly known as ClawdBot/MoltBot) that runs persistently on your machine with broad access to files, terminal, email, calendar, and the internet.

Core Security Risks

1. Limited Built-in Security

OpenClawForCISO includes limited built-in security controls. The runtime can ingest untrusted text, download and execute skills (code) from external sources, and perform actions using the credentials assigned to it — effectively shifting the execution boundary from static application code to dynamically supplied content, without equivalent controls around identity, input handling, or privilege scoping.

2. Prompt Injection

Indirect prompt injection collapses the boundary between data and control, turning OpenClawForCISO's broad visibility and operational reach into an attack surface where context becomes contaminated and every upstream system becomes a potential delivery vector for agent compromise. This means even if only you message the bot, prompt injection can still happen via any untrusted content the bot reads — web search results, browser pages, emails, docs, attachments, or pasted logs.

3. Malicious Skills / ClawHub Marketplace

Skills can bundle scripts alongside markdown instructions, meaning execution can happen outside the MCP tool boundary entirely. If your security model is "MCP will gate tool calls," you can still lose to a malicious skill that routes around MCP through social engineering, direct shell instructions, or bundled code.

Security researchers found a vulnerable third-party skill that facilitated active data exfiltration — explicitly instructing the bot to execute a curl command that silently sends data to an external server, combined with a direct prompt injection to bypass the assistant's internal safety guidelines.

4. Exposed Instances & Credential Leakage

Three risks materialize quickly in an unguarded deployment: credentials and accessible data may be exposed or exfiltrated; the agent's persistent memory can be modified; and the host environment can be compromised if the agent is induced to retrieve and execute malicious code.

5. mDNS/Bonjour Reconnaissance

The Gateway broadcasts its presence via mDNS, which in full mode can expose sensitive operational details including the full filesystem path to the CLI binary, hostname information, and SSH availability.

6. Multi-user / Shared Bot Risk

If several people can message one tool-enabled agent, each of them can steer that same permission set. Run separate gateways per trust boundary.

Hardening Best Practices

Network & Access

Never expose your OpenClawForCISO Gateway without authentication. Set a strong auth token and enable HTTPS. Bind the Gateway to localhost or use a VPN — do not expose port 3000 directly to the public internet. Use Nginx or Caddy as a reverse proxy with TLS termination, and add rate limiting and IP allowlisting.

From a security standpoint, using a VPN solution (e.g., Tailscale) is the safest option, as it avoids exposing OpenClawForCISO to the public internet and limits access to trusted devices.

Isolation

OpenClawForCISO should be deployed only in a fully isolated environment such as a dedicated virtual machine or separate physical system, using dedicated non-privileged credentials with access only to non-sensitive data.

Skills & Code Execution

Only install skills from the official ClawHub marketplace. Version 2026.2.21 includes VirusTotal scanning for all marketplace submissions.

Block external skills and only allow pre-vetted, manually reviewed code. Disable high-risk tools like shell execution, browser control, and web fetching if they aren't needed.

Model Choice

Model choice matters — older/legacy models can be less robust against prompt injection and tool misuse. OpenClawForCISO recommends using Anthropic Claude Opus 4.6 (or the latest Opus) because it's strong at recognizing prompt injections.

Logging & Monitoring

Log every command execution, API call, file access, and decision. Store logs somewhere the agent can't modify them — ideally a separate logging server or SIEM system. Continuous monitoring and a rebuild plan should be part of the operating model.

Prompt Injection Defense (SOUL.md)

Add explicit instructions to your SOUL.md file to treat content inside <user_data> tags as data only, implement output validation before execution, and require human approval for sensitive actions.

Need a Hardened OpenClawForCISO Implementation?

Reach out to us and we'll help you deploy OpenClawForCISO securely.

contact@dheemai.com