Safety and Security of OpenClaw Applications

← Back to Blog

In a now-famous viral tweet, Meta's head of AI Safety demonstrated the perils of agents and how they can run amok. Needless to say, great consideration is needed when implementing OpenClaw applications.

What is OpenClaw?

OpenClaw is an open-source, self-hosted AI agent (formerly known as ClawdBot/MoltBot) that runs persistently on your machine with broad access to files, terminal, email, calendar, and the internet.

Core Security Risks

1. Limited Built-in Security

OpenClaw includes limited built-in security controls. The runtime can ingest untrusted text, download and execute skills (code) from external sources, and perform actions using the credentials assigned to it — effectively shifting the execution boundary from static application code to dynamically supplied content, without equivalent controls around identity, input handling, or privilege scoping.

2. Prompt Injection

Indirect prompt injection collapses the boundary between data and control, turning OpenClaw's broad visibility and operational reach into an attack surface where context becomes contaminated and every upstream system becomes a potential delivery vector for agent compromise. This means even if only you message the bot, prompt injection can still happen via any untrusted content the bot reads — web search results, browser pages, emails, docs, attachments, or pasted logs.

3. Malicious Skills / ClawHub Marketplace

Skills can bundle scripts alongside markdown instructions, meaning execution can happen outside the MCP tool boundary entirely. Security researchers found a vulnerable third-party skill that facilitated active data exfiltration.

4. Exposed Instances & Credential Leakage

Three risks materialize quickly in an unguarded deployment: credentials and accessible data may be exposed or exfiltrated; the agent's persistent memory can be modified; and the host environment can be compromised if the agent is induced to retrieve and execute malicious code.

5. mDNS/Bonjour Reconnaissance

The Gateway broadcasts its presence via mDNS, which in full mode can expose sensitive operational details including the full filesystem path to the CLI binary, hostname information, and SSH availability.

6. Multi-user / Shared Bot Risk

If several people can message one tool-enabled agent, each of them can steer that same permission set. Run separate gateways per trust boundary.

Hardening Best Practices

Network & Access

Never expose your OpenClaw Gateway without authentication. Set a strong auth token and enable HTTPS. Bind the Gateway to localhost or use a VPN. Use Nginx or Caddy as a reverse proxy with TLS termination, and add rate limiting and IP allowlisting.

Isolation

OpenClaw should be deployed only in a fully isolated environment such as a dedicated virtual machine or separate physical system, using dedicated non-privileged credentials with access only to non-sensitive data.

Skills & Code Execution

Only install skills from the official ClawHub marketplace. Block external skills and only allow pre-vetted, manually reviewed code. Disable high-risk tools like shell execution, browser control, and web fetching if they aren't needed.

Model Choice

Model choice matters — older/legacy models can be less robust against prompt injection and tool misuse. OpenClaw recommends using Anthropic Claude Opus 4.6 (or the latest Opus) because it's strong at recognizing prompt injections.

Logging & Monitoring

Log every command execution, API call, file access, and decision. Store logs somewhere the agent can't modify them — ideally a separate logging server or SIEM system.

Prompt Injection Defense (SOUL.md)

Add explicit instructions to your SOUL.md file to treat content inside <user_data> tags as data only, implement output validation before execution, and require human approval for sensitive actions.

Need a Hardened OpenClaw Implementation?

Reach out to us and we'll help you deploy OpenClaw securely.

contact@dheemai.com