Transformer Co-Author Illia Polosukhin Unveils IronClaw: A Rust-Based Fortress to Fix OpenClaw's Security Crisis

The original architect of modern artificial intelligence is now fixing its most pressing security vulnerability. Illia Polosukhin—co-author of the seminal 2017 paper "Attention Is All You Need" that introduced the Transformer architecture—has released IronClaw, a ground-up Rust rewrite of the OpenClaw protocol designed to stop the hemorrhaging of user credentials and API keys across the internet.

The OpenClaw Security Crisis

OpenClaw, the protocol enabling AI agents to interact with external tools and services, has become what security researchers call a "security dumpster fire." Over 25,000 public instances currently expose user data to the internet without adequate protection, enabling one-click remote code execution, prompt injection attacks, and malicious skill extraction.

Polosukhin, who also founded blockchain platform NEAR Protocol, was an early OpenClaw adopter and acknowledges its transformative potential. "It has already changed the way I interact with computing," he noted in a Reddit discussion. Yet the architecture contains fundamental flaws: when users provide email Bearer Tokens or API keys to OpenClaw-based agents, that sensitive data travels directly to Large Language Model (LLM) provider servers.

"Everything you have, including data you didn't explicitly authorize, could be accessed by any employee of these companies," Polosukhin explained. "It's not that these companies are malicious, but the reality is that users have no real privacy."

Four Layers of Defense

IronClaw addresses these vulnerabilities through a defense-in-depth architecture built on four critical layers:

Memory Safety by Design: Written entirely in Rust, IronClaw eliminates entire classes of traditional vulnerabilities including buffer overflows and use-after-free errors that C or C++ implementations commonly suffer.
WASM Sandbox Isolation: All third-party tools and AI-generated code execute within isolated WebAssembly containers. Even if a tool proves malicious, its destructive capability remains confined to the sandbox, unable to touch the host system.
Encrypted Credential Vault: API keys and passwords receive AES-256-GCM encryption with strict policy binding. Each credential links to specific domain restrictions, ensuring a GitHub token cannot be exfiltrated to an unauthorized server, even if the LLM is compromised via prompt injection.
Trusted Execution Environment (TEE): Hardware-level isolation protects data during cloud deployment, preventing even cloud service providers from accessing sensitive user information.

The critical architectural difference: raw credentials never touch the language model. Injection occurs only at the network boundary when the agent requires external communication. If an attacker attempts to prompt-inject a request to steal Google OAuth tokens, the credential layer rejects the transaction, logs the attempt, and alerts the user.

User-Owned AI Vision

IronClaw represents more than a security patch—it serves as the runtime layer for Polosukhin's broader "User-Owned AI" vision through NEAR Protocol. This ecosystem aims to let users maintain complete control over their data and digital assets while AI agents execute tasks within cryptographically verifiable environments.

NEAR has already deployed supporting infrastructure including confidential cloud computing platforms and decentralized GPU marketplaces. Polosukhin's team has also launched market.near.ai, a reputation-based marketplace where specialized agents can hire one another for complex workflows.

Deployment and Future Roadmap

Currently available on GitHub (version 0.15.0) with binaries for macOS, Linux, and Windows, IronClaw supports both local deployment and confidential cloud hosting. Polosukhin argues that pure local deployment faces practical limitations—agents stop when devices sleep, battery drain impacts mobile usage, and complex long-running tasks require persistent infrastructure.

The confidential cloud model offers what he describes as the optimal compromise: privacy guarantees approaching local execution combined with always-on availability. Users can even configure geographic security policies, automatically adding authentication barriers when traveling across borders.

Regarding persistent concerns about prompt injection—the industry-wide Achilles' heel of LLM security—Polosukhin acknowledges ongoing challenges. Current implementations use heuristic pattern detection, with plans to deploy updatable small language classifiers for injection identification. However, he concedes that sophisticated attacks targeting code repositories or communication tools require smarter behavioral analysis systems capable of reviewing agent intent without exposing input content.

The project will undergo professional red-team testing and security audits as the core codebase stabilizes. Polosukhin has invited the security community to contribute to what he calls the "ultimate interface between humans and everything online"—provided it remains secure.

Background and Context

Polosukhin's authority stems from his 2017 work at Google Research, where despite appearing last on the Transformer paper's author list, a footnote confirms "Equal contribution. Listing order is random." That same year he departed Google to establish NEAR Protocol, consistently advocating that natural language would replace traditional coding interfaces.

Nine years later, as that prediction materializes through AI agents, Polosukhin is ensuring the infrastructure supporting that transition does not sacrifice user security for convenience.

Resources

GitHub Repository: https://github.com/nearai/ironclaw
Reddit AMA: https://www.reddit.com/r/MachineLearning/comments/1rlnwsk/d_ama_secure_version_of_openclaw/

Agent Roundtable