SecurityAI AgentsSandboxingKernel EnforcementOpen Source

Why I built nono

Luke Hinds-February 12, 2026

I've been here before.

A few years ago, I watched the software industry wake up to a problem it had been ignoring for decades: the software supply chain. Developers were pulling thousands of dependencies into their projects with zero verification. No signatures. No provenance. No way to know if the package you downloaded was the package the author published. Open source packages were getting typo-squatted, backdoored, and compromised at an alarming rate. And the industry was just... accepting as a fact of life, and the security industry was just... writing blog posts about it.

While at Red Hat, I got asked by Steve Watt who ran the emerging technology group in Red Hat, to look at what we could do about software supply chain security. I spent a good few days mulling over the problem, and an architecture started to take shape. It was a simple idea, really: what if we could make it easy for developers to sign their packages cryptographically, and for users to verify those signatures automatically?

What if we could build a system that made signing and verification free, easy, and default? From there I sought out others with a simple prototype I built and was lucky to be joined by some very smart folks (Dan Lorenc, now the CEO of Chainguard, and Bob Callaway, of the Google open source security team) and together we went on to build what would become Sigstore.

Well, that feeling is back. And this time it's about AI agents. In fact the whole of AI, but especially where it's heading with agents. The technology is moving fast, the potential is huge, but the security implications are being ignored at an alarming rate and are going to bite us hard if we don't do something about it, and soon.

The OpenClaw Wake-Up Call

The moment that crystallised everything for me was watching what happened with OpenClaw.

OpenClaw gained 180,000 GitHub stars in a week. It changed its name three times in three days. And within that same window, security researchers found over 1,800 exposed instances via Shodan - servers with no authentication, leaking API keys, chat histories, and credentials to anyone who looked. Three critical CVEs dropped in three days, including a one-click RCE that let an attacker steal your token, connect via WebSocket, and disable confirmation prompts just by getting you to visit a web page.

Then came ClawHub, the skills marketplace. Researchers found 386 malicious skills masquerading as crypto trading tools, all deploying the same infostealer to harvest SSH keys, wallet credentials, and browser passwords. The barrier to publishing a skill? A markdown file and a week-old GitHub account.

None of this required a sophisticated attack. The agent had broad system access. The skills were treated as trusted configuration. Nobody had drawn a line between what the agent needed and what it could reach.

I don't say this to single out OpenClaw - the whole industry faces the same structural problem. OpenClaw just moved fast enough to make the consequences visible.

The .md File Is the New Attack Surface

Somewhere in the last eighteen months, markdown stopped being documentation and quietly became a control plane.

Look at how modern AI agent frameworks work. You drop a SKILLS.md into your project root and it tells the agent what it can do - what tools to call, what files to read, how to behave. You add a RULES.md and it sets the boundaries. A CLAUDE.md, an AGENTS.md, a PROMPTS.md - pick your flavour. The pattern is the same: a plain text file, written in natural language, that an autonomous process treats as executable instruction.

Let that sit for a moment. We are configuring autonomous systems with prose.

In traditional software, configuration has structure. YAML, TOML, JSON - these formats have schemas, parsers, and validators. If a malformed instruction makes it into your Kubernetes manifest, the parser rejects it. There's a clear line between what the system accepts and what it doesn't. Markdown has none of this. There's no schema to validate against, no parser to reject malformed input. A legitimate instruction and an injected one are syntactically identical. They're both just sentences.

And here's what makes the SKILLS.md pattern particularly dangerous: the instruction and the data live in the same artifact.

A skill file doesn't just tell the agent what to do - it often includes example data, templates, context, sample outputs. The control plane and the data plane aren't merely in the same channel (the model's context window, which is already problematic). They're in the same file. An attacker doesn't need to poison a separate document that the agent might fetch later. They just need to get a few lines into a file the agent already trusts as its own configuration.

This is worse than traditional prompt injection, and it's worth understanding why.

With a standard prompt injection, the malicious payload arrives through a channel the agent knows is external - a user message, a fetched web page, a document it's been asked to summarise. A well-designed agent can, in theory, treat that content with some scepticism. It can apply filters. It can flag anomalies.

A skill file gets no such scrutiny. It's loaded before the conversation starts. It's treated as part of the agent's identity - its capabilities, its permissions, its operational boundaries. From the model's perspective, a skill definition is self. It's not external input to be examined; it's the ground truth about what the agent is and what it's allowed to do.

So when an attacker injects instructions into a skill file - or into any markdown file that gets loaded as configuration - those instructions arrive pre-trusted. The model doesn't see a boundary being crossed, because architecturally, there is no boundary. The model's context window is a single, undifferentiated stream of tokens. System prompts, tool definitions, skill files, and user-supplied content are all serialised into the same payload. The model has no mechanism to distinguish "follow this" from "process this." They're just tokens.

This is the control plane / data plane separation problem, and it's one that decades of software architecture learned to solve the hard way. We learned - through SQL injection, through cross-site scripting, through every category of injection attack in the OWASP Top Ten - that when a system can't tell instructions from data, attackers will exploit the ambiguity. Every time.

The .md file is the new <script> tag. And right now, the industry is treating it like it's harmless.

The Same Problem, a Different Era

What struck me wasn't just the OpenClaw incidents themselves - it was how familiar the pattern felt.

With software supply chains, the core issue was trust without verification. We trusted packages because they had the right name. We trusted registries because everyone else did. We trusted the build process because... well, we just did.

With AI agents, the core issue is access without boundaries. We give agents our full filesystem permissions because that's how Unix works. We give them network access because they need to call APIs. We give them access to our SSH keys, our cloud credentials, our shell history, our browser cookies - not because they need any of that, but because we haven't built the tooling to say "you can have this, but not that."

And just like with supply chain security, the longer we wait to fix this, the worse the incidents get.

Building nono

I started building nono because I wanted something that was simple, opinionated, and impossible to bypass.

The key insight was that security for AI agents can't live at the application layer. If the agent is running inside the application, then the application's security controls are just suggestions. A sufficiently capable model, a well-crafted prompt injection, or even a simple hallucination can route around any filter that lives in the same process.

Real enforcement has to come from the operating system kernel. The kernel doesn't care what the agent thinks it's allowed to do. The kernel doesn't respond to jailbreaks. The kernel doesn't have a system prompt you can override. It enforces access control absolutely, and once a sandbox is applied, it's irreversible. There's no API call to undo it. There's no escape hatch.

That's the foundation nono is built on. On Linux, it uses Landlock - a security module built into the kernel since version 5.13. On macOS, it uses Apple's Seatbelt sandbox framework. Both are battle-tested, both are maintained by their respective OS vendors, and both enforce access control at a level that no userspace process can circumvent.

The philosophy is simple: deny by default, allow explicitly. When you run an agent through nono, it starts with nothing. No file access. No network. No ability to run dangerous commands. You grant exactly what the agent needs - read access to your project directory, write access to an output folder, network access if the agent requires it - and everything else stays locked. Your SSH keys, your cloud credentials, your shell history: blocked automatically, without configuration.

In practice, it looks like this:

bash

# Before: agent has full access to everything
$ agent

# After: agent can only touch your project directory
$ nono run --allow ./project -- agent

One line. That's the difference between an agent that can read system files, your home drive and often a whole lot more, and one that's confined to operate exactly where it needs to be.

What Comes Next

nono today is a sandbox. It blocks what shouldn't happen. That's necessary, but it's not sufficient. There is a lot more to do, so we are busy seperating out the CLI from a core library, to allow other ecosystems to benefit from fast immediate sandboxing. We will over the course look to extend features beyond sandboxing, where relevant and suited.

But here's the thing I keep coming back to: nono is one tool solving one layer of the problem. Agent security needs the same ecosystem that software supply chain security eventually got - standards, shared infrastructure, community-maintained tooling. Sigstore didn't win because it was one project. It won because an entire community decided that signing and verification should be a default, not an afterthought. Agent security needs that same energy.

nono is an open source, kernel-enforced capability sandbox for AI agents. Star us on GitHub or visit nono.sh to get started.

Want to learn more about Always Further?

Come chat witg a founder! Get in touch with us today to explore how we can help you secure your AI agents and infrastructure.

Get in touch