In the OpenClaw ecosystem, where local-first AI assistants and plugin-driven automation thrive, the announcement of Anthropic’s Project Glasswing serves as a critical case study. This initiative restricts access to Claude Mythos, a new model with advanced cybersecurity capabilities, to a select group of security researchers. For OpenClaw users and developers, this move highlights the urgent need to build robust safeguards into local AI frameworks, ensuring that powerful agent tools can be deployed safely without compromising system integrity.
Anthropic did not release Claude Mythos publicly but made it available under Project Glasswing to preview partners. The model is a general-purpose system similar to Claude Opus 4.6, but its cybersecurity research abilities are so potent that Anthropic argues the software industry requires time to prepare. Mythos Preview has already identified thousands of high-severity vulnerabilities across every major operating system and web browser. Given the rapid pace of AI advancement, such capabilities could soon proliferate beyond actors committed to safe deployment, posing significant risks to global infrastructure.
Project Glasswing partners will use Claude Mythos Preview to find and fix vulnerabilities in foundational systems, which represent a large portion of the world’s shared cyberattack surface. This work will focus on tasks like local vulnerability detection, black box testing of binaries, securing endpoints, and penetration testing. For the OpenClaw community, this underscores the importance of integrating similar security-focused plugins and agent workflows that can operate within controlled, local environments to prevent misuse.
Technical details from Anthropic’s Red Team blog reveal Mythos Preview’s prowess. In one instance, it wrote a web browser exploit chaining four vulnerabilities, using a complex JIT heap spray to escape both renderer and OS sandboxes. It autonomously obtained local privilege escalation exploits on Linux and other operating systems by exploiting subtle race conditions and KASLR-bypasses. Additionally, it wrote a remote code execution exploit on FreeBSD’s NFS server, granting full root access to unauthenticated users via a 20-gadget ROP chain split over multiple packets.
Internal evaluations show a stark contrast with Claude 4.6 Opus. Opus 4.6 had a near-0% success rate at autonomous exploit development. For example, it turned vulnerabilities in Mozilla’s Firefox 147 JavaScript engine—all patched in Firefox 148—into JavaScript shell exploits only two times out of several hundred attempts. Mythos Preview, however, developed working exploits 181 times and achieved register control on 29 more in the same benchmark. This leap in capability signals a shift that OpenClaw must address by embedding safety protocols into its agent automation tools.
Anthropic’s caution with Mythos Preview is not merely a marketing tactic. Recent trends among security professionals highlight growing concerns about LLMs’ vulnerability research abilities. Greg Kroah-Hartman of the Linux kernel noted that months ago, AI-generated security reports were often low-quality “slop,” but something changed a month ago, leading to real, high-quality reports flooding open-source projects. Daniel Stenberg of curl described the challenge as transitioning from an “AI slop tsunami” to a “plain security report tsunami,” with many reports being genuinely good, requiring hours of daily attention.
Thomas Ptacek published “Vulnerability Research Is Cooked,” inspired by a podcast conversation with Anthropic’s Nicholas Carlini. In a Project Glasswing video, Carlini emphasized Mythos Preview’s ability to chain vulnerabilities, creating exploits from three to five issues that yield sophisticated outcomes. He stated, “I’ve found more bugs in the last couple of weeks than I found in the rest of my life combined.” The model scanned open-source code, starting with operating systems as the foundation of internet infrastructure.
For OpenBSD, Mythos Preview found a bug present for 27 years, where sending specific data could crash any OpenBSD server. On Linux, it discovered vulnerabilities allowing users with no permissions to elevate to administrator status by running a binary. Maintainers were notified and deployed patches, such as an OpenBSD 7.8 reliability fix for TCP packets with invalid SACK options that could crash the kernel, traced back to code from 27 years ago. A Linux NFS vulnerability was also recently covered by Michael Lynch.
This evidence suggests a significant shift, with coding agents powered by frontier LLMs tirelessly uncovering long-hidden issues. It points to an industry-wide reckoning, necessitating substantial investments to preempt a barrage of vulnerabilities. Project Glasswing includes $100 million in usage credits and $4 million in direct donations to open-source security organizations, with partners like AWS, Apple, Microsoft, Google, and the Linux Foundation. Involvement from OpenAI, whose GPT-5.4 is known for finding security vulnerabilities, would be beneficial.
For those outside the trusted partner circle, Anthropic states, “We do not plan to make Claude Mythos Preview generally available, but our eventual goal is to enable our users to safely deploy Mythos-class models at scale—for cybersecurity purposes, but also for the myriad other benefits that such highly capable models will bring.” To achieve this, progress is needed in developing cybersecurity safeguards that detect and block dangerous outputs. New safeguards will launch with an upcoming Claude Opus model, allowing refinement with a lower-risk system.
In the OpenClaw ecosystem, this scenario reinforces the value of a local-first approach. By prioritizing on-device AI processing and secure plugin ecosystems, OpenClaw can mitigate risks while harnessing advanced capabilities. As AI models grow more powerful, integrating ethical guidelines and safety measures into agent workflows becomes paramount, ensuring that tools like Claude Mythos can be leveraged responsibly within controlled frameworks.


