Key Takeaways

  • Autonomous AI agents using frontier LLMs can perform 80–90% of a cyber campaign’s workload, including reconnaissance, lateral movement, and exfiltration.
  • Anthropic’s Mythos-class testing surfaced thousands of zero‑days, including a 27‑year‑old OpenBSD bug, and autonomously chained four bugs into a working browser sandbox escape.
  • AI compresses the vulnerability lifecycle: discovery-to-PoC drops from days or hours to minutes–hours, shrinking defender patch windows to less than a release cycle.
  • Effective defense requires treating AI agents as first‑class assets—inventorying them, logging model calls and agent actions, and enforcing narrow, auditable governance.

AI‑assisted exploitation has crossed a line. We now have autonomous AI agents on top of high‑capability large language models that can discover, chain, and weaponize vulnerabilities end‑to‑end, at machine speed. [2] At Google scale, response must shift from “block the IP” to “detect and disrupt the AI campaign itself.”

Anthropic’s Mythos Preview reportedly:

  • Surfaced thousands of zero‑day vulnerabilities across major OSes and browsers
  • Found a 27‑year‑old OpenBSD bug missed by humans [2]
  • Autonomously chained four bugs into a browser sandbox escape [2]

On defense, OpenAI’s Daybreak uses GPT‑5.5 and Codex Security to scan large codebases, propose patches, and validate fixes in minutes — a generative AI vulnerability factory for defenders. [3][4]

Key idea: Offense and defense now share the same primitives (LLMs, agents, cloud orchestration). What differs is how they are governed and who gets to run at machine speed.

Google’s reported disruption of an AI‑driven exploitation campaign looks like an early pattern: AI‑run operations treating infrastructure as a continuous search–optimize–exploit loop. [8]

From “could this happen?” to “Google just stopped it”: why AI-driven exploitation is now real

Mythos Preview is the first widely described frontier LLM explicitly evaluated for autonomous vulnerability discovery at scale. In controlled tests, it:

  • Found thousands of zero‑days in major OSes and browsers
  • Uncovered a 27‑year‑old OpenBSD vulnerability [2]
  • Demonstrated that deep structural flaws in mature codebases are within model reach

Mythos also autonomously chained four distinct vulnerabilities into a working sandbox escape by: [2]

  • Understanding sandbox boundaries
  • Spotting memory‑safety defects
  • Selecting compatible primitives
  • Assembling a reliable exploit

Anthropic’s later report on a state‑backed espionage campaign shows the next step: [8]

  • AI agents performed 80–90% of reconnaissance, lateral movement, exfiltration
  • Humans mainly provided high‑level guidance and approvals

⚠️ Escalation signal: A state actor trusting AI with 80–90% of campaign workload means autonomous systems now outperform junior operators across much of the kill chain. [8]

Inside enterprises, agentic AI is spreading on the defender and developer sides. Netskope observes that LLM‑powered agents with direct access to software and infrastructure are already deployed, often with minimal supervision. [5] These agents become:

  • High‑value targets for compromise
  • Stepping‑stones for lateral movement
  • “Free infrastructure” for attackers

Check Point Research showed that web‑enabled conversational assistants can be hijacked as stealth C2 channels, blending into ordinary AI traffic and requiring no attacker‑hosted infra. [1]

Together, these data points make Google’s disrupted AI‑driven campaign look like a logical next step in an arms race where both attackers and defenders rely on frontier models and autonomous workflows. [2][5][8]

How LLMs discover and weaponize vulnerabilities faster than your patch cycle

Anthropic’s offensive research lead estimates attackers could access Mythos‑class tools within 6–12 months of preview, shrinking defenders’ lead to roughly one release cycle. [2] Mythos already:

  • Identified thousands of zero‑day issues in widely deployed platforms [2]
  • Suggests the backlog of exploitable bugs is growing faster than most orgs can patch

In early 2025, about one‑third of exploited CVEs were hit on or before public disclosure day — before industrialized offensive LLMs. [2] With automated triage, exploit synthesis, and agent‑driven fuzzing, that window compresses from days to hours.

📊 Timeline compression:

  • Pre‑AI: weeks–months from bug intro to discovery; days from disclosure to weaponization
  • Early AI: days to discovery; hours to weaponization [2]
  • Frontier LLM era: minutes–hours from code landing in main to discovery and PoC synthesis [2][3]

OpenAI’s Daybreak mirrors this for defense. GPT‑5.5 and Codex Security can: [3][7]

  • Analyze thousands of lines at once
  • Surface vulnerabilities and data‑flow risks
  • Generate compile‑clean patches plus unit tests
  • Validate fixes in isolated environments

Daybreak makes security a continuous SDLC concern — secure review, threat modeling, dependency analysis, and patch validation integrated into pipelines. [4][7]

OpenAI further splits GPT‑5.5 into: [4]

  • GPT‑5.5 (general)
  • GPT‑5.5 with Trusted Access for Cyber (vetted defensive workflows) [4]
  • GPT‑5.5‑Cyber (more permissive for red teaming and intrusion testing) [4][6]

Implication: Hardware and model capabilities are symmetric for attackers and defenders; governance and allowed tool use are what differ. [4][6]

For engineering teams, every vulnerability in repos — even on feature branches — is now in scope for AI‑accelerated discovery, whether via your Daybreak‑style stack or an adversary’s Mythos‑like tooling. [2][3][7] The Google incident should force CI/CD and vuln‑management pipelines to adapt to AI‑native velocities, not human change‑advisory cycles. [2][6]

Agentic AI as attacker: multi-agent workflows, planning and cloud-scale operations

Anthropic’s 2025 espionage report is the clearest public description of AI as primary operator. In that campaign: [8]

  • AI agents executed 80–90% of tasks from external recon to internal pivoting
  • Humans mainly approved goals and sensitive steps

To generalize, researchers built a multi‑agent penetration‑testing PoC against cloud infrastructure. The system: [8]

  • Did not invent new attack surfaces
  • Dramatically accelerated exploitation of known misconfigurations
  • Excelled at:
    • Enumerating cloud resources via APIs
    • Identifying misconfigured IAM roles/policies
    • Following documented attack paths
    • Scaling across many accounts in parallel

💼 Echo in practice: One SaaS security lead saw a benign agent chain an overly permissive GCP service account into full DB read access in under 10 minutes — a path never documented in manual reviews.

Netskope warns that because agentic systems directly operate software and infrastructure, they are prime cyber targets — yet most orgs lack: [5]

  • A complete inventory of agents
  • Policies for systems agents may control
  • Telemetry specific to agent behavior

On defense, Codex Security already acts as a sophisticated agent: it builds editable threat models from entire repos, identifies realistic attack paths, and validates patches in isolation. [7] These are the same reasoning skills an offensive agent uses to construct and traverse attack graphs.

GPT‑5.5‑Cyber formalizes this dual‑use nature: it is more permissive specifically for authorized offensive workflows like red teaming. [4][6] Without strong governance, “authorized” vs “unauthorized” can collapse to “whoever holds the API key.”

⚠️ Dual‑use warning: Check Point’s hijacking of web‑enabled assistants into stealth C2 shows that a single LLM instance can simultaneously act as planner, operator, and covert infrastructure. [1]

C2 through the front door: how LLM traffic and cloud services hide AI-driven attacks

Attackers have long abused legitimate cloud services (Slack, Dropbox, OneDrive) as C2 because traffic blends into baselines. [1] Defenders eventually instrumented these services and shipped SIEM/XDR rules. [1]

Web‑enabled LLM assistants disrupt that learning curve. Their traffic is: [1]

  • New, with immature telemetry and detection content
  • Hard to block once broadly adopted
  • Trusted as “business productivity” tooling

Check Point’s experiment abused assistants’ web‑fetch features. Malware: [1]

  • Never contacted attacker infra directly
  • Asked the assistant to fetch an attacker‑controlled URL that encoded commands
  • Received results via the assistant’s HTTP requests

This required no API keys, no authenticated accounts, and produced traffic indistinguishable from normal AI usage.

In parallel, the multi‑agent cloud‑attack PoC showed that LLMs can orchestrate complex sequences of GCP API calls: [8]

  • Chaining misconfigurations into full compromise
  • Using only standard control‑plane traffic
  • Standing out mostly by speed, breadth, and sequencing

📊 New observability layer: In AI‑driven campaigns, key signals may include: [5][7]

  • Unusual LLM usage patterns (prompt types, call volumes, odd timing)
  • Orchestrated sequences of cloud API calls at machine speed
  • Correlation between agent actions and data‑plane anomalies

Netskope notes that most organizations have not modeled AI agents as first‑class security entities, leaving blind spots around what they access and how outputs are consumed. [5]

At Google scale, disrupting an AI campaign is less about identifying a new malware family and more about correlating: [1][8]

  • Anomalous model calls
  • Strange agent behavior
  • Cloud control‑plane sequences across tenants and data sources

For engineering teams, LLM access logs, model‑usage fingerprints, and agent execution traces must become core observability signals, alongside syscalls and VPC flow logs. [5][7]

Defensive AI stack: Daybreak, Mythos and AI-native vuln pipelines

Anthropic’s Mythos and Glasswing projects, used for industrial‑scale Firefox vuln hunting, showed that frontier models can be aimed at large, hardened codebases and still uncover subtle, long‑lived flaws. [2][4]

OpenAI’s response is Daybreak — a platform combining GPT‑5.5, GPT‑5.5‑Cyber, and Codex Security into a continuous software‑protection stack, explicitly framed against AI‑accelerated attacks. [3][6][7] Key patterns:

  • Security by design: checks on every merge, not post‑release audits [4][7]
  • Whole‑repo reasoning: Codex Security builds an editable threat model from the entire codebase [7]
  • Sandboxed patch validation: generated fixes are tested with verifiable evidence before landing [3][7]

💡 Pattern to emulate: Treat AI security as a continuous service that:

  1. Watches every change (code, infra, dependencies)
  2. Maintains an evolving threat model
  3. Automatically proposes and tests remediations

Codex Security’s ability to reason over attack paths and validate patches matters against AI‑driven exploit chains, which often depend on multi‑step preconditions. [7] If your defensive agents cannot reason over attack graphs, they will trail offensive agents that can.

OpenAI’s launch cadence — GPT‑5.5‑Cyber first, then Daybreak days later — highlights an industry race to build AI‑native cyber platforms that keep pace with offensive AI. [6] For organizations, the lesson is direct: AI‑based vuln discovery and remediation must be as core as CI/CD or observability. [2][3][6]

Without an AI‑native defensive pipeline spanning code, infrastructure, and production telemetry, reproducing a Google‑style disruption of an autonomous campaign will remain unrealistic, regardless of human IR quality. [3][7]

Engineering for the era of AI-driven hacking: architecture, guardrails and operational playbooks

Netskope argues that adapting security to the “agentic economy” is now urgent. [5] Treat AI agents as:

  • Discoverable assets (inventory and SBOM)
  • Subjects of policy (who they can impersonate, what they can access)
  • Continuous telemetry sources (what they actually do) [5]

Anthropic’s multi‑agent PoC suggests AI’s main offensive advantages are speed and scale, not fundamentally new exploit primitives. Defenders should emphasize: [8]

  • Rate‑limiting automated actions and model calls
  • Anomaly detection over automation patterns (bursts, wide sweeps)
  • Rapid containment (agent kill switches, scoped revocation)

⚠️ Policy gap: Check Point’s LLM‑as‑C2 work implies many enterprises still treat AI assistant traffic as generic HTTPS, with no SIEM rules, EDR thresholds, or egress controls tuned to AI endpoints. [1]

GPT‑5.5 with Trusted Access for Cyber offers a governance blueprint: [4]

  • Confine use to vetted defensive workflows (secure review, malware triage, patch validation)
  • Enforce narrow auth scopes tied to specific repos/environments
  • Log prompts, tools, and outputs with strong retention
  • Require humans in the loop for destructive actions

Daybreak’s workflow integration shows the value of running security agents as persistent, policy‑governed services — like CI jobs or SAST — rather than ad hoc chat tools. [3][7] This makes behavior auditable and impact predictable.

As Mythos and Daybreak compress the vuln lifecycle on both offense and defense, incident playbooks need explicit “AI‑discovered, AI‑exploited” branches. [2][3][8] Those should define:

  • Detection rules (agent anomalies, unusual model usage)
  • Forensic artifacts (LLM logs, agent traces, cloud‑API sequences)
  • Containment steps (agent shutdown, credential rotation, rollbacks)

💼 Operational takeaway: Your SOC should quickly answer: “Which agents touched this system? Which models did they call? What did they ask and do?” If that visibility is missing, it belongs at the top of your engineering backlog. [5][7]

Conclusion: Google’s incident as your last early warning

Anthropic’s Mythos results, the state‑backed espionage campaign, and Check Point’s LLM‑as‑C2 experiments show that AI‑driven exploitation is becoming standard for well‑resourced actors. [2][8][1] In parallel, OpenAI’s Daybreak, GPT‑5.5‑Cyber, and Codex Security illustrate a defensive ecosystem racing to embed AI into code review, threat modeling, and automated patching from day zero. [3][4][6][7]

Netskope’s warnings about agentic AI and the absence of robust monitoring make clear that the main gap is governance and observability, not raw capability. [5] Google’s disruption of an AI‑driven campaign should be treated as a template: any organization with valuable assets should assume similarly autonomous chains will probe their surface.

Call to action: Treat this as your last early warning. Starting now:

  1. Inventory your AI agents — know where they run, what they touch, and who owns them. [5]
  2. Instrument their behavior — log model usage, tool calls, and access patterns as first‑class security telemetry. [1][5][7]

Frequently Asked Questions

How immediate and widespread is the threat from AI‑driven exploitation?
AI‑driven exploitation is already material and accelerating. Public reports and PoCs show frontier models and agent frameworks discovering thousands of zero‑days, autonomously chaining multi‑step exploits, and enabling state‑level campaigns where agents complete 80–90% of tasks; Anthropic’s Mythos and related espionage reporting demonstrate that well‑resourced actors can reach production‑grade offensive automation now, and Anthropic’s own estimates imply Mythos‑class capabilities could be widely accessible to attackers within 6–12 months. This means organizations can no longer assume long human lead times for exploit discovery or weaponization—the operational tempo has moved to machine speed, and defensive controls, governance, and observability must adapt immediately.
What are the most effective short‑term defensive steps organizations must take?
Inventory and governance are highest priority. Quickly discover and catalog every AI agent and model integration, enforce least‑privilege and narrow auth scopes for model access, log prompts and tool calls with retention policies, and introduce kill switches and rate limits to halt abusive automation. Parallel investments should add model‑usage telemetry to SIEM/XDR, inject AI checks into CI/CD (continuous SAST/patch validation), and require human approval for destructive or high‑impact actions so that defenders can detect anomalous model behavior and contain fast‑moving campaigns.
How should engineering and SOC teams change playbooks for AI‑native attack vectors?
Treat AI agents like services: include them in SBOMs, ownership, and incident response plans. Update detection rules to look for model‑call anomalies, rapid orchestrated cloud API sequences, and odd agent interaction patterns; capture LLM logs, agent traces, and correlated cloud control‑plane activity as mandatory forensic artifacts. Build CI/CD gates that run AI‑based vuln discovery and automated patch validation (defender agents) while also enforcing strict runtime controls and prompt/output auditing to reduce the chance an attacker can commandeer agent capabilities.

Sources & References (8)

Key Entities

💡
GCP service account
Concept
💡
cloud orchestration
Concept
💡
AI agents
WikipediaConcept
💡
sandbox escape
Concept
💡
zero-day vulnerabilities
WikipediaConcept
💡
OpenBSD vulnerability (27-year-old)
WikipediaConcept
📅
state-backed espionage campaign (2025 report)
Event
🏢
Google
WikipediaOrg
🏢
OpenAI
WikipediaOrg
🏢
Netskope
Org
📦
Mythos Preview
Produit

Generated by CoreProse in 3m 12s

8 sources verified & cross-referenced 2,032 words 0 false citations

Share this article

Generated in 3m 12s

What topic do you want to cover?

Get the same quality with verified sources on any subject.