AI-driven Exploits: Google, Agents & LLM Risks Unveiled

AI-assisted editorialBy Olivierdrafted by CoreProse Auto-Writer8 sources verified

Key Takeaways

Autonomous AI agents using frontier LLMs can perform 80–90% of a cyber campaign’s workload, including reconnaissance, lateral movement, and exfiltration.
Anthropic’s Mythos-class testing surfaced thousands of zero‑days, including a 27‑year‑old OpenBSD bug, and autonomously chained four bugs into a working browser sandbox escape.
AI compresses the vulnerability lifecycle: discovery-to-PoC drops from days or hours to minutes–hours, shrinking defender patch windows to less than a release cycle.
Effective defense requires treating AI agents as first‑class assets—inventorying them, logging model calls and agent actions, and enforcing narrow, auditable governance.

AI‑assisted exploitation has crossed a line. We now have autonomous AI agents on top of high‑capability large language models that can discover, chain, and weaponize vulnerabilities end‑to‑end, at machine speed. [2] At Google scale, response must shift from “block the IP” to “detect and disrupt the AI campaign itself.”

Anthropic’s Mythos Preview reportedly:

Surfaced thousands of zero‑day vulnerabilities across major OSes and browsers
Found a 27‑year‑old OpenBSD bug missed by humans [2]
Autonomously chained four bugs into a browser sandbox escape [2]

On defense, OpenAI’s Daybreak uses GPT‑5.5 and Codex Security to scan large codebases, propose patches, and validate fixes in minutes — a generative AI vulnerability factory for defenders. [3][4]

Key idea: Offense and defense now share the same primitives (LLMs, agents, cloud orchestration). What differs is how they are governed and who gets to run at machine speed.

Google’s reported disruption of an AI‑driven exploitation campaign looks like an early pattern: AI‑run operations treating infrastructure as a continuous search–optimize–exploit loop. [8]

From “could this happen?” to “Google just stopped it”: why AI-driven exploitation is now real

Mythos Preview is the first widely described frontier LLM explicitly evaluated for autonomous vulnerability discovery at scale. In controlled tests, it:

Found thousands of zero‑days in major OSes and browsers
Uncovered a 27‑year‑old OpenBSD vulnerability [2]
Demonstrated that deep structural flaws in mature codebases are within model reach

Mythos also autonomously chained four distinct vulnerabilities into a working sandbox escape by: [2]

Understanding sandbox boundaries
Spotting memory‑safety defects
Selecting compatible primitives
Assembling a reliable exploit

Anthropic’s later report on a state‑backed espionage campaign shows the next step: [8]

AI agents performed 80–90% of reconnaissance, lateral movement, exfiltration
Humans mainly provided high‑level guidance and approvals

⚠️ Escalation signal: A state actor trusting AI with 80–90% of campaign workload means autonomous systems now outperform junior operators across much of the kill chain. [8]

Inside enterprises, agentic AI is spreading on the defender and developer sides. Netskope observes that LLM‑powered agents with direct access to software and infrastructure are already deployed, often with minimal supervision. [5] These agents become:

High‑value targets for compromise
Stepping‑stones for lateral movement
“Free infrastructure” for attackers

Check Point Research showed that web‑enabled conversational assistants can be hijacked as stealth C2 channels, blending into ordinary AI traffic and requiring no attacker‑hosted infra. [1]

Together, these data points make Google’s disrupted AI‑driven campaign look like a logical next step in an arms race where both attackers and defenders rely on frontier models and autonomous workflows. [2][5][8]

How LLMs discover and weaponize vulnerabilities faster than your patch cycle

Anthropic’s offensive research lead estimates attackers could access Mythos‑class tools within 6–12 months of preview, shrinking defenders’ lead to roughly one release cycle. [2] Mythos already:

Identified thousands of zero‑day issues in widely deployed platforms [2]
Suggests the backlog of exploitable bugs is growing faster than most orgs can patch

In early 2025, about one‑third of exploited CVEs were hit on or before public disclosure day — before industrialized offensive LLMs. [2] With automated triage, exploit synthesis, and agent‑driven fuzzing, that window compresses from days to hours.

📊 Timeline compression:

Pre‑AI: weeks–months from bug intro to discovery; days from disclosure to weaponization
Early AI: days to discovery; hours to weaponization [2]
Frontier LLM era: minutes–hours from code landing in main to discovery and PoC synthesis [2][3]

OpenAI’s Daybreak mirrors this for defense. GPT‑5.5 and Codex Security can: [3][7]

Analyze thousands of lines at once
Surface vulnerabilities and data‑flow risks
Generate compile‑clean patches plus unit tests
Validate fixes in isolated environments

Daybreak makes security a continuous SDLC concern — secure review, threat modeling, dependency analysis, and patch validation integrated into pipelines. [4][7]

OpenAI further splits GPT‑5.5 into: [4]

GPT‑5.5 (general)
GPT‑5.5 with Trusted Access for Cyber (vetted defensive workflows) [4]
GPT‑5.5‑Cyber (more permissive for red teaming and intrusion testing) [4][6]

⚡ Implication: Hardware and model capabilities are symmetric for attackers and defenders; governance and allowed tool use are what differ. [4][6]

For engineering teams, every vulnerability in repos — even on feature branches — is now in scope for AI‑accelerated discovery, whether via your Daybreak‑style stack or an adversary’s Mythos‑like tooling. [2][3][7] The Google incident should force CI/CD and vuln‑management pipelines to adapt to AI‑native velocities, not human change‑advisory cycles. [2][6]

Agentic AI as attacker: multi-agent workflows, planning and cloud-scale operations

Anthropic’s 2025 espionage report is the clearest public description of AI as primary operator. In that campaign: [8]

AI agents executed 80–90% of tasks from external recon to internal pivoting
Humans mainly approved goals and sensitive steps

To generalize, researchers built a multi‑agent penetration‑testing PoC against cloud infrastructure. The system: [8]

Did not invent new attack surfaces
Dramatically accelerated exploitation of known misconfigurations
Excelled at:
- Enumerating cloud resources via APIs
- Identifying misconfigured IAM roles/policies
- Following documented attack paths
- Scaling across many accounts in parallel

💼 Echo in practice: One SaaS security lead saw a benign agent chain an overly permissive GCP service account into full DB read access in under 10 minutes — a path never documented in manual reviews.

Netskope warns that because agentic systems directly operate software and infrastructure, they are prime cyber targets — yet most orgs lack: [5]

A complete inventory of agents
Policies for systems agents may control
Telemetry specific to agent behavior

On defense, Codex Security already acts as a sophisticated agent: it builds editable threat models from entire repos, identifies realistic attack paths, and validates patches in isolation. [7] These are the same reasoning skills an offensive agent uses to construct and traverse attack graphs.

GPT‑5.5‑Cyber formalizes this dual‑use nature: it is more permissive specifically for authorized offensive workflows like red teaming. [4][6] Without strong governance, “authorized” vs “unauthorized” can collapse to “whoever holds the API key.”

⚠️ Dual‑use warning: Check Point’s hijacking of web‑enabled assistants into stealth C2 shows that a single LLM instance can simultaneously act as planner, operator, and covert infrastructure. [1]

C2 through the front door: how LLM traffic and cloud services hide AI-driven attacks

Attackers have long abused legitimate cloud services (Slack, Dropbox, OneDrive) as C2 because traffic blends into baselines. [1] Defenders eventually instrumented these services and shipped SIEM/XDR rules. [1]

Web‑enabled LLM assistants disrupt that learning curve. Their traffic is: [1]

New, with immature telemetry and detection content
Hard to block once broadly adopted
Trusted as “business productivity” tooling

Check Point’s experiment abused assistants’ web‑fetch features. Malware: [1]

Never contacted attacker infra directly
Asked the assistant to fetch an attacker‑controlled URL that encoded commands
Received results via the assistant’s HTTP requests

This required no API keys, no authenticated accounts, and produced traffic indistinguishable from normal AI usage.

In parallel, the multi‑agent cloud‑attack PoC showed that LLMs can orchestrate complex sequences of GCP API calls: [8]

Chaining misconfigurations into full compromise
Using only standard control‑plane traffic
Standing out mostly by speed, breadth, and sequencing

📊 New observability layer: In AI‑driven campaigns, key signals may include: [5][7]

Unusual LLM usage patterns (prompt types, call volumes, odd timing)
Orchestrated sequences of cloud API calls at machine speed
Correlation between agent actions and data‑plane anomalies

Netskope notes that most organizations have not modeled AI agents as first‑class security entities, leaving blind spots around what they access and how outputs are consumed. [5]

At Google scale, disrupting an AI campaign is less about identifying a new malware family and more about correlating: [1][8]

Anomalous model calls
Strange agent behavior
Cloud control‑plane sequences across tenants and data sources

For engineering teams, LLM access logs, model‑usage fingerprints, and agent execution traces must become core observability signals, alongside syscalls and VPC flow logs. [5][7]

Defensive AI stack: Daybreak, Mythos and AI-native vuln pipelines

Anthropic’s Mythos and Glasswing projects, used for industrial‑scale Firefox vuln hunting, showed that frontier models can be aimed at large, hardened codebases and still uncover subtle, long‑lived flaws. [2][4]

OpenAI’s response is Daybreak — a platform combining GPT‑5.5, GPT‑5.5‑Cyber, and Codex Security into a continuous software‑protection stack, explicitly framed against AI‑accelerated attacks. [3][6][7] Key patterns:

Security by design: checks on every merge, not post‑release audits [4][7]
Whole‑repo reasoning: Codex Security builds an editable threat model from the entire codebase [7]
Sandboxed patch validation: generated fixes are tested with verifiable evidence before landing [3][7]

💡 Pattern to emulate: Treat AI security as a continuous service that:

Watches every change (code, infra, dependencies)
Maintains an evolving threat model
Automatically proposes and tests remediations

Codex Security’s ability to reason over attack paths and validate patches matters against AI‑driven exploit chains, which often depend on multi‑step preconditions. [7] If your defensive agents cannot reason over attack graphs, they will trail offensive agents that can.

OpenAI’s launch cadence — GPT‑5.5‑Cyber first, then Daybreak days later — highlights an industry race to build AI‑native cyber platforms that keep pace with offensive AI. [6] For organizations, the lesson is direct: AI‑based vuln discovery and remediation must be as core as CI/CD or observability. [2][3][6]

Without an AI‑native defensive pipeline spanning code, infrastructure, and production telemetry, reproducing a Google‑style disruption of an autonomous campaign will remain unrealistic, regardless of human IR quality. [3][7]

Engineering for the era of AI-driven hacking: architecture, guardrails and operational playbooks

Netskope argues that adapting security to the “agentic economy” is now urgent. [5] Treat AI agents as:

Discoverable assets (inventory and SBOM)
Subjects of policy (who they can impersonate, what they can access)
Continuous telemetry sources (what they actually do) [5]

Anthropic’s multi‑agent PoC suggests AI’s main offensive advantages are speed and scale, not fundamentally new exploit primitives. Defenders should emphasize: [8]

Rate‑limiting automated actions and model calls
Anomaly detection over automation patterns (bursts, wide sweeps)
Rapid containment (agent kill switches, scoped revocation)

⚠️ Policy gap: Check Point’s LLM‑as‑C2 work implies many enterprises still treat AI assistant traffic as generic HTTPS, with no SIEM rules, EDR thresholds, or egress controls tuned to AI endpoints. [1]

GPT‑5.5 with Trusted Access for Cyber offers a governance blueprint: [4]

Confine use to vetted defensive workflows (secure review, malware triage, patch validation)
Enforce narrow auth scopes tied to specific repos/environments
Log prompts, tools, and outputs with strong retention
Require humans in the loop for destructive actions

Daybreak’s workflow integration shows the value of running security agents as persistent, policy‑governed services — like CI jobs or SAST — rather than ad hoc chat tools. [3][7] This makes behavior auditable and impact predictable.

As Mythos and Daybreak compress the vuln lifecycle on both offense and defense, incident playbooks need explicit “AI‑discovered, AI‑exploited” branches. [2][3][8] Those should define:

Detection rules (agent anomalies, unusual model usage)
Forensic artifacts (LLM logs, agent traces, cloud‑API sequences)
Containment steps (agent shutdown, credential rotation, rollbacks)

💼 Operational takeaway: Your SOC should quickly answer: “Which agents touched this system? Which models did they call? What did they ask and do?” If that visibility is missing, it belongs at the top of your engineering backlog. [5][7]

Conclusion: Google’s incident as your last early warning

Anthropic’s Mythos results, the state‑backed espionage campaign, and Check Point’s LLM‑as‑C2 experiments show that AI‑driven exploitation is becoming standard for well‑resourced actors. [2][8][1] In parallel, OpenAI’s Daybreak, GPT‑5.5‑Cyber, and Codex Security illustrate a defensive ecosystem racing to embed AI into code review, threat modeling, and automated patching from day zero. [3][4][6][7]

Netskope’s warnings about agentic AI and the absence of robust monitoring make clear that the main gap is governance and observability, not raw capability. [5] Google’s disruption of an AI‑driven campaign should be treated as a template: any organization with valuable assets should assume similarly autonomous chains will probe their surface.

⚡ Call to action: Treat this as your last early warning. Starting now:

Inventory your AI agents — know where they run, what they touch, and who owns them. [5]
Instrument their behavior — log model usage, tool calls, and access patterns as first‑class security telemetry. [1][5][7]

Frequently Asked Questions

How immediate and widespread is the threat from AI‑driven exploitation?

AI‑driven exploitation is already material and accelerating. Public reports and PoCs show frontier models and agent frameworks discovering thousands of zero‑days, autonomously chaining multi‑step exploits, and enabling state‑level campaigns where agents complete 80–90% of tasks; Anthropic’s Mythos and related espionage reporting demonstrate that well‑resourced actors can reach production‑grade offensive automation now, and Anthropic’s own estimates imply Mythos‑class capabilities could be widely accessible to attackers within 6–12 months. This means organizations can no longer assume long human lead times for exploit discovery or weaponization—the operational tempo has moved to machine speed, and defensive controls, governance, and observability must adapt immediately.

What are the most effective short‑term defensive steps organizations must take?

Inventory and governance are highest priority. Quickly discover and catalog every AI agent and model integration, enforce least‑privilege and narrow auth scopes for model access, log prompts and tool calls with retention policies, and introduce kill switches and rate limits to halt abusive automation. Parallel investments should add model‑usage telemetry to SIEM/XDR, inject AI checks into CI/CD (continuous SAST/patch validation), and require human approval for destructive or high‑impact actions so that defenders can detect anomalous model behavior and contain fast‑moving campaigns.

How should engineering and SOC teams change playbooks for AI‑native attack vectors?

Treat AI agents like services: include them in SBOMs, ownership, and incident response plans. Update detection rules to look for model‑call anomalies, rapid orchestrated cloud API sequences, and odd agent interaction patterns; capture LLM logs, agent traces, and correlated cloud control‑plane activity as mandatory forensic artifacts. Build CI/CD gates that run AI‑based vuln discovery and automated patch validation (defender agents) while also enforcing strict runtime controls and prompt/output auditing to reduce the chance an attacker can commandeer agent capabilities.

Sources & References (8)

1
Malware guidé par LLM : comment l'IA réduit le signal observable pour contourner les seuils EDR - IT SOCIAL
Check Point Research a démontré en environnement contrôlé qu'un assistant IA doté de capacités de navigation web peut être détourné en canal de commandement et contrôle (C2) furtif, sans clé API ni co...
2
Pipelines et vulnérabilités zero-day découvertes par l'IA
# Pipelines et vulnérabilités zero-day découvertes par l'IA Pipelines et vulnérabilités zero-day découvertes par l'IA Date de publication: 11 mai 2026 Temps de lecture: 8 min # Vulnérabilités zero...
3
OpenAI lance Daybreak, l'IA qui détecte et corrige les failles de sécurité en quelques minutes
OpenAI vient de dévoiler Daybreak, une plateforme qui mobilise ses modèles d’IA les plus puissants, dont GPT-5.5 et l’agent Codex, pour analyser des milliers de lignes de code, détecter les failles de...
4
OpenAI dégaine Daybreak : sa plateforme cybersécurité pour concurrencer Anthropic
OpenAI vient de lancer Daybreak, une plateforme de cybersécurité s'appuyant sur ses modèles GPT-5.5 et son agent Codex Security. L'objectif : rivaliser avec Anthropic dans la chasse aux vulnérabilités...
5
Adapter la sécurité à l'ère de l'IA agentique, une priorité en 2026
Par Netskope, 15 avril 2026 11:02 Du fait de leur capacité à interagir avec d'autres logiciels ou infrastructures, les systèmes d'IA agentiques pourraient constituer des cibles de choix pour les cybe...
6
OpenAI Daybreak : l’IA cyber qui défie Anthropic Mythos
# OpenAI Daybreak : l’IA cyber qui défie Anthropic Mythos Data / IA Daybreak et GPT-5.5-Cyber : L’arme de destruction massive des vulnérabilités logicielles? Par Laurent Delattre, publié le 12 mai ...
7
Cybersécurité : qu’est-ce que Daybreak, la nouvelle initiative d’OpenAI ?
Daybreak est une initiative d’OpenAI dédiée à la cyberdéfense qui regroupe ses modèles IA spécialisés, son agent Codex Security et un écosystème de partenaires de sécurité. Daybreak : une plateforme ...
8
L’IA peut-elle s’attaquer au cloud? Enseignements tirés de la construction d’un système multi-agents offensif autonome dans le cloud
Avant-propos Les capacités offensives des large language models (LLM, grands modèles de langage) n’étaient jusqu’à présent que des risques théoriques: ils étaient fréquemment évoqués lors de conféren...

Key Entities

💡

large language models

Concept

💡

AI agents

Concept

💡

sandbox escape

Concept

💡

zero-day vulnerabilities

Concept

💡

cloud orchestration

Concept

💡

GCP service account

Concept

💡

OpenBSD vulnerability (27-year-old)

Concept

📅

state-backed espionage campaign (2025 report)

Event

🏢

Anthropic

Org

🏢

OpenAI

Org

🏢

Check Point Research

Org

🏢

Google

Org

🏢

Netskope

Org

📦

Daybreak

Produit

Generated by CoreProse in 3m 12s

8 sources verified & cross-referenced 2,032 words 0 false citations

Share this article

X LinkedIn

Generated in 3m 12s

What topic do you want to cover?

Get the same quality with verified sources on any subject.

Google vs AI-Driven Exploits: How Autonomy, Agents and LLMs Are Rewriting Offensive Security

Key Takeaways

From “could this happen?” to “Google just stopped it”: why AI-driven exploitation is now real

How LLMs discover and weaponize vulnerabilities faster than your patch cycle

Agentic AI as attacker: multi-agent workflows, planning and cloud-scale operations

C2 through the front door: how LLM traffic and cloud services hide AI-driven attacks

Defensive AI stack: Daybreak, Mythos and AI-native vuln pipelines

Engineering for the era of AI-driven hacking: architecture, guardrails and operational playbooks

Conclusion: Google’s incident as your last early warning

Frequently Asked Questions

Sources & References (8)

Key Entities

What topic do you want to cover?

Continue reading

From Booth to Boardroom: How WAIC 2026 Exhibitors Can Showcase Production-Ready AI Systems

Infrastructure and Supply-Chain Strain from Large Language Models

Weekly AI Update: Inside OpenAI’s GPT‑5.6 Rollout and What It Means for You

MORPHEUS: A Persistent Enterprise Simulation Benchmark for Continual Reinforcement Learning