Key Takeaways
- Commercial LLMs are now core offensive tooling: DPRK‑linked HexagonalRodent used LLMs and copilots and reportedly stole over USD 12M in three months by automating job ads, developer workflows, and new malware families.
- AI‑scaled phishing and deepfake pipelines produce unique, context‑fit lures at scale, degrading template‑based detection; modern enterprises see >10,000 alerts/month with ~52% false positives and ~64% redundancy, driving analyst fatigue.
- Agentic AIs create persistent multi‑step attack loops and enable tool hijacking and memory poisoning; defenses must decouple LLM intent from side effects with per‑agent least‑privilege, strict input sanitization, and a policy enforcement layer.
- Effective mitigation requires end‑to‑end AI security: inventory model assets, log all model calls and tool usage, apply the “Rule of Two” for agents, and run continuous red‑teaming and AI‑SPM controls.
Commercial large language models (LLMs) now sit in the core tooling of both red‑teams and criminal groups. The same conversational APIs and copilots your engineers use are being scripted for phishing, malware iteration, deepfake scripts, and covert C2 that looks like normal assistant traffic.[9][1]
For ML and security engineers, this expands the threat surface: you are defending not just against bespoke malware and hand‑crafted phishing, but against programmable abuse of high‑capacity models wired into CI/CD, SaaS, and agent frameworks.[3][9]
💡 Mental model: Treat every commercial LLM—internal or external—as a shared cyber capability that adversaries can also automate against you.
A fintech security lead who enabled generative email assistance saw phishing suddenly mirror internal tone, threading, and calendar flows; traditional rule‑based filters missed it.[9]
This article explains how generative AI industrializes classic attacks, how agentic AI changes campaign economics, and what architectures you can deploy now.
1. From Niche Experiments to Industrialized AI-Assisted Offense
“AI‑assisted attacks” still map to phishing, malware, ATO, and fraud—but with new scale and personalization.[9] This is early‑stage industrialized cybercrime.
Attackers now use LLMs to:[9]
- Generate role‑ and company‑specific phishing in any language
- Iterate malware, droppers, and implants via coding copilots
- Script polished social‑engineering narratives and deepfake scripts
LLMs make scams more fluent and context‑fit, boosting BEC and phishing conversion:[9]
- Maintain conversation state and tone
- Adapt to victim responses and objections
- Produce unique lures at scale, defeating template‑based detection
📊 Deepfake + LLM convergence[9]
- Draft scripts for synthetic audio/video “approvals”
- Match internal jargon and recent events from public sources
- Help bypass voice‑based verification in banking/support
The LLM supplies linguistic and social‑engineering sophistication that many attackers lack.[9]
Advanced threats embed commercial copilots like ChatGPT and Cursor into malware workflows—for code generation, refactoring, debugging, and pretext content (fake websites, executive bios, investor decks).[10] DPRK‑linked “HexagonalRodent” reportedly stole over USD 12M in three months using AI‑generated job ads, VS Code tasks, and new malware families such as BeaverTail, OtterCookie, and InvisibleFerret.[10]
💼 Observed in the wild[10]
Incident responders found repos where attackers had:
- A polished “company” site built with a design copilot
- Onboarding docs and coding tests in flawless English
- Implant code commented like ChatGPT explanations
The social and developer experience looked like a real team’s work—built quickly with commercial tools.[10]
On defense, LLMs help SOCs summarize telemetry, correlate logs, and reduce overload.[5] But the same properties shorten attacker learning loops and lower the expertise needed for sophisticated operations.[5][9]
As LLMs move from passive chat to embedded tools and AI agents in CI/CD, SaaS, and proprietary apps, value shifts from one‑off prompts to instrumented pipelines with tight feedback loops.[3][11][12]
2. Concrete Attack Patterns Using Commercial LLMs
Thinking in terms of real workflows, not abstract “LLM abuse,” helps design defenses.
AI-enhanced phishing factories
A modern phishing pipeline typically:[9]
- Scrapes org structure, roles, and recent events.
- Prompts an LLM for tailored lures (dozens–thousands per scenario).
- Auto‑translates and tunes tone by geography and seniority.
- Uses the LLM again to craft dynamic replies to each victim.
Effects:[9]
- Each email is unique, evading template/signature filters.
- Follow‑ups and threading mimic real customer/internal communication.
- Email stacks see a long tail of “novel but coherent” messages.
⚠️ Impact: Rule‑based filters and static heuristics degrade; traffic looks like normal business email.
HexagonalRodent’s AI-structured kill chain
Expel’s tracking of HexagonalRodent illustrates AI‑scaled supply‑chain and developer‑targeted attacks:[10]
- High‑paying job ads generated and localized by LLMs
- “Code tests” implemented as VS Code tasks executing malware
- Fake corporate façade: AI‑built website, fabricated leadership
- Compromised VS Code extension for distribution
The LLM participates in:[10]
- Pretext crafting (ads, HR comms, onboarding)
- Technical malware development via copilots
- Rapid refinement of lures and docs from victim feedback
AI assistants as covert C2
Check Point Research showed web‑enabled assistants like Grok and Microsoft Copilot can be abused as stealth C2 channels.[1]
Pattern:[1]
- Malware issues innocuous queries (e.g., “summarize this URL”).
- The URL content encodes instructions for the attacker.
- The assistant fetches and “interprets” them, turning replies into C2.
- Exfiltrated data returns inside later assistant‑mediated HTTP requests.
📊 Key property:[1]
- No custom C2 infra; traffic is normal AI assistant usage.
- No direct attacker connection; C2 rides on assistant’s outbound calls.
- Often no explicit attacker API keys involved.
This is powerful because enterprise AI assistant traffic is:[1]
- Hard to block once widely adopted
- Lightly instrumented in SIEM/XDR
- Often treated as “trusted productivity traffic”
LLMs as reverse engineering copilot
Both sides use LLMs to shrink the gap from code/binaries to exploits:[5][7]
- Summarizing large codebases and calling out risky flows
- Explaining decompiled output and crash traces
- Generating PoC snippets and harnesses to test suspected bugs[5][7]
💡 Implication: If your code or configs leak, assume an LLM can turn them into actionable attack plans far faster than a junior analyst could.
All of these attacks ride on mainstream SaaS APIs and HTTP traffic, inheriting platform “legitimacy.” IP reputation, domain blocks, and protocol‑only detections lose effectiveness as primary controls.[1][9]
3. Agentic AI and the Automation of End-to-End Attacks
The move from stateless chat to agentic AI—LLMs that browse, call tools, use the Model Context Protocol (MCP), store memory, and act—creates qualitatively new risks.[3][11][12]
Where classic prompt injection targeted single answers, agents enable:[12]
- Multi‑step prompt injection and persistent memory poisoning
- Tool hijacking and privilege escalation via connectors
- Cascading failures across chained tools and agents
Enterprise guidance flags agents as prime targets because they already operate other systems.[11] Compromised prompts, policies, or connectors become general‑purpose remote ops channels.
⚠️ Agent-specific threats[3][12]
- Tool hijack & escalation: Mis‑binding a “search” intent to “execute SQL.”
- Memory poisoning: Storing malicious instructions or false beliefs.
- Chain‑of‑tool failures: Small deviations compounding through workflows.
- Agent supply chain attacks: Compromised frameworks, connectors, MCP tools.
Databricks notes that agents combining sensitive data, untrusted external inputs, and external actions resemble pre‑built attack chains awaiting prompt injection.[3]
Offensive agent loop
From the attacker’s view, agent frameworks automate full campaigns (recon → access → lateral movement → exfiltration):[3][12]
while True:
goals = update_goals(env_state)
plan = llm.plan(goals=goals, tools=tool_catalog)
for step in plan:
if not policy.allow(step):
continue
result = tools[step.tool].run(step.args)
memory.store(result)
if detect_access(memory):
exfiltrate(memory.snapshot())
If plans and memory are influenced by malicious inputs—docs, user messages, poisoned KB—this loop becomes persistent, adaptive probing.[3][11][12]
💡 Operational challenge: Most enterprises lack baselines, playbooks, and monitoring for real agent behavior. Guidance stresses explicit monitoring and hands‑on training to understand how agents actually interact with data and tools, not just design assumptions.[11][12]
4. LLM Security Fundamentals: What Makes Commercial Models Abusable
LLM security is end‑to‑end: models, data pipelines, infra, and interfaces from training to decommissioning.[2][4]
The OWASP Top 10 for LLM apps highlights:[2][4]
- Prompt injection (user‑ and data‑embedded)
- Training data poisoning
- Model and data theft
- Supply‑chain flaws in plugins, SDKs, frameworks
Key differences from classic software:[4]
- Non‑determinism: Same input can yield different outputs.
- Prompt layering: System, user, and hidden prompts interwoven.
- Executable output: Responses can contain code, shell, or SQL that looks plausible.
Hallucinations—plausible but incorrect outputs—provide cover for malicious content to slip through.[4]
Effective security combines:[2][4]
- Traditional controls: AuthZ, input validation, secure deployment, secrets hygiene.
- AI‑specific measures: Adversarial training, output filtering, behavior monitoring, red‑teaming.
- Strong input sanitization: Normalize encodings, strip homoglyphs, constrain what reaches tools.
AI Security Posture Management (AI‑SPM) tools are emerging to:[2]
- Inventory LLM assets and data flows
- Track risks and misconfigurations
- Enforce policies across clouds and environments
NIST’s AI Risk Management Framework calls out adversarial examples, data poisoning, and model/dataset exfiltration as central threats, not corner cases.[2][4]
💡 Design stance: Do not treat commercial LLM APIs as trusted black boxes. Treat them as partially adversarial components whose inputs, outputs, and training dependencies need explicit review and controls.[2][4]
5. Defensive Use of Commercial Models: SOC, Daybreak, and GPT‑5.5‑Cyber
The same LLMs fueling AI‑scaled attacks are transforming defensive operations and Enterprise AI.
Modern SOCs increasingly use LLMs as reasoning/orchestration layers over telemetry:[5]
- Ingest large volumes of heterogeneous logs
- Correlate with threat intel and historical incidents
- Produce high‑fidelity natural‑language summaries
This shifts scaling from analyst headcount to data quality and model orchestration.[5]
📊 Alert fatigue and AI triage[6]
Large orgs often see:
-
10,000 alerts/month from SIEM and related tools
- ~52% false positives and 64% redundant alerts
- Analyst fatigue and missed real incidents
Playbooks—automated sequences of detection, analysis, remediation—are now standard.[6] LLMs augment them by:[5][6]
- Enriching alerts with context and likely impact
- Normalizing/deduplicating similar events
- Proposing investigation steps and remediation actions
Daybreak and codified AI defense
OpenAI’s Daybreak bundles specialized models, the Codex Security agent, and partners to embed security earlier in the SDLC.[7]
Codex Security can:[7]
- Analyze codebases and track data flows across files
- Build editable threat models and attack paths
- Flag high‑impact vulnerabilities
- Generate and test patches in isolation, surfacing only reproducible issues
GPT‑5.5 and GPT‑5.5‑Cyber, via Trusted Access for Cyber (TAC), are positioned as core defender infrastructure:[8]
- Identity‑ and trust‑based access to advanced cyber capabilities
- Lower refusal rates for legitimate tasks (malware analysis, reverse engineering, detection engineering, patch validation)
- Guardrails to block misuse[8]
💼 Upside for small teams: These copilots function as “virtual senior analysts” for code review, threat modeling, and artifact analysis—if wrapped in strong governance, logging, and containment.[7][8]
6. Architectural and Implementation Patterns to Mitigate AI-Scaled Attacks
Mitigation depends on AI architectures that embed security from day one, not as bolt‑ons.
Databricks’ AI Security Framework and “Rule of Two for Agents” emphasize layered defenses:[3]
- Avoid combining sensitive data, untrusted inputs, and powerful external actions in one agent.
- Enforce strict per‑agent and per‑tool data access controls.
- Validate/sanitize all inputs before use.
- Constrain and review outputs before triggering side‑effectful tools.
These are containment controls: assume compromise is possible, limit blast radius.[3]
📊 Shift-left for AI security[2][4]
Best practices:
- Threat‑model prompts, tools, agents, and data flows early.
- Red‑team model behavior and agent policies.
- Simulate prompt‑injection, data‑poisoning, and exfiltration scenarios.
- Maintain AI‑specific incident response plans.
For agents, guidance stresses:[11][12]
- Continuous monitoring of real‑world behavior
- Clear visibility into which tools and datasets each agent can access
- Strategies assuming tool misuse, memory poisoning, and unintended data exfiltration, not just benign hallucinations
Policy layer for tool-calling agents
A robust pattern is inserting a policy layer between LLM “intent” and actual tool execution:[3][11]
def execute_tool_call(user, agent_id, tool_name, args, context):
decision = policy_engine.evaluate(
user=user,
agent_id=agent_id,
tool=tool_name,
args=args,
data_sensitivity=classify_data(context),
intent=llm_infer_intent(tool_name, args, context),
)
if not decision.allowed:
log_block(user, agent_id, tool_name, args, reason=decision.reason)
return {"error": "action_blocked"}
result = tools[tool_name].run(args)
audit_log(user, agent_id, tool_name, args, result)
return result
- Decouples LLM reasoning from side effects
- Enforces least privilege at the tool boundary
- Provides a clean hook for anomaly detection and forensics
⚠️ End-to-end protection[2][4]
Vendors like SentinelOne and Wiz stress that securing LLMs means securing:
- Training and fine‑tuning data
- Model artifacts and configuration
- Deployment infra and secrets
- Integrations, plugins, agents, and SaaS apps
Attackers will hit the weakest link—data poisoning, prompt tampering, or unsecured plugins—to exfiltrate data or alter behavior.[2][4]
ML and security engineers should fold commercial LLM usage into overall AI security posture by instrumenting:[2][4]
- Model calls (caller, purpose, latency, error/refusal rates)
- Data flows and tool usage
- AI‑specific alerts and incident workflows
Conclusion: Designing for a Shared AI Battlefield
Commercial LLMs have turned from niche tools into shared infrastructure for both attackers and defenders. Offensively, they industrialize phishing, malware development, deepfakes, and C2, and agentic AI automates multi‑step campaigns.[1][3][9][10][12] Defensively, the same capabilities can compress detection, investigation, and remediation cycles—if wrapped in strong governance and containment.[5][7][8]
For ML and security engineers, the path forward is to:
- Treat LLMs as partially adversarial components, not trusted utilities.[2][4]
- Architect agent and assistant systems with strict policies, monitoring, and least privilege from the outset.[3][11][12]
- Integrate AI security into the SDLC and SOC workflows, including red‑teaming and AI‑specific incident response.[2][4][5][7]
In a world where attackers and defenders share the same AI stack, advantage goes to teams that understand these models deeply, instrument them rigorously, and design their architectures assuming intelligent abuse—not just accidental error.
Frequently Asked Questions
What immediate architectural controls stop LLM‑powered attacks?
How should incident response change for agentic AI threats?
Can commercial LLMs be used safely in CI/CD and developer tooling?
Sources & References (10)
- 1Malware guidé par LLM : comment l'IA réduit le signal observable pour contourner les seuils EDR - IT SOCIAL
Check Point Research a démontré en environnement contrôlé qu'un assistant IA doté de capacités de navigation web peut être détourné en canal de commandement et contrôle (C2) furtif, sans clé API ni co...
- 2Sécurité des LLM en entreprise : risques et bonnes pratiques | Wiz
Sécurité des LLM en entreprise : risques et bonnes pratiques Points clés sur la sécurité des LLM - La sécurité des LLM est une discipline de bout en bout qui protège les modèles, les pipelines de do...
- 3Atténuer le risque d'injection de prompt pour les agents IA sur Databricks | Databricks Blog
Résumé - Les agents d'IA autonomes ont besoin de données sensibles, d'entrées non fiables et d'actions externes pour être utiles, mais la combinaison de ces trois éléments crée des chaînes d'attaque ...
- 4Quels sont les risques de sécurité des LLM? Et comment les atténuer
Auteur: SentinelOne Mis à jour: October 24, 2025 Qu'est-ce que les grands modèles de langage et quels sont les risques de sécurité des LLM? Les grands modèles de langage (LLM) sont des systèmes d’IA...
- 5Du triage réactif à la défense autonome : Pourquoi l'intégration des LLM redéfinit le plafond opérationnel du SOC
Pendant des décennies, l'industrie de la cybersécurité a fonctionné sous une contrainte fondamentale : la défense était une fonction linéaire de l'effectif humain et de l'expertise spécialisée. Nous p...
- 6Comment gérer les Faux-Positifs dans un SOC
Le SIEM est l’un des outils les plus importants dans la lutte contre les cyber-attaques, mais avec l’augmentation du volume des données en provenance des différents équipements, le traitement des inci...
- 7Cybersécurité : qu’est-ce que Daybreak, la nouvelle initiative d’OpenAI ?
Daybreak est une initiative lancée par OpenAI pour la cyberdéfense qui regroupe ses modèles IA spécialisés, son agent Codex Security et un écosystème de partenaires de sécurité. L’objectif est d’intég...
- 8Scaling Trusted Access for Cyber with GPT-5.5 and GPT-5.5-Cyber
# Scaling Trusted Access for Cyber with GPT‑5.5 and GPT‑5.5‑Cyber How our latest models help each layer of the defensive ecosystem and accelerate the security flywheel. For years we’ve been chronicl...
- 9Quels sont les principaux cyberattaques et escroqueries assistées par l’IA ?
SIEM & EDR janvier 05, 2026 Les menaces assistées par l’IA ne sont pas un nouveau genre d’attaques. Il s’agit de tactiques familières – phishing, fraude, prise de contrôle de compte et livraison de ...
- 10Le groupe de hackers nord-coréen “HexagonalRodent” utilise l’IA pour lancer des attaques à grande échelle contre les développeurs Web3, volant plus de 12 millions de dollars d’actifs cryptographiques en trois mois.
Selon un rapport de recherche publié par la société de cybersécurité Expel, celle-ci suit actuellement un groupe APT évalué comme étant soutenu par la Corée du Nord (DPRK), nommé "HexagonalRodent", qu...
Key Entities
Generated by CoreProse in 3m 31s
What topic do you want to cover?
Get the same quality with verified sources on any subject.