[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"kb-article-inside-the-first-ai-crafted-zero-day-how-google-blocked-a-2fa-bypass-and-what-it-means-for-your-llm-security-stack-en":3,"ArticleBody_BBkxk3Fk9V1AaAtMPg7pIcFHyL5s2AbZv4EDNT5g":207},{"article":4,"relatedArticles":177,"locale":65},{"id":5,"title":6,"slug":7,"content":8,"htmlContent":9,"excerpt":10,"category":11,"tags":12,"metaDescription":10,"wordCount":13,"readingTime":14,"publishedAt":15,"sources":16,"sourceCoverage":58,"transparency":59,"seo":62,"language":65,"featuredImage":66,"featuredImageCredit":67,"isFreeGeneration":71,"trendSlug":72,"niche":73,"geoTakeaways":76,"geoFaq":85,"entities":95},"6a1740d9cdbfc0b804a68a63","Inside the First AI‑Crafted Zero‑Day: How Google Blocked a 2FA Bypass and What It Means for Your LLM Security Stack","inside-the-first-ai-crafted-zero-day-how-google-blocked-a-2fa-bypass-and-what-it-means-for-your-llm-security-stack","An [AI system](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FDiella_(AI_system)) recently autonomously assembled a working zero‑day exploit to bypass 2FA on an open‑source admin tool—then ran into a [Google‑grade detection pipeline](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FList_of_pipeline_accidents_in_the_United_States_in_the_1970s) and was stopped.\n\nThis aligns three visible trends:\n\n- Nation‑state operators using public LLMs for recon and scripting. [2]  \n- Offensive models autonomously discovering and chaining vulnerabilities. [8]  \n- AI‑native malware and [C2](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FC2) channels optimized around LLM behavior. [3][9]  \n\nSecurity and ML teams now must defend **against** AI‑driven attackers while securing **inside** AI‑driven systems. Treating LLMs as “just another tool” in SOC or DevSecOps is no longer tenable. [5][6]\n\n---\n\n## 1. From Hypothetical to Real: Why an AI‑Crafted Zero‑Day 2FA Bypass Matters\n\nNation‑states and advanced crime groups already use public LLMs for:\n\n- Protocol and standards analysis.  \n- Scripting and code assistance.  \n- Research on high‑value targets. [2]  \n\n[Microsoft](\u002Fentities\u002F69ea7cace1ca17caac372ea9-microsoft) has observed groups like Forest Blizzard and Salmon Typhoon querying GenAI for satellite, radar, and technical stack details, then using it to refine code and campaigns. [2]\n\n[Anthropic](\u002Fentities\u002F69d05cf64eea09eba3dfcc08-anthropic)’s Mythos Preview model showed that capable LLMs can: [8]\n\n- Ingest large codebases and binaries.  \n- Discover **thousands** of zero‑days, including very old bugs.  \n- Autonomously chain issues (e.g., four bugs into a browser sandbox escape).  \n\nReal‑world pattern:\n\n- A fintech red team used an internal LLM to move from “suspicious SSO plugin” to working account‑takeover PoC in under a day—code review, exploit sketching, and payload tweaks handled largely by the model. [8][10]  \n\nAnthropic’s later cloud PoC advanced this to a multi‑agent system that: [10]\n\n- Autonomously executed 80–90% of a penetration campaign on a misconfigured [GCP](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FGCP) environment.  \n- Covered asset discovery, lateral movement, and privilege escalation with humans mainly supervising.  \n\nSimultaneously, AI‑native malware research anticipates: [3]\n\n- Embedded LLMs driving self‑modifying code.  \n- Environment‑aware evasion and tactic switching.  \n- Continuous payload evolution faster than traditional SIEM rule updates.  \n\nC2 is shifting as well:\n\n- [Check Point](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FCheck_Point) showed assistants like [Grok](\u002Fentities\u002F6a0b3ab61f0b27c1f426e46f-grok) and [Copilot](\u002Fentities\u002F6a0b3ab61f0b27c1f426e46e-copilot) can be turned into covert C2 channels via web‑fetch features—no dedicated C2 infra, no attacker‑owned API key, and traffic that looks like normal AI usage. [9]\n\nWhy this 2FA bypass matters:\n\n- OWASP’s LLM Top 10 highlights prompt injection, model abuse, and AI‑specific exfiltration—risks not covered by classic AppSec. [6]  \n- Regulators such as CNIL require 72‑hour notification for AI‑related breaches impacting auth and admin consoles. [6]  \n- The first blocked AI‑crafted zero‑day against an open‑source admin tool is a **concrete milestone** in LLMs becoming standard offensive components for nation‑states and cybercrime. [2][8][10]  \n\nOffense has moved from “AI assistant” to “AI operator.” Your defenses—and your own LLM stack—must assume that.\n\n---\n\n## 2. How LLMs Craft Zero‑Day Exploits and 2FA Bypasses in Practice\n\nMythos‑style models show that modern LLMs can: [8]\n\n- Ingest entire repos and binaries.  \n- Reason about edge cases and undefined behavior.  \n- Propose exploit primitives and chains quickly and at scale.  \n\n### A likely AI attacker workflow against 2FA\n\n1. **Code and config ingestion**\n\n   - Clone the open‑source admin tool.  \n   - Extract auth middleware, 2FA handlers, session logic.  \n   - Ask the model to map login → token issuance → 2FA challenge → session persistence. [8]\n\n2. **Vulnerability hypothesis generation**\n\n   - Enumerate plausible 2FA bypass paths: missing CSRF, weak binding, token reuse, flawed “remember device” logic, backup code abuse. [8]  \n   - Cluster issues by exploitability and prerequisites.\n\n3. **Exploit primitive synthesis**\n\n   Echoing Mythos’ vulnerability chaining, an LLM might combine: [8]\n\n   - Session fixation on pre‑auth routes.  \n   - 2FA tokens bound to sessions, not identities.  \n   - CSRF on verification endpoints.  \n   - Over‑permissive recovery or device‑remember flows.\n\n4. **Payload generation and iteration**\n\n   AI‑native malware work suggests embedded LLMs will: [3]\n\n   - Adjust headers, ordering, and timing based on responses.  \n   - Mutate payloads to evade WAF rules and anomaly models.  \n   - Re‑plan when defenses partially block a chain. [3]\n\n5. **Live tuning via LLM‑enabled C2**\n\n   Using LLMs with web‑fetch as relays, attackers can: [9]\n\n   - Hide instructions inside URLs and benign prompts.  \n   - Tune exploit parameters in real time without visible C2 infra.  \n   - Blend traffic into normal “assistant” usage. [9]\n\nKey implication:\n\n- AI doesn’t introduce new vulnerability classes but compresses discovery → weaponization → deployment into hours, then scales replication once a pattern works. [3][8][10]  \n- Expect more frequent, better‑tuned attacks on 2FA and SSO, rapidly ported across similar stacks. [8][10]  \n\n---\n\n## 3. Inside a Google‑Style Detection Stack: LLM‑Augmented SIEM and UEBA\n\nStopping an AI‑crafted 2FA bypass requires telemetry and analytics operating near attacker speed.\n\nAugmented SIEM architectures typically combine: [1]\n\n- Traditional correlation rules and signatures.  \n- UEBA models that learn user, device, and service baselines.  \n- LLM layers that summarize, hypothesize, and propose new rules.  \n\nThese systems correlate events from identity providers, web frontends, admin tools, and infra, and flag patterns like anomalous token issuance or admin flows. [1]\n\nMicrosoft’s GenAI–SIEM experiments found LLMs can: [2]\n\n- Summarize complex alert clusters.  \n- Suggest likely attack paths and root causes.  \n- Propose candidate detection rules from natural‑language TTPs.  \n\n### How a 2FA bypass shows up in an LLM‑augmented SIEM\n\nAt scale, a zero‑day 2FA bypass produces recognizable side‑effects: [1]\n\n- Spikes in failed or incomplete 2FA flows by IP range, ASN, or group.  \n- Session tokens with no matching 2FA events or with abnormal device\u002Fbrowser fingerprints.  \n- Sudden high‑risk admin actions (role grants, policy changes) by accounts with no history of such activity.  \n\nUEBA flags deviations, such as: [1]\n\n- A dormant engineer account doing bulk 2FA resets at 3 AM from a new ASN.  \n\nLLMs then: [1][2][9]\n\n- Explain anomalies in analyst‑friendly language.  \n- Correlate with known AI‑assisted behavior, including unusual assistant traffic tied to the same identity.  \n- Draft new correlation rules or enrichment flows for analyst review.\n\nFor sensitive environments using on‑prem AI (e.g., Codex via Dell’s AI Factory), detection agents can run **adjacent to** critical data and admin services, enabling: [4]\n\n- Low‑latency correlation and blocking.  \n- Less exposure of raw auth logs outside the perimeter. [4]\n\nCritical nuance:\n\n- LLM agents are **attack surfaces** and must themselves be monitored: [5][7][9]  \n  - Abnormal tool calls (e.g., repeated 2FA resets).  \n  - Unexpected access to admin consoles.  \n  - Prompts\u002Fresponses suggesting jailbreak or C2 behavior.  \n\nA Google‑style stack combines SIEM, UEBA, and LLM‑based triage, then escalates to automated containment:\n\n- Token revocation.  \n- Forced re‑authentication.  \n- Temporary blocks on suspicious admin flows. [1][2]  \n\n---\n\n## 4. Building Pipelines for AI‑Discovered Zero‑Days: Detection to Patch\n\nCatching an AI‑crafted exploit is only the start; the race is to mitigate the underlying zero‑day before it’s retried with improved variants.\n\nMythos and GitLab analyses indicate: [8]\n\n- AI can find vulnerabilities faster than teams patch.  \n- About one‑third of exploited CVEs in early 2025 were hit on or before disclosure day.  \n- AI accelerates both volume and speed of new findings. [8]\n\n### Pipeline design for AI‑found zero‑days\n\nKey components:\n\n1. **Automated exploitability and impact classification**\n\n   - Use models to label issue types (RCE, auth logic, info‑leak). [8]  \n   - Attach business impact scores (e.g., 2FA bypass in production admin → critical, externally exploitable). [1][8]\n\n2. **Ownership routing and escalation**\n\n   - Route auth\u002F2FA vulnerabilities directly to identity\u002Fplatform security with strict SLAs. [1][8]  \n\n3. **Rule and model generation**\n\n   To keep pace with AI‑driven malware, defenders must automate going from “idea” to deployable detection: [3]\n\n   - Auto‑generate SIEM rules from observed anomalous auth patterns. [1][3]  \n   - Update UEBA baselines and models to reflect new attack paths. [1]  \n\n4. **Continuous offensive testing**\n\n   Mirror Anthropic‑style agentic systems internally: [10]\n\n   - Red‑team agents in CI\u002FCD that probe 2FA flows, SSO, admin consoles, and recovery flows.  \n   - Automated replay of known exploit patterns against staging.  \n\n5. **Incident response for AI incidents**\n\n   LLM‑specific IR playbooks should include: [5]\n\n   - Capturing prompts and responses involved in the exploit.  \n   - Tracing which agents and data sources were used.  \n   - Tightening guardrails, tools, and access for implicated agents.  \n\n6. **Governance and auditability**\n\n   Operational AI security guidance tied to GDPR\u002FAI Act stresses: [6]\n\n   - Logging AI‑driven decisions, exploit analyses, and auto‑mitigations.  \n   - Ensuring you can meet 72‑hour breach notification and audit demands.  \n\nBy combining AI‑driven discovery, prioritized routing, automated detection, continuous testing, and compliant logging, organizations can shrink time‑to‑mitigation from weeks to days—or hours. [3][6][8]\n\n---\n\n## 5. Securing Your Own LLMs, Agents, and Admin Tools in the Crossfire\n\nIn attacks like the 2FA bypass attempt, your **defensive** LLM stack can become the attacker’s pivot.\n\nThe LLM security risk guide highlights key attack surfaces: [5]\n\n- User prompts and uploads.  \n- Internal RAG sources and vector stores.  \n- Tools\u002Fplugins and APIs—especially those touching admin and auth.  \n\nPoorly constrained, any of these can lead directly to admin consoles or 2FA settings. [5]\n\n### Policy and governance gaps\n\nCurrent compliance guidance notes: [6]\n\n- ~74% of enterprises lack AI‑specific security policies.  \n- Classic controls ignore prompt injection, data poisoning, and tool abuse.  \n\nWhen LLMs are wired into admin tools, ticketing systems, or ChatOps that can trigger 2FA or role changes, you effectively place a semi‑autonomous agent next to your highest‑risk controls. [5][6]\n\nJailbreaking research shows: [7]\n\n- Crafted prompts and hidden HTML instructions can override safety training.  \n- LLM‑based email filters, log viewers, and console assistants can be turned into vectors when they parse attacker‑controlled content.  \n\nExamples apply to: [5][7]\n\n- Log viewers powered by LLMs.  \n- Admin console “AI assistants.”  \n- ChatOps bots allowed to run 2FA resets or grant roles.  \n\nCheck Point’s C2 work underscores that: [9]\n\n- AI traffic is often trusted and under‑monitored.  \n- Assistants with backend or log access but weak egress control can serve as ideal C2 relays and exploit tuners.  \n\n### Layered defenses for your AI stack\n\nTreat LLMs and agents as primary security assets:\n\n- **Prompt filtering\u002Fsanitization**  \n  - Strip or neutralize jailbreak patterns and untrusted markup before model input. [5][7]  \n\n- **Tool‑use allowlists**  \n  - Enumerate allowed APIs; exclude 2FA reset and high‑risk admin calls unless absolutely needed. [5]  \n\n- **Scoped admin APIs**  \n  - Enforce fine‑grained RBAC, contextual checks, and strong audit for LLM‑triggered admin actions. [5][6]  \n\n- **Behavioral monitoring of agents**  \n  - UEBA‑style analytics for tool‑call sequences, admin action frequency, and unusual access. [1][5]  \n\n- **Detection of jailbreak and C2‑like usage**  \n  - Flag sessions with prompt‑injection markers or repeated web‑fetches to suspicious domains. [7][9]  \n\n---\n\n## 6. Implementation Roadmap for Security & ML Engineers\n\nTo operationalize this, you need a staged program spanning security, ML, and compliance.\n\n### Step 1: Extend your threat model with AI‑specific risks\n\nExplicitly incorporate: [5][6][8][9]\n\n- AI‑assisted zero‑day discovery against your codebases. [8]  \n- Prompt injection against LLM agents with access to admin or IR tools. [5]  \n- C2 over trusted assistant\u002Fcopilot traffic. [6][9]  \n\nMap where LLMs intersect with auth, 2FA, SSO, and admin paths.\n\n### Step 2: Integrate LLM‑augmented analytics into SIEM\u002FUEBA\n\nBased on SIEM augmentation work and Microsoft experiments: [1][2]\n\n- Feed login, 2FA, SSO, and admin logs into UEBA.  \n- Layer LLMs for alert summarization, root‑cause hypotheses, and auto‑drafted rules.  \n- Focus on anomalous 2FA events, privilege escalations, and admin session anomalies.  \n\nQuick win: [1][2]\n\n- Use an LLM to propose SIEM rules for “impossible travel” during 2FA enrollment or bulk 2FA resets, then refine with analysts.\n\n### Step 3: Deploy on‑prem or hybrid AI for sensitive workloads\n\nWhere possible, deploy on‑prem\u002Fhybrid agents like Codex near your most sensitive systems: [4]\n\n- Local code review and exploit triage for auth components.  \n- On‑box anomaly detection over 2FA and admin logs.  \n- Reduced exposure of sensitive data to public APIs. [4][6]\n\n---\n\n## Conclusion: Meeting AI‑Driven Attackers at Their Own Speed\n\nAI‑crafted exploits and AI‑assisted attackers are now operational reality. The blocked 2FA bypass zero‑day shows both risk and opportunity:\n\n- Offensive models can rapidly find and weaponize flaws. [3][8][10]  \n- Defensive stacks that fuse SIEM, UEBA, and LLMs can detect and contain them—if designed with AI in mind. [1][2][5]  \n\nBy extending your threat model, augmenting analytics with LLMs, deploying carefully scoped local agents, and securing your own LLM stack as a high‑value target, you can keep pace with AI‑driven attackers instead of becoming their next pivot. [5][6][8]","\u003Cp>An \u003Ca href=\"https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FDiella_(AI_system)\" class=\"wiki-link\" target=\"_blank\" rel=\"noopener\">AI system\u003C\u002Fa> recently autonomously assembled a working zero‑day exploit to bypass 2FA on an open‑source admin tool—then ran into a \u003Ca href=\"https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FList_of_pipeline_accidents_in_the_United_States_in_the_1970s\" class=\"wiki-link\" target=\"_blank\" rel=\"noopener\">Google‑grade detection pipeline\u003C\u002Fa> and was stopped.\u003C\u002Fp>\n\u003Cp>This aligns three visible trends:\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Nation‑state operators using public LLMs for recon and scripting. \u003Ca href=\"#source-2\" class=\"citation-link\" title=\"View source [2]\">[2]\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>Offensive models autonomously discovering and chaining vulnerabilities. \u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>AI‑native malware and \u003Ca href=\"https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FC2\" class=\"wiki-link\" target=\"_blank\" rel=\"noopener\">C2\u003C\u002Fa> channels optimized around LLM behavior. \u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa>\u003Ca href=\"#source-9\" class=\"citation-link\" title=\"View source [9]\">[9]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>Security and ML teams now must defend \u003Cstrong>against\u003C\u002Fstrong> AI‑driven attackers while securing \u003Cstrong>inside\u003C\u002Fstrong> AI‑driven systems. Treating LLMs as “just another tool” in SOC or DevSecOps is no longer tenable. \u003Ca href=\"#source-5\" class=\"citation-link\" title=\"View source [5]\">[5]\u003C\u002Fa>\u003Ca href=\"#source-6\" class=\"citation-link\" title=\"View source [6]\">[6]\u003C\u002Fa>\u003C\u002Fp>\n\u003Chr>\n\u003Ch2>1. From Hypothetical to Real: Why an AI‑Crafted Zero‑Day 2FA Bypass Matters\u003C\u002Fh2>\n\u003Cp>Nation‑states and advanced crime groups already use public LLMs for:\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Protocol and standards analysis.\u003C\u002Fli>\n\u003Cli>Scripting and code assistance.\u003C\u002Fli>\n\u003Cli>Research on high‑value targets. \u003Ca href=\"#source-2\" class=\"citation-link\" title=\"View source [2]\">[2]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>\u003Ca href=\"\u002Fentities\u002F69ea7cace1ca17caac372ea9-microsoft\">Microsoft\u003C\u002Fa> has observed groups like Forest Blizzard and Salmon Typhoon querying GenAI for satellite, radar, and technical stack details, then using it to refine code and campaigns. \u003Ca href=\"#source-2\" class=\"citation-link\" title=\"View source [2]\">[2]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cp>\u003Ca href=\"\u002Fentities\u002F69d05cf64eea09eba3dfcc08-anthropic\">Anthropic\u003C\u002Fa>’s Mythos Preview model showed that capable LLMs can: \u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Ingest large codebases and binaries.\u003C\u002Fli>\n\u003Cli>Discover \u003Cstrong>thousands\u003C\u002Fstrong> of zero‑days, including very old bugs.\u003C\u002Fli>\n\u003Cli>Autonomously chain issues (e.g., four bugs into a browser sandbox escape).\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>Real‑world pattern:\u003C\u002Fp>\n\u003Cul>\n\u003Cli>A fintech red team used an internal LLM to move from “suspicious SSO plugin” to working account‑takeover PoC in under a day—code review, exploit sketching, and payload tweaks handled largely by the model. \u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003Ca href=\"#source-10\" class=\"citation-link\" title=\"View source [10]\">[10]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>Anthropic’s later cloud PoC advanced this to a multi‑agent system that: \u003Ca href=\"#source-10\" class=\"citation-link\" title=\"View source [10]\">[10]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Autonomously executed 80–90% of a penetration campaign on a misconfigured \u003Ca href=\"https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FGCP\" class=\"wiki-link\" target=\"_blank\" rel=\"noopener\">GCP\u003C\u002Fa> environment.\u003C\u002Fli>\n\u003Cli>Covered asset discovery, lateral movement, and privilege escalation with humans mainly supervising.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>Simultaneously, AI‑native malware research anticipates: \u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Embedded LLMs driving self‑modifying code.\u003C\u002Fli>\n\u003Cli>Environment‑aware evasion and tactic switching.\u003C\u002Fli>\n\u003Cli>Continuous payload evolution faster than traditional SIEM rule updates.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>C2 is shifting as well:\u003C\u002Fp>\n\u003Cul>\n\u003Cli>\u003Ca href=\"https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FCheck_Point\" class=\"wiki-link\" target=\"_blank\" rel=\"noopener\">Check Point\u003C\u002Fa> showed assistants like \u003Ca href=\"\u002Fentities\u002F6a0b3ab61f0b27c1f426e46f-grok\">Grok\u003C\u002Fa> and \u003Ca href=\"\u002Fentities\u002F6a0b3ab61f0b27c1f426e46e-copilot\">Copilot\u003C\u002Fa> can be turned into covert C2 channels via web‑fetch features—no dedicated C2 infra, no attacker‑owned API key, and traffic that looks like normal AI usage. \u003Ca href=\"#source-9\" class=\"citation-link\" title=\"View source [9]\">[9]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>Why this 2FA bypass matters:\u003C\u002Fp>\n\u003Cul>\n\u003Cli>OWASP’s LLM Top 10 highlights prompt injection, model abuse, and AI‑specific exfiltration—risks not covered by classic AppSec. \u003Ca href=\"#source-6\" class=\"citation-link\" title=\"View source [6]\">[6]\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>Regulators such as CNIL require 72‑hour notification for AI‑related breaches impacting auth and admin consoles. \u003Ca href=\"#source-6\" class=\"citation-link\" title=\"View source [6]\">[6]\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>The first blocked AI‑crafted zero‑day against an open‑source admin tool is a \u003Cstrong>concrete milestone\u003C\u002Fstrong> in LLMs becoming standard offensive components for nation‑states and cybercrime. \u003Ca href=\"#source-2\" class=\"citation-link\" title=\"View source [2]\">[2]\u003C\u002Fa>\u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003Ca href=\"#source-10\" class=\"citation-link\" title=\"View source [10]\">[10]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>Offense has moved from “AI assistant” to “AI operator.” Your defenses—and your own LLM stack—must assume that.\u003C\u002Fp>\n\u003Chr>\n\u003Ch2>2. How LLMs Craft Zero‑Day Exploits and 2FA Bypasses in Practice\u003C\u002Fh2>\n\u003Cp>Mythos‑style models show that modern LLMs can: \u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Ingest entire repos and binaries.\u003C\u002Fli>\n\u003Cli>Reason about edge cases and undefined behavior.\u003C\u002Fli>\n\u003Cli>Propose exploit primitives and chains quickly and at scale.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Ch3>A likely AI attacker workflow against 2FA\u003C\u002Fh3>\n\u003Col>\n\u003Cli>\n\u003Cp>\u003Cstrong>Code and config ingestion\u003C\u002Fstrong>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Clone the open‑source admin tool.\u003C\u002Fli>\n\u003Cli>Extract auth middleware, 2FA handlers, session logic.\u003C\u002Fli>\n\u003Cli>Ask the model to map login → token issuance → 2FA challenge → session persistence. \u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003C\u002Fli>\n\u003Cli>\n\u003Cp>\u003Cstrong>Vulnerability hypothesis generation\u003C\u002Fstrong>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Enumerate plausible 2FA bypass paths: missing CSRF, weak binding, token reuse, flawed “remember device” logic, backup code abuse. \u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>Cluster issues by exploitability and prerequisites.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003C\u002Fli>\n\u003Cli>\n\u003Cp>\u003Cstrong>Exploit primitive synthesis\u003C\u002Fstrong>\u003C\u002Fp>\n\u003Cp>Echoing Mythos’ vulnerability chaining, an LLM might combine: \u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Session fixation on pre‑auth routes.\u003C\u002Fli>\n\u003Cli>2FA tokens bound to sessions, not identities.\u003C\u002Fli>\n\u003Cli>CSRF on verification endpoints.\u003C\u002Fli>\n\u003Cli>Over‑permissive recovery or device‑remember flows.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003C\u002Fli>\n\u003Cli>\n\u003Cp>\u003Cstrong>Payload generation and iteration\u003C\u002Fstrong>\u003C\u002Fp>\n\u003Cp>AI‑native malware work suggests embedded LLMs will: \u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Adjust headers, ordering, and timing based on responses.\u003C\u002Fli>\n\u003Cli>Mutate payloads to evade WAF rules and anomaly models.\u003C\u002Fli>\n\u003Cli>Re‑plan when defenses partially block a chain. \u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003C\u002Fli>\n\u003Cli>\n\u003Cp>\u003Cstrong>Live tuning via LLM‑enabled C2\u003C\u002Fstrong>\u003C\u002Fp>\n\u003Cp>Using LLMs with web‑fetch as relays, attackers can: \u003Ca href=\"#source-9\" class=\"citation-link\" title=\"View source [9]\">[9]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Hide instructions inside URLs and benign prompts.\u003C\u002Fli>\n\u003Cli>Tune exploit parameters in real time without visible C2 infra.\u003C\u002Fli>\n\u003Cli>Blend traffic into normal “assistant” usage. \u003Ca href=\"#source-9\" class=\"citation-link\" title=\"View source [9]\">[9]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003C\u002Fli>\n\u003C\u002Fol>\n\u003Cp>Key implication:\u003C\u002Fp>\n\u003Cul>\n\u003Cli>AI doesn’t introduce new vulnerability classes but compresses discovery → weaponization → deployment into hours, then scales replication once a pattern works. \u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa>\u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003Ca href=\"#source-10\" class=\"citation-link\" title=\"View source [10]\">[10]\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>Expect more frequent, better‑tuned attacks on 2FA and SSO, rapidly ported across similar stacks. \u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003Ca href=\"#source-10\" class=\"citation-link\" title=\"View source [10]\">[10]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Chr>\n\u003Ch2>3. Inside a Google‑Style Detection Stack: LLM‑Augmented SIEM and UEBA\u003C\u002Fh2>\n\u003Cp>Stopping an AI‑crafted 2FA bypass requires telemetry and analytics operating near attacker speed.\u003C\u002Fp>\n\u003Cp>Augmented SIEM architectures typically combine: \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Traditional correlation rules and signatures.\u003C\u002Fli>\n\u003Cli>UEBA models that learn user, device, and service baselines.\u003C\u002Fli>\n\u003Cli>LLM layers that summarize, hypothesize, and propose new rules.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>These systems correlate events from identity providers, web frontends, admin tools, and infra, and flag patterns like anomalous token issuance or admin flows. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cp>Microsoft’s GenAI–SIEM experiments found LLMs can: \u003Ca href=\"#source-2\" class=\"citation-link\" title=\"View source [2]\">[2]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Summarize complex alert clusters.\u003C\u002Fli>\n\u003Cli>Suggest likely attack paths and root causes.\u003C\u002Fli>\n\u003Cli>Propose candidate detection rules from natural‑language TTPs.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Ch3>How a 2FA bypass shows up in an LLM‑augmented SIEM\u003C\u002Fh3>\n\u003Cp>At scale, a zero‑day 2FA bypass produces recognizable side‑effects: \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Spikes in failed or incomplete 2FA flows by IP range, ASN, or group.\u003C\u002Fli>\n\u003Cli>Session tokens with no matching 2FA events or with abnormal device\u002Fbrowser fingerprints.\u003C\u002Fli>\n\u003Cli>Sudden high‑risk admin actions (role grants, policy changes) by accounts with no history of such activity.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>UEBA flags deviations, such as: \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>A dormant engineer account doing bulk 2FA resets at 3 AM from a new ASN.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>LLMs then: \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-2\" class=\"citation-link\" title=\"View source [2]\">[2]\u003C\u002Fa>\u003Ca href=\"#source-9\" class=\"citation-link\" title=\"View source [9]\">[9]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Explain anomalies in analyst‑friendly language.\u003C\u002Fli>\n\u003Cli>Correlate with known AI‑assisted behavior, including unusual assistant traffic tied to the same identity.\u003C\u002Fli>\n\u003Cli>Draft new correlation rules or enrichment flows for analyst review.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>For sensitive environments using on‑prem AI (e.g., Codex via Dell’s AI Factory), detection agents can run \u003Cstrong>adjacent to\u003C\u002Fstrong> critical data and admin services, enabling: \u003Ca href=\"#source-4\" class=\"citation-link\" title=\"View source [4]\">[4]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Low‑latency correlation and blocking.\u003C\u002Fli>\n\u003Cli>Less exposure of raw auth logs outside the perimeter. \u003Ca href=\"#source-4\" class=\"citation-link\" title=\"View source [4]\">[4]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>Critical nuance:\u003C\u002Fp>\n\u003Cul>\n\u003Cli>LLM agents are \u003Cstrong>attack surfaces\u003C\u002Fstrong> and must themselves be monitored: \u003Ca href=\"#source-5\" class=\"citation-link\" title=\"View source [5]\">[5]\u003C\u002Fa>\u003Ca href=\"#source-7\" class=\"citation-link\" title=\"View source [7]\">[7]\u003C\u002Fa>\u003Ca href=\"#source-9\" class=\"citation-link\" title=\"View source [9]\">[9]\u003C\u002Fa>\n\u003Cul>\n\u003Cli>Abnormal tool calls (e.g., repeated 2FA resets).\u003C\u002Fli>\n\u003Cli>Unexpected access to admin consoles.\u003C\u002Fli>\n\u003Cli>Prompts\u002Fresponses suggesting jailbreak or C2 behavior.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>A Google‑style stack combines SIEM, UEBA, and LLM‑based triage, then escalates to automated containment:\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Token revocation.\u003C\u002Fli>\n\u003Cli>Forced re‑authentication.\u003C\u002Fli>\n\u003Cli>Temporary blocks on suspicious admin flows. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-2\" class=\"citation-link\" title=\"View source [2]\">[2]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Chr>\n\u003Ch2>4. Building Pipelines for AI‑Discovered Zero‑Days: Detection to Patch\u003C\u002Fh2>\n\u003Cp>Catching an AI‑crafted exploit is only the start; the race is to mitigate the underlying zero‑day before it’s retried with improved variants.\u003C\u002Fp>\n\u003Cp>Mythos and GitLab analyses indicate: \u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>AI can find vulnerabilities faster than teams patch.\u003C\u002Fli>\n\u003Cli>About one‑third of exploited CVEs in early 2025 were hit on or before disclosure day.\u003C\u002Fli>\n\u003Cli>AI accelerates both volume and speed of new findings. \u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Ch3>Pipeline design for AI‑found zero‑days\u003C\u002Fh3>\n\u003Cp>Key components:\u003C\u002Fp>\n\u003Col>\n\u003Cli>\n\u003Cp>\u003Cstrong>Automated exploitability and impact classification\u003C\u002Fstrong>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Use models to label issue types (RCE, auth logic, info‑leak). \u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>Attach business impact scores (e.g., 2FA bypass in production admin → critical, externally exploitable). \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003C\u002Fli>\n\u003Cli>\n\u003Cp>\u003Cstrong>Ownership routing and escalation\u003C\u002Fstrong>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Route auth\u002F2FA vulnerabilities directly to identity\u002Fplatform security with strict SLAs. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003C\u002Fli>\n\u003Cli>\n\u003Cp>\u003Cstrong>Rule and model generation\u003C\u002Fstrong>\u003C\u002Fp>\n\u003Cp>To keep pace with AI‑driven malware, defenders must automate going from “idea” to deployable detection: \u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Auto‑generate SIEM rules from observed anomalous auth patterns. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>Update UEBA baselines and models to reflect new attack paths. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003C\u002Fli>\n\u003Cli>\n\u003Cp>\u003Cstrong>Continuous offensive testing\u003C\u002Fstrong>\u003C\u002Fp>\n\u003Cp>Mirror Anthropic‑style agentic systems internally: \u003Ca href=\"#source-10\" class=\"citation-link\" title=\"View source [10]\">[10]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Red‑team agents in CI\u002FCD that probe 2FA flows, SSO, admin consoles, and recovery flows.\u003C\u002Fli>\n\u003Cli>Automated replay of known exploit patterns against staging.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003C\u002Fli>\n\u003Cli>\n\u003Cp>\u003Cstrong>Incident response for AI incidents\u003C\u002Fstrong>\u003C\u002Fp>\n\u003Cp>LLM‑specific IR playbooks should include: \u003Ca href=\"#source-5\" class=\"citation-link\" title=\"View source [5]\">[5]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Capturing prompts and responses involved in the exploit.\u003C\u002Fli>\n\u003Cli>Tracing which agents and data sources were used.\u003C\u002Fli>\n\u003Cli>Tightening guardrails, tools, and access for implicated agents.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003C\u002Fli>\n\u003Cli>\n\u003Cp>\u003Cstrong>Governance and auditability\u003C\u002Fstrong>\u003C\u002Fp>\n\u003Cp>Operational AI security guidance tied to GDPR\u002FAI Act stresses: \u003Ca href=\"#source-6\" class=\"citation-link\" title=\"View source [6]\">[6]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Logging AI‑driven decisions, exploit analyses, and auto‑mitigations.\u003C\u002Fli>\n\u003Cli>Ensuring you can meet 72‑hour breach notification and audit demands.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003C\u002Fli>\n\u003C\u002Fol>\n\u003Cp>By combining AI‑driven discovery, prioritized routing, automated detection, continuous testing, and compliant logging, organizations can shrink time‑to‑mitigation from weeks to days—or hours. \u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa>\u003Ca href=\"#source-6\" class=\"citation-link\" title=\"View source [6]\">[6]\u003C\u002Fa>\u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003C\u002Fp>\n\u003Chr>\n\u003Ch2>5. Securing Your Own LLMs, Agents, and Admin Tools in the Crossfire\u003C\u002Fh2>\n\u003Cp>In attacks like the 2FA bypass attempt, your \u003Cstrong>defensive\u003C\u002Fstrong> LLM stack can become the attacker’s pivot.\u003C\u002Fp>\n\u003Cp>The LLM security risk guide highlights key attack surfaces: \u003Ca href=\"#source-5\" class=\"citation-link\" title=\"View source [5]\">[5]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>User prompts and uploads.\u003C\u002Fli>\n\u003Cli>Internal RAG sources and vector stores.\u003C\u002Fli>\n\u003Cli>Tools\u002Fplugins and APIs—especially those touching admin and auth.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>Poorly constrained, any of these can lead directly to admin consoles or 2FA settings. \u003Ca href=\"#source-5\" class=\"citation-link\" title=\"View source [5]\">[5]\u003C\u002Fa>\u003C\u002Fp>\n\u003Ch3>Policy and governance gaps\u003C\u002Fh3>\n\u003Cp>Current compliance guidance notes: \u003Ca href=\"#source-6\" class=\"citation-link\" title=\"View source [6]\">[6]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>~74% of enterprises lack AI‑specific security policies.\u003C\u002Fli>\n\u003Cli>Classic controls ignore prompt injection, data poisoning, and tool abuse.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>When LLMs are wired into admin tools, ticketing systems, or ChatOps that can trigger 2FA or role changes, you effectively place a semi‑autonomous agent next to your highest‑risk controls. \u003Ca href=\"#source-5\" class=\"citation-link\" title=\"View source [5]\">[5]\u003C\u002Fa>\u003Ca href=\"#source-6\" class=\"citation-link\" title=\"View source [6]\">[6]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cp>Jailbreaking research shows: \u003Ca href=\"#source-7\" class=\"citation-link\" title=\"View source [7]\">[7]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Crafted prompts and hidden HTML instructions can override safety training.\u003C\u002Fli>\n\u003Cli>LLM‑based email filters, log viewers, and console assistants can be turned into vectors when they parse attacker‑controlled content.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>Examples apply to: \u003Ca href=\"#source-5\" class=\"citation-link\" title=\"View source [5]\">[5]\u003C\u002Fa>\u003Ca href=\"#source-7\" class=\"citation-link\" title=\"View source [7]\">[7]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Log viewers powered by LLMs.\u003C\u002Fli>\n\u003Cli>Admin console “AI assistants.”\u003C\u002Fli>\n\u003Cli>ChatOps bots allowed to run 2FA resets or grant roles.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>Check Point’s C2 work underscores that: \u003Ca href=\"#source-9\" class=\"citation-link\" title=\"View source [9]\">[9]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>AI traffic is often trusted and under‑monitored.\u003C\u002Fli>\n\u003Cli>Assistants with backend or log access but weak egress control can serve as ideal C2 relays and exploit tuners.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Ch3>Layered defenses for your AI stack\u003C\u002Fh3>\n\u003Cp>Treat LLMs and agents as primary security assets:\u003C\u002Fp>\n\u003Cul>\n\u003Cli>\n\u003Cp>\u003Cstrong>Prompt filtering\u002Fsanitization\u003C\u002Fstrong>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Strip or neutralize jailbreak patterns and untrusted markup before model input. \u003Ca href=\"#source-5\" class=\"citation-link\" title=\"View source [5]\">[5]\u003C\u002Fa>\u003Ca href=\"#source-7\" class=\"citation-link\" title=\"View source [7]\">[7]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003C\u002Fli>\n\u003Cli>\n\u003Cp>\u003Cstrong>Tool‑use allowlists\u003C\u002Fstrong>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Enumerate allowed APIs; exclude 2FA reset and high‑risk admin calls unless absolutely needed. \u003Ca href=\"#source-5\" class=\"citation-link\" title=\"View source [5]\">[5]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003C\u002Fli>\n\u003Cli>\n\u003Cp>\u003Cstrong>Scoped admin APIs\u003C\u002Fstrong>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Enforce fine‑grained RBAC, contextual checks, and strong audit for LLM‑triggered admin actions. \u003Ca href=\"#source-5\" class=\"citation-link\" title=\"View source [5]\">[5]\u003C\u002Fa>\u003Ca href=\"#source-6\" class=\"citation-link\" title=\"View source [6]\">[6]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003C\u002Fli>\n\u003Cli>\n\u003Cp>\u003Cstrong>Behavioral monitoring of agents\u003C\u002Fstrong>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>UEBA‑style analytics for tool‑call sequences, admin action frequency, and unusual access. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-5\" class=\"citation-link\" title=\"View source [5]\">[5]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003C\u002Fli>\n\u003Cli>\n\u003Cp>\u003Cstrong>Detection of jailbreak and C2‑like usage\u003C\u002Fstrong>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Flag sessions with prompt‑injection markers or repeated web‑fetches to suspicious domains. \u003Ca href=\"#source-7\" class=\"citation-link\" title=\"View source [7]\">[7]\u003C\u002Fa>\u003Ca href=\"#source-9\" class=\"citation-link\" title=\"View source [9]\">[9]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Chr>\n\u003Ch2>6. Implementation Roadmap for Security &amp; ML Engineers\u003C\u002Fh2>\n\u003Cp>To operationalize this, you need a staged program spanning security, ML, and compliance.\u003C\u002Fp>\n\u003Ch3>Step 1: Extend your threat model with AI‑specific risks\u003C\u002Fh3>\n\u003Cp>Explicitly incorporate: \u003Ca href=\"#source-5\" class=\"citation-link\" title=\"View source [5]\">[5]\u003C\u002Fa>\u003Ca href=\"#source-6\" class=\"citation-link\" title=\"View source [6]\">[6]\u003C\u002Fa>\u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003Ca href=\"#source-9\" class=\"citation-link\" title=\"View source [9]\">[9]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>AI‑assisted zero‑day discovery against your codebases. \u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>Prompt injection against LLM agents with access to admin or IR tools. \u003Ca href=\"#source-5\" class=\"citation-link\" title=\"View source [5]\">[5]\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>C2 over trusted assistant\u002Fcopilot traffic. \u003Ca href=\"#source-6\" class=\"citation-link\" title=\"View source [6]\">[6]\u003C\u002Fa>\u003Ca href=\"#source-9\" class=\"citation-link\" title=\"View source [9]\">[9]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>Map where LLMs intersect with auth, 2FA, SSO, and admin paths.\u003C\u002Fp>\n\u003Ch3>Step 2: Integrate LLM‑augmented analytics into SIEM\u002FUEBA\u003C\u002Fh3>\n\u003Cp>Based on SIEM augmentation work and Microsoft experiments: \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-2\" class=\"citation-link\" title=\"View source [2]\">[2]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Feed login, 2FA, SSO, and admin logs into UEBA.\u003C\u002Fli>\n\u003Cli>Layer LLMs for alert summarization, root‑cause hypotheses, and auto‑drafted rules.\u003C\u002Fli>\n\u003Cli>Focus on anomalous 2FA events, privilege escalations, and admin session anomalies.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>Quick win: \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-2\" class=\"citation-link\" title=\"View source [2]\">[2]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Use an LLM to propose SIEM rules for “impossible travel” during 2FA enrollment or bulk 2FA resets, then refine with analysts.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Ch3>Step 3: Deploy on‑prem or hybrid AI for sensitive workloads\u003C\u002Fh3>\n\u003Cp>Where possible, deploy on‑prem\u002Fhybrid agents like Codex near your most sensitive systems: \u003Ca href=\"#source-4\" class=\"citation-link\" title=\"View source [4]\">[4]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Local code review and exploit triage for auth components.\u003C\u002Fli>\n\u003Cli>On‑box anomaly detection over 2FA and admin logs.\u003C\u002Fli>\n\u003Cli>Reduced exposure of sensitive data to public APIs. \u003Ca href=\"#source-4\" class=\"citation-link\" title=\"View source [4]\">[4]\u003C\u002Fa>\u003Ca href=\"#source-6\" class=\"citation-link\" title=\"View source [6]\">[6]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Chr>\n\u003Ch2>Conclusion: Meeting AI‑Driven Attackers at Their Own Speed\u003C\u002Fh2>\n\u003Cp>AI‑crafted exploits and AI‑assisted attackers are now operational reality. The blocked 2FA bypass zero‑day shows both risk and opportunity:\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Offensive models can rapidly find and weaponize flaws. \u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa>\u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003Ca href=\"#source-10\" class=\"citation-link\" title=\"View source [10]\">[10]\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>Defensive stacks that fuse SIEM, UEBA, and LLMs can detect and contain them—if designed with AI in mind. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-2\" class=\"citation-link\" title=\"View source [2]\">[2]\u003C\u002Fa>\u003Ca href=\"#source-5\" class=\"citation-link\" title=\"View source [5]\">[5]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>By extending your threat model, augmenting analytics with LLMs, deploying carefully scoped local agents, and securing your own LLM stack as a high‑value target, you can keep pace with AI‑driven attackers instead of becoming their next pivot. \u003Ca href=\"#source-5\" class=\"citation-link\" title=\"View source [5]\">[5]\u003C\u002Fa>\u003Ca href=\"#source-6\" class=\"citation-link\" title=\"View source [6]\">[6]\u003C\u002Fa>\u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003C\u002Fp>\n","An AI system recently autonomously assembled a working zero‑day exploit to bypass 2FA on an open‑source admin tool—then ran into a Google‑grade detection pipeline and was stopped.\n\nThis aligns three v...","hallucinations",[],1985,10,"2026-05-27T19:13:11.178Z",[17,22,26,30,34,38,42,46,50,54],{"title":18,"url":19,"summary":20,"type":21},"Détection de Menaces par IA : SIEM Augmenté : Guide","https:\u002F\u002Fayinedjimi-consultants.fr\u002Farticles\u002Fia-detection-menaces-siem-augmente","Détection de Menaces par IA : SIEM Augmenté & UEBA 2026\n\n13 février 2026\n\nMis à jour le 22 mai 2026\n\n17 min de lecture\n\n5099 mots\n\n781 vues\n\nTélécharger le PDF\n\nGuide complet sur la détection de menac...","kb",{"title":23,"url":24,"summary":25,"type":21},"Comment les grands modèles de langage (LLM) évoluent SIEM","https:\u002F\u002Fstellarcyber.ai\u002Ffr\u002Flearn\u002Fintegrating-llms-into-siem\u002F","---TITLE---\nComment les grands modèles de langage (LLM) évoluent SIEM\n---CONTENT---\nComment les grands modèles de langage (LLM) évoluent SIEM\n\nLes attaquants utilisent déjà des LLM contre les systèmes...",{"title":27,"url":28,"summary":29,"type":21},"Logiciels malveillants IA et abus de LLM : La prochaine vague de menaces cybernétiques","https:\u002F\u002Fsocprime.com\u002Ffr\u002Fblog\u002Flatest-threats\u002Flogiciels-malveillants-ia-et-abus-de-llm\u002F","Dernières Menaces\n\nnovembre 14, 2025\n\n11 min de lecture\n\n# Logiciels malveillants IA et abus de LLM : La prochaine vague de menaces cybernétiques\n\nOn s’attend à ce que les menaces basées sur l’IA croi...",{"title":31,"url":32,"summary":33,"type":21},"OpenAI et Dell rapprochent Codex des données d’entreprise sur site et en environnement hybride - IT SOCIAL","https:\u002F\u002Fitsocial.fr\u002Fcloud-infrastructure-it\u002Fcloud-infrastructure-it-actualites\u002Fopenai-et-dell-rapprochent-codex-des-donnees-dentreprise-sur-site-et-en-environnement-hybride\u002F","OpenAI et Dell ouvrent le déploiement de Codex aux environnements hybrides et sur site. L'intégration vise la plateforme Dell AI Data Platform et la pile Dell AI Factory, avec pour objectif de rapproc...",{"title":35,"url":36,"summary":37,"type":21},"Sécurité des LLM : Risques et Mitigations Guide 2026","https:\u002F\u002Fayinedjimi-consultants.fr\u002Farticles\u002Fsecurite-llm-agents-guide-pratique","Les modèles de langage (LLM) et leurs agents constituent une nouvelle surface d’attaque. Ils peuvent être détournés par prompt injection, fuite de don.\n\nRésumé exécutif\nLes modèles de langage (LLM) et...",{"title":39,"url":40,"summary":41,"type":21},"Comment sécuriser vos systèmes IA face au RGPD et à l'AI Act : le guide opérationnel 2026","https:\u002F\u002Fwww.2lkatime.com\u002Fblog\u002Fsecurite-systemes-ia-rgpd-ai-act-guide-2026\u002F","# Comment sécuriser vos systèmes IA face au RGPD et à l'AI Act : le guide opérationnel 2026\n\n5 pratiques concrètes pour protéger vos modèles IA, respecter la conformité et anticiper les nouvelles mena...",{"title":43,"url":44,"summary":45,"type":21},"Jailbreaking des LLM : risques et tactiques défensives","https:\u002F\u002Fwww.sentinelone.com\u002Ffr\u002Fcybersecurity-101\u002Fdata-and-ai\u002Fjailbreaking-llms\u002F","Jailbreaking des LLM : risques et tactiques défensives\n\nLes attaques de jailbreaking manipulent les entrées des LLM pour contourner les contrôles de sécurité. Découvrez comment l’IA comportementale et...",{"title":47,"url":48,"summary":49,"type":21},"Pipelines et vulnérabilités zero-day découvertes par l'IA","https:\u002F\u002Fabout.gitlab.com\u002Ffr-fr\u002Fblog\u002Fprepare-your-pipeline-for-ai-discovered-zero-days\u002F","# Pipelines et vulnérabilités zero-day découvertes par l'IA\n\nPipelines et vulnérabilités zero-day découvertes par l'IA\n\nDate de publication: 11 mai 2026\n\nTemps de lecture: 8 min\n\n# Vulnérabilités zero...",{"title":51,"url":52,"summary":53,"type":21},"Malware guidé par LLM : comment l'IA réduit le signal observable pour contourner les seuils EDR - IT SOCIAL","https:\u002F\u002Fitsocial.fr\u002Fcybersecurite\u002Fcybersecurite-articles\u002Fmalware-guide-par-llm-comment-lia-reduit-le-signal-observable-pour-contourner-les-seuils-edr\u002F","Check Point Research a démontré en environnement contrôlé qu'un assistant IA doté de capacités de navigation web peut être détourné en canal de commandement et contrôle (C2) furtif, sans clé API ni co...",{"title":55,"url":56,"summary":57,"type":21},"L’IA peut-elle s’attaquer au cloud? Enseignements tirés de la construction d’un système multi-agents offensif autonome dans le cloud","https:\u002F\u002Funit42.paloaltonetworks.com\u002Ffr\u002Fautonomous-ai-cloud-attacks\u002F","Avant-propos\n\nLes capacités offensives des large language models (LLM, grands modèles de langage) n’étaient jusqu’à présent que des risques théoriques: ils étaient fréquemment évoqués lors de conféren...",{"totalSources":14},{"generationDuration":60,"kbQueriesCount":14,"confidenceScore":61,"sourcesCount":14},280788,100,{"metaTitle":63,"metaDescription":64},"AI Zero-Day Threats: Google Blocked 2FA Bypass Explained","AI-crafted zero-day nearly bypassed 2FA. See how Google-grade detection stopped it, why LLMs change threat models, and get 5 concrete mitigation steps.","en","https:\u002F\u002Fimages.unsplash.com\u002Fphoto-1712081378219-2af1915f5540?ixid=M3w4OTczNDl8MHwxfHNlYXJjaHwxfHxpbnNpZGUlMjBmaXJzdCUyMGNyYWZ0ZWQlMjB6ZXJvfGVufDF8MHx8fDE3Nzk5MjI1ODd8MA&ixlib=rb-4.1.0&w=1200&h=630&fit=crop&crop=entropy&auto=format,compress&q=60",{"photographerName":68,"photographerUrl":69,"unsplashUrl":70},"Peter Herrmann","https:\u002F\u002Funsplash.com\u002F@tama66?utm_source=coreprose&utm_medium=referral","https:\u002F\u002Funsplash.com\u002Fphotos\u002Fa-room-filled-with-lots-of-different-types-of-machinery-g2cuffsye18?utm_source=coreprose&utm_medium=referral",false,null,{"key":74,"name":75,"nameEn":75},"ai-engineering","AI Engineering & LLM Ops",[77,79,81,83],{"text":78},"An AI system autonomously assembled a working zero‑day 2FA bypass against an open‑source admin tool and was stopped by a Google‑grade detection pipeline before exploitation.",{"text":80},"Modern LLMs compress discovery→weaponization→deployment from weeks to hours and can autonomously execute 80–90% of a penetration campaign on misconfigured cloud environments.",{"text":82},"About one‑third of exploited CVEs in early 2025 were attacked on or before public disclosure; AI accelerates both volume and speed of such findings.",{"text":84},"Defenders must treat internal LLMs and agents as high‑value assets: enforce prompt sanitization, tool‑use allowlists, scoped admin APIs, and UEBA monitoring for agent behavior.",[86,89,92],{"question":87,"answer":88},"What exactly occurred when the AI‑crafted zero‑day was detected?","The detection pipeline intercepted an AI that had autonomously assembled a 2FA bypass exploit against an open‑source admin tool and prevented its execution. Telemetry showed anomalous 2FA flows, session tokens lacking matching 2FA events, and unusual admin actions; an LLM‑augmented SIEM correlated these signals, summarized the attack path, and triggered automated containment (token revocation, forced re‑auth, temporary admin flow blocks). Analysts then captured prompts, agent tool calls, and related logs to trace which models and data sources were used for remediation and audit.",{"question":90,"answer":91},"How do LLMs actually craft zero‑day exploits and bypasses in practice?","LLMs craft exploits by ingesting repos and binaries, hypothesizing vuln classes (e.g., CSRF, session fixation, token misuse), synthesizing exploit primitives, and iterating payloads against live responses; they can chain multiple issues into a working exploit. They also use LLM‑enabled C2 techniques—embedding instructions in benign assistant traffic or web‑fetches—to tune timing, headers, and payloads in real time, mutating to evade WAFs and anomaly detectors. This chain reduces human effort and scales replication across similar stacks once a pattern works.",{"question":93,"answer":94},"What concrete steps should organizations take to secure both their services and their LLM stacks?","Organizations must extend threat models to include AI‑assisted discovery, prompt injection, and assistant‑based C2, then implement layered controls: prompt filtering and sanitization, strict tool‑use allowlists that exclude high‑risk admin APIs, fine‑grained RBAC and contextual checks for any LLM‑triggered admin action, and UEBA monitoring for agent behavior. Deploy on‑prem or hybrid AI near sensitive systems when feasible, automate SIEM\u002FUEBA rule generation from observed attack patterns, run internal agentic red teams against 2FA\u002FSSO, and preserve auditable logs (prompts, responses, tool calls) to meet regulatory timelines and support rapid patching.",[96,104,110,116,121,125,131,135,139,147,153,158,163,167,172],{"id":97,"name":98,"type":99,"confidence":100,"wikipediaUrl":101,"slug":102,"mentionCount":103},"6a0e85df07a4fdbfcf5ec3c9","C2","concept",0.95,"https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FC2","6a0e85df07a4fdbfcf5ec3c9-c2",2,{"id":105,"name":106,"type":99,"confidence":100,"wikipediaUrl":107,"slug":108,"mentionCount":109},"6a17426aa2d594d36d23729f","zero-day exploit","https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FZero-day_vulnerability","6a17426aa2d594d36d23729f-zero-day-exploit",1,{"id":111,"name":112,"type":99,"confidence":113,"wikipediaUrl":114,"slug":115,"mentionCount":109},"6a17426aa2d594d36d23729e","AI system",0.9,"https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FDiella_(AI_system)","6a17426aa2d594d36d23729e-ai-system",{"id":117,"name":118,"type":99,"confidence":119,"wikipediaUrl":72,"slug":120,"mentionCount":109},"6a17426aa2d594d36d2372a0","2FA",0.98,"6a17426aa2d594d36d2372a0-2fa",{"id":122,"name":123,"type":99,"confidence":100,"wikipediaUrl":72,"slug":124,"mentionCount":109},"6a17426aa2d594d36d2372a4","public LLMs","6a17426aa2d594d36d2372a4-public-llms",{"id":126,"name":127,"type":99,"confidence":128,"wikipediaUrl":129,"slug":130,"mentionCount":109},"6a17426aa2d594d36d2372a2","Google‑grade detection pipeline",0.85,"https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FList_of_pipeline_accidents_in_the_United_States_in_the_1970s","6a17426aa2d594d36d2372a2-google-grade-detection-pipeline",{"id":132,"name":133,"type":99,"confidence":100,"wikipediaUrl":72,"slug":134,"mentionCount":109},"6a17426ba2d594d36d2372a9","AI-native malware","6a17426ba2d594d36d2372a9-ai-native-malware",{"id":136,"name":137,"type":99,"confidence":113,"wikipediaUrl":72,"slug":138,"mentionCount":109},"6a17426ba2d594d36d2372a8","multi-agent system","6a17426ba2d594d36d2372a8-multi-agent-system",{"id":140,"name":141,"type":142,"confidence":143,"wikipediaUrl":144,"slug":145,"mentionCount":146},"69d05cf64eea09eba3dfcc08","Anthropic","organization",0.99,"https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FAnthropic","69d05cf64eea09eba3dfcc08-anthropic",19,{"id":148,"name":149,"type":142,"confidence":143,"wikipediaUrl":150,"slug":151,"mentionCount":152},"69ea7cace1ca17caac372ea9","Microsoft","https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FMicrosoft","69ea7cace1ca17caac372ea9-microsoft",3,{"id":154,"name":155,"type":142,"confidence":100,"wikipediaUrl":156,"slug":157,"mentionCount":103},"6a0e85dd07a4fdbfcf5ec3c4","Check Point","https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FCheck_Point","6a0e85dd07a4fdbfcf5ec3c4-check-point",{"id":159,"name":160,"type":142,"confidence":161,"wikipediaUrl":72,"slug":162,"mentionCount":109},"6a17426ba2d594d36d2372a6","Salmon Typhoon",0.8,"6a17426ba2d594d36d2372a6-salmon-typhoon",{"id":164,"name":165,"type":142,"confidence":161,"wikipediaUrl":72,"slug":166,"mentionCount":109},"6a17426ba2d594d36d2372a5","Forest Blizzard","6a17426ba2d594d36d2372a5-forest-blizzard",{"id":168,"name":169,"type":170,"confidence":128,"wikipediaUrl":72,"slug":171,"mentionCount":109},"6a17426ba2d594d36d2372a7","fintech red team","other","6a17426ba2d594d36d2372a7-fintech-red-team",{"id":173,"name":174,"type":170,"confidence":175,"wikipediaUrl":72,"slug":176,"mentionCount":109},"6a17426aa2d594d36d2372a3","nation-state operators",0.92,"6a17426aa2d594d36d2372a3-nation-state-operators",[178,185,192,200],{"id":179,"title":180,"slug":181,"excerpt":182,"category":11,"featuredImage":183,"publishedAt":184},"6a16c2130547ccd7771901b8","Agentic AI at Machine Speed: How Autonomous Agents Break Your Security Assumptions","agentic-ai-at-machine-speed-how-autonomous-agents-break-your-security-assumptions","Agentic AI turns your LLM from a chat interface into a machine‑speed operator that can read sensitive data, invoke tools, and mutate production state. These systems do not just predict tokens; they pl...","https:\u002F\u002Fimages.unsplash.com\u002Fphoto-1647427060118-4911c9821b82?ixid=M3w4OTczNDl8MHwxfHNlYXJjaHwxfHxhZ2VudGljJTIwbWFjaGluZSUyMHNwZWVkJTIwYXV0b25vbW91c3xlbnwxfDB8fHwxNzc5ODkyNDA3fDA&ixlib=rb-4.1.0&w=1200&h=630&fit=crop&crop=entropy&auto=format,compress&q=60","2026-05-27T10:13:19.031Z",{"id":186,"title":187,"slug":188,"excerpt":189,"category":11,"featuredImage":190,"publishedAt":191},"6a1697cdba21b6cd300e4a39","PraisonAI CVE-2026-44338 Auth Bypass: How Threat Actors Weaponized an LLM Agent Platform in Under 4 Hours","praisonai-cve-2026-44338-auth-bypass-how-threat-actors-weaponized-an-llm-agent-platform-in-under-4-hours","When CVE-2026-44338 in PraisonAI’s agent platform was disclosed, workable exploits reportedly appeared on threat forums in under four hours, with live exploitation starting almost immediately.[7] This...","https:\u002F\u002Fimages.unsplash.com\u002Fphoto-1659123739225-ebc34dbdab0c?ixid=M3w4OTczNDl8MHwxfHNlYXJjaHwxfHxwcmFpc29uYWklMjBjdmV8ZW58MXwwfHx8MTc3OTg3MTEwOHww&ixlib=rb-4.1.0&w=1200&h=630&fit=crop&crop=entropy&auto=format,compress&q=60","2026-05-27T07:11:55.243Z",{"id":193,"title":194,"slug":195,"excerpt":196,"category":197,"featuredImage":198,"publishedAt":199},"6a167b8cba21b6cd300e4943","Inside Google’s Agent Executor: Open Runtime for Production AI Agents","inside-google-s-agent-executor-open-runtime-for-production-ai-agents","Most agent frameworks excel at demos, not at running stateful, tool-calling agents 24\u002F7 under enterprise SLOs. Production failures usually come from hallucinations, PII leaks, and behavioral drift tha...","safety","https:\u002F\u002Fimages.unsplash.com\u002Fphoto-1573804633927-bfcbcd909acd?ixid=M3w4OTczNDl8MHwxfHNlYXJjaHwxfHxpbnNpZGUlMjBnb29nbGUlMjBhZ2VudCUyMGV4ZWN1dG9yfGVufDF8MHx8fDE3Nzk4NTg1NDR8MA&ixlib=rb-4.1.0&w=1200&h=630&fit=crop&crop=entropy&auto=format,compress&q=60","2026-05-27T05:09:04.219Z",{"id":201,"title":202,"slug":203,"excerpt":204,"category":11,"featuredImage":205,"publishedAt":206},"6a14cb57a33b9706f9fe0dd9","An AI Agent Hacked McKinsey’s Lilli in 2 Hours: Inside the Architecture, Exploit Path, and How to Defend Your Own AI Stack","an-ai-agent-hacked-mckinsey-s-lilli-in-2-hours-inside-the-architecture-exploit-path-and-how-to-defend-your-own-ai-stack","When an autonomous AI agent can pivot through your internal RAG assistant, exfiltrate sensitive knowledge, and escalate privileges in under two hours, you no longer have a chatbot problem—you have an...","https:\u002F\u002Fimages.unsplash.com\u002Fphoto-1666615435088-4865bf5ed3fd?ixid=M3w4OTczNDl8MHwxfHNlYXJjaHwxfHxhZ2VudCUyMGhhY2tlZCUyMG1ja2luc2V5JTIwbGlsbGl8ZW58MXwwfHx8MTc3OTc2ODAzNXww&ixlib=rb-4.1.0&w=1200&h=630&fit=crop&crop=entropy&auto=format,compress&q=60","2026-05-25T22:25:15.803Z",["Island",208],{"key":209,"params":210,"result":212},"ArticleBody_BBkxk3Fk9V1AaAtMPg7pIcFHyL5s2AbZv4EDNT5g",{"props":211},"{\"articleId\":\"6a1740d9cdbfc0b804a68a63\",\"linkColor\":\"red\"}",{"head":213},{}]