[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-threat-actors-are-hijacking-exposed-ai-endpoints-to-power-their-attacks-en":3,"ArticleBody_lUCFUN67CEa848Isiyi2HKSfsnLsyFuzasDh1bBlbY":212},{"article":4,"relatedArticles":181,"locale":60},{"id":5,"title":6,"slug":7,"content":8,"htmlContent":9,"excerpt":10,"category":11,"tags":12,"metaDescription":10,"wordCount":13,"readingTime":14,"publishedAt":15,"sources":16,"sourceCoverage":52,"transparency":54,"seo":57,"language":60,"featuredImage":61,"featuredImageCredit":62,"isFreeGeneration":66,"trendSlug":67,"trendSnapshot":68,"niche":77,"geoTakeaways":80,"geoFaq":89,"entities":99},"6a47f007a616f41b30a9cd4e","Threat Actors Are Hijacking Exposed AI Endpoints to Power Their Attacks","threat-actors-are-hijacking-exposed-ai-endpoints-to-power-their-attacks","Modern AI stacks expose inference endpoints like `\u002Fapi\u002Fgenerate`, `\u002Fapi\u002Fchat`, or `\u002Fv1\u002Fresponses` so apps can call models over HTTP. When self-hosted backends are reachable from the public internet without auth, they effectively become free “LLM-as-a-service” for anyone who finds them. [3]\n\nBetween March and May, Zenity honeypots saw three campaigns doing exactly this, abusing exposed [Ollama](\u002Fentities\u002F69871f06033ff25c8c612c83-ollama) and LiteLLM instances as offensive AI backends with no exploit beyond knowing URL and port. [1][3]\n\n💡 **Key takeaway:** If your AI backend is on the internet without strong auth, assume it will be used as someone else’s attack engine. [1]\n\n---\n\n## 1. How Threat Actors Hijack Exposed AI Endpoints\n\nSelf-hosted AI runtimes expose inference APIs so frontends, agents, and tools can call models. In Ollama, that includes `\u002Fapi\u002Fgenerate` and `\u002Fapi\u002Fchat` on port `11434`; LiteLLM commonly exposes `\u002Fv1\u002Fresponses` on port `4000`. [3] Many teams spin these up for PoCs, bind to `0.0.0.0`, and never add network restrictions or authentication. [3]\n\nZenity observed three real-world campaigns abusing such honeypots as backend compute. [1][3] Across them, attackers typically:  \n\n- Scan for reachable Ollama \u002F LiteLLM-style endpoints.  \n- Send a small “hello” prompt to verify model behavior.  \n- Repoint their own agents to the discovered endpoint as the model backend.\n\n⚠️ **Key point:** No RCE, SSRF, or deserialization bug is needed—the attack surface is the *intended* API, misconfigured on the internet. [1][3]\n\nTwo campaigns used autonomous penetration frameworks ([Strix](\u002Fentities\u002F6a460f6c8224e44d5c35476e-strix), HexStrike AI), uploading large orchestration prompts and toolsets into the victim’s Ollama or LiteLLM. [1][3] A third used an OpenAI Codex persona tuned to bypass safety refusals and assist with web reverse‑engineering. [1]\n\nOperationally, the adversary simply reconfigures their client:\n\n```bash\n# Benign\nexport LLM_BASE_URL=https:\u002F\u002Fapi.openai.com\u002Fv1\n\n# Hijack victim endpoint\nexport LLM_BASE_URL=http:\u002F\u002Fvictim.example.com:4000\u002Fv1\n```\n\nThey then send full agent payloads—system prompt, tool schemas, objectives—in the request body, turning the victim’s endpoint into the “brain” of their agent. [1][3] One SaaS company only noticed its exposed Ollama box after cost anomalies and 140k‑character Strix prompts appeared in logs. [1][3]\n\nThe workflow below summarizes how attackers convert an exposed AI endpoint into the backend for their agents. [1][3]\n\n```mermaid\nflowchart LR\n    title Hijacking Exposed AI Endpoints\n    A[Scan open ports] --> B[Test model prompt]\n    B --> C[Repoint AI client]\n    C --> D[Run attacker agents]\n    D --> E[Costs & risk to victim]\n    style A fill:#3b82f6,stroke:#1d4ed8,stroke-width:2px\n    style B fill:#22c55e,stroke:#15803d,stroke-width:2px\n    style C fill:#f59e0b,stroke:#b45309,stroke-width:2px\n    style D fill:#ef4444,stroke:#b91c1c,stroke-width:2px\n    style E fill:#ef4444,stroke:#b91c1c,stroke-width:2px\n```\n\nThe root problem is weak defaults: Ollama ships without authentication, and LiteLLM treats it as opt‑in, so many instances are launched with no real access control. [1]\n\n💼 **Operational impact:** You pay for GPUs and risk; the attacker gets free offensive AI infrastructure. [1][3]\n\n---\n\n## 2. Why Exposed AI Endpoints Are a High-Impact Attack Surface\n\nThese incidents align with the shift from single LLM calls to agentic apps that plan, call tools, and iterate. [7] When attackers hijack your endpoint, they can run full penetration testing, exploit development, or code‑analysis workflows—not just completions. [5]\n\nResearch on agent security shows most failures stem from insecure patterns and misconfigured tools, not a specific framework. [5] Across [CrewAI](\u002Fentities\u002F697bbf84e28785d1e150709d-crewai) and [AutoGen](\u002Fentities\u002F6960e29a19d266277e150443-autogen), issues like:  \n\n- Poor scope definition,  \n- Unsafe tool \u002F API integrations,  \n- Over‑trusted code interpreters  \n\nall produced similar compromise scenarios. [5] Any generic, exposed LLM backend can therefore be wired into risky agents.\n\n📊 **Data point:** A single agent trajectory may involve dozens of tool calls, browser sessions, and code executions—each a potential abuse vector if the attacker controls the objective. [5][7]\n\nAI also compresses exploit timelines: models rapidly analyze codebases, map APIs, and suggest exploits, making compute-rich endpoints highly attractive when someone else pays. [1][8]\n\nAttackers don’t need full environment compromise. By merely repointing their clients to your endpoint, they gain model access, compute, and sometimes network adjacency, while you inherit cost, legal, and reputational exposure if your infra appears in intrusion logs. [1][3]\n\n⚡ **Reality check:** “We’ll lock it down when we productize” is unsafe—attackers already treat test endpoints as infrastructure. [1][3]\n\n---\n\n## 3. Defensive Playbook: Securing AI Endpoints and Investigating Abuse\n\nArchitect for non‑exposure. Do not place Ollama, LiteLLM, or similar runtimes directly on the public internet. Instead: [1][3]\n\n- Keep them on private networks or behind VPNs.  \n- Front them with authenticated API gateways.  \n- Enforce real authentication; reject placeholder \u002F default keys.  \n- Continuously scan cloud, on‑prem, and lab environments for open AI ports; shut down or properly wrap anything unintentionally reachable. [1][3]\n\nStrengthen observability:\n\n- Log full request bodies, not just headers and status codes.  \n- Flag large system prompts, embedded tool definitions, and “mission” descriptions as possible external agent traffic. [1][7]  \n- Watch for prompts mentioning offensive tooling (e.g., “Strix”, “penetration test”, “do not ask permission”) or long JSON tool schemas. [1][3][7]\n\nLeverage existing telemetry stacks. Microsoft Purview, Defender, and Sentinel, for example, can show who initiated AI interactions, when, and which resources were touched, enabling reconstruction of AI activity chains. [6]\n\nUse a scope–context–signal model for investigations: [6]\n\n1. **Scope:** Identities, IPs, and services hitting the suspect endpoint.  \n2. **Context:** Data, tools, and internal systems accessed.  \n3. **Signal:** Anomalies such as usage spikes, unusual prompts, or credential exposure.\n\nPrepare the organization:\n\n- Run AI‑specific incident tabletop exercises.  \n- Track CISA JCDC efforts toward an AI Security Incident Collaboration Playbook for shared response patterns. [9]  \n- Align security, engineering, and legal roles before an AI endpoint hijack, when minutes matter. [8][9]\n\n⚠️ **Key point:** Treat AI endpoints as first‑class production services with threat models, runbooks, and incident drills—not experimental sidecars. [6][9]\n\n---\n\n## Conclusion: Inventory, Lock Down, and Practice the Response\n\nExposed inference endpoints are low‑friction, high‑reward targets. Attackers need only a URL to conscript your models and compute, as agentic AI, misconfiguration, and accelerated exploit development converge. [1][5][8]\n\nConcretely, you should:\n\n- Inventory every AI endpoint across environments.  \n- Lock down exposure and enforce strong authentication.  \n- Extend logging to capture agent prompts, tools, and trajectories.  \n- Integrate AI‑focused tabletop exercises into incident response. [1][6][9]\n\nDoing this now is the difference between reading about hijacked AI infrastructure and discovering the infrastructure is yours.","\u003Cp>Modern AI stacks expose inference endpoints like \u003Ccode>\u002Fapi\u002Fgenerate\u003C\u002Fcode>, \u003Ccode>\u002Fapi\u002Fchat\u003C\u002Fcode>, or \u003Ccode>\u002Fv1\u002Fresponses\u003C\u002Fcode> so apps can call models over HTTP. When self-hosted backends are reachable from the public internet without auth, they effectively become free “LLM-as-a-service” for anyone who finds them. \u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cp>Between March and May, Zenity honeypots saw three campaigns doing exactly this, abusing exposed \u003Ca href=\"\u002Fentities\u002F69871f06033ff25c8c612c83-ollama\">Ollama\u003C\u002Fa> and LiteLLM instances as offensive AI backends with no exploit beyond knowing URL and port. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cp>💡 \u003Cstrong>Key takeaway:\u003C\u002Fstrong> If your AI backend is on the internet without strong auth, assume it will be used as someone else’s attack engine. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003C\u002Fp>\n\u003Chr>\n\u003Ch2>1. How Threat Actors Hijack Exposed AI Endpoints\u003C\u002Fh2>\n\u003Cp>Self-hosted AI runtimes expose inference APIs so frontends, agents, and tools can call models. In Ollama, that includes \u003Ccode>\u002Fapi\u002Fgenerate\u003C\u002Fcode> and \u003Ccode>\u002Fapi\u002Fchat\u003C\u002Fcode> on port \u003Ccode>11434\u003C\u002Fcode>; LiteLLM commonly exposes \u003Ccode>\u002Fv1\u002Fresponses\u003C\u002Fcode> on port \u003Ccode>4000\u003C\u002Fcode>. \u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa> Many teams spin these up for PoCs, bind to \u003Ccode>0.0.0.0\u003C\u002Fcode>, and never add network restrictions or authentication. \u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cp>Zenity observed three real-world campaigns abusing such honeypots as backend compute. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa> Across them, attackers typically:\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Scan for reachable Ollama \u002F LiteLLM-style endpoints.\u003C\u002Fli>\n\u003Cli>Send a small “hello” prompt to verify model behavior.\u003C\u002Fli>\n\u003Cli>Repoint their own agents to the discovered endpoint as the model backend.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>⚠️ \u003Cstrong>Key point:\u003C\u002Fstrong> No RCE, SSRF, or deserialization bug is needed—the attack surface is the \u003Cem>intended\u003C\u002Fem> API, misconfigured on the internet. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cp>Two campaigns used autonomous penetration frameworks (\u003Ca href=\"\u002Fentities\u002F6a460f6c8224e44d5c35476e-strix\">Strix\u003C\u002Fa>, HexStrike AI), uploading large orchestration prompts and toolsets into the victim’s Ollama or LiteLLM. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa> A third used an OpenAI Codex persona tuned to bypass safety refusals and assist with web reverse‑engineering. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cp>Operationally, the adversary simply reconfigures their client:\u003C\u002Fp>\n\u003Cpre>\u003Ccode class=\"language-bash\"># Benign\nexport LLM_BASE_URL=https:\u002F\u002Fapi.openai.com\u002Fv1\n\n# Hijack victim endpoint\nexport LLM_BASE_URL=http:\u002F\u002Fvictim.example.com:4000\u002Fv1\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Cp>They then send full agent payloads—system prompt, tool schemas, objectives—in the request body, turning the victim’s endpoint into the “brain” of their agent. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa> One SaaS company only noticed its exposed Ollama box after cost anomalies and 140k‑character Strix prompts appeared in logs. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cp>The workflow below summarizes how attackers convert an exposed AI endpoint into the backend for their agents. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cpre>\u003Ccode class=\"language-mermaid\">flowchart LR\n    title Hijacking Exposed AI Endpoints\n    A[Scan open ports] --&gt; B[Test model prompt]\n    B --&gt; C[Repoint AI client]\n    C --&gt; D[Run attacker agents]\n    D --&gt; E[Costs &amp; risk to victim]\n    style A fill:#3b82f6,stroke:#1d4ed8,stroke-width:2px\n    style B fill:#22c55e,stroke:#15803d,stroke-width:2px\n    style C fill:#f59e0b,stroke:#b45309,stroke-width:2px\n    style D fill:#ef4444,stroke:#b91c1c,stroke-width:2px\n    style E fill:#ef4444,stroke:#b91c1c,stroke-width:2px\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Cp>The root problem is weak defaults: Ollama ships without authentication, and LiteLLM treats it as opt‑in, so many instances are launched with no real access control. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cp>💼 \u003Cstrong>Operational impact:\u003C\u002Fstrong> You pay for GPUs and risk; the attacker gets free offensive AI infrastructure. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa>\u003C\u002Fp>\n\u003Chr>\n\u003Ch2>2. Why Exposed AI Endpoints Are a High-Impact Attack Surface\u003C\u002Fh2>\n\u003Cp>These incidents align with the shift from single LLM calls to agentic apps that plan, call tools, and iterate. \u003Ca href=\"#source-7\" class=\"citation-link\" title=\"View source [7]\">[7]\u003C\u002Fa> When attackers hijack your endpoint, they can run full penetration testing, exploit development, or code‑analysis workflows—not just completions. \u003Ca href=\"#source-5\" class=\"citation-link\" title=\"View source [5]\">[5]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cp>Research on agent security shows most failures stem from insecure patterns and misconfigured tools, not a specific framework. \u003Ca href=\"#source-5\" class=\"citation-link\" title=\"View source [5]\">[5]\u003C\u002Fa> Across \u003Ca href=\"\u002Fentities\u002F697bbf84e28785d1e150709d-crewai\">CrewAI\u003C\u002Fa> and \u003Ca href=\"\u002Fentities\u002F6960e29a19d266277e150443-autogen\">AutoGen\u003C\u002Fa>, issues like:\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Poor scope definition,\u003C\u002Fli>\n\u003Cli>Unsafe tool \u002F API integrations,\u003C\u002Fli>\n\u003Cli>Over‑trusted code interpreters\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>all produced similar compromise scenarios. \u003Ca href=\"#source-5\" class=\"citation-link\" title=\"View source [5]\">[5]\u003C\u002Fa> Any generic, exposed LLM backend can therefore be wired into risky agents.\u003C\u002Fp>\n\u003Cp>📊 \u003Cstrong>Data point:\u003C\u002Fstrong> A single agent trajectory may involve dozens of tool calls, browser sessions, and code executions—each a potential abuse vector if the attacker controls the objective. \u003Ca href=\"#source-5\" class=\"citation-link\" title=\"View source [5]\">[5]\u003C\u002Fa>\u003Ca href=\"#source-7\" class=\"citation-link\" title=\"View source [7]\">[7]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cp>AI also compresses exploit timelines: models rapidly analyze codebases, map APIs, and suggest exploits, making compute-rich endpoints highly attractive when someone else pays. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cp>Attackers don’t need full environment compromise. By merely repointing their clients to your endpoint, they gain model access, compute, and sometimes network adjacency, while you inherit cost, legal, and reputational exposure if your infra appears in intrusion logs. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cp>⚡ \u003Cstrong>Reality check:\u003C\u002Fstrong> “We’ll lock it down when we productize” is unsafe—attackers already treat test endpoints as infrastructure. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa>\u003C\u002Fp>\n\u003Chr>\n\u003Ch2>3. Defensive Playbook: Securing AI Endpoints and Investigating Abuse\u003C\u002Fh2>\n\u003Cp>Architect for non‑exposure. Do not place Ollama, LiteLLM, or similar runtimes directly on the public internet. Instead: \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Keep them on private networks or behind VPNs.\u003C\u002Fli>\n\u003Cli>Front them with authenticated API gateways.\u003C\u002Fli>\n\u003Cli>Enforce real authentication; reject placeholder \u002F default keys.\u003C\u002Fli>\n\u003Cli>Continuously scan cloud, on‑prem, and lab environments for open AI ports; shut down or properly wrap anything unintentionally reachable. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>Strengthen observability:\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Log full request bodies, not just headers and status codes.\u003C\u002Fli>\n\u003Cli>Flag large system prompts, embedded tool definitions, and “mission” descriptions as possible external agent traffic. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-7\" class=\"citation-link\" title=\"View source [7]\">[7]\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>Watch for prompts mentioning offensive tooling (e.g., “Strix”, “penetration test”, “do not ask permission”) or long JSON tool schemas. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-3\" class=\"citation-link\" title=\"View source [3]\">[3]\u003C\u002Fa>\u003Ca href=\"#source-7\" class=\"citation-link\" title=\"View source [7]\">[7]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>Leverage existing telemetry stacks. Microsoft Purview, Defender, and Sentinel, for example, can show who initiated AI interactions, when, and which resources were touched, enabling reconstruction of AI activity chains. \u003Ca href=\"#source-6\" class=\"citation-link\" title=\"View source [6]\">[6]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cp>Use a scope–context–signal model for investigations: \u003Ca href=\"#source-6\" class=\"citation-link\" title=\"View source [6]\">[6]\u003C\u002Fa>\u003C\u002Fp>\n\u003Col>\n\u003Cli>\u003Cstrong>Scope:\u003C\u002Fstrong> Identities, IPs, and services hitting the suspect endpoint.\u003C\u002Fli>\n\u003Cli>\u003Cstrong>Context:\u003C\u002Fstrong> Data, tools, and internal systems accessed.\u003C\u002Fli>\n\u003Cli>\u003Cstrong>Signal:\u003C\u002Fstrong> Anomalies such as usage spikes, unusual prompts, or credential exposure.\u003C\u002Fli>\n\u003C\u002Fol>\n\u003Cp>Prepare the organization:\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Run AI‑specific incident tabletop exercises.\u003C\u002Fli>\n\u003Cli>Track CISA JCDC efforts toward an AI Security Incident Collaboration Playbook for shared response patterns. \u003Ca href=\"#source-9\" class=\"citation-link\" title=\"View source [9]\">[9]\u003C\u002Fa>\u003C\u002Fli>\n\u003Cli>Align security, engineering, and legal roles before an AI endpoint hijack, when minutes matter. \u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003Ca href=\"#source-9\" class=\"citation-link\" title=\"View source [9]\">[9]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>⚠️ \u003Cstrong>Key point:\u003C\u002Fstrong> Treat AI endpoints as first‑class production services with threat models, runbooks, and incident drills—not experimental sidecars. \u003Ca href=\"#source-6\" class=\"citation-link\" title=\"View source [6]\">[6]\u003C\u002Fa>\u003Ca href=\"#source-9\" class=\"citation-link\" title=\"View source [9]\">[9]\u003C\u002Fa>\u003C\u002Fp>\n\u003Chr>\n\u003Ch2>Conclusion: Inventory, Lock Down, and Practice the Response\u003C\u002Fh2>\n\u003Cp>Exposed inference endpoints are low‑friction, high‑reward targets. Attackers need only a URL to conscript your models and compute, as agentic AI, misconfiguration, and accelerated exploit development converge. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-5\" class=\"citation-link\" title=\"View source [5]\">[5]\u003C\u002Fa>\u003Ca href=\"#source-8\" class=\"citation-link\" title=\"View source [8]\">[8]\u003C\u002Fa>\u003C\u002Fp>\n\u003Cp>Concretely, you should:\u003C\u002Fp>\n\u003Cul>\n\u003Cli>Inventory every AI endpoint across environments.\u003C\u002Fli>\n\u003Cli>Lock down exposure and enforce strong authentication.\u003C\u002Fli>\n\u003Cli>Extend logging to capture agent prompts, tools, and trajectories.\u003C\u002Fli>\n\u003Cli>Integrate AI‑focused tabletop exercises into incident response. \u003Ca href=\"#source-1\" class=\"citation-link\" title=\"View source [1]\">[1]\u003C\u002Fa>\u003Ca href=\"#source-6\" class=\"citation-link\" title=\"View source [6]\">[6]\u003C\u002Fa>\u003Ca href=\"#source-9\" class=\"citation-link\" title=\"View source [9]\">[9]\u003C\u002Fa>\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cp>Doing this now is the difference between reading about hijacked AI infrastructure and discovering the infrastructure is yours.\u003C\u002Fp>\n","Modern AI stacks expose inference endpoints like \u002Fapi\u002Fgenerate, \u002Fapi\u002Fchat, or \u002Fv1\u002Fresponses so apps can call models over HTTP. When self-hosted backends are reachable from the public internet without...","trend-radar",[],1005,5,"2026-07-03T17:31:22.207Z",[17,22,25,29,32,36,40,44,48],{"title":18,"url":19,"summary":20,"type":21},"Attackers Hijack Exposed AI Endpoints to Power Offensive Ops","https:\u002F\u002Fdaily.dev\u002Fposts\u002Fattackers-hijack-exposed-ai-endpoints-to-power-offensive-ops-wkmxwlyyc","Zenity researchers documented three distinct attack campaigns between March and May in which threat actors hijacked exposed AI inference endpoints (Ollama and LiteLLM) to power offensive operations — ...","kb",{"title":18,"url":23,"summary":24,"type":21},"https:\u002F\u002Fx.com\u002FTheCyberSecHub\u002Fstatus\u002F2072072333901390278","Attackers Hijack Exposed AI Endpoints to Power Offensive Ops\n\nAttackers don't need any special authentication to reach a target endpoint — they just need to know where it is.\n\nSource: The Cyber Securi...",{"title":26,"url":27,"summary":28,"type":21},"Attackers Seize Exposed AI Endpoints to Power Offensive Ops","https:\u002F\u002Fwww.darkreading.com\u002Fcloud-security\u002Fattackers-hijack-exposed-ai-endpoints-power-offensive-ops","Threat actors are trying to leverage organization-owned AI agents to power complex threat activity.\n\nBetween March and May, Zenity researchers observed three distinct campaigns leveraging its honeypot...",{"title":18,"url":30,"summary":31,"type":21},"https:\u002F\u002Fwww.facebook.com\u002Fdarkreadingcom\u002Fposts\u002Fattackers-hijack-exposed-ai-endpoints-to-power-offensive-ops-by-alexander-culafi\u002F1431939115621298\u002F","Attackers Hijack Exposed AI Endpoints to Power Offensive Ops — by Alexander Culafi\n\nThis article discusses how attackers are exploiting exposed AI endpoints to conduct offensive operations. The piece,...",{"title":33,"url":34,"summary":35,"type":21},"AI Agents Are Here. So Are the Threats.","https:\u002F\u002Funit42.paloaltonetworks.com\u002Fagentic-ai-threats\u002F","Executive Summary\n\nAgentic applications are programs that leverage AI agents — software designed to autonomously collect data and take actions toward specific objectives — to drive their functionality...",{"title":37,"url":38,"summary":39,"type":21},"AI systems are now part of everyday work. Investigators need a consistent way to reconstruct what happened within them.","https:\u002F\u002Fwww.microsoft.com\u002Fen-us\u002Fsecurity\u002Fblog\u002F2026\u002F06\u002F09\u002Freconstructing-ai-activity-investigations\u002F","AI interactions generate telemetry across Microsoft Purview, Defender, and Sentinel. That telemetry captures who initiated an interaction, when it occurred, and which resources were involved. It provi...",{"title":41,"url":42,"summary":43,"type":21},"How to Test and Evaluate Agentic Systems for Reliability","https:\u002F\u002Fvirtuslab.com\u002Fblog\u002Fai\u002Ftesting-evaluating-agentic-systems\u002F","Agentic systems—autonomous, goal-directed stacks that plan, call tools, observe results, and iterate—are rapidly becoming a core component of modern products. Examples include travel-booking assistant...",{"title":45,"url":46,"summary":47,"type":21},"AI is changing the economics of both software development and cyberattacks","https:\u002F\u002Fwww.wiz.io\u002Fblog\u002Fai-rewriting-secops-playbook","AI is changing the economics of both software development and cyberattacks. Organizations are shipping code faster than ever, increasingly with the help of AI agents and tools that generate, modify, a...",{"title":49,"url":50,"summary":51,"type":21},"Enhancing AI Security Incident Response Through Collaborative Exercises","https:\u002F\u002Fblogs.cisco.com\u002Fsecurity\u002Fenhancing-ai-security-incident-response-through-collaborative-exercises","I had the privilege of participating in an AI Security Incident tabletop exercise led by the Cybersecurity and Infrastructure Security Agency’s (CISA) Joint Cyber Defense Collaborative (JCDC). This ex...",{"totalSources":53},9,{"generationDuration":55,"kbQueriesCount":53,"confidenceScore":56,"sourcesCount":53},267352,100,{"metaTitle":58,"metaDescription":59},"Exposed AI Endpoints Hijacked by Threat Actors Now","Exposed AI endpoints are being hijacked. Learn how attackers repoint public Ollama\u002FLiteLLM APIs, why misconfigurations enable abuse, and steps to mitigate.","en","https:\u002F\u002Fimages.unsplash.com\u002Fphoto-1509479200622-4503f27f12ef?ixid=M3w4OTczNDl8MHwxfHNlYXJjaHwxfHx0aHJlYXQlMjBhY3RvcnMlMjBoaWphY2tpbmclMjBleHBvc2VkfGVufDF8MHx8fDE3ODMwOTkzOTl8MA&ixlib=rb-4.1.0&w=1200&h=630&fit=crop&crop=entropy&auto=format,compress&q=60",{"photographerName":63,"photographerUrl":64,"unsplashUrl":65},"Max Bender","https:\u002F\u002Funsplash.com\u002F@maxwbender?utm_source=coreprose&utm_medium=referral","https:\u002F\u002Funsplash.com\u002Fphotos\u002Fperson-standing-near-led-sign-XIVDN9cxOVc?utm_source=coreprose&utm_medium=referral",true,"threat-actors-hijacking-exposed-ai-endpoints-to-power-attacks",{"score":69,"type":70,"sourceCount":71,"topSourceDomains":72,"detectedAt":76,"mentionsLast7Days":71},99,"spiking",4,[73,74,75],"darkreading.com","petri.com","thehackernews.com","2026-07-03T12:54:04.448Z",{"key":78,"name":79,"nameEn":79},"ai-engineering","AI Engineering & LLM Ops",[81,83,85,87],{"text":82},"Between March and May, Zenity honeypots observed three distinct campaigns that hijacked publicly reachable Ollama and LiteLLM endpoints with no exploit beyond knowing the URL and port.",{"text":84},"Ollama commonly exposes \u002Fapi\u002Fgenerate and \u002Fapi\u002Fchat on port 11434 and LiteLLM exposes \u002Fv1\u002Fresponses on port 4000; default installs often ship without authentication.",{"text":86},"Attackers repoint their agents to victim endpoints and run full agent payloads; one victim logged 140,000‑character Strix prompts and unexpected costs when abuse began.",{"text":88},"If an AI backend is reachable from the public internet without strong authentication, it will be used as someone else’s attack infrastructure.",[90,93,96],{"question":91,"answer":92},"How exactly do attackers hijack exposed AI endpoints?","Attackers simply discover a reachable inference endpoint, verify model behavior with a small “hello” prompt, and repoint their agent’s LLM_BASE_URL (or equivalent) to that URL, requiring no RCE or deserialization bug. They then send full agent payloads—system prompts, tool schemas, and objectives—so the victim’s model runs the attacker’s orchestration. Campaigns observed by Zenity used scans for typical ports (11434 for Ollama, 4000 for LiteLLM), uploaded large orchestration prompts (one case produced 140k‑character Strix payloads), and reused the victim’s compute and model as the backend for autonomous frameworks like Strix or HexStrike AI, turning a misconfigured service into free offensive infrastructure.",{"question":94,"answer":95},"What telemetry and signs indicate my AI endpoint is being abused?","Definitive signs include sudden usage spikes, large request bodies (especially system prompts exceeding normal lengths), appearance of tool\u002FJSON schemas in request payloads, and prompt text referencing offensive tooling or instructions to bypass safeguards. Also watch for cost anomalies on GPU\u002Fmachine billing, new external IPs hitting inference ports, and logged requests containing mission statements or multi‑step objectives. Instrument full request body logging, alert on unusually long system prompts or embedded tool definitions, and correlate network\u002Fsource IPs with unexpected compute consumption to detect hijacking early.",{"question":97,"answer":98},"What immediate steps should I take to secure exposed inference endpoints?","Immediately restrict network exposure by placing runtimes on private networks or behind a VPN and front them with authenticated API gateways using strong, non‑default credentials. Perform an inventory scan for open AI ports across cloud and on‑prem environments, shut down any publicly reachable instances, and enable full request body logging and alerts for large or agent‑style prompts. Follow up with incident drills, apply least‑privilege integrations for tool access, and treat AI endpoints as first‑class production services with runbooks, monitoring, and regular audits.",[100,108,112,116,121,126,131,136,140,145,153,158,162,169,175],{"id":101,"name":102,"type":103,"confidence":104,"wikipediaUrl":105,"slug":106,"mentionCount":107},"6a46ee878224e44d5c35575e","\u002Fapi\u002Fchat","concept",0.9,null,"6a46ee878224e44d5c35575e-api-chat",2,{"id":109,"name":110,"type":103,"confidence":104,"wikipediaUrl":105,"slug":111,"mentionCount":107},"6a46ee878224e44d5c35575f","\u002Fv1\u002Fresponses","6a46ee878224e44d5c35575f-v1-responses",{"id":113,"name":114,"type":103,"confidence":104,"wikipediaUrl":105,"slug":115,"mentionCount":107},"6a46ee878224e44d5c35575d","\u002Fapi\u002Fgenerate","6a46ee878224e44d5c35575d-api-generate",{"id":117,"name":118,"type":103,"confidence":104,"wikipediaUrl":105,"slug":119,"mentionCount":120},"6a47f2138224e44d5c359001","agentic apps","6a47f2138224e44d5c359001-agentic-apps",1,{"id":122,"name":123,"type":103,"confidence":124,"wikipediaUrl":105,"slug":125,"mentionCount":120},"6a47f2138224e44d5c359000","OpenAI Codex persona",0.85,"6a47f2138224e44d5c359000-openai-codex-persona",{"id":127,"name":128,"type":103,"confidence":129,"wikipediaUrl":105,"slug":130,"mentionCount":120},"6a47f2158224e44d5c359003","exposed inference endpoints",0.95,"6a47f2158224e44d5c359003-exposed-inference-endpoints",{"id":132,"name":133,"type":103,"confidence":134,"wikipediaUrl":105,"slug":135,"mentionCount":120},"6a47f2158224e44d5c359006","public internet without auth",0.92,"6a47f2158224e44d5c359006-public-internet-without-auth",{"id":137,"name":138,"type":103,"confidence":124,"wikipediaUrl":105,"slug":139,"mentionCount":120},"6a47f2158224e44d5c359005","140k-character Strix prompts","6a47f2158224e44d5c359005-140k-character-strix-prompts",{"id":141,"name":142,"type":143,"confidence":104,"wikipediaUrl":105,"slug":144,"mentionCount":120},"6a47f2158224e44d5c359004","three campaigns","event","6a47f2158224e44d5c359004-three-campaigns",{"id":146,"name":147,"type":148,"confidence":149,"wikipediaUrl":150,"slug":151,"mentionCount":152},"69871f06033ff25c8c612c83","Ollama","organization",0.98,"https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FOllama","69871f06033ff25c8c612c83-ollama",39,{"id":154,"name":155,"type":148,"confidence":156,"wikipediaUrl":105,"slug":157,"mentionCount":120},"6a47f2148224e44d5c359002","CISA JCDC",0.88,"6a47f2148224e44d5c359002-cisa-jcdc",{"id":159,"name":160,"type":148,"confidence":104,"wikipediaUrl":105,"slug":161,"mentionCount":120},"6a47f2138224e44d5c358fff","Zenity honeypots","6a47f2138224e44d5c358fff-zenity-honeypots",{"id":163,"name":164,"type":165,"confidence":129,"wikipediaUrl":166,"slug":167,"mentionCount":168},"697bbf84e28785d1e150709d","CrewAI","product","https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FCrewAI","697bbf84e28785d1e150709d-crewai",55,{"id":170,"name":171,"type":165,"confidence":129,"wikipediaUrl":172,"slug":173,"mentionCount":174},"6960e29a19d266277e150443","AutoGen","https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FAutogen","6960e29a19d266277e150443-autogen",52,{"id":176,"name":177,"type":165,"confidence":178,"wikipediaUrl":105,"slug":179,"mentionCount":180},"69e1522e6db79d4361e0ae1e","LiteLLM",0.99,"69e1522e6db79d4361e0ae1e-litellm",28,[182,190,198,205],{"id":183,"title":184,"slug":185,"excerpt":186,"category":187,"featuredImage":188,"publishedAt":189},"6a49598e09928d6bcf462390","Supreme Court Alarm on AI‑Generated Fake Case Law: Technical, Legal, and Governance Playbook for LLM Systems in Justice","supreme-court-alarm-on-ai-generated-fake-case-law-technical-legal-and-governance-playbook-for-llm-systems-in-justice","As courts flag AI‑generated fake precedents, legal teams face a core risk: LLMs can confidently invent non‑existent cases that look authentic. This is not creativity but hallucination, a major reliabi...","hallucinations","https:\u002F\u002Fimages.unsplash.com\u002Fphoto-1593115057322-e94b77572f20?ixid=M3w4OTczNDl8MHwxfHNlYXJjaHwxfHxzdXByZW1lJTIwY291cnQlMjBhbGFybSUyMGdlbmVyYXRlZHxlbnwxfDB8fHwxNzgzMTkzMjk3fDA&ixlib=rb-4.1.0&w=1200&h=630&fit=crop&crop=entropy&auto=format,compress&q=60","2026-07-04T19:12:57.486Z",{"id":191,"title":192,"slug":193,"excerpt":194,"category":195,"featuredImage":196,"publishedAt":197},"6a48950209928d6bcf4618f5","Inside the Zeta–Palantir Alliance: Architecting AI-Native Enterprise Marketing","inside-the-zeta-palantir-alliance-architecting-ai-native-enterprise-marketing","Enterprise marketing is shifting from channel tweaks to AI-orchestrated journeys that adapt in real time. By 2026, large language models (LLMs) and agentic AI are core infrastructure for automation, R...","safety","https:\u002F\u002Fimages.unsplash.com\u002Fphoto-1756908992154-c8a89f5e517f?ixid=M3w4OTczNDl8MHwxfHNlYXJjaHwzMXx8YXJ0aWZpY2lhbCUyMGludGVsbGlnZW5jZSUyMHRlY2hub2xvZ3l8ZW58MXwwfHx8MTc4MzEzMzg1M3ww&ixlib=rb-4.1.0&w=1200&h=630&fit=crop&crop=entropy&auto=format,compress&q=60","2026-07-04T05:12:25.078Z",{"id":199,"title":200,"slug":201,"excerpt":202,"category":11,"featuredImage":203,"publishedAt":204},"6a47b0b8a616f41b30a9c789","Databricks Data + AI Summit 2026: Every Major Product Launch That Matters","databricks-data-ai-summit-2026-every-major-product-launch-that-matters","Summit 2026 in Context: Scale, Theme, and Agenda\n\nData + AI Summit 2026 (June 15–18, Moscone Center) brought 30,000+ attendees from 150+ countries, which Databricks calls the world’s largest data and...","https:\u002F\u002Fimages.unsplash.com\u002Fphoto-1777449425442-adc413f3d873?ixid=M3w4OTczNDl8MHwxfHNlYXJjaHw2MXx8YXJ0aWZpY2lhbCUyMGludGVsbGlnZW5jZSUyMHRlY2hub2xvZ3l8ZW58MXwwfHx8MTc4Mjg4NDcwMHww&ixlib=rb-4.1.0&w=1200&h=630&fit=crop&crop=entropy&auto=format,compress&q=60","2026-07-03T13:01:48.623Z",{"id":206,"title":207,"slug":208,"excerpt":209,"category":195,"featuredImage":210,"publishedAt":211},"6a474357d03ca4ad20bb9ae6","Engineering for Insurability: Inside Mayflower and Hadron’s Affirmative AI Liability Program","engineering-for-insurability-inside-mayflower-and-hadron-s-affirmative-ai-liability-program","AI systems now write code, move money, and influence underwriting, but most enterprise policies still hide LLMs and agents in generic cyber riders never designed for GenAI copilots or autonomous workf...","https:\u002F\u002Fimages.unsplash.com\u002Fphoto-1684930184431-d00fb241bdec?ixid=M3w4OTczNDl8MHwxfHNlYXJjaHwxfHxlbmdpbmVlcmluZyUyMGluc3VyYWJpbGl0eSUyMGluc2lkZSUyMG1heWZsb3dlcnxlbnwxfDB8fHwxNzgzMDU1NDUxfDA&ixlib=rb-4.1.0&w=1200&h=630&fit=crop&crop=entropy&auto=format,compress&q=60","2026-07-03T05:10:51.750Z",["Island",213],{"key":214,"params":215,"result":217},"ArticleBody_lUCFUN67CEa848Isiyi2HKSfsnLsyFuzasDh1bBlbY",{"props":216},"{\"articleId\":\"6a47f007a616f41b30a9cd4e\"}",{"head":218},{}]