Window: 2026-05-08 → 2026-05-21 | Threats analysed: 4 | Combined CVSS: 9.0 average | KEV-listed: 2 of 4
In fourteen days, four pre-authentication remote-code-execution vulnerabilities landed across four distinct layers of the modern AI stack — the vector database that holds your embeddings, the LLM proxy that holds your API keys, the agent framework that holds your workflow configs, and the low-code pipeline that wires them all together. Three of the four ship with public proof-of-concept code. One was weaponised within four hours of public disclosure. Two have already made the CISA Known Exploited Vulnerabilities catalog. Every Fortune 500 SOC currently racing to deploy LLM gateways, vector RAG pipelines and agent orchestrators is shipping the same authentication-ordering, deserialization and missing-authorization bugs the web ecosystem learned about a decade ago — except now the blast radius is the entire commercial-LLM credential set.
This post correlates the four threats into a single architectural story, traces the attack flow across the stack, and provides production-ready SPL, KQL, and Sigma detections for every layer.
The AI Stack, Mapped
Executive Summary
- What: Four pre-authentication RCEs published 2026-05-08 → 2026-05-21 — ChromaDB
CVE-2026-45829(CVSS 10.0), LiteLLMCVE-2026-42208(CVSS 9.8, CISA KEV), PraisonAICVE-2026-44338(CVSS 7.3), and the KeyHunter botnet exploiting LangflowCVE-2026-33017(CVSS 9.8, CISA KEV). - Where: Every architectural layer of the modern LLM stack — vector DB, agent framework, low-code pipeline, and LLM proxy / gateway.
- Speed: PraisonAI's advisory-to-first-exploit latency was 3 hours 44 minutes — faster than most enterprises' patch SLAs.
- Impact: Full proxy credential database disclosure (LiteLLM), arbitrary code execution against the vector embedding host (ChromaDB), unauthenticated workflow execution (PraisonAI), and a fully automated LLMjacking botnet (KeyHunter) using NATS pub/sub as covert C2.
- Detection: 36 production-ready detection rules across SPL, KQL, and Sigma — nine per threat, covering reconnaissance, exploitation, persistence, and credential exfiltration.
Threat 1 — ChromaDB CVE-2026-45829 "ChromaToast"
TL-2026-0544 · CVSS 10.0 (AV:N/AC:L/AT:N/PR:N/UI:N/VC:H/VI:H/VA:H) · Severity: CRITICAL · Exploitability: Public PoC
HiddenLayer researcher Esteban Tonglet disclosed CVE-2026-45829 ("ChromaToast Served Pre-Auth") on 2026-05-18 — a maximum-severity pre-authentication RCE in ChromaDB's Python FastAPI server distribution. The bug is an authentication-ordering defect: the POST /api/v2/tenants/{tenant}/databases/{db}/collections handler instantiates the embedding function — and therefore downloads and executes an attacker-supplied Hugging Face model with trust_remote_code=True — before the FastAPI authentication dependency fires. The server returns HTTP 500 or 403 only after the payload has already executed inside the ChromaDB process.
The Python FastAPI server ships in the chromadb PyPI package (14 million monthly downloads) and is the default runtime under several embedded deployment paths. The Rust server (default for chroma run and the official Docker image) does not traverse the vulnerable code path. HiddenLayer's scan of internet-exposed ChromaDB instances found roughly 73% running a vulnerable Python-FastAPI version, with adopters including Capital One, UnitedHealthcare, Weights & Biases, Mintlify, and Factory AI. The maintainer received the private report on 2025-11-28 but has not produced a confirmed patch as of disclosure; ChromaDB 1.5.9 shipped to PyPI on 2026-05-04 but its remediation status is unconfirmed. Hadrian published a Nuclei detection template and one-line PoC on 2026-05-19.
The kill chain is brief: scan → POST a collection-create request with a configuration_json.embedding_function referencing an attacker-controlled HuggingFace repo → ChromaDB downloads and executes modeling_evil.py as the uvicorn process → credentials (OpenAI, Anthropic, Cohere, Voyage keys, object-store creds, cloud IAM) reachable from the process are exfiltrated. Mitigations until a confirmed patch lands are: switch to the Rust server, front the Python server with an authenticating reverse proxy, and add a WAF rule that drops any request body containing trust_remote_code.
Threat 2 — LiteLLM CVE-2026-42208 (CISA KEV)
TL-2026-0506 · CVSS 9.8 · Severity: CRITICAL · Exploitability: Active · CISA KEV: 2026-05-08 (FCEB due 2026-05-11)
BerriAI's LiteLLM (~46,800 GitHub stars) is the AI gateway of choice for consolidating access to commercial LLM providers behind a single API. Enterprises route Anthropic, OpenAI, Azure OpenAI, AWS Bedrock, Vertex AI, Cohere, and Mistral traffic through a single proxy that holds the upstream provider credentials, per-key spending budgets, model allow-lists, and (in many deployments) PII. CVE-2026-42208 is a pre-authentication SQL injection in the proxy's API-key verification routine — discovered by Tencent YunDing Security Lab and disclosed via GHSA-r75f-5x8p-qvmc on 2026-04-20.
Between v1.81.16 and v1.83.6, a refactor introduced an error-handling code path inside the LiteLLM_VerificationTokenView (combined_view) lookup that interpolates the raw caller-supplied API key directly into the SQL query string rather than passing it as a Prisma parameter. Under normal authentication, the key is hashed before lookup; under the error-handling fallback, the unsanitised key flows straight into $queryRawUnsafe. A single crafted Authorization: Bearer ' UNION SELECT key,user_id,models,spend FROM "LiteLLM_VerificationToken" -- header against any proxy endpoint (/chat/completions, /v1/embeddings, /key/generate) extracts the entire virtual-key table including the upstream provider credentials.
CISA added CVE-2026-42208 to the KEV catalog on 2026-05-08 citing in-the-wild exploitation; EPSS currently sits at 37.4% (97th percentile). The fix is v1.83.7-stable, released 2026-04-19. If you cannot upgrade today, set general_settings.disable_error_logs: true, restart, then rotate every virtual key and every upstream provider credential.
Threat 3 — PraisonAI CVE-2026-44338 (4-hour disclosure-to-exploit)
TL-2026-0502 · CVSS 7.3 · Severity: HIGH · Exploitability: Active
PraisonAI is an open-source multi-agent LLM orchestration framework maintained by MervinPraison (~7,100 GitHub stars). Versions 2.5.6 → 4.6.33 shipped the legacy Flask api_server.py entrypoint with two module-level flags set to insecure defaults: AUTH_ENABLED = False and AUTH_TOKEN = None. The helper check_auth() is implemented to return True whenever auth is disabled. The result: every default deployment exposes GET /agents and POST /chat to any unauthenticated network caller, leaking the agent workflow configuration (including embedded API keys in agents.yaml) and allowing arbitrary workflow invocation.
The Sysdig Threat Research Team documented the exploitation timeline minute-by-minute. GHSA-6rmh-7xcm-cpxj published at 13:56:16 UTC on 2026-05-11. Generic disclosure-path reconnaissance from 146.190.133.49 (AS14061 DigitalOcean) began at 17:32:50 UTC; a PraisonAI-specific second pass with User-Agent: CVE-Detector/1.0 started eight minutes later at 17:40:53 UTC. Advisory-to-first-exploit latency: 3 hours, 44 minutes, 39 seconds — faster than most patch-management SLAs even contemplate.
Fixed in v4.6.34. The newer FastAPI server binds to 127.0.0.1 by default; the deprecated Flask api_server.py is the entire vulnerability surface. Migration off the legacy entrypoint is the long-term answer.
Threat 4 — KeyHunter / NATS-as-C2 / Langflow CVE-2026-33017
TL-2026-0514 · CVSS 9.8 · Severity: HIGH · Exploitability: Active · CISA KEV: 2026-05-05
Sysdig TRT also disclosed the first publicly documented case of an attacker using NATS — a CNCF-graduated pub/sub messaging server — as the coordination plane for a distributed credential-harvesting botnet. The campaign, tracked as KeyHunter after the Go module path github.com/keyhunter/worker leaked in the binary, combines four notable techniques: (1) abuse of a legitimate ACL-enforced NATS deployment on 45.192.109.25:14222 as covert C2; (2) automated exploitation of Langflow CVE-2026-33017 (unauthenticated Jython execution at /api/v1/build_public_tmp/{flow}) for initial access; (3) post-compromise scraping of public code-development environments (CodePen, JSFiddle, StackBlitz, CodeSandbox) for AWS access keys and Anthropic / OpenAI / Bedrock API keys, with live validation against vendor APIs; and (4) DirtyPipe / DirtyCreds container-escape exploitation chains for breakout from compromised pods.
Captured worker behavior includes aws sts get-caller-identity followed by bedrock list-foundation-models — operator-validation sequences classic to LLMjacking. The novelty is the NATS C2 fabric: worker pods subscribe to subjects like task.scan_cde, task.validate_aws, task.validate_ai and publish back to *.result, all governed by ACL-enforced authorization tokens. Detection requires either DPI on TCP/14222 or workload-tier outbound-network policies that disallow NATS protocol egress from production services by default.
Cross-Threat MITRE Coverage
The four threats converge on the same defensive surface — public-facing exploitation followed by credential access from files and from the LLM-key store, followed by impact via resource hijacking.
| Technique | Where it appears |
|---|---|
T1190 Exploit Public-Facing Application | All four threats — entry vector |
T1595.002 Active Scanning: Vulnerability Scanning | All four — opportunistic scanner reconnaissance |
T1059.006 Python interpreter | ChromaDB, PraisonAI, Langflow |
T1505.003 Web Shell | ChromaDB, LiteLLM, KeyHunter |
T1552.001 Credentials in Files | ChromaDB (process env), LiteLLM (Postgres), KeyHunter (CDE scraping) |
T1528 Steal Application Access Token | LiteLLM (virtual-key DB), KeyHunter (cloud API keys) |
T1611 Escape to Host (DirtyPipe / DirtyCreds) | KeyHunter — container-to-host breakout |
T1571 Non-Standard Port | KeyHunter — NATS on TCP/14222 |
T1496 Resource Hijacking | KeyHunter — Bedrock LLMjacking, stolen-key resale |
Detection
Each threat ships with three SPL, three KQL, and three Sigma rules on the Threadlinqs Intelligence platform. The three queries below cover the highest-fidelity behaviors across all four threats.
Splunk SPL — ChromaDB pre-auth RCE
index=proxy OR index=web sourcetype=access*
uri="*/api/v2/tenants/*/databases/*/collections"
method=POST
| rex field=request_body "trust_remote_code\s*[:=]\s*[Tt]rue|auto_map"
| eval anomalous=if(isnotnull(trust_remote_code) OR isnotnull(auto_map), 1, 0)
| where anomalous=1
| stats count, values(src_ip) as src_ips, values(http_status) as statuses by host, request_body
| where count >= 1spl
Microsoft KQL — LiteLLM SQL-injection in Authorization header
let HOSTS = dynamic(["litellm-proxy", "ai-gateway"]);
W3CIISLog
| where csHost in (HOSTS) or csUriStem contains "/v1/chat/completions" or csUriStem contains "/key/generate"
| extend AuthHeader = extract(@"Authorization: ([^\r\n]+)", 1, cs(Cookie))
| where AuthHeader matches regex @"(?i)(UNION\s+SELECT|--\s|/\*|pg_sleep|information_schema|sleep\()"
| project TimeGenerated, cIP, csHost, csUriStem, AuthHeader, scStatus
| order by TimeGenerated desckql
Sigma — Outbound NATS C2 (KeyHunter)
title: KeyHunter NATS-as-C2 Outbound on TCP/14222
id: 8e7d4f1c-5a92-4b03-9b6e-3c2f1e8d4a7b
status: experimental
description: Detects outbound TCP/14222 connections from container or AI workload hosts to the KeyHunter NATS C2 relay (45.192.109.25). NATS pub/sub abused as covert command-and-control plane for distributed credential harvesting.
references:
- https://sysdig.com/blog/nats-as-c2-inside-a-new-technique
- https://intel.threadlinqs.com/threat/TL-2026-0514
logsource:
category: network_connection
product: linux
detection:
selection:
DestinationPort: 14222
DestinationIp:
- '45.192.109.25'
selection_proto:
DestinationPort: 14222
condition: selection or (selection_proto and not Initiated by 'natsd|nats-server')
fields:
- SourceIp
- DestinationIp
- ProcessName
- User
level: critical
tags:
- attack.command_and_control
- attack.t1571
- attack.t1090sigma
Indicators of Compromise
| Type | Value | Context |
|---|---|---|
| IP | 146.190.133.49 | PraisonAI exploitation source (AS14061 DigitalOcean), UA CVE-Detector/1.0 |
| IP | 159.89.205.184 | KeyHunter DigitalOcean staging — TCP/8888 worker artifacts |
| IP / Port | 45.192.109.25:14222 | KeyHunter NATS-as-C2 relay (ACL-enforced subjects) |
| URL | /api/v2/tenants/*/databases/*/collections | ChromaDB ChromaToast exploit endpoint (POST + trust_remote_code) |
| URL | /api/v1/build_public_tmp/{flow} | Langflow unauth Jython exec (CVE-2026-33017) |
| URL | /agents, /chat | PraisonAI legacy api_server.py exposed routes |
| File | worker-linux-amd64 | KeyHunter Go worker binary (Bedrock LLMjacking) |
| File | keyhunter_worker.py, deploy.sh | KeyHunter Python orchestrator + systemd/k8s installer |
| Behavior | aws sts get-caller-identity → bedrock list-foundation-models | Operator credential-validation sequence |
| Behavior | HTTP Authorization: Bearer '...UNION SELECT... | LiteLLM SQLi exploitation header |
| UA | CVE-Detector/1.0 | Sysdig-observed opportunistic scanner against newly disclosed AI CVEs |
Why the AI Stack Is Repeating Web-Era Mistakes
Look at the bug classes:
- ChromaToast — authentication-ordering defect; the auth dependency fires after the side-effectful handler body. This is a Spring 2010-era Java framework bug, now ported to FastAPI.
- LiteLLM — string-concatenated SQL inside an error-handling fallback path. The hot path uses parameterized Prisma queries; the cold path was overlooked.
- PraisonAI — auth disabled by default in a server bound to
0.0.0.0. This is the same default-credential bug that put thousands of MongoDB instances on Shodan in 2015. - Langflow / KeyHunter — unauthenticated dynamic-code endpoint accepting arbitrary Jython. This is the Struts2 OGNL pattern, reincarnated.
The AI tooling supply chain is shipping with the same defect classes the web ecosystem learned to detect (and largely eliminate) over the 2010-2020 decade. What's different now is the monetization vector: an attacker who lands code execution on any of these components isn't farming credit cards or cryptominers — they are pivoting straight into stolen Anthropic / OpenAI / Bedrock keys, monetized via resold virtual keys and large-scale prompt-completion abuse against the victim's own provider account. Sysdig has tracked individual victims being billed tens of thousands of dollars in a single weekend from a single compromised proxy.
The CISO-level takeaway: an LLM gateway is Tier-0 credential infrastructure. It deserves the same network controls, the same patch cadence, and the same Tier-0 detection posture as your identity provider and your cloud root-account console. Today, almost none of these systems are being treated that way.
Action Items
- This week: Inventory every Langflow / LiteLLM / ChromaDB / PraisonAI deployment in your environment — including unsanctioned shadow-IT instances on developer laptops. Censys + Shodan + internal CMDB; do not rely on Composer / pip lock files alone.
- Upgrade or isolate: LiteLLM ≥ v1.83.7-stable, PraisonAI ≥ 4.6.34, Langflow per vendor CVE-2026-33017 advisory. ChromaDB: switch to the Rust server (default since 1.0.0) or front the Python server with authenticating proxy + WAF rule blocking
trust_remote_code. - Rotate credentials: every LiteLLM virtual key, every upstream provider key (Anthropic, OpenAI, Azure OpenAI, Bedrock, Vertex AI, Mistral, Cohere), every AWS access key reachable from a Langflow or KeyHunter-compromised host.
- Network: Deny outbound TCP/14222 (NATS) from production workloads by default. Block egress to
45.192.109.25and159.89.205.184. Front every internet-facing AI component with authenticating reverse proxy. - Detect: Deploy the SPL/KQL/Sigma rules above; alert on Bedrock
InvokeModelspikes per IAM principal; baseline Anthropic / OpenAI billing per project.
This post correlates four threats from the Threadlinqs Intelligence platform. Full IOC sets, MITRE technique chains, and 36 production detection rules are available via the threat-detail pages linked above.