AI Stack Under Siege — Four Pre-Auth RCEs Hit ChromaDB, LiteLLM, PraisonAI, and Langflow in 14 Days

Window: 2026-05-08 → 2026-05-21 | Threats analysed: 4 | Combined CVSS: 9.0 average | KEV-listed: 2 of 4

In fourteen days, four pre-authentication remote-code-execution vulnerabilities landed across four distinct layers of the modern AI stack — the vector database that holds your embeddings, the LLM proxy that holds your API keys, the agent framework that holds your workflow configs, and the low-code pipeline that wires them all together. Three of the four ship with public proof-of-concept code. One was weaponised within four hours of public disclosure. Two have already made the CISA Known Exploited Vulnerabilities catalog. Every Fortune 500 SOC currently racing to deploy LLM gateways, vector RAG pipelines and agent orchestrators is shipping the same authentication-ordering, deserialization and missing-authorization bugs the web ecosystem learned about a decade ago — except now the blast radius is the entire commercial-LLM credential set.

This post correlates the four threats into a single architectural story, traces the attack flow across the stack, and provides production-ready SPL, KQL, and Sigma detections for every layer.

The AI Stack, Mapped

// fig.1 — Four pre-auth RCEs across three architectural layers. Solid red arrows = initial exploit. Dashed red = post-compromise credential exfiltration. Mint arrows = downstream abuse against the commercial LLM APIs whose keys the AI gateway centralises.

Executive Summary

What: Four pre-authentication RCEs published 2026-05-08 → 2026-05-21 — ChromaDB CVE-2026-45829 (CVSS 10.0), LiteLLM CVE-2026-42208 (CVSS 9.8, CISA KEV), PraisonAI CVE-2026-44338 (CVSS 7.3), and the KeyHunter botnet exploiting Langflow CVE-2026-33017 (CVSS 9.8, CISA KEV).
Where: Every architectural layer of the modern LLM stack — vector DB, agent framework, low-code pipeline, and LLM proxy / gateway.
Speed: PraisonAI's advisory-to-first-exploit latency was 3 hours 44 minutes — faster than most enterprises' patch SLAs.
Impact: Full proxy credential database disclosure (LiteLLM), arbitrary code execution against the vector embedding host (ChromaDB), unauthenticated workflow execution (PraisonAI), and a fully automated LLMjacking botnet (KeyHunter) using NATS pub/sub as covert C2.
Detection: 36 production-ready detection rules across SPL, KQL, and Sigma — nine per threat, covering reconnaissance, exploitation, persistence, and credential exfiltration.

Threat 1 — ChromaDB CVE-2026-45829 "ChromaToast"

TL-2026-0544 · CVSS 10.0 (AV:N/AC:L/AT:N/PR:N/UI:N/VC:H/VI:H/VA:H) · Severity: CRITICAL · Exploitability: Public PoC

HiddenLayer researcher Esteban Tonglet disclosed CVE-2026-45829 ("ChromaToast Served Pre-Auth") on 2026-05-18 — a maximum-severity pre-authentication RCE in ChromaDB's Python FastAPI server distribution. The bug is an authentication-ordering defect: the POST /api/v2/tenants/{tenant}/databases/{db}/collections handler instantiates the embedding function — and therefore downloads and executes an attacker-supplied Hugging Face model with trust_remote_code=True — before the FastAPI authentication dependency fires. The server returns HTTP 500 or 403 only after the payload has already executed inside the ChromaDB process.

The Python FastAPI server ships in the chromadb PyPI package (14 million monthly downloads) and is the default runtime under several embedded deployment paths. The Rust server (default for chroma run and the official Docker image) does not traverse the vulnerable code path. HiddenLayer's scan of internet-exposed ChromaDB instances found roughly 73% running a vulnerable Python-FastAPI version, with adopters including Capital One, UnitedHealthcare, Weights & Biases, Mintlify, and Factory AI. The maintainer received the private report on 2025-11-28 but has not produced a confirmed patch as of disclosure; ChromaDB 1.5.9 shipped to PyPI on 2026-05-04 but its remediation status is unconfirmed. Hadrian published a Nuclei detection template and one-line PoC on 2026-05-19.

The kill chain is brief: scan → POST a collection-create request with a configuration_json.embedding_function referencing an attacker-controlled HuggingFace repo → ChromaDB downloads and executes modeling_evil.py as the uvicorn process → credentials (OpenAI, Anthropic, Cohere, Voyage keys, object-store creds, cloud IAM) reachable from the process are exfiltrated. Mitigations until a confirmed patch lands are: switch to the Rust server, front the Python server with an authenticating reverse proxy, and add a WAF rule that drops any request body containing trust_remote_code.

Threat 2 — LiteLLM CVE-2026-42208 (CISA KEV)

TL-2026-0506 · CVSS 9.8 · Severity: CRITICAL · Exploitability: Active · CISA KEV: 2026-05-08 (FCEB due 2026-05-11)

BerriAI's LiteLLM (~46,800 GitHub stars) is the AI gateway of choice for consolidating access to commercial LLM providers behind a single API. Enterprises route Anthropic, OpenAI, Azure OpenAI, AWS Bedrock, Vertex AI, Cohere, and Mistral traffic through a single proxy that holds the upstream provider credentials, per-key spending budgets, model allow-lists, and (in many deployments) PII. CVE-2026-42208 is a pre-authentication SQL injection in the proxy's API-key verification routine — discovered by Tencent YunDing Security Lab and disclosed via GHSA-r75f-5x8p-qvmc on 2026-04-20.

Between v1.81.16 and v1.83.6, a refactor introduced an error-handling code path inside the LiteLLM_VerificationTokenView (combined_view) lookup that interpolates the raw caller-supplied API key directly into the SQL query string rather than passing it as a Prisma parameter. Under normal authentication, the key is hashed before lookup; under the error-handling fallback, the unsanitised key flows straight into $queryRawUnsafe. A single crafted Authorization: Bearer ' UNION SELECT key,user_id,models,spend FROM "LiteLLM_VerificationToken" -- header against any proxy endpoint (/chat/completions, /v1/embeddings, /key/generate) extracts the entire virtual-key table including the upstream provider credentials.

CISA added CVE-2026-42208 to the KEV catalog on 2026-05-08 citing in-the-wild exploitation; EPSS currently sits at 37.4% (97th percentile). The fix is v1.83.7-stable, released 2026-04-19. If you cannot upgrade today, set general_settings.disable_error_logs: true, restart, then rotate every virtual key and every upstream provider credential.

Threat 3 — PraisonAI CVE-2026-44338 (4-hour disclosure-to-exploit)

TL-2026-0502 · CVSS 7.3 · Severity: HIGH · Exploitability: Active

PraisonAI is an open-source multi-agent LLM orchestration framework maintained by MervinPraison (~7,100 GitHub stars). Versions 2.5.6 → 4.6.33 shipped the legacy Flask api_server.py entrypoint with two module-level flags set to insecure defaults: AUTH_ENABLED = False and AUTH_TOKEN = None. The helper check_auth() is implemented to return True whenever auth is disabled. The result: every default deployment exposes GET /agents and POST /chat to any unauthenticated network caller, leaking the agent workflow configuration (including embedded API keys in agents.yaml) and allowing arbitrary workflow invocation.

The Sysdig Threat Research Team documented the exploitation timeline minute-by-minute. GHSA-6rmh-7xcm-cpxj published at 13:56:16 UTC on 2026-05-11. Generic disclosure-path reconnaissance from 146.190.133.49 (AS14061 DigitalOcean) began at 17:32:50 UTC; a PraisonAI-specific second pass with User-Agent: CVE-Detector/1.0 started eight minutes later at 17:40:53 UTC. Advisory-to-first-exploit latency: 3 hours, 44 minutes, 39 seconds — faster than most patch-management SLAs even contemplate.

Fixed in v4.6.34. The newer FastAPI server binds to 127.0.0.1 by default; the deprecated Flask api_server.py is the entire vulnerability surface. Migration off the legacy entrypoint is the long-term answer.

Threat 4 — KeyHunter / NATS-as-C2 / Langflow CVE-2026-33017

TL-2026-0514 · CVSS 9.8 · Severity: HIGH · Exploitability: Active · CISA KEV: 2026-05-05

Sysdig TRT also disclosed the first publicly documented case of an attacker using NATS — a CNCF-graduated pub/sub messaging server — as the coordination plane for a distributed credential-harvesting botnet. The campaign, tracked as KeyHunter after the Go module path github.com/keyhunter/worker leaked in the binary, combines four notable techniques: (1) abuse of a legitimate ACL-enforced NATS deployment on 45.192.109.25:14222 as covert C2; (2) automated exploitation of Langflow CVE-2026-33017 (unauthenticated Jython execution at /api/v1/build_public_tmp/{flow}) for initial access; (3) post-compromise scraping of public code-development environments (CodePen, JSFiddle, StackBlitz, CodeSandbox) for AWS access keys and Anthropic / OpenAI / Bedrock API keys, with live validation against vendor APIs; and (4) DirtyPipe / DirtyCreds container-escape exploitation chains for breakout from compromised pods.

Captured worker behavior includes aws sts get-caller-identity followed by bedrock list-foundation-models — operator-validation sequences classic to LLMjacking. The novelty is the NATS C2 fabric: worker pods subscribe to subjects like task.scan_cde, task.validate_aws, task.validate_ai and publish back to *.result, all governed by ACL-enforced authorization tokens. Detection requires either DPI on TCP/14222 or workload-tier outbound-network policies that disallow NATS protocol egress from production services by default.

Cross-Threat MITRE Coverage

The four threats converge on the same defensive surface — public-facing exploitation followed by credential access from files and from the LLM-key store, followed by impact via resource hijacking.

Technique	Where it appears
`T1190` Exploit Public-Facing Application	All four threats — entry vector
`T1595.002` Active Scanning: Vulnerability Scanning	All four — opportunistic scanner reconnaissance
`T1059.006` Python interpreter	ChromaDB, PraisonAI, Langflow
`T1505.003` Web Shell	ChromaDB, LiteLLM, KeyHunter
`T1552.001` Credentials in Files	ChromaDB (process env), LiteLLM (Postgres), KeyHunter (CDE scraping)
`T1528` Steal Application Access Token	LiteLLM (virtual-key DB), KeyHunter (cloud API keys)
`T1611` Escape to Host (DirtyPipe / DirtyCreds)	KeyHunter — container-to-host breakout
`T1571` Non-Standard Port	KeyHunter — NATS on TCP/14222
`T1496` Resource Hijacking	KeyHunter — Bedrock LLMjacking, stolen-key resale

Detection

Each threat ships with three SPL, three KQL, and three Sigma rules on the Threadlinqs Intelligence platform. The three queries below cover the highest-fidelity behaviors across all four threats.

Splunk SPL — ChromaDB pre-auth RCE

index=proxy OR index=web sourcetype=access*
  uri="*/api/v2/tenants/*/databases/*/collections"
  method=POST
| rex field=request_body "trust_remote_code\s*[:=]\s*[Tt]rue|auto_map"
| eval anomalous=if(isnotnull(trust_remote_code) OR isnotnull(auto_map), 1, 0)
| where anomalous=1
| stats count, values(src_ip) as src_ips, values(http_status) as statuses by host, request_body
| where count >= 1spl

Microsoft KQL — LiteLLM SQL-injection in Authorization header

let HOSTS = dynamic(["litellm-proxy", "ai-gateway"]);
W3CIISLog
| where csHost in (HOSTS) or csUriStem contains "/v1/chat/completions" or csUriStem contains "/key/generate"
| extend AuthHeader = extract(@"Authorization: ([^\r\n]+)", 1, cs(Cookie))
| where AuthHeader matches regex @"(?i)(UNION\s+SELECT|--\s|/\*|pg_sleep|information_schema|sleep\()"
| project TimeGenerated, cIP, csHost, csUriStem, AuthHeader, scStatus
| order by TimeGenerated desckql

Sigma — Outbound NATS C2 (KeyHunter)

title: KeyHunter NATS-as-C2 Outbound on TCP/14222
id: 8e7d4f1c-5a92-4b03-9b6e-3c2f1e8d4a7b
status: experimental
description: Detects outbound TCP/14222 connections from container or AI workload hosts to the KeyHunter NATS C2 relay (45.192.109.25). NATS pub/sub abused as covert command-and-control plane for distributed credential harvesting.
references:
  - https://sysdig.com/blog/nats-as-c2-inside-a-new-technique
  - https://intel.threadlinqs.com/threat/TL-2026-0514
logsource:
  category: network_connection
  product: linux
detection:
  selection:
    DestinationPort: 14222
    DestinationIp:
      - '45.192.109.25'
  selection_proto:
    DestinationPort: 14222
  condition: selection or (selection_proto and not Initiated by 'natsd|nats-server')
fields:
  - SourceIp
  - DestinationIp
  - ProcessName
  - User
level: critical
tags:
  - attack.command_and_control
  - attack.t1571
  - attack.t1090sigma

Indicators of Compromise

Type	Value	Context
IP	`146.190.133.49`	PraisonAI exploitation source (AS14061 DigitalOcean), UA `CVE-Detector/1.0`
IP	`159.89.205.184`	KeyHunter DigitalOcean staging — TCP/8888 worker artifacts
IP / Port	`45.192.109.25:14222`	KeyHunter NATS-as-C2 relay (ACL-enforced subjects)
URL	`/api/v2/tenants//databases//collections`	ChromaDB ChromaToast exploit endpoint (POST + trust_remote_code)
URL	`/api/v1/build_public_tmp/{flow}`	Langflow unauth Jython exec (CVE-2026-33017)
URL	`/agents`, `/chat`	PraisonAI legacy `api_server.py` exposed routes
File	`worker-linux-amd64`	KeyHunter Go worker binary (Bedrock LLMjacking)
File	`keyhunter_worker.py`, `deploy.sh`	KeyHunter Python orchestrator + systemd/k8s installer
Behavior	`aws sts get-caller-identity` → `bedrock list-foundation-models`	Operator credential-validation sequence
Behavior	HTTP `Authorization: Bearer '...UNION SELECT...`	LiteLLM SQLi exploitation header
UA	`CVE-Detector/1.0`	Sysdig-observed opportunistic scanner against newly disclosed AI CVEs

Why the AI Stack Is Repeating Web-Era Mistakes

Look at the bug classes:

ChromaToast — authentication-ordering defect; the auth dependency fires after the side-effectful handler body. This is a Spring 2010-era Java framework bug, now ported to FastAPI.
LiteLLM — string-concatenated SQL inside an error-handling fallback path. The hot path uses parameterized Prisma queries; the cold path was overlooked.
PraisonAI — auth disabled by default in a server bound to 0.0.0.0. This is the same default-credential bug that put thousands of MongoDB instances on Shodan in 2015.
Langflow / KeyHunter — unauthenticated dynamic-code endpoint accepting arbitrary Jython. This is the Struts2 OGNL pattern, reincarnated.

The AI tooling supply chain is shipping with the same defect classes the web ecosystem learned to detect (and largely eliminate) over the 2010-2020 decade. What's different now is the monetization vector: an attacker who lands code execution on any of these components isn't farming credit cards or cryptominers — they are pivoting straight into stolen Anthropic / OpenAI / Bedrock keys, monetized via resold virtual keys and large-scale prompt-completion abuse against the victim's own provider account. Sysdig has tracked individual victims being billed tens of thousands of dollars in a single weekend from a single compromised proxy.

The CISO-level takeaway: an LLM gateway is Tier-0 credential infrastructure. It deserves the same network controls, the same patch cadence, and the same Tier-0 detection posture as your identity provider and your cloud root-account console. Today, almost none of these systems are being treated that way.

Action Items

This week: Inventory every Langflow / LiteLLM / ChromaDB / PraisonAI deployment in your environment — including unsanctioned shadow-IT instances on developer laptops. Censys + Shodan + internal CMDB; do not rely on Composer / pip lock files alone.
Upgrade or isolate: LiteLLM ≥ v1.83.7-stable, PraisonAI ≥ 4.6.34, Langflow per vendor CVE-2026-33017 advisory. ChromaDB: switch to the Rust server (default since 1.0.0) or front the Python server with authenticating proxy + WAF rule blocking trust_remote_code.
Rotate credentials: every LiteLLM virtual key, every upstream provider key (Anthropic, OpenAI, Azure OpenAI, Bedrock, Vertex AI, Mistral, Cohere), every AWS access key reachable from a Langflow or KeyHunter-compromised host.
Network: Deny outbound TCP/14222 (NATS) from production workloads by default. Block egress to 45.192.109.25 and 159.89.205.184. Front every internet-facing AI component with authenticating reverse proxy.
Detect: Deploy the SPL/KQL/Sigma rules above; alert on Bedrock InvokeModel spikes per IAM principal; baseline Anthropic / OpenAI billing per project.

This post correlates four threats from the Threadlinqs Intelligence platform. Full IOC sets, MITRE technique chains, and 36 production detection rules are available via the threat-detail pages linked above.