AI-STACK-2026-05 CRITICAL 2026-05-21 Cluster Analysis

AI Stack Under Siege — Four Pre-Auth RCEs Hit ChromaDB, LiteLLM, PraisonAI, and Langflow in 14 Days

Threadlinqs Intelligence 14 min read
ai-infrastructurechromadblitellmpraisonailangflowcve-2026-45829cve-2026-42208cve-2026-44338cve-2026-33017cisa-kevllmjackingnats-c2pre-auth-rce

Window: 2026-05-08 → 2026-05-21  |  Threats analysed: 4  |  Combined CVSS: 9.0 average  |  KEV-listed: 2 of 4

In fourteen days, four pre-authentication remote-code-execution vulnerabilities landed across four distinct layers of the modern AI stack — the vector database that holds your embeddings, the LLM proxy that holds your API keys, the agent framework that holds your workflow configs, and the low-code pipeline that wires them all together. Three of the four ship with public proof-of-concept code. One was weaponised within four hours of public disclosure. Two have already made the CISA Known Exploited Vulnerabilities catalog. Every Fortune 500 SOC currently racing to deploy LLM gateways, vector RAG pipelines and agent orchestrators is shipping the same authentication-ordering, deserialization and missing-authorization bugs the web ecosystem learned about a decade ago — except now the blast radius is the entire commercial-LLM credential set.

This post correlates the four threats into a single architectural story, traces the attack flow across the stack, and provides production-ready SPL, KQL, and Sigma detections for every layer.

The AI Stack, Mapped

// internet // layer 1 — orchestration / agent / vector // layer 2 — llm proxy / gateway // layer 3 — commercial llm apis attacker scanner + poc + 0day ChromaDB CVE-2026-45829 CVSS 10.0 · public PoC PraisonAI CVE-2026-44338 4hr disclosure→exploit Langflow CVE-2026-33017 (KEV) KeyHunter botnet · NATS C2 KeyHunter NATS pub/sub C2 45.192.109.25:14222 LiteLLM AI Gateway CVE-2026-42208 (CISA KEV) · SQLi → API-key DB stores Anthropic / OpenAI / Bedrock / Azure keys Anthropic API claude-sonnet, opus ~$0.015/1k tokens OpenAI API gpt-5, gpt-4o virtual key resale AWS Bedrock + IAM lateral LLMjacking + S3 pivot Azure OpenAI deployment keys Vertex AI · Cohere // legend exploit (pre-auth) credential pivot stolen-key usage
// fig.1 — Four pre-auth RCEs across three architectural layers. Solid red arrows = initial exploit. Dashed red = post-compromise credential exfiltration. Mint arrows = downstream abuse against the commercial LLM APIs whose keys the AI gateway centralises.

Executive Summary

Threat 1 — ChromaDB CVE-2026-45829 "ChromaToast"

TL-2026-0544 · CVSS 10.0 (AV:N/AC:L/AT:N/PR:N/UI:N/VC:H/VI:H/VA:H) · Severity: CRITICAL · Exploitability: Public PoC

HiddenLayer researcher Esteban Tonglet disclosed CVE-2026-45829 ("ChromaToast Served Pre-Auth") on 2026-05-18 — a maximum-severity pre-authentication RCE in ChromaDB's Python FastAPI server distribution. The bug is an authentication-ordering defect: the POST /api/v2/tenants/{tenant}/databases/{db}/collections handler instantiates the embedding function — and therefore downloads and executes an attacker-supplied Hugging Face model with trust_remote_code=Truebefore the FastAPI authentication dependency fires. The server returns HTTP 500 or 403 only after the payload has already executed inside the ChromaDB process.

The Python FastAPI server ships in the chromadb PyPI package (14 million monthly downloads) and is the default runtime under several embedded deployment paths. The Rust server (default for chroma run and the official Docker image) does not traverse the vulnerable code path. HiddenLayer's scan of internet-exposed ChromaDB instances found roughly 73% running a vulnerable Python-FastAPI version, with adopters including Capital One, UnitedHealthcare, Weights & Biases, Mintlify, and Factory AI. The maintainer received the private report on 2025-11-28 but has not produced a confirmed patch as of disclosure; ChromaDB 1.5.9 shipped to PyPI on 2026-05-04 but its remediation status is unconfirmed. Hadrian published a Nuclei detection template and one-line PoC on 2026-05-19.

The kill chain is brief: scan → POST a collection-create request with a configuration_json.embedding_function referencing an attacker-controlled HuggingFace repo → ChromaDB downloads and executes modeling_evil.py as the uvicorn process → credentials (OpenAI, Anthropic, Cohere, Voyage keys, object-store creds, cloud IAM) reachable from the process are exfiltrated. Mitigations until a confirmed patch lands are: switch to the Rust server, front the Python server with an authenticating reverse proxy, and add a WAF rule that drops any request body containing trust_remote_code.

Threat 2 — LiteLLM CVE-2026-42208 (CISA KEV)

TL-2026-0506 · CVSS 9.8 · Severity: CRITICAL · Exploitability: Active · CISA KEV: 2026-05-08 (FCEB due 2026-05-11)

BerriAI's LiteLLM (~46,800 GitHub stars) is the AI gateway of choice for consolidating access to commercial LLM providers behind a single API. Enterprises route Anthropic, OpenAI, Azure OpenAI, AWS Bedrock, Vertex AI, Cohere, and Mistral traffic through a single proxy that holds the upstream provider credentials, per-key spending budgets, model allow-lists, and (in many deployments) PII. CVE-2026-42208 is a pre-authentication SQL injection in the proxy's API-key verification routine — discovered by Tencent YunDing Security Lab and disclosed via GHSA-r75f-5x8p-qvmc on 2026-04-20.

Between v1.81.16 and v1.83.6, a refactor introduced an error-handling code path inside the LiteLLM_VerificationTokenView (combined_view) lookup that interpolates the raw caller-supplied API key directly into the SQL query string rather than passing it as a Prisma parameter. Under normal authentication, the key is hashed before lookup; under the error-handling fallback, the unsanitised key flows straight into $queryRawUnsafe. A single crafted Authorization: Bearer ' UNION SELECT key,user_id,models,spend FROM "LiteLLM_VerificationToken" -- header against any proxy endpoint (/chat/completions, /v1/embeddings, /key/generate) extracts the entire virtual-key table including the upstream provider credentials.

CISA added CVE-2026-42208 to the KEV catalog on 2026-05-08 citing in-the-wild exploitation; EPSS currently sits at 37.4% (97th percentile). The fix is v1.83.7-stable, released 2026-04-19. If you cannot upgrade today, set general_settings.disable_error_logs: true, restart, then rotate every virtual key and every upstream provider credential.

Threat 3 — PraisonAI CVE-2026-44338 (4-hour disclosure-to-exploit)

TL-2026-0502 · CVSS 7.3 · Severity: HIGH · Exploitability: Active

PraisonAI is an open-source multi-agent LLM orchestration framework maintained by MervinPraison (~7,100 GitHub stars). Versions 2.5.6 → 4.6.33 shipped the legacy Flask api_server.py entrypoint with two module-level flags set to insecure defaults: AUTH_ENABLED = False and AUTH_TOKEN = None. The helper check_auth() is implemented to return True whenever auth is disabled. The result: every default deployment exposes GET /agents and POST /chat to any unauthenticated network caller, leaking the agent workflow configuration (including embedded API keys in agents.yaml) and allowing arbitrary workflow invocation.

The Sysdig Threat Research Team documented the exploitation timeline minute-by-minute. GHSA-6rmh-7xcm-cpxj published at 13:56:16 UTC on 2026-05-11. Generic disclosure-path reconnaissance from 146.190.133.49 (AS14061 DigitalOcean) began at 17:32:50 UTC; a PraisonAI-specific second pass with User-Agent: CVE-Detector/1.0 started eight minutes later at 17:40:53 UTC. Advisory-to-first-exploit latency: 3 hours, 44 minutes, 39 seconds — faster than most patch-management SLAs even contemplate.

Fixed in v4.6.34. The newer FastAPI server binds to 127.0.0.1 by default; the deprecated Flask api_server.py is the entire vulnerability surface. Migration off the legacy entrypoint is the long-term answer.

Threat 4 — KeyHunter / NATS-as-C2 / Langflow CVE-2026-33017

TL-2026-0514 · CVSS 9.8 · Severity: HIGH · Exploitability: Active · CISA KEV: 2026-05-05

Sysdig TRT also disclosed the first publicly documented case of an attacker using NATS — a CNCF-graduated pub/sub messaging server — as the coordination plane for a distributed credential-harvesting botnet. The campaign, tracked as KeyHunter after the Go module path github.com/keyhunter/worker leaked in the binary, combines four notable techniques: (1) abuse of a legitimate ACL-enforced NATS deployment on 45.192.109.25:14222 as covert C2; (2) automated exploitation of Langflow CVE-2026-33017 (unauthenticated Jython execution at /api/v1/build_public_tmp/{flow}) for initial access; (3) post-compromise scraping of public code-development environments (CodePen, JSFiddle, StackBlitz, CodeSandbox) for AWS access keys and Anthropic / OpenAI / Bedrock API keys, with live validation against vendor APIs; and (4) DirtyPipe / DirtyCreds container-escape exploitation chains for breakout from compromised pods.

Captured worker behavior includes aws sts get-caller-identity followed by bedrock list-foundation-models — operator-validation sequences classic to LLMjacking. The novelty is the NATS C2 fabric: worker pods subscribe to subjects like task.scan_cde, task.validate_aws, task.validate_ai and publish back to *.result, all governed by ACL-enforced authorization tokens. Detection requires either DPI on TCP/14222 or workload-tier outbound-network policies that disallow NATS protocol egress from production services by default.

Cross-Threat MITRE Coverage

The four threats converge on the same defensive surface — public-facing exploitation followed by credential access from files and from the LLM-key store, followed by impact via resource hijacking.

TechniqueWhere it appears
T1190 Exploit Public-Facing ApplicationAll four threats — entry vector
T1595.002 Active Scanning: Vulnerability ScanningAll four — opportunistic scanner reconnaissance
T1059.006 Python interpreterChromaDB, PraisonAI, Langflow
T1505.003 Web ShellChromaDB, LiteLLM, KeyHunter
T1552.001 Credentials in FilesChromaDB (process env), LiteLLM (Postgres), KeyHunter (CDE scraping)
T1528 Steal Application Access TokenLiteLLM (virtual-key DB), KeyHunter (cloud API keys)
T1611 Escape to Host (DirtyPipe / DirtyCreds)KeyHunter — container-to-host breakout
T1571 Non-Standard PortKeyHunter — NATS on TCP/14222
T1496 Resource HijackingKeyHunter — Bedrock LLMjacking, stolen-key resale

Detection

Each threat ships with three SPL, three KQL, and three Sigma rules on the Threadlinqs Intelligence platform. The three queries below cover the highest-fidelity behaviors across all four threats.

Splunk SPL — ChromaDB pre-auth RCE

index=proxy OR index=web sourcetype=access*
  uri="*/api/v2/tenants/*/databases/*/collections"
  method=POST
| rex field=request_body "trust_remote_code\s*[:=]\s*[Tt]rue|auto_map"
| eval anomalous=if(isnotnull(trust_remote_code) OR isnotnull(auto_map), 1, 0)
| where anomalous=1
| stats count, values(src_ip) as src_ips, values(http_status) as statuses by host, request_body
| where count >= 1spl

Microsoft KQL — LiteLLM SQL-injection in Authorization header

let HOSTS = dynamic(["litellm-proxy", "ai-gateway"]);
W3CIISLog
| where csHost in (HOSTS) or csUriStem contains "/v1/chat/completions" or csUriStem contains "/key/generate"
| extend AuthHeader = extract(@"Authorization: ([^\r\n]+)", 1, cs(Cookie))
| where AuthHeader matches regex @"(?i)(UNION\s+SELECT|--\s|/\*|pg_sleep|information_schema|sleep\()"
| project TimeGenerated, cIP, csHost, csUriStem, AuthHeader, scStatus
| order by TimeGenerated desckql

Sigma — Outbound NATS C2 (KeyHunter)

title: KeyHunter NATS-as-C2 Outbound on TCP/14222
id: 8e7d4f1c-5a92-4b03-9b6e-3c2f1e8d4a7b
status: experimental
description: Detects outbound TCP/14222 connections from container or AI workload hosts to the KeyHunter NATS C2 relay (45.192.109.25). NATS pub/sub abused as covert command-and-control plane for distributed credential harvesting.
references:
  - https://sysdig.com/blog/nats-as-c2-inside-a-new-technique
  - https://intel.threadlinqs.com/threat/TL-2026-0514
logsource:
  category: network_connection
  product: linux
detection:
  selection:
    DestinationPort: 14222
    DestinationIp:
      - '45.192.109.25'
  selection_proto:
    DestinationPort: 14222
  condition: selection or (selection_proto and not Initiated by 'natsd|nats-server')
fields:
  - SourceIp
  - DestinationIp
  - ProcessName
  - User
level: critical
tags:
  - attack.command_and_control
  - attack.t1571
  - attack.t1090sigma

Indicators of Compromise

TypeValueContext
IP146.190.133.49PraisonAI exploitation source (AS14061 DigitalOcean), UA CVE-Detector/1.0
IP159.89.205.184KeyHunter DigitalOcean staging — TCP/8888 worker artifacts
IP / Port45.192.109.25:14222KeyHunter NATS-as-C2 relay (ACL-enforced subjects)
URL/api/v2/tenants/*/databases/*/collectionsChromaDB ChromaToast exploit endpoint (POST + trust_remote_code)
URL/api/v1/build_public_tmp/{flow}Langflow unauth Jython exec (CVE-2026-33017)
URL/agents, /chatPraisonAI legacy api_server.py exposed routes
Fileworker-linux-amd64KeyHunter Go worker binary (Bedrock LLMjacking)
Filekeyhunter_worker.py, deploy.shKeyHunter Python orchestrator + systemd/k8s installer
Behavioraws sts get-caller-identitybedrock list-foundation-modelsOperator credential-validation sequence
BehaviorHTTP Authorization: Bearer '...UNION SELECT...LiteLLM SQLi exploitation header
UACVE-Detector/1.0Sysdig-observed opportunistic scanner against newly disclosed AI CVEs

Why the AI Stack Is Repeating Web-Era Mistakes

Look at the bug classes:

The AI tooling supply chain is shipping with the same defect classes the web ecosystem learned to detect (and largely eliminate) over the 2010-2020 decade. What's different now is the monetization vector: an attacker who lands code execution on any of these components isn't farming credit cards or cryptominers — they are pivoting straight into stolen Anthropic / OpenAI / Bedrock keys, monetized via resold virtual keys and large-scale prompt-completion abuse against the victim's own provider account. Sysdig has tracked individual victims being billed tens of thousands of dollars in a single weekend from a single compromised proxy.

The CISO-level takeaway: an LLM gateway is Tier-0 credential infrastructure. It deserves the same network controls, the same patch cadence, and the same Tier-0 detection posture as your identity provider and your cloud root-account console. Today, almost none of these systems are being treated that way.

Action Items

  1. This week: Inventory every Langflow / LiteLLM / ChromaDB / PraisonAI deployment in your environment — including unsanctioned shadow-IT instances on developer laptops. Censys + Shodan + internal CMDB; do not rely on Composer / pip lock files alone.
  2. Upgrade or isolate: LiteLLM ≥ v1.83.7-stable, PraisonAI ≥ 4.6.34, Langflow per vendor CVE-2026-33017 advisory. ChromaDB: switch to the Rust server (default since 1.0.0) or front the Python server with authenticating proxy + WAF rule blocking trust_remote_code.
  3. Rotate credentials: every LiteLLM virtual key, every upstream provider key (Anthropic, OpenAI, Azure OpenAI, Bedrock, Vertex AI, Mistral, Cohere), every AWS access key reachable from a Langflow or KeyHunter-compromised host.
  4. Network: Deny outbound TCP/14222 (NATS) from production workloads by default. Block egress to 45.192.109.25 and 159.89.205.184. Front every internet-facing AI component with authenticating reverse proxy.
  5. Detect: Deploy the SPL/KQL/Sigma rules above; alert on Bedrock InvokeModel spikes per IAM principal; baseline Anthropic / OpenAI billing per project.

This post correlates four threats from the Threadlinqs Intelligence platform. Full IOC sets, MITRE technique chains, and 36 production detection rules are available via the threat-detail pages linked above.