SIEM Splunk Detection Engineering Query Language

What is SPL (Search Processing Language)?

Q: Is SPL similar to SQL?

SPL and SQL share some conceptual similarities such as filtering, aggregation, and sorting, but they differ significantly in syntax and purpose. SQL operates on structured relational databases using SELECT, FROM, and WHERE clauses. SPL operates on semi-structured machine data using a pipeline model where the output of one command feeds into the next via the pipe operator. SPL is optimized for time-series log data and real-time search, while SQL is designed for transactional queries against predefined table schemas.

Q: How do I learn SPL?

Start with Splunk's free training platform (Splunk Education) and the official SPL documentation. Practice in a Splunk Enterprise trial or Splunk Cloud sandbox. Study real detection rules — platforms like Threadlinqs publish over 1,100 pre-written SPL detections that demonstrate practical patterns. Focus on mastering the core commands first: search, stats, eval, where, table, and timechart. Then progress to advanced topics like subsearches, lookups, macros, and data models.

Q: Can I convert SPL to other SIEM formats?

Yes, SPL can be translated to other SIEM query languages such as Microsoft KQL (Kusto Query Language), Elastic EQL, or vendor-neutral Sigma rules. Tools like Sigma converters (sigmac, pySigma) and commercial platforms can automate some of these translations, though complex SPL queries with advanced subsearches or custom macros may require manual adjustment. The Sigma rule format serves as an intermediary standard that can target multiple SIEM backends including Splunk SPL.

SPL is Splunk's proprietary query language used to search, filter, transform, and visualize machine data in real time. It is the primary language SOC analysts, detection engineers, and threat hunters use inside Splunk Enterprise and Splunk Cloud to investigate security events, build detection rules, and create operational dashboards.

Understanding SPL in Depth

Search Processing Language was introduced by Splunk as the core interface between analysts and the vast volumes of machine-generated data that Splunk indexes. Unlike traditional database query languages that operate on structured tables, SPL is purpose-built for semi-structured, time-series log data. It processes events as they are retrieved from indexed data stores, applying a chain of commands through a pipeline architecture.

Every SPL query starts with a search that retrieves a set of events from one or more indexes. From there, the results flow through a series of piped commands that filter, transform, enrich, and aggregate the data. This pipeline model is what makes SPL exceptionally powerful for security operations: an analyst can go from raw syslog events to a statistical summary of attacker behavior in a single query.

SPL is used across every tier of security operations. Tier 1 analysts use it to triage alerts and search for indicators of compromise. Tier 2 analysts build correlation rules and scheduled searches. Detection engineers author production-grade SPL queries that power continuous monitoring. Threat hunters write ad-hoc SPL to explore hypotheses about adversary activity across months of historical data.

SPL Fundamentals

Search Commands

Every SPL query begins with an implicit or explicit search command. The three most common fields used to scope a search are:

index — Specifies which data repository to query (e.g., index=windows). Splunk stores ingested data in indexes, and targeting the correct one is critical for performance.
sourcetype — Filters by the data format (e.g., sourcetype=WinEventLog:Security). Each sourcetype defines how Splunk parses fields from raw events.
host — Narrows results to a specific machine or device (e.g., host=dc01.corp.local).

The Pipe Operator

The pipe character (|) is the backbone of SPL. It passes the output of one command as input to the next, enabling analysts to build multi-stage queries. Each pipe stage progressively refines, transforms, or aggregates the data. A typical detection query chains 4 to 8 pipe stages together.

Transforming Commands

These commands aggregate or reshape event data into structured results:

stats — Computes statistical aggregations (count, sum, avg, dc, values) grouped by one or more fields.
timechart — Aggregates data into time buckets for trend visualization and anomaly detection.
table — Selects specific fields into a tabular output, discarding everything else.
chart — Creates pivot-table-style aggregations split by a field value.
top / rare — Returns the most or least common values for a given field.

Eval Functions and Lookups

The eval command creates calculated fields inline. It supports string manipulation, conditional logic (if, case), mathematical operations, and type conversions. Lookups enrich events at search time by joining external reference data (CSV files, KV store collections, or external APIs) into the search results. In security operations, lookups are commonly used to tag known-bad IP addresses, map user identities, or classify assets by criticality.

Subsearches

Subsearches are enclosed in square brackets and execute before the outer search. They allow analysts to dynamically generate filter criteria from one dataset and apply it to another. For example, a subsearch might retrieve all IP addresses associated with a known threat actor from a threat intelligence lookup, then feed those IPs into the main search to find matching network connections.

SPL Detection Example: Suspicious Process Creation

The following SPL query detects potentially malicious process creation events by identifying rare command-line executions spawned from commonly abused parent processes. Each pipe stage is annotated to explain its purpose.

`-- Search Windows process creation events from the last 24 hours` index=windows sourcetype=WinEventLog:Security EventCode=4688 earliest=-24h `-- Filter to high-risk parent processes commonly abused by attackers` | search Parent_Process_Name IN ("cmd.exe", "powershell.exe", "wscript.exe", "mshta.exe", "wmiprvse.exe") `-- Count executions per unique command line and parent process` | stats count as exec_count dc(ComputerName) as host_spread values(ComputerName) as hosts by New_Process_Name, CommandLine, Parent_Process_Name `-- Isolate rare executions seen on very few machines` | where exec_count < 3 AND host_spread < 2 `-- Flag known suspicious patterns in the command line` | eval risk_indicator=if(match(CommandLine, "(?i)(invoke-expression|downloadstring|frombase64|hidden|-enc\s)"), "high", "medium") `-- Format output for SOC triage` | table risk_indicator, New_Process_Name, Parent_Process_Name, CommandLine, exec_count, host_spread, hosts | sort - risk_indicator

This query demonstrates the typical SPL pipeline pattern used in production detection rules: scope the data, filter to relevant events, aggregate for statistical context, apply threshold logic, enrich with calculated fields, and format the output for analyst review.

SPL vs KQL: Comparison

SPL and KQL (Kusto Query Language) are the two dominant SIEM query languages in enterprise security. SPL powers Splunk, while KQL is used in Microsoft Sentinel, Defender XDR, and Azure Data Explorer. Understanding their differences is essential for detection engineers working across multi-SIEM environments.

Aspect	SPL (Splunk)	KQL (Microsoft)
Pipeline syntax	Pipe character `\|` between commands	Pipe character `\|` between operators
Data source selection	`index=name sourcetype=type`	`TableName` at start of query
Filtering	`search` or `where` command	`where` operator
Aggregation	`stats count by field`	`summarize count() by field`
String matching	`field="pattern"` or `match()`	`has`, `contains`, `matches regex`
Time scoping	Time picker UI or `earliest=-1h`	`ago(1h)` in where clause
Platform	Splunk Enterprise, Splunk Cloud	Microsoft Sentinel, Defender XDR, Azure
Learning curve	Moderate; extensive command library	Moderate; SQL-like syntax lowers barrier
Community size	Large; 20+ years of documentation	Growing rapidly with Sentinel adoption

SPL in Threat Detection

Detection engineers rely on SPL to write rules that continuously monitor Splunk indexes for indicators of malicious activity. These rules run as scheduled searches (also called correlation searches in Splunk Enterprise Security) and trigger alerts when adversary behavior is identified. SPL detection rules cover the full spectrum of attacker techniques:

Process Creation Monitoring

SPL queries against Windows Event ID 4688 or Sysmon Event ID 1 identify suspicious process execution chains. Detection engineers look for unusual parent-child relationships (e.g., excel.exe spawning powershell.exe), encoded command-line arguments, LOLBin abuse, and process injection artifacts.

Network Connection Analysis

SPL parses firewall, proxy, and DNS logs to detect command-and-control beaconing, data exfiltration over DNS tunneling, connections to known-malicious infrastructure, and unusual outbound traffic patterns. Threshold-based and statistical anomaly approaches both leverage SPL aggregation functions.

File System Monitoring

SPL rules correlate Sysmon file-creation events to identify ransomware encryption patterns (rapid sequential file modifications), suspicious file drops in temp directories, DLL sideloading attempts, and unauthorized modifications to system binaries or scheduled tasks.

Production SPL detections are typically paired with adaptive response actions in Splunk SOAR (formerly Phantom) to automate containment steps such as isolating a host, disabling a user account, or enriching an alert with threat intelligence lookups.

SPL Detection Rules on Threadlinqs

The Threadlinqs Intelligence platform maintains a continuously updated library of production-ready SPL detection rules, written and validated by detection engineers for every threat in the database.

1,144

SPL detection rules

160+

Threats covered

465

MITRE techniques mapped

Each SPL rule in the library is mapped to specific MITRE ATT&CK techniques, assigned a severity and confidence level, and formatted for direct deployment into Splunk. Analysts can browse by threat, filter by tactic or technique, and copy any rule to their clipboard with a single click for immediate use in their Splunk environment.

Whether you are responding to a new CVE, hunting for a specific threat actor's tradecraft, or building out detection coverage for a MITRE ATT&CK gap analysis, the Threadlinqs SPL library eliminates the time-consuming process of writing detections from scratch.

[ browse_spl_detections ]

Related Terms

Frequently Asked Questions

Is SPL similar to SQL?

SPL and SQL share conceptual similarities such as filtering, aggregation, and sorting, but they differ in syntax and purpose. SQL operates on structured relational databases using SELECT, FROM, and WHERE clauses. SPL operates on semi-structured machine data using a pipeline model where the output of one command feeds into the next via the pipe operator. SPL is optimized for time-series log data and real-time search, while SQL is designed for transactional queries against predefined table schemas. If you know SQL, you will find many SPL concepts familiar, but the pipeline-first approach requires a shift in how you structure queries.

How do I learn SPL?

Start with Splunk's free training platform and the official SPL documentation. Practice in a Splunk Enterprise trial or Splunk Cloud sandbox with real or sample data. Study production detection rules to understand practical patterns — platforms like Threadlinqs publish over 1,100 pre-written SPL detections that demonstrate real-world query structures. Focus on mastering core commands first: search, stats, eval, where, table, and timechart. Then progress to advanced topics like subsearches, lookups, macros, and data models. The Splunk community on Slack, Splunk Answers, and GitHub also provides extensive examples and troubleshooting guidance.

Can I convert SPL to other SIEM formats?

Yes. SPL can be translated to other SIEM query languages such as Microsoft KQL, Elastic EQL, or vendor-neutral Sigma rules. Tools like pySigma and commercial SIEM migration platforms can automate straightforward translations. However, complex SPL queries that use advanced subsearches, custom macros, lookup tables, or Splunk-specific eval functions may require manual adjustment. The Sigma rule format is widely used as an intermediary standard: a single Sigma rule can be compiled to SPL, KQL, EQL, and several other backends, making it a practical bridge for multi-SIEM environments.