index="tutorialdata" sourcetype="access*" status=200 action=purchase [search index="tutorialdata" sourcetype="access*" status=200 action=purchase | top limit=1 clientIP | table clientIP] | stats count by productId

A subsearch in Splunk is a search within a search. Subsearches allow you to run a secondary search and use the results of that search as input for the main (outer) search. Subsearches are enclosed in square brackets ([...]), and their results are passed to the outer search for further processing.

Here’s a breakdown of how subsearches work, some common use cases, and examples:

How Subsearches Work:

  1. Subsearch Execution: Splunk executes the subsearch first and gathers the results.
  2. Passing Results: The results from the subsearch are passed as a filter or input to the main (outer) search. These results can be passed as field-value pairs or directly depending on the search.
  3. Subsearch Result Limit: By default, a subsearch returns a maximum of 10,000 results or runs for a maximum of 60 seconds, whichever comes first. You can modify these limits if needed using subsearch_maxout.

Subsearch Syntax:

Subsearches are enclosed within square brackets [...], and they are placed within the main search.

Basic Syntax:

<main_search> [<subsearch>]

Examples of Subsearch Usage:

1. Simple Subsearch for Filtering Results

This is one of the most common subsearch use cases. A subsearch is used to generate a set of field-value pairs that are passed to the main search as filters.

Example (Find Events Related to Specific IPs):

index=web sourcetype=access_logs [search index=security sourcetype=firewall_logs | fields ip | dedup ip]

2. Subsearch with return

The return command inside a subsearch allows you to format the results in a specific way (as a list of field-value pairs). This is useful when you need to pass specific fields to the outer search.

Example (Search for Specific Usernames):

index=web sourcetype=access_logs [search index=security sourcetype=user_logs "login_failed" | fields user | dedup user | return 10 user]

3. Using Subsearch for Dynamic Field Matching

A subsearch can be used to match dynamic values in the outer search.

Example (Find Logs from Hosts with High Error Rates):

index=web sourcetype=access_logs [search index=web sourcetype=error_logs error_type=high | stats count by host | where count > 100 | fields host]

4. Subsearch with format

The format command is used within a subsearch to format the results as a list of OR’ed conditions. This is useful when you want to pass a list of field-value pairs to the outer search.

Example (Search for IPs with Multiple Failed Logins):

index=web sourcetype=access_logs [search index=web sourcetype=auth_logs "login_failed" | stats count by ip | where count > 10 | fields ip | format]

5. Subsearch for Exclusion

Subsearches can be used for exclusion as well. You can use a subsearch to exclude specific results from the outer search.

Example (Exclude Suspicious IPs from Search):

index=web sourcetype=access_logs NOT [search index=security sourcetype=suspicious_ips | fields ip]

6. Subsearch with Joins (append)

You can use subsearches to append or join data from two different searches, allowing you to merge different datasets.

Example (Join Two Searches on a Common Field):

index=web sourcetype=access_logs | append [search index=web sourcetype=user_logs | fields user, action]

Key Subsearch Commands:

Command Description
return Outputs results in a list of field=value pairs, or as field="value1" OR field="value2".
format Formats the results of the subsearch as a set of OR conditions.
fields Specifies which fields to include in the subsearch results.
append Appends the subsearch results to the outer search results (used for joining data).
NOT Excludes the results of a subsearch from the outer search.

Subsearch Limitations:

  1. Result Limits: By default, subsearches return a maximum of 10,000 results or run for 60 seconds. You can modify this using subsearch_maxout in the limits.conf file if necessary, though increasing this can impact performance.
  2. Performance: Since subsearches run before the main search, they can introduce performance overhead, especially when handling large datasets.
  3. Chaining Subsearches: Subsearches cannot be nested or chained together.

Summary:

Subsearches are a powerful tool in Splunk, enabling complex searches by passing results between searches. However, care should be taken to manage performance, especially with large datasets.