Installing and Managing Splunk Stream in a Distributed Environment

Splunk Stream is great way to monitor network traffic from a host or via a network tap or span port. The software acts as a network traffic "sniffer." The web GUI interface allows you to choose individual metadata fields that are specific to a network protocol and write that metadata to your Splunk indexers for searching.

This means that you can capture all kinds of useful metadata through Splunk Stream, and even do limited full packet capture! Top data sources for Splunk Stream include DNS and DHCP (both protocols where logging is notoriously weak), but many people use it to capture HTTP transactions, database queries, emails, and more.

This blog post will focus on the bits needed to deploy, configure and manage Splunk Stream in a distributed environment. This may consist of hundreds or thousands of Splunk Universal Forwarders running on endpoints throughout your environment, receiving their initial Splunk Stream Technology Add-On (TA) from your central deployment server, and their subsequent Splunk Stream configuration from a central Splunk Stream server. These two roles (Deployment server and Splunk Stream server) may run on the same host, depending on the size and complexity of your configuration. Much more detail can be found on Splunk Docs, but this post will cover the high-level steps and requirements.

Overview

If you only have a small handful of stream hosts, it's by far easiest to just install the heavy-weight forwarder and manually configure it; but if you're planning to roll out a fleet of Stream sensors throughout your network, you will want to centrally monitor them. While Stream can be deployed via the Deployment server, the actual stream configuration is managed via a different model. We will walk through that model below, but the high-level summary is that you can deploy the Stream Technology Add-on (TA) onto Universal Forwarders (no requirement for heavy-weight forwarders for the TA) and tell them to all point to a central Stream configuration server over your standard Splunk port (default http over 8000).

Implementation

Note that there are two primary components in Splunk Stream. First is the Splunk Stream app, which provides the web interface and allows stream configuration. This component exposes the configuration you build to clients. The client (Splunk_TA_stream) gets its configuration from the Splunk Stream app via REST API.

In the above example of a standalone configuration, both of these components are installed (Splunk_TA_stream comes as part of the Splunk Stream app that you download from Splunkbase). In a standalone configuration, the request and transfer of configuration information from server to client takes place on the local network stack. In a distributed configuration, the request and transfer of configuration takes place over the wire.

1. Have a Splunk deployment running

In order to start capturing wire data in a distributed environment, you’ll first need a distributed environment. If you’re not there yet, feel free to read on, but you might want to go back to deploying Splunk in Splunk Docs.

The components you’ll need from the Splunk side are:

Search Head: The Splunk server used to search indexed data.
Indexer(s): The Splunk server used for ingesting and indexing the wire data from the Stream platform.
Deployment server: The central configuration point for Splunk Universal Forwarders in your environment.
Splunk Stream server: A full install of Splunk running the Splunk Stream app. This server will be used to deploy the Stream configuration to each of the universal forwarders. This server should not be configured as an indexer, but rather to forward all internal logs to the indexing tier. No Stream data will be received at this server, but rather will be sent to the indexing tier via the forwarder’s outputs.conf settings. In a smaller environment, this function could also be run on your deployment server.
Splunk Universal Forwarders: “The best piece of software ever written” according to James Brodsky, can run on multiple operating systems, and be used to capture numerous types of machine data, including running the Splunk Stream binaries to capture wire data at your endpoints.

See the following diagram for a breakdown:

2. Configure the Splunk Stream app for distributed management on a standalone server

Following the Splunk Stream installation guide, install Splunk Stream on a full instance of Splunk. You’ll use this host as your configuration point for all Stream configurations that will be pulled to your Universal Forwarders using subsequent configurations. Go ahead and disable all the default Streams that are enabled in the initial install. We will work on establishing a new Metadata Stream in a subsequent step.

3. Run up the Splunk Stream TA to deploy to Universal Forwarders from your deployment server

Using the Splunk Deployment server, add the Splunk Stream TA (which is available in the full Splunk Stream app download package, under the install folder of the splunk_stream_app) to a deployment server class to be pushed out to relevant Splunk Universal Forwarders. You may choose to deploy to a subset of forwarders, or all of the forwarders in your environment.

The most important step here is to define your Splunk Stream server (see point 2 above) in your TA’s inputs.conf file. The custom inputs.conf that resides in that app should point to your remote Stream server, as below.

[streamfwd://streamfwd]

splunk_stream_app_location = http://remote_stream_server:8000/en-us/custom/splunk_app_stream/

stream_forwarder_id =

disabled = 0

The following is a screenshot of an installed universal forwarder, with an example config file:

Don't forget to modify the protocol if you're using SSL/TLS on your Stream server.

This will allow your newly minted Splunk Stream TA—running in your Splunk Universal Forwarder—the ability to pull your defined Stream configurations from your central Splunk Stream server. This process is highlighted as follows:

4. Configure streams on your Stream Server to deploy to the universal forwarder fleet.

Using the configuration above, you’ll now have a bunch of Universal Forwarders phoning into the Splunk Stream server waiting for a configuration to become available. You’ll now need to create that configuration so that the Universal Forwarders pull it down in order to start sending Splunk Stream data to your deployment’s indexer tier.

After logging in to your Splunk Stream server, from the configuration page of Splunk Stream, click the “Collect data from other machines” button. You will need to enable the HTTP Event Collector first if you're going to use the Independent Stream Forwarder at a later stage (but if only using Universal Forwarder for forwarding, then you won’t need this):

Click on the “Configure Streams” menu item to begin configuring a stream for your deployment.

For our example, clone the default DNS stream by clicking the clone button:

Give the new Stream a meaningful name:

Now, configure the stream based on your requirements and enable it. Consider sending the data to a specific index and which fields you are going to capture:

Now you have a stream configured, go to “Distributed Forwarder Management” under the configuration menu and create a new group:

Choose which forwarders to deploy to using regex to define a group, if required.

Now choose which streams to capture under this new group configuration.

You should now start seeing data into your indexing tier.

5. Other considerations and notes

Don't forget that your stream forwarders will need to connect home to the Splunk Stream server, so network access will be required. You will also need to consider adjusting the frequency that they call home if you deploy a large number (hundreds or thousands), which you can do by adding the "pingInterval" setting on the streamfwd.conf. The default value is 5 seconds, but in larger environments an interval of many minutes is usually more than sufficient.

You may need to consider removing the default limit of 256KB/s network output for a Splunk Universal Forwarder, if you are deploying large Splunk Stream configurations. This limit may bottleneck the ability for the forwarder to send data in real-time to your indexing tier. To change this, modify your thruput stanza in limits.conf. Check out Splunk Docs for more information.

If you are a Splunk Cloud customer, you can still leverage all of this wire data goodness. The following diagram highlights the changes to the above deployment design that would be required in your install:

What Next?

Head over to Splunkbase and download the new Splunk Essentials for Wire Data app, which showcases 49 example use cases across security, IT ops and fraud, all using data solely from Splunk Stream. Grab it here.

Simon O'Brien

I am a passionate Splunker, traveller, family man, cook, basketballer, social advocate and security professional. I have the best job in the world, and live in the best place in the world.

Installing and Managing Splunk Stream in a Distributed Environment | Splunk (2024)

Overview

Implementation

What Next?

References