Cribl puts your IT and Security data at the center of your data management strategy and provides a one-stop shop for analyzing, collecting, processing, and routing it all at any scale. Try the Cribl suite of products and start building your data engine today!
Learn more ›Evolving demands placed on IT and Security teams are driving a new architecture for how observability data is captured, curated, and queried. This new architecture provides flexibility and control while managing the costs of increasing data volumes.
Read white paper ›Cribl Stream is a vendor-agnostic observability pipeline that gives you the flexibility to collect, reduce, enrich, normalize, and route data from any source to any destination within your existing data infrastructure.
Learn more ›Cribl Edge provides an intelligent, highly scalable edge-based data collection system for logs, metrics, and application data.
Learn more ›Cribl Search turns the traditional search process on its head, allowing users to search data in place without having to collect/store first.
Learn more ›Cribl Lake is a turnkey data lake solution that takes just minutes to get up and running — no data expertise needed. Leverage open formats, unified security with rich access controls, and central access to all IT and security data.
Learn more ›The Cribl.Cloud platform gets you up and running fast without the hassle of running infrastructure.
Learn more ›Cribl.Cloud Solution Brief
The fastest and easiest way to realize the value of an observability ecosystem.
Read Solution Brief ›Cribl Copilot gets your deployments up and running in minutes, not weeks or months.
Learn more ›AppScope gives operators the visibility they need into application behavior, metrics and events with no configuration and no agent required.
Learn more ›Explore Cribl’s Solutions by Use Cases:
Explore Cribl’s Solutions by Integrations:
Explore Cribl’s Solutions by Industry:
September 25 | 10am PT / 1pm ET
Hold my beer: lessons from one team’s data pipeline journey
Register ›Try Your Own Cribl Sandbox
Experience a full version of Cribl Stream and Cribl Edge in the cloud.
Launch Now ›Get inspired by how our customers are innovating IT, security and observability. They inspire us daily!
Read Customer Stories ›Sally Beauty Holdings
Sally Beauty Swaps LogStash and Syslog-ng with Cribl.Cloud for a Resilient Security and Observability Pipeline
Read Case Study ›Experience a full version of Cribl Stream and Cribl Edge in the cloud.
Launch Now ›Transform data management with Cribl, the Data Engine for IT and Security
Learn More ›Cribl Corporate Overview
Cribl makes open observability a reality, giving you the freedom and flexibility to make choices instead of compromises.
Get the Guide ›Stay up to date on all things Cribl and observability.
Visit the Newsroom ›Cribl’s leadership team has built and launched category-defining products for some of the most innovative companies in the technology sector, and is supported by the world’s most elite investors.
Meet our Leaders ›Join the Cribl herd! The smartest, funniest, most passionate goats you’ll ever meet.
Learn More ›Whether you’re just getting started or scaling up, the Cribl for Startups program gives you the tools and resources your company needs to be successful at every stage.
Learn More ›Want to learn more about Cribl from our sales experts? Send us your contact information and we’ll be in touch.
Talk to an Expert ›August 17, 2021
Around this time last year, I showed how to convert a standard Splunk forwarder event to the Elastic Common Schema. In this blog, we’ll walk through how to convert a much more common data source: Apache Combined logs.
Mark Settle and Mathieu Martin over at Elastic inspired me with their 2019 blog introducing the concept of the Elastic Common Schema with exactly this type of event. Their blog does an amazing job of explaining the whys and hows of the ECS in general, and I’m so thrilled to be able to add to their work with an in-depth explanation of how Cribl Stream handles the practicals of getting you from raw event to perfectly crafted ECS-conforming event.
So in this blog, we’re going to walk you through the beginning to end process of designing a Cribl Stream pipeline that converts an Apache log event to conform to the Elastic Common Schema.
Before we crack open Stream and fire up a new Apache to Elastic pipeline, let’s do a little planning: Let’s look at the raw Apache event and map the components of the event to the correct ECS field and then to the Stream functions that will convert the field.
The Raw Event
10.42.42.42 - - [07/Dec/2018:11:05:07 +0100] "GET /blog HTTP/1.1" 200 2571 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36"
Field Name | Value | Stream Function | Notes |
---|---|---|---|
@timestamp | 2018-12-07T11:05:07.000Z | Parser Eval | Parser to extract the field, Eval to rename it, and to convert it to the new format. |
ecs.version | 1.0.0 | Eval | Eval to create the field and hard code with value. |
event.dataset | apache.access | Parser Eval | Parser to extract the field, Eval to rename it. |
event.original | 10.42.42.42 – – [07/Dec … | Parser | Parser to extract the field, Eval to rename it. |
http.request.method | get | Parser | Parser to extract the field, Eval to rename it. |
http.response.body.bites | 2571 | Parser | Parser to extract the field, Eval to rename it. |
http.response.status_code | 200 | Parser | Parser to extract the field, Eval to rename it. |
http.version | 1.1 | Parser | Parser to extract the field, Eval to rename it. |
http.hostname | webserver-blog-prod | Eval | Eval to create the field and hard code with value. If host metadata is available, use that. |
message | “GET /blog HTTP/1.1” 200 2571 | Parser | Parser to extract the field, Eval to rename it. |
service.name | Company blog | Eval | Eval to create the field and hard code with value. |
service.type | apache | Eval | Eval to create the field and hard code with value. |
source.geo.* | GeoIP | ||
source.ip | 10.42.42.42 | Parser Eval | Parser to extract the field, Eval to rename it. |
url.original | /blog | Parser Eval | Parser to extract the field, Eval to rename it. |
user.name | – | Parser Eval | Parser to extract the field, Eval to rename it. Can also use Eval to set a default if the value is not in the original event. |
user_agent.original | Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 | Parser Eval | Parser to extract the field, Eval to rename it. |
A great function to kick off an Apache to Elastic pipeline is often the Parser function. It’s the workhorse of Cribl Stream: It’s designed to parse events with common structures or delimiters and even comes with a library of pre-built field lists for common event types like Palo Alto firewall logs, AWS CloudFront logs, and…wait for it…Apache Combined Logs. It does a lot of the heavy lifting for this use case. See screenshot for full settings:
Pro tip: if you fill out the optional “Destination Field” with something like “__parsed”, it packages up all the newly parsed fields into a nice tidy array. Even better, since fields that start with a double underscore are internal and don’t get passed along to the destination. (Double pro tip: if you do start using internal fields, be sure to turn on “Show Internal Fields” in the Preview Settings.)
The second step is to work on the fields that need a little sprucing up, like source.geo.*
Getting source.geo.* entails using the GeoIP function which, in turn, entails uploading the free version of the MaxMind database. (For more information on how to add the MaxMind database to your Stream instance, check Managing Large Lookups in the documentation.)
Once you’ve got that squared away, you can fill out the rest of the GeoIP function so that it looks like this:
The result will be an array that starts with “__geoip” — another one of those internal fields that won’t get passed along. This sets you up nicely for the final wrap up of this pipeline.
The other powerhouse function of Cribl Stream is Eval. It’s deceptively simple – make this field equal to this value or this other field – but almost every pipeline uses it for one thing or another.
In the case of this pipeline, several fields ONLY need this function to be created. These are the static fields like ecs.version
, service.name
, and service.type
. Those values aren’t in the original event so I’ve simply hard-coded them into the eval statement. In a real-life scenario, these values might be brought in via a lookup table keyed off of the source type, input ID, or some other identifier in either the metadata or the data itself.
The other fields are being pulled in from the __parser field either directly or with moderate modification. For example, http.request.method
is turned to lower case.
We further leverage this final eval to clean out the unwanted fields, including _raw
and _time
.
Note that single quotes are used around both field names and static values.
After all of these modifications, you’ll get an event that looks like the one below. It has everything the ECS is looking for… and nothing it isn’t looking for.
While this article took you step-by-step through the specific process of converting an Apache Combined log event to an ECS-compatible event, these same steps can apply to virtually any type of event you may want to convert to ECS. Please reach out to us through Cribl Community Slack if you’d like to noodle with your own events and make them compatible with ECS. We can’t wait to hear from you!
The fastest way to get started with Cribl Stream is to sign-up at Cribl.Cloud. You can process up to 1 TB of throughput per day at no cost. Sign-up and start using Stream within a few minutes.
Rick Salsa Sep 19, 2024
Josh Biggley Sep 17, 2024
Classic choice. Sadly, our website is designed for all modern supported browsers like Edge, Chrome, Firefox, and Safari
Got one of those handy?