Logs, OpenMetrics, and OpenTelemetry All-in-One

The following post is based upon a presentation at KubeCon Europe 2022. To view the entire presentation, scroll down to the embedded video.

One of the core components of observability is moving data. Gathering mountains of logs, metrics, and traces from a wide variety of systems and routing the data to a host of other systems where they can be analyzed and stored.

This was always a challenge, but as IT becomes more dynamic, distributed, ephemeral, and ubiquitous, more and more data is coming from more and more places. The challenge is complex, but solving this problem is mission-critical. If you don’t have immediate insight into your systems’ performance and don’t have near real-time observability, your operations could suffer.

Fluent Solves the Problem

The Fluent ecosystem was created to meet that challenge. Both Fluentd and Fluent Bit are Cloud Native Computing Foundation (CNCF) graduated projects originally created to aggregate and forward logs. They can complement each other or be used as standalone solutions

Fluentd and Fluent Bit were never intended to be drop-in and replace components of your infrastructure. Your infrastructure is your business. The Fluent ecosystem is open source and vendor agnostic. Do you want to send data to Splunk? Prometheus? S3? All of those? Fine. The Fluent ecosystem will do that for you. Do you use vanilla Kubernetes or OpenShift? Fluent doesn’t care.

Enrich Logs with Fluent Bit

Fluent Bit was originally created to process logs, and in addition to allowing you to gather log data from anywhere and send them to anywhere, you can also enrich or even redact that data before sending them to their destinations. The most common use case for enrichment is Kubernetes logs. The Kubernetes filter in Fluent Bit will automatically enrich the container logs with

Pod Name
Namespace
Container Name
Container ID

You can also query the Kubernetes API Server to obtain extra metadata for the pod such as pod ID, labels, and annotations. These enriched logs make debugging and troubleshooting much, much faster.

Filter Data for Privacy and to Reduce Costs

Conversely, there are situations where you may need to remove some data from the logs before sending them to a destination for analysis. For example, personally identifiable information (PII) could be redacted to comply with privacy laws.

Another common use case is reducing the data array. We often see folks who are gathering petabytes and petabytes of data who might not want to pay for ingesting unnecessary data into analytics applications or who may want to send it to a less expensive, less used data storage.

View the presentation from KubeCon Europe

Fluent Bit, Prometheus, and OpenMetrics

A while back, we noticed that a lot of folks were gathering logs and then extracting metrics from those logs. Often they were writing gigantic Lua scripts, extracting all these small decimal points from logs to do all sorts of additions, and then trying to hack it together with Node exporter or something else and eventually get into Prometheus, which has become the standard with OpenMetrics providing the compatibility layer.

Following our Fluent ecosystem philosophy, last year we began working to enable Fluent Bit to support metrics. We added counters, gauges, histograms, and summaries. We created libraries called CMetrics; it’s in the project so you can go and look at it. And then we made sure that all these things were all exportable into various formats.

Metrics also are unique in Fluent Bit in that metrics don’t go through the same pipelines that logs go through. So metrics are almost treated as an independent type.

Fluent Bit and OpenTelemetry

OpenTelemtry is coming like a giant wave. So this year we have spent a lot of effort to ensure that Fluent Bit can ride that wave and be compatible. The first step we took was to add support for OpenTelemetry metrics and make sure that if you’re using the OpenTelemetry metrics SDK, we’ll be able to collect that data and send it out. Currently, we are limited to HTTP input and outputs, but as we’ll see we have plans for extending that.

Fluent Bit and Traces

Traces are a focus of the Fluent Bit roadmap for 2022.

As with logs and metrics, we treat traces as another independent data source with its own pipeline. The challenge, then, is how to make some really meaningful correlations or interactions between these three types of data. With logs, for instance, we have the capability of string processing. You can write SQL today on top of your logs. So you could take your NGINX logs, and write a query that would group them by error code.

Similarly, we’re trying to bring some of that logic over to traces and to metrics. The goal is to have this available in Q4 of this year. That is also the timeline for extending the current OpenTelemetry input/output options.

Next Steps for Your Observability Pipeline

Calyptia, the creators and maintainers of Fluent Bit, offers Fluent Bit-based services and products including:

Calyptia for Fluent Bit — a Long Term Support edition of Fluent Bit for enterprises that require predictable upgrades and an SLA
Calyptia Core — a Kubernetes solution that simplifies data collection, aggregation, and routing at scale

To learn more about how Calyptia can help you get the most out of your observability pipeline, contact us.