Background with Fluent Bit bird
Background

Creating custom processing rules for Fluent Bit with Lua

Written by Anurag Gupta, Erik Bledsoe in How toFluent Biton September 28, 2023

Creating custom processing rules for Fluent Bit with Lua

Fluent Bit is a widely used open-source data collection agent, processor, and forwarder. Fluent Bit  technology enables you to collect logs, metrics, and traces from various sources, filter and transform them, and then forward them to multiple destinations. Fluent Bit employs a plugin architecture for creating integrations with data sources and destinations as well as filters for in-stream data processing.

Illustration of a Fluent Bit pipeline showing the role of the various plugins it uses

Although there are dozens of supported plugins, there may be times when no out-of-the-box plugin will accomplish the exact processing you need. You may need, for example, to apply some complex business logic to the data before routing and storing for analysis. Or you may need to enrich the data with some sort of computation. Thankfully, for these situations, the official Lua filter plugin for Fluent Bit allows users to write custom Lua scripts to process the records flowing through the data pipeline.  

In this post, we’ll provide an overview of the Lua filter plugin and how it functions. We will also provide some working examples that will demonstrate the plugin and, hopefully, will inspire you to create your own custom Lua scripts.

What you’ll need to get started:

  • Familiarity with Fluent Bit concepts such as inputs, outputs, parsers, and filters. If you’re unfamiliar with these concepts, please refer to the official documentation.

  • A running Fluent Bit instance. We will be using a very basic AWS EC2 running Debian. Check the documentation if you need help installing Fluent Bit for your OS

  • Lua installed on the same machine where Fluent Bit is running. Lua comes preinstalled on many flavors of Linux. Check the Lua documentation for help with installation

What is Lua?

Lua is a lightweight, high-level, multi-paradigm scripting language designed primarily for embedded use in applications. It has a Python-like syntax, making it easy for many developers to pick up. It is widely used as an extension library, including in apps such as 

  • Roblox

  • World of Warcraft

  • Adobe Photoshop Lightroom

  • Redis

Lua’s high performance, small footprint, and built-in pattern-matching library make it ideally suited for scripting extensions for Fluent Bit plugins. As we will see, the built-in pattern matching enables us to use Lua for powerful parsing and transformation of records, and the Lua scripts can be much less resource-intensive than complex regex formulas.

Configuring the Lua filter plugin for Fluent Bit

To invoke the Lua filter plugin, you must define it in your Fluent Bit configuration file. It requires 4 parameters:

  • Name — this will always be Lua

  • Match — this defines what records should be processed by the filter

  • script or code — these two parameters let Fluent Bit know how to locate the Lua script to be executed. The script parameter identifies the path and filename of an external file containing the script. The code parameter indicates that the Lua script is presented inline in the configuration file as the value of the code parameter. 

  • call — the name of a function defined in the script that should be executed. Only one function can be called from the filter configuration, although that function may call other functions in the script. You could also configure multiple Lua plugins in the configuration file, each calling a different script if needed.

The plugin accepts other parameters as well, but these 4 are required. 

A sample configuration could look like this:

[FILTER]
    name lua
    match *
    script /path/to/your/script/my-script.lua
    call cb_filter

In the above, the match value of * indicates that all records should be processed by this filter. The script parameter points us to the location of the file. Finally, the call parameter identifies that the function to be executed is named cb_filter.

If we wanted to utilize inline scripting rather than an external file, the configuration might look like this:

[FILTER]
    Name    lua
    Match   *
    code    function inline_filter(tag, timestamp, record)record.tag = tag; return 1, timestamp, record end
    call    inline_filter

Understanding the Lua filter plugin

The Lua function takes three arguments, which are automatically supplied by Fluent Bit every time it calls a function on a matching record. The three arguments are:

  • tag: the name of the tag associated with the incoming record

  • timestamp: the timestamp associated with the incoming record, formatted as an epoch timestamp with nanosecond resolution. If the record contains an identifiable timestamp, Fluent Bit will utilize that as the timestamp. If the record does not contain a timestamp, or if Fluent Bit cannot identify the timestamp because the record is unstructured, it will generate a timestamp based on when Fluent Bit received the record. 

  • record – the record itself, formatted as a Lua table

The Lua function must then return three arguments:

  • code: the code provides instructions for Fluent Bit about how to process the record being returned. There are four possible values:

Code Description
-1 The record will be dropped from the pipeline; no additional filters will be applied and it will not be routed to any output; this is useful for disposing of unnecessary or noisy data, resulting in storage savings.
0 The record should not be modified by the Lua filter; the original record initially passed to the Lua should continue through the pipeline, including through any additional filters, and be routed to the appropriate endpoint(s) as defined in the configuration file;  this is useful when not all data needs to be processed.
1 The original record and its timestamp should be replaced by the timestamp and record values returned by the Lua function; this is useful when strict auditing of all transformations is required.
2 The original record has been changed and should be replaced with the returned record, but the record timestamp should remain the same as originally passed.
  • timestamp: the timestamp that should be applied to the record being returned; this will only take place if the returned code value is 1

  • record: the original record passed to the function as transformed (or not) by the Lua script

As you might imagine, the flexibility of Lua greatly expands the possibilities for processing your streaming data. We could, for example, compare the IP address contained in a record to a list of known IP addresses and then drop or tag the records in a particular manner. This would enable us to drop records when Google bot indexes our website and tag internal traffic for routing to a different destination than external traffic.  Or you could identify and mask sensitive data contained within the record.  

Now that we have an understanding of how the plugin works, let’s dive in and start creating some Lua script filters.

Example Lua filters

For the purposes of this post, I’ll be using some sample Apache2 access log data as my input. You can download the data here if you would like to use it to follow along with the examples. 

Before adding the Lua filter, our Fluent Bit config looks like this:

[SERVICE]
    parsers_file /fluent-bit/etc/parsers.conf

[INPUT]
    name tail
    path /var/log/access.log
    read_from_head true
    parser apache2

[OUTPUT]
    name stdout
    format json
    match *

We are loading the standard Fluent Bit parsers. Then we use our sample log data file as input —  beginning at the first line (head) — and run it through the included Apache2 parser, which formats it as JSON and embeds the Fluent Bit timestamp into the record. Finally, we send the send the output to stdout so that we can see it. 

We can then start Fluent Bit with this command:

/fluent-bit/bin/fluent-bit -c /fluent-bit/etc/fluent-bit.conf | jq

Note that we are piping our output through jq to make it more readable. 

The output should look something like this:

…
  {
    "date": 1643771334,
    "host": "182.165.233.130",
    "user": "-",
    "method": "PUT",
    "path": "/explore",
    "code": "200",
    "size": "4953",
    "referer": "https://fields.com/list/wp-content/main/faq/",
    "agent": "Mozilla/5.0 (Windows NT 5.1; fy-DE; rv:1.9.2.20) Gecko/2016-03-08 11:51:37 Firefox/3.8"
  },
  {
    "date": 1643771606,
    "host": "97.155.7.33",
    "user": "-",
    "method": "DELETE",
    "path": "/explore",
    "code": "200",
    "size": "4981",
    "referer": "http://skinner-stanley.info/list/faq.htm",
    "agent": "Mozilla/5.0 (iPad; CPU iPad OS 10_3_4 like Mac OS X) AppleWebKit/533.2 (KHTML, like Gecko) FxiOS/11.8c5049.0 Mobile/32P755 Safari/533.2"
  },
  {
    "date": 1643771902,
    "host": "73.137.57.176",
    "user": "-",
    "method": "GET",
    "path": "/search/tag/list",
    "code": "301",
    "size": "5103",
    "referer": "http://www.warner-kramer.info/",
    "agent": "Mozilla/5.0 (compatible; MSIE 6.0; Windows NT 6.2; Trident/3.0)"
  }

Example: Saying hello

Now let’s create our first Lua script. We’ll start by simply enriching the record with a new key value pair. The Lua script looks like this:

function hi_filter(tag, timestamp, record)
    record.hello = "Hello world"
    return 1, timestamp, record
end

Since our function is short, we will include it inline in our Fluent Bit configuration, which now looks like this: 

[SERVICE]
    parsers_file /fluent-bit/etc/parsers.conf

[INPUT]
    name tail
    path /var/log/access.log
    read_from_head true
    parser apache2

[FILTER]
    name lua
    match *
    code function hi_filter(tag, timestamp, record) record.hello = "Hello world"; return 1, timestamp, record end
    call hi_filter

[OUTPUT]
    name stdout
    format json
    match *

We then request that Fluent Bit reprocess our sample data:

/fluent-bit/bin/fluent-bit -c /fluent-bit/etc/fluent-bit.conf | jq

The last few records of our output should look like this:

{
  "date": 1643771606,
  "method": "DELETE",
  "code": "200",
  "path": "/explore",
  "size": "4981",
  "referer": "http://skinner-stanley.info/list/faq.htm",
  "agent": "Mozilla/5.0 (iPad; CPU iPad OS 10_3_4 like Mac OS X) AppleWebKit/533.2 (KHTML, like Gecko) FxiOS/11.8c5049.0 Mobile/32P755 Safari/533.2",
  "user": "-",
  "host": "97.155.7.33",
  "hello": "Hello world"
},
{
  "date": 1643771902,
  "method": "GET",
  "code": "301",
  "path": "/search/tag/list",
  "size": "5103",
  "referer": "http://www.warner-kramer.info/",
  "agent": "Mozilla/5.0 (compatible; MSIE 6.0; Windows NT 6.2; Trident/3.0)",
  "user": "-",
  "host": "73.137.57.176",
  "hello": "Hello world"
}

Because our function returned a value of 1 for the code parameter (return 1,timestamp, record) the modifications to the record that our script performed were returned and the modified record would continue down the pipeline.  If we had returned a value of -1 all of the records would have been dropped, while a value of 0 would simply have ignored all the changes made.

Example: Enriching data with hostname 

Now let’s enrich our data with something a little more useful than “hello world.” 

First, replace the Lua filter in our configuration file with the following:

[FILTER]
    Name    lua
    Match   *
    script /fluent-bit/etc/script-example.lua
    call    enrich_filter

Since the Lua script we will be using is a bit more complex than our original script, we will store in a separate file and call it using the script parameter. 

Now create the file at the path above with this content:

local a
local function b()
    if a==nil then
        local c=io.popen('hostname')
        a=c:read('*a'):gsub('%s+$','')c:close()
        end;
        return a
        end;
        function enrich_filter(tag,timestamp,record)
            record.hostname=b()
            return 1,timestamp,record
        end

This Lua script will grab the hostname of our machine and add it to our record, which is much more useful when examining our logs later than just adding a greeting. 

It also demonstrates how even though a single Lua filter can only call one function, that function can call additional functions.

When we again run Fluent Bit and process our sample log data, the last few records should look something like the following:

{
  "date": 1643771334,
  "code": "200",
  "hostname": "fluent-bit-sandbox",
  "size": "4953",
  "agent": "Mozilla/5.0 (Windows NT 5.1; fy-DE; rv:1.9.2.20) Gecko/2016-03-08 11:51:37 Firefox/3.8",
  "path": "/explore",
  "host": "182.165.233.130",
  "method": "PUT",
  "user": "-",
  "referer": "https://fields.com/list/wp-content/main/faq/"
},
{
  "date": 1643771606,
  "code": "200",
  "hostname": "fluent-bit-sandbox",
  "size": "4981",
  "agent": "Mozilla/5.0 (iPad; CPU iPad OS 10_3_4 like Mac OS X) AppleWebKit/533.2 (KHTML, like Gecko) FxiOS/11.8c5049.0 Mobile/32P755 Safari/533.2",
  "path": "/explore",
  "host": "97.155.7.33",
  "method": "DELETE",
  "user": "-",
  "referer": "http://skinner-stanley.info/list/faq.htm"
},
{
  "date": 1643771902,
  "code": "301",
  "hostname": "fluent-bit-sandbox",
  "size": "5103",
  "agent": "Mozilla/5.0 (compatible; MSIE 6.0; Windows NT 6.2; Trident/3.0)",
  "path": "/search/tag/list",
  "host": "73.137.57.176",
  "method": "GET",
  "user": "-",
  "referer": "http://www.warner-kramer.info/"
}

Example: Dropping and routing data 

In this example, we will use a Lua function to examine the http error codes in our logs. If the code is 200, we will drop the record. For the remaining records, we will add a new key-value pair that varies depending on the code. 

Append the following code to our existing script-example.lua file:

function route_filter(a,b,c)
   local d=c.code:find('^200')~=nil;
   if d or e then 
      return-1 
   elseif c.code == "404" then
      c.route="team1"
   elseif c.code == "301" then 
      c.route="team2"
   else
      c.route="team3"
   end;
   return 1,b,c
end

Next modify our Fluent Bit configuration file to add a second Lua filter that calls our new function: 

[SERVICE]
    parsers_file /fluent-bit/etc/parsers.conf

[INPUT]
    name tail
    path /var/log/access.log
    read_from_head true
    parser apache2

[FILTER]
    Name    lua
    Match   *
    script /fluent-bit/etc/script-example.lua
    call    route_filter

[FILTER]
    Name    lua
    Match   *
    script /fluent-bit/etc/script-example.lua
    call    enrich_filter

[OUTPUT]
    name stdout
    format json
    match *

Although each instance of the Lua filter can only call one function, there is no problem with having both filters refer to the same file that contains our scripts. Note that rather than add a second Lua filter to our configuration we could also have modified our script so that the route_filter function called the enrich_filter function as the last step before returning its values. 

When we rerun Fluent Bit to process our sample data we see that all of the records with code 200 have been dropped and each remaining record has a new route key with a value of either team1, team2, or team3.

With this logic applied, we can then use a series of rewrite_tag filters to route the data to different destinations. We won’t be going through the specifics of how that works in this post, but for an excellent demonstration of the concept watch this webinar by our tech lead for infrastructure. 

Next steps: Learn more about Fluent Bit

As the creators and maintainers of Fluent Bit, Calyptia regularly offers webinars and training sessions for Fluent Bit users. If you enjoyed this post, you may also be interested in our recent Fluent Bit Summer Webinar series which covered topics including advanced routing and processing and operational best practices. The advanced processing webinar includes additional examples of Lua filters not covered in this post. All of the webinars in the Fluent Bit Summer series are now available on demand

Much of this post was based on a recent Fluent Bit half-day training course that allowed participants to gain hands-on experience using a sandbox environment. If you would like to be notified of upcoming webinars and training sessions, please subscribe to our Fluent Bit newsletter.

You might also like

Fluent Bit or Fluentd

Fluent Bit and Fluentd – a child or a successor?

Fluent Bit may have started as a sibling to Fluentd, but it is fair to say that it has now grown up and is Fluentd's equal. Learn which is right for your needs and how they can be used together.

Continue reading
Calyptia + Lua + AI

Transform your logs in-flight with Lua, AI, and Calyptia

Learn how Calyptia lets you create custom processing rules to transform your data using Lua and how Calyptia integrates AI to simplify data processing.

Continue reading
Fluent Bit v3

Fluent Bit v3 gives users greater control of their data and telemetry pipelines

New release allows filtering of Windows and MacOS metrics, supports SQL for parsing logs, adds support for HTTP/2, and more.

Continue reading