Multiline Parsing
In an ideal world, applications might log their messages within a single line, but in reality applications generate multiple log messages that sometimes belong to the same context. But when is time to process such information it gets really complex. Consider application stack traces which always have multiple log lines.
Starting from Calyptia Core Agent v1.8, we have implemented a unified Multiline core functionality to solve all the user corner cases. In this section, you will learn about the features and configuration options available.
Concepts
The Multiline parser engine exposes two ways to configure and use the functionality:
Built-in multiline parser
Configurable multiline parser
Built-in Multiline Parsers
Without any extra configuration, Calyptia Core Agent exposes certain pre-configured parsers (built-in) to solve specific multiline parser cases.
As a Calyptia customer, you can also request additional languages for multiline parsers.
Configurable multiline parsers
Besides the previouslt listed built-in parsers, through the configuration files is possible to define your own Multiline parsers with their own rules.
A multiline parser is defined in a parsers configuration file by using a [MULTILINE_PARSER]
section definition. The Multiline parser must have a unique name and a type plus other configured properties associated with each type.
To understand which Multiline parser type is required for your use case you have to know beforehand what are the conditions in the content that determines the beginning of a multiline message and the continuation of subsequent lines. We provide a regex based configuration that supports states to handle from the most simple to difficult cases.
Lines and states
Before start configuring your parser you need to know the answer to the following questions:
What is the regular expression (regex) that matches the first line of a multiline message ?
What are the regular expressions (regex) that match the continuation lines of a multiline message ?
When matching regex, we have to define states, some states define the start of a multiline message while others are states for the continuation of multiline messages. You can have multiple continuation states definitions to solve complex cases.
The first regex that matches the start of a multiline message is called start_state, then other regexes continuation lines can have different state names.
Rules definition
A rule specifies how to match a multiline pattern and perform the concatenation. A rule is defined by 3 specific components:
state name
regular expression pattern
next state
A rule might be defined as follows (comments added to simplify the definition) :
In the previous example, we have defined two rules, each one has its own state name, regex patterns, and the next state name. Every field that composes a rule must be inside double quotes.
The first rule of state name must always be start_state, and the regex pattern must match the first line of a multiline message, also a next state must be set to specify how the possible continuation lines would look like.
To simplify the configuration of regular expressions, you can use the Rubular web site. We have posted an example by using the regex described above plus a log line that matches the pattern:
Configuration Example
The following example provides a full Calyptia Core Agent configuration file for multiline parsing by using the definition explained above.
The following example files can be located at:
https://github.com/fluent/fluent-bit/tree/master/documentation/examples/multiline/regex-001
Example files content:
This is the primary Calyptia Core Agent configuration file. It includes the parsers_multiline.conf
and tails the file test.log
by applying the multiline parser multiline-regex-test
. Then it sends the processing to the standard output.
By running Calyptia Core Agent with the given configuration file you will obtain:
The lines that did not match a pattern are not considered as part of the multiline message, while the ones that matched the rules were concatenated properly.
Limitations
The multiline parser is a very powerful feature, but it has some limitations that you should be aware of:
The multiline parser is not affected by the
buffer_max_size
configuration option, allowing the composed log record to grow beyond this size. Hence, theskip_long_lines
option will not be applied to multiline messages.It is not possible to get the time key from the body of the multiline message. However, it can be extracted and set as a new key by using a filter.
Get structured data from multiline message
Calyptia Core Agent supports /pat/m
option. It allows .
matches a new line. It is useful to parse multiline log.
The following example is to get date
and message
from concatenated log.
Example files content:
This is the primary Calyptia Core Agent configuration file. It includes the parsers_multiline.conf
and tails the file test.log
by applying the multiline parser multiline-regex-test
. It also parses concatenated log by applying parser named-capture-test
. Then it sends the processing to the standard output.
By running Calyptia Core Agent with the given configuration file you will obtain:
Last updated