Concatenate Multiline or Stack trace log messages. Available on Fluent Bit >= v1.8.2.
The Multiline Filter helps to concatenate messages that originally belong to one context but were split across multiple records or log lines. Common examples are stack traces or applications that print logs in multiple lines.
As part of the built-in functionality, without major configuration effort, you can enable one of ours built-in parsers with auto detection and multi format support:
The usage of this filter depends on a previous configuration of a Multiline Parser definition.
If you wish to concatenate messages read from a log file, it is highly recommended to use the multiline support in the Tail plugin itself. This is because performing concatenation while reading the log file is more performant. Concatenating messages originally split by Docker or CRI container engines, is supported in the Tail plugin.
This filter only performs buffering that persists across different Chunks when Buffer is enabled. Otherwise, the filter will process one Chunk at a time and is not suitable for most inputs which might send multiline messages in separate chunks.
When buffering is enabled, the filter does not immediately emit messages it receives. It uses the in_emitter plugin, same as the Rewrite Tag Filter, and emits messages once they are fully concatenated, or a timeout is reached.
Since concatenated records are re-emitted to the head of the Fluent Bit log pipeline, you can not configure multiple multiline filter definitions that match the same tags. This will cause an infinite loop in the Fluent Bit pipeline; to use multiple parsers on the same logs, configure a single filter definitions with a comma separated list of parsers for multiline.parser. For more, see issue #5235.
Secondly, for the same reason, the multiline filter should be the first filter. Logs will be re-emitted by the multiline filter to the head of the pipeline- the filter will ignore its own re-emitted records, but other filters won't. If there are filters before the multiline filter, they will be applied twice.
Configuration Parameters
The plugin supports the following configuration parameters:
Configuration Example
The following example aims to parse a log file called test.log that contains some full lines, a custom Java stacktrace and a Go stacktrace.
This is the primary Fluent Bit configuration file. It includes the parsers_multiline.conf and tails the file test.log by applying the multiline parsers multiline-regex-test and go. Then it sends the processing to the standard output.
[SERVICE]
flush 1
log_level info
parsers_file parsers_multiline.conf
[INPUT]
name tail
path test.log
read_from_head true
[FILTER]
name multiline
match *
multiline.key_content log
multiline.parser go, multiline-regex-test
[OUTPUT]
name stdout
match *
This second file defines a multiline parser for the example. Note that a second multiline parser called go is used in fluent-bit.conf, but this one is a built-in parser.
[MULTILINE_PARSER]
name multiline-regex-test
type regex
flush_timeout 1000
#
# Regex rules for multiline parsing
# ---------------------------------
#
# configuration hints:
#
# - first state always has the name: start_state
# - every field in the rule must be inside double quotes
#
# rules | state name | regex pattern | next state
# ------|---------------|--------------------------------------------
rule "start_state" "/(Dec \d+ \d+\:\d+\:\d+)(.*)/" "cont"
rule "cont" "/^\s+at.*/" "cont"
An example file with multiline and multiformat content:
The lines that did not match a pattern are not considered as part of the multiline message, while the ones that matched the rules were concatenated properly.
Docker Partial Message Use Case
When Fluent Bit is consuming logs from a container runtime, such as docker, these logs will be split above a certain limit, usually 16KB. If your application emits a 100K log line, it will be split into 7 partial messages. If you are using the Fluentd Docker Log Driver to send the logs to Fluent Bit, they might look like this:
Fluent Bit can re-combine these logs that were split by the runtime and remove the partial message fields. The filter example below is for this use case.
[FILTER]
name multiline
match *
multiline.key_content log
mode partial_message
The two options for mode are mutually exclusive in the filter. If you set the mode to partial_message then the multiline.parser option is not allowed.
Specify one or multiple Multiline Parser definitions to apply to the content. You can specify multiple multiline parsers to detect different formats by separating them with a comma.
multiline.key_content
Key name that holds the content to process. Note that a Multiline Parser definition can already specify the key_content to use, but this option allows to overwrite that value for the purpose of the filter.
mode
Mode can be parser for regex concat, or partial_message to concat split docker logs.
buffer
Enable buffered mode. In buffered mode, the filter can concatenate multilines from inputs that ingest records one by one (ex: Forward), rather than in chunks, re-emitting them into the beggining of the pipeline (with the same tag) using the in_emitter instance. With buffer off, this filter will not work with most inputs, except tail.
flush_ms
Flush time for pending multiline records. Defaults to 2000.
emitter_name
Name for the emitter input instance which re-emits the completed records at the beginning of the pipeline.
emitter_storage.type
The storage type for the emitter input instance. This option supports the values memory (default) and filesystem.
emitter_mem_buf_limit
Set a limit on the amount of memory the emitter can consume if the outputs provide backpressure. The default for this limit is 10M. The pipeline will pause once the buffer exceeds the value of this setting. For example, if the value is set to 10M then the pipeline will pause if the buffer exceeds 10M. The pipeline will remain paused until the output drains the buffer below the 10M limit.