The regex parser allows to define a custom Ruby Regular Expression that will use a named capture feature to define which content belongs to which key name.
Security Warning: Onigmo is a backtracking regex engine. You need to be careful not to use expensive regex patterns, or Onigmo can take very long time to perform pattern matching. For details, please read the article "ReDoS" on OWASP.
Note: understanding how regular expressions works is out of the scope of this content.
From a configuration perspective, when the format is set to regex, is mandatory and expected that a Regex configuration key exists.
The regex parser supports the following configuration parameters.
The following parser configuration example aims to provide rules that can be applied to an Apache HTTP Server log entry:
Regex ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
Time_Format %d/%b/%Y:%H:%M:%S %z
Types code:integer size:integer
As an example, takes the following Apache HTTP Server log entry:
192.168.2.20 - - [29/Jul/2015:10:27:10 -0300] "GET /cgi-bin/try/ HTTP/1.0" 200 3395
The above content do not provide a defined structure for Calyptia Fluent Bit, but enabling the proper parser we can help to make a structured representation of it:
A common pitfall is that you cannot use characters other than alphabets, numbers and underscore in group names. For example, a group name like
(?<user-name>.*)will cause an error due to containing an invalid character (