Log Normalization and special characters
When trying to normalize log messages via liblognorm and mmnormalize, you need to create a rulebase first. The rulebase is usually a representation of message formats.
Due to the format of these rules, it is necessary to be cautious. Some messages and rule necessities could possibly cause confusion to the configuration interpreter. This mainly applies to clear text passages in single rules.
For example, if you have a log message from a Cisco ASA, the message looks like this:
2012-11-23T10:47:42+01:00 10.10.10.10 : %ASA-3-313001: ...
The only interesting parts are the IP and the numerical code to identify the message. We are not interested in the timestamp or “%ASA”. But when making the rule, the trouble starts there. The percent character is also used to define variables and their values in a rule. Thus it needs to be escaped. This is done with the ASCII code representation of the percent character. The rule would look like this:
rule=: %date:word% %host:ipv4% : \x25ASA-%char1:char-to:-%-%char2:number%: ...
If you write “%ASA” into the rule, the interpreter will think, that a new variable starts there. This will cause confusion to the rest of the rule and render it not working correctly. This needs to be avoided.
The same applies to “:”. But this time, it needs to be escaped when using it as delimiter vor variables. Example:
%variable:char-to:\x3a%
This will fill “variable” with everything until the next “:” occurs. If you just put a “:” here as a delimiter, the rule will not work anymore.