舉例:apache2 Parser Plugin正則表達式
expression /^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$/ time_format %d/%b/%Y:%H:%M:%S %z
example:express
192.168.0.1 - - [28/Feb/2013:12:00:00 +0900] "GET / HTTP/1.1" 200 777 "-" "Opera/12.0"
This incoming event is parsed as:apache
time: 1362020400 (28/Feb/2013:12:00:00 +0900) record: { "user" : nil, "method" : "GET", "code" : 200, "size" : 777, "host" : "192.168.0.1", "path" : "/", "referer": nil, "agent" : "Opera/12.0" }
下面具體分析上面正則表達式,大體結構爲 expression /^(?<field1>[^ ]*)(?<field2>[^\\]*)\\(?<field3>[^ ]*)$/spa
(?<field1>[^ ]*)表明要提取一個field1的字段,內容是連續不爲空格的字符 (?<field2>[^\\]*)表明要提取一個field1的字段,內容是連續不爲反斜槓的字符 以此類推,這些字段之間還能夠結合其餘字符或者正則表達式,比方[^\d], 整個expression必須可以匹配event日誌,不然會提示parse error。(?: )? 這種結構代表中間的正則表達式匹配的字段可能存在或者不存在,比方說下面的例子,path字段爲兩個空格之間的一段字符,可是也有可能這一段不存在,由於爲了不出現parse error,能夠用(?: )?這種結構。(?: +(?<path>[^ ]*) +\S*)?