【Elasticsearch 2.x】issues

issues #1:同一個Index不一樣type下同一個Field的映射衝突html

Kibana Sense:
POST /index-1/type-1
{
  "age":25
}

GET /index-1/_mapping
{
  "index-1": {
    "mappings": {
      "type-1": {
        "properties": {
          "age": {
            "type": "long"        => 在index-1/type-1下的age映射爲long
          }
        }
      }
    }
  }
}

POST /index-1/type-2
{
  "age":"xx"
}

{
   "error": {
      "root_cause": [
         {
            "type": "mapper_parsing_exception",
            "reason": "failed to parse [age]"
         }
      ],
      "type": "mapper_parsing_exception",
      "reason": "failed to parse [age]",
      "caused_by": {
         "type": "number_format_exception",
         "reason": "For input string: \"xx\""
      }
   },
   "status": 400
}

issue #2:不一樣index的任意type下的,同一個Field映射類型不一樣,Kibana Mapping Conflictapache

POST /index-21/type-1
{
  "age":"xx",
  "post_date" : "2016-06-03T14:12:12"
}

{
  "index-21": {
    "mappings": {
      "type-1": {
        "properties": {
          "age": {
            "type": "string"       => index-21/type-1/age => 映射爲string
          },
          "post_date": {
            "type": "date",
            "format": "strict_date_optional_time||epoch_millis"
          }
        }
      }
    }
  }
}

POST /index-21/type-2
{
  "age":25,
  "post_date" : "2016-06-03T14:12:12"
}

{
  "index-21": {
    "mappings": {
      "type-2": {
        "properties": {
          "age": {
            "type": "long"       => index-21/type-2/age => 映射爲long
          },
          "post_date": {
            "type": "date",
            "format": "strict_date_optional_time||epoch_millis"
          }
        }
      }
    }
  }
}

Kibana --> settings 使用index-2* 做爲index-pattern會出現Mapping Conflict

issue #2.1:實例 不一樣index的任意type下的,同一個Field映射類型不一樣 ,以及解決方案json

咱們針對bytes字段,出現衝突的解決方案

input{
	file{
		path => "/opt/logstash-data/input/logstash-tutorial.log"
	}
}

filter{
	grok{
		match => { "message" => "%{COMBINEDAPACHELOG}" }
	}
}

output{
	stdout{
		codec => rubydebug
	}

	elasticsearch{
		hosts => ["xxx.xxx.xxx.xxx"]
		index => "test-%{+YYYY.MM.dd.HH}"
	}
}

上述的logstash的配置文件是解決apache log日誌,並寫入es,是以時間戳創建索引

創建與es上的數據以下:
{
        "message" => "83.149.9.216 - - [04/Jan/2015:05:13:42 +0000] \"GET /presentations/logstash-monitorama-2013/images/kibana-search.png HTTP/1.1\" 200 203023 \"http://semicomplete.com/presentations/logstash-monitorama-2013/\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36\"",
       "@version" => "1",
     "@timestamp" => "2016-06-06T02:28:27.152Z",
           "path" => "/opt/logstash-data/input/logstash-tutorial.log",
           "host" => "xxx.xxx.xxx.xxx",
       "clientip" => "83.149.9.216",
          "ident" => "-",
           "auth" => "-",
      "timestamp" => "04/Jan/2015:05:13:42 +0000",
           "verb" => "GET",
        "request" => "/presentations/logstash-monitorama-2013/images/kibana-search.png",
    "httpversion" => "1.1",
       "response" => "200",
          "bytes" => "1200",           ==> 能夠bytes字段是string類型的 
       "referrer" => "\"http://semicomplete.com/presentations/logstash-monitorama-2013/\"",
          "agent" => "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36\""
}

假設想把bytes字段類型設置爲long,修改logstash配置文件
input{
	file{
		path => "/opt/logstash-data/input/logstash-tutorial.log"
	}
}

filter{
	grok{
		match => { "message" => "%{COMBINEDAPACHELOG}" }
	}
        mutate{
		convert => {
			"bytes" => "integer"
		}
	}
}

output{
	stdout{
		codec => rubydebug
	}

	elasticsearch{
		hosts => ["xxx.xxx.xxx.xxx"]
		index => "test-%{+YYYY.MM.dd.HH}"
	}
}

{
        "message" => "83.149.9.216 - - [04/Jan/2015:05:13:42 +0000] \"GET /presentations/logstash-monitorama-2013/images/kibana-search.png HTTP/1.1\" 200 203023 \"http://semicomplete.com/presentations/logstash-monitorama-2013/\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36\"",
       "@version" => "1",
     "@timestamp" => "2016-06-06T02:28:27.152Z",
           "path" => "/opt/logstash-data/input/logstash-tutorial.log",
           "host" => "xxx.xxx.xxx.xxx",
       "clientip" => "83.149.9.216",
          "ident" => "-",
           "auth" => "-",
      "timestamp" => "04/Jan/2015:05:13:42 +0000",
           "verb" => "GET",
        "request" => "/presentations/logstash-monitorama-2013/images/kibana-search.png",
    "httpversion" => "1.1",
       "response" => "200",
          "bytes" => 1200,           ==> 能夠bytes字段是long類型的 
       "referrer" => "\"http://semicomplete.com/presentations/logstash-monitorama-2013/\"",
          "agent" => "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36\""
}

注意:
假設
第一條數據創建在test-2016.06.06.01
第二條數據創建在test-2016.06.06.01 =>那麼雖然bytes字段是long型的,可是會轉爲string,不會衝突

若
第二條數據創建在test-2016.06.06.02 => 此時bytes字段會是long型的。index的mapping也會是long型的

此時kibana上,若是使用的index_pattern是 [test]-YYYY.MM.DD.HH 就會存在衝突

解決方案:
咱們能夠把bytes定製爲string,把其餘類型的轉化爲long。或者反之把bytes定製爲long,其餘類型的轉化爲string。

假設咱們把bytes定製爲long型的,那麼logstash的配置文件便可不變。

咱們要把test-2016.06.06.01的bytes字段改成long型。
elasticsearch不支持直接修改已經存在的index,只能重建索引。又不能有數據丟失。

咱們先建立索引test-2016.06.06.01.bak並強制寫mapping使得bytes爲long型

PUT /test-2016.06.06.01.bak/
{
  "mappings": {
    "logs": {                ==> 默認的type是logs,若是定製,則換成對應的type
      "properties": {
        "bytes": {
          "type": "long"
        }
      }
    }
  }
}

使用stream2es把test-2016.06.06.01數據copy到test-2016.06.06.01.bak
stream2es es --source http://localhost:9200/test-2016.06.06.01 --target http://localhost:9200/test-2016.06.06.01.bak

檢查數據量是否完整

刪除test-2016.06.06.01
DELETE /test-2016.06.06.01

建立alias,test-2016.06.06.01 指向 test-2016.06.06.01.bak
POST /_aliases
{
  "actions": [
    {
      "add": {
        "index": "test-2016.06.06.01.bak",
        "alias": "test-2016.06.06.01"
      }
    }
  ]
}

issue #3: message內容是json,怎麼解析出json對應的key-value 若是發送的信息體是json格式的。怎麼解析?ruby

如發送的message是 {"k1":v1,"k2":"v2"}

[root@hfelkcld0002 conf.d]# cat json.conf
input{
	stdin{
	}
}
output{
	stdout{
		codec => rubydebug
	}
}

啓動logstash,控制檯輸入 {"k1":1,"k2":"v2"}
{"k1":1,"k2":"v2"}
{
       "message" => "{\"k1\":1,\"k2\":\"v2\"}",        => json信息沒法被解析出字段
      "@version" => "1",
    "@timestamp" => "2016-06-06T04:59:42.792Z",
          "host" => "xxx.xxx.xxx.xxx"
}

logstash的filter提供一個json的filter
https://www.elastic.co/guide/en/logstash/current/plugins-filters-json.html

[root@hfelkcld0002 conf.d]# cat json.conf
input{
	stdin{
	}
}
filter{
	json{
		source => "message"
		target => "metrics"
	}
}
output{
	stdout{
		codec => rubydebug
	}
}

{"k1":1,"k2":"v2"}
{
       "message" => "{\"k1\":1,\"k2\":\"v2\"}",
      "@version" => "1",
    "@timestamp" => "2016-06-06T05:35:58.605Z",
          "host" => "xxx.xxx.xxx.xxx",
       "metrics" => {
        "k1" => 1,
        "k2" => "v2"
    }
}

能夠看到,其實數據存放了兩份,在message和metrics字段上均存放。
能夠把target還執行message,這樣就覆蓋了原有的message
[root@hfelkcld0002 conf.d]# cat json.conf
input{
	stdin{
	}
}
filter{
	json{
		source => "message"
		target => "message"
	}
}
output{
	stdout{
		codec => rubydebug
	}
}

{"k1":1,"k2":"v2"}
{
       "message" => {
        "k1" => 1,
        "k2" => "v2"
    },
      "@version" => "1",
    "@timestamp" => "2016-06-06T05:40:11.454Z",
          "host" => "xxx.xxx.xxx.xxx"
}
相關文章
相關標籤/搜索