elasticsearch查詢須要實現相似於mysql的like查詢效果,例如值爲hello中國233
的記錄,便可以經過中國
查詢出記錄,也能夠經過llo
查詢出記錄。mysql
可是elasticsearch的查詢都是基於分詞查詢,hello中國233
會默認分詞爲hello
、中
、國
、233
。當使用hello
查詢時能夠匹配到該記錄,可是使用llo
查詢時,匹配不到該記錄。sql
因爲記錄內容分詞的結果的粒度不夠細,致使分詞查詢匹配不到記錄,所以解決方案是將記錄內容以每一個字符進行分詞。即把hello中國233
分詞爲h
、e
、l
、o
、中
、國
、2
、3
。bash
elasticsearch默認沒有如上效果的分詞器,能夠經過自定義分詞器實現該效果:經過字符過濾器,將字符串的每個字符間添加一個空格,再使用空格分詞器將字符串拆分紅字符。app
PUT /like_search
{
"mappings": {
"like_search_type": {
"properties": {
"name": {
"type": "text"
}
}
}
}
}
PUT /like_search/like_search_type/1
{
"name": "hello中國233"
}
複製代碼
分詞效果elasticsearch
GET /like_search/_analyze
{
"text": [
"hello中國233"
]
}
複製代碼
{
"tokens": [
{
"token": "hello",
"start_offset": 0,
"end_offset": 5,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "中",
"start_offset": 5,
"end_offset": 6,
"type": "<IDEOGRAPHIC>",
"position": 1
},
{
"token": "國",
"start_offset": 6,
"end_offset": 7,
"type": "<IDEOGRAPHIC>",
"position": 2
},
{
"token": "233",
"start_offset": 7,
"end_offset": 10,
"type": "<NUM>",
"position": 3
}
]
}
複製代碼
elasticsearch默認使用standard
分詞器,以下經過llo
查詢不到hello中國233
的記錄。spa
GET /like_search/_search
{
"query": {
"match_phrase": {
"name": "llo"
}
}
}
複製代碼
PUT /like_search
{
"settings": {
"analysis": {
"analyzer": {
"char_analyzer": {
"char_filter": [
"split_by_whitespace_filter"
],
"tokenizer": "whitespace"
}
},
"char_filter": {
"split_by_whitespace_filter": {
"type": "pattern_replace",
"pattern": "(.+?)",
"replacement": "$1 "
}
}
}
},
"mappings": {
"like_search_type": {
"properties": {
"name": {
"type": "text",
"analyzer": "char_analyzer"
}
}
}
}
}
PUT /like_search/like_search_type/1
{
"name": "hello中國233"
}
複製代碼
分詞效果code
GET /like_search/_analyze
{
"analyzer": "char_analyzer",
"text": [
"hello中國233"
]
}
複製代碼
{
"tokens": [
{
"token": "h",
"start_offset": 0,
"end_offset": 0,
"type": "word",
"position": 0
},
{
"token": "e",
"start_offset": 1,
"end_offset": 1,
"type": "word",
"position": 1
},
{
"token": "l",
"start_offset": 2,
"end_offset": 2,
"type": "word",
"position": 2
},
{
"token": "l",
"start_offset": 3,
"end_offset": 3,
"type": "word",
"position": 3
},
{
"token": "o",
"start_offset": 4,
"end_offset": 4,
"type": "word",
"position": 4
},
{
"token": "中",
"start_offset": 5,
"end_offset": 5,
"type": "word",
"position": 5
},
{
"token": "國",
"start_offset": 6,
"end_offset": 6,
"type": "word",
"position": 6
},
{
"token": "2",
"start_offset": 7,
"end_offset": 7,
"type": "word",
"position": 7
},
{
"token": "3",
"start_offset": 8,
"end_offset": 8,
"type": "word",
"position": 8
},
{
"token": "3",
"start_offset": 9,
"end_offset": 9,
"type": "word",
"position": 9
}
]
}
複製代碼
使用自定義的分詞器,以下經過llo
能夠查詢到hello中國233
的記錄。token
GET /like_search/_search
{
"query": {
"match_phrase": {
"name": "llo"
}
}
}
複製代碼