ES系列1、CentOS7安裝ES 6.3.一、集成IK分詞器

Elasticsearch 6.3.1 地址:html

wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.3.1.tar.gz

2.安裝配置

1.拷貝

拷貝到服務器上,解壓:tar -xvzf elasticsearch-6.3.1.tar.gz 。解壓後路徑:/home/elasticsearch-6.3.1java

3.建立用戶

建立用戶,建立esdata目錄,並賦予權限node

[root@bogon home]# adduser esuser
[root@bogon home]# cd /home
[root@bogon home]# mkdir -p esdata/data
[root@bogon home]# mkdir -p esdata/log
[root@bogon home]# chown -R esuser elasticsearch-6.3.1 
[root@bogon home]# chown -R esuser esdata

4.配置es節點

[root@bogon esdata]# cat /home/elasticsearch-6.3.1/config/elasticsearch.yml
# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: my-application
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: node-1
#
# Add custom attributes to the node:
#
node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /home/esdata/data
#
# Path to log files:
#
path.logs: /home/esdata/log
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
# 容許訪問的ip,0.0.0.0表示任意ip能夠訪問
network.host: 0.0.0.0
#
# Set a custom port for HTTP:
# 對外端口
http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
# 集羣其餘節點IP,只有一個節點寫本機ip
discovery.zen.ping.unicast.hosts: ["host1", "host2"]
#
# Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):
#
#discovery.zen.minimum_master_nodes:
#
# For more information, consult the zen discovery module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
# 集羣節點數量
gateway.recover_after_nodes: 1
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
action.destructive_requires_name: true

3.配置系統參數

[root@bogon bin]#  vim /etc/security/limits.conf(在文件最後添加)
esuser hard nofile 65536
esuser soft nofile 65536
esuser soft memlock unlimited
esuser hard memlock unlimited

以上配置解決問題:linux

max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536]
memory locking requested for elasticsearch process but memory is not locked

 

臨時設置:sysctl -w vm.max_map_count=262144
永久修改:
修改vim /etc/sysctl.conf 文件,添加 「vm.max_map_count」設置
並執行:sysctl -p

 

以上配置解決問題:git

max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

 

[root@bogon logs]# visudo
。。。。。。。。
## Allow root to run any commands anywhere
root    ALL=(ALL)       ALL
esuser  ALL=(ALL)       ALL
。。。。。。。。

以上配置解決某些狀況下沒法讀寫的問題github

1.ulimit -n和-u能夠查看linux的最大進程數和最大文件打開數

一、vim /etc/security/limits.d/90-nproc.conf文件尾添加chrome

 

* soft nproc 204800  
* hard nproc 204800  

  

二、vim /etc/security/limits.d/def.conf文件尾添加docker

 

* soft nofile 204800  
* hard nofile 204800  

  

這兩個文件的設置將會覆蓋前面的設置。重啓後生效bootstrap

以上配置解決問題:max number of threads [3895] for user [esuser] is too low, increase to at least [4096]

 

問題一:警告提示vim

[2016-11-06T16:27:21,712][WARN ][o.e.b.JNANatives ] unable to install syscall filter: 

java.lang.UnsupportedOperationException: seccomp unavailable: requires kernel 3.5+ with CONFIG_SECCOMP and CONFIG_SECCOMP_FILTER compiled in
at org.elasticsearch.bootstrap.Seccomp.linuxImpl(Seccomp.java:349) ~[elasticsearch-5.0.0.jar:5.0.0]
at org.elasticsearch.bootstrap.Seccomp.init(Seccomp.java:630) ~[elasticsearch-5.0.0.jar:5.0.0]

報了一大串錯誤,其實只是一個警告。

解決:使用新的centOS版本,centOS7就不會出現此類問題了。

 

問題二:報錯

報錯:
ERROR: bootstrap checks failed
system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk

緣由:
這是在由於Centos6不支持SecComp,而ES5.2.0默認bootstrap.system_call_filter爲true進行檢測,因此致使檢測失敗,失敗後直接致使ES不能啓動。

解決:
在elasticsearch.yml中配置bootstrap.system_call_filter爲false,注意要在Memory下面:
bootstrap.memory_lock: false
bootstrap.system_call_filter: false

4.啓動

複製代碼
[root@bogon ~]# cd /home/elasticsearch-6.3.1/bin/
[root@bogon bin]# su esuser
[esuser@bogon bin]$ ./elasticsearch
[2018-07-17T10:17:30,139][INFO ][o.e.n.Node               ] [node-1] initializing ...
[2018-07-17T10:17:30,234][INFO ][o.e.e.NodeEnvironment    ] [node-1] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [22.1gb], net total_space [27.6gb], types [rootfs]
[2018-07-17T10:17:30,234][INFO ][o.e.e.NodeEnvironment    ] [node-1] heap size [1007.3mb], compressed ordinary object pointers [true]
[2018-07-17T10:17:30,236][INFO ][o.e.n.Node               ] [node-1] node name [node-1], node ID [cb69e4JjSBKeHJ9y-q-hNA]
[2018-07-17T10:17:30,236][INFO ][o.e.n.Node               ] [node-1] version[6.3.1], pid[26327], build[default/tar/eb782d0/2018-06-29T21:59:26.107521Z], OS[Linux/3.10.0-514.6.1.el7.x86_64/amd64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_92/25.92-b14]
[2018-07-17T10:17:30,236][INFO ][o.e.n.Node               ] [node-1] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/tmp/elasticsearch.F1Jh0AOB, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime, -Xloggc:logs/gc.log, -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=32, -XX:GCLogFileSize=64m, -Des.path.home=/home/elasticsearch-6.3.1, -Des.path.conf=/home/elasticsearch-6.3.1/config, -Des.distribution.flavor=default, -Des.distribution.type=tar]
[2018-07-17T10:17:33,136][INFO ][o.e.p.PluginsService     ] [node-1] loaded module [aggs-matrix-stats]
[2018-07-17T10:17:33,136][INFO ][o.e.p.PluginsService     ] [node-1] loaded module [analysis-common]
[2018-07-17T10:17:33,137][INFO ][o.e.p.PluginsService     ] [node-1] loaded module [ingest-common]
。。。。。。
複製代碼

 

5.驗證

瀏覽器訪問:http://192.168.20.115:9200/  (192.168.20.115是es服務器的IP,另外請確保9200端口可以被外部訪問),返回:

複製代碼
{
  "name" : "node-1",
  "cluster_name" : "my-application",
  "cluster_uuid" : "_na_",
  "version" : {
    "number" : "6.3.1",
    "build_flavor" : "default",
    "build_type" : "tar",
    "build_hash" : "eb782d0",
    "build_date" : "2018-06-29T21:59:26.107521Z",
    "build_snapshot" : false,
    "lucene_version" : "7.3.1",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}
複製代碼

 

固然最方便的安裝方法仍是下載docker鏡像,官方安裝手冊:https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html  步驟:

1)下載鏡像:docker pull docker.elastic.co/elasticsearch/elasticsearch:6.3.1

2)運行容器:docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:6.3.1

 

6.ElasticSearch Head安裝

官方的模擬工具是控制檯的curl,不是很直觀,能夠在chrome瀏覽器中安裝head插件來做爲請求的工具:head插件的地址:Cenos7安裝ES head6.3.1

7、集成集成Ikanalyzer分詞器

1. 獲取 ES-IKAnalyzer插件

必定和ES的版本一致( 6.3.1)

地址: https://github.com/medcl/elasticsearch-analysis-ik/releases

2. 安裝插件

 將 ik 的壓縮包解壓到 ES安裝目錄的plugins/目錄下(最好把解出的目錄名改一下,防止安裝別的插件時同名衝突),而後重啓ES。

3. 擴展詞庫

擴展詞典能夠修改配置文件config/IKAnalyzer.cfg.xml

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
    <comment>IK Analyzer 擴展配置</comment>
    <!--用戶能夠在這裏配置本身的擴展字典 -->
    <entry key="ext_dict">custom/mydict.dic;custom/single_word_low_freq.dic</entry>
     <!--用戶能夠在這裏配置本身的擴展中止詞字典-->
    <entry key="ext_stopwords">custom/ext_stopword.dic</entry>
    <!--用戶能夠在這裏配置遠程擴展字典 遠程詞庫,可熱更新,在一處地方維護-->
    <!-- <entry key="remote_ext_dict">words_location</entry> -->
    <!--用戶能夠在這裏配置遠程擴展中止詞字典-->
    <!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>

4. 測試 IK

一、建立一個索引

http://start.com:9200/iktest
{
    "mappings":{
        "_doc":{
                "properties": {
                "content": {
                "type": "text",
                "analyzer": "ik_max_word",
                "search_analyzer": "ik_max_word"
                }
            }
        }
    
    }
}

2.分詞測試

http://start.com:9200/_analyze
{
  "analyzer":"ik_smart",
  "text":"天團S.H.E昨在兩廳院藝文廣場舉辦17萬人露天音樂會,3人獻唱多首經典好歌,讓現場粉絲聽得如癡如醉"
}

結果:

{
    "tokens": [
        {
            "token": "",
            "start_offset": 0,
            "end_offset": 1,
            "type": "CN_CHAR",
            "position": 0
        },
        {
            "token": "",
            "start_offset": 1,
            "end_offset": 2,
            "type": "CN_CHAR",
            "position": 1
        },
        {
            "token": "s.h.e",
            "start_offset": 2,
            "end_offset": 7,
            "type": "LETTER",
            "position": 2
        },
        {
            "token": "昨在",
            "start_offset": 7,
            "end_offset": 9,
            "type": "CN_WORD",
            "position": 3
        },
        {
            "token": "兩廳",
            "start_offset": 9,
            "end_offset": 11,
            "type": "CN_WORD",
            "position": 4
        },
        {
            "token": "",
            "start_offset": 11,
            "end_offset": 12,
            "type": "CN_CHAR",
            "position": 5
        },
        {
            "token": "藝文",
            "start_offset": 12,
            "end_offset": 14,
            "type": "CN_WORD",
            "position": 6
        },
        {
            "token": "廣場",
            "start_offset": 14,
            "end_offset": 16,
            "type": "CN_WORD",
            "position": 7
        },
        {
            "token": "舉辦",
            "start_offset": 16,
            "end_offset": 18,
            "type": "CN_WORD",
            "position": 8
        },
        {
            "token": "17",
            "start_offset": 18,
            "end_offset": 20,
            "type": "ARABIC",
            "position": 9
        },
        {
            "token": "萬人",
            "start_offset": 20,
            "end_offset": 22,
            "type": "CN_WORD",
            "position": 10
        },
        {
            "token": "露天",
            "start_offset": 22,
            "end_offset": 24,
            "type": "CN_WORD",
            "position": 11
        },
        {
            "token": "音樂會",
            "start_offset": 24,
            "end_offset": 27,
            "type": "CN_WORD",
            "position": 12
        },
        {
            "token": "3人",
            "start_offset": 28,
            "end_offset": 30,
            "type": "TYPE_CQUAN",
            "position": 13
        },
        {
            "token": "",
            "start_offset": 30,
            "end_offset": 31,
            "type": "CN_CHAR",
            "position": 14
        },
        {
            "token": "",
            "start_offset": 31,
            "end_offset": 32,
            "type": "CN_CHAR",
            "position": 15
        },
        {
            "token": "多首",
            "start_offset": 32,
            "end_offset": 34,
            "type": "CN_WORD",
            "position": 16
        },
        {
            "token": "經典",
            "start_offset": 34,
            "end_offset": 36,
            "type": "CN_WORD",
            "position": 17
        },
        {
            "token": "好歌",
            "start_offset": 36,
            "end_offset": 38,
            "type": "CN_WORD",
            "position": 18
        },
        {
            "token": "",
            "start_offset": 39,
            "end_offset": 40,
            "type": "CN_CHAR",
            "position": 19
        },
        {
            "token": "現場",
            "start_offset": 40,
            "end_offset": 42,
            "type": "CN_WORD",
            "position": 20
        },
        {
            "token": "粉絲",
            "start_offset": 42,
            "end_offset": 44,
            "type": "CN_WORD",
            "position": 21
        },
        {
            "token": "聽得",
            "start_offset": 44,
            "end_offset": 46,
            "type": "CN_WORD",
            "position": 22
        },
        {
            "token": "如癡如醉",
            "start_offset": 46,
            "end_offset": 50,
            "type": "CN_WORD",
            "position": 23
        }
    ]
}

對比standard分詞器:

http://start.com:9200/_analyze
{
  "analyzer":"standard",
  "text":"天團S.H.E昨在兩廳院藝文廣場 舉辦17萬人露 天音樂會,3人獻唱多首 經典好歌,讓現場 粉絲聽得如癡如醉"
}

結果:

{
    "tokens": [
        {
            "token": "",
            "start_offset": 0,
            "end_offset": 1,
            "type": "<IDEOGRAPHIC>",
            "position": 0
        },
        {
            "token": "",
            "start_offset": 1,
            "end_offset": 2,
            "type": "<IDEOGRAPHIC>",
            "position": 1
        },
        {
            "token": "s.h.e",
            "start_offset": 2,
            "end_offset": 7,
            "type": "<ALPHANUM>",
            "position": 2
        },
        {
            "token": "",
            "start_offset": 7,
            "end_offset": 8,
            "type": "<IDEOGRAPHIC>",
            "position": 3
        },
        {
            "token": "",
            "start_offset": 8,
            "end_offset": 9,
            "type": "<IDEOGRAPHIC>",
            "position": 4
        },
        {
            "token": "",
            "start_offset": 9,
            "end_offset": 10,
            "type": "<IDEOGRAPHIC>",
            "position": 5
        },
        {
            "token": "",
            "start_offset": 10,
            "end_offset": 11,
            "type": "<IDEOGRAPHIC>",
            "position": 6
        },
        {
            "token": "",
            "start_offset": 11,
            "end_offset": 12,
            "type": "<IDEOGRAPHIC>",
            "position": 7
        },
        {
            "token": "",
            "start_offset": 12,
            "end_offset": 13,
            "type": "<IDEOGRAPHIC>",
            "position": 8
        },
        {
            "token": "",
            "start_offset": 13,
            "end_offset": 14,
            "type": "<IDEOGRAPHIC>",
            "position": 9
        }
      。。。
    ]
}

 standard分詞器把中文都拆分紅了單個字。IK分詞器拆分紅了字和詞語。

相關文章
相關標籤/搜索