(1)源碼
https://github.com/medcl/elasticsearch-analysis-ikjava
(2)releases
https://github.com/medcl/elasticsearch-analysis-ik/releasesnode
(3)複製zip地址
https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.1.1/elasticsearch-analysis-ik-6.1.1.zipgit
(1)elasticsearch-plugingithub
[es@node1 elasticsearch-6.1.1]$ bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.1.1/elasticsearch-analysis-ik-6.1.1.zip -> Downloading https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.1.1/elasticsearch-analysis-ik-6.1.1.zip [=================================================] 100% -> Installed analysis-ik [es@node1 elasticsearch-6.1.1]$ ll plugins/ total 0 drwxr-xr-x 2 es es 199 Jan 7 08:52 analysis-ik [es@node1 elasticsearch-6.1.1]$
(2)查看目錄express
[es@node1 elasticsearch-6.1.1]$ ll plugins/analysis-ik/ total 1420 -rw-r--r-- 1 es es 263965 Jan 7 08:52 commons-codec-1.9.jar -rw-r--r-- 1 es es 61829 Jan 7 08:52 commons-logging-1.2.jar -rw-r--r-- 1 es es 51658 Jan 7 08:52 elasticsearch-analysis-ik-6.1.1.jar -rw-r--r-- 1 es es 736658 Jan 7 08:52 httpclient-4.5.2.jar -rw-r--r-- 1 es es 326724 Jan 7 08:52 httpcore-4.4.4.jar -rw-r--r-- 1 es es 2666 Jan 7 08:52 plugin-descriptor.properties [es@node1 elasticsearch-6.1.1]$
[es@node1 elasticsearch-6.1.1]$ bin/elasticsearch
[2018-01-07T09:01:17,283][INFO ][o.e.n.Node ] [] initializing ... [2018-01-07T09:01:17,421][INFO ][o.e.e.NodeEnvironment ] [cNWkQjt] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [14.3gb], net total_space [21.9gb], types [rootfs] [2018-01-07T09:01:17,422][INFO ][o.e.e.NodeEnvironment ] [cNWkQjt] heap size [1007.3mb], compressed ordinary object pointers [true] [2018-01-07T09:01:17,484][INFO ][o.e.n.Node ] node name [cNWkQjt] derived from node ID [cNWkQjt9SzKFNtyx8IIu-A]; set [node.name] to override [2018-01-07T09:01:17,484][INFO ][o.e.n.Node ] version[6.1.1], pid[3445], build[bd92e7f/2017-12-17T20:23:25.338Z], OS[Linux/3.10.0-514.el7.x86_64/amd64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_112/25.112-b15] [2018-01-07T09:01:17,485][INFO ][o.e.n.Node ] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -XX:+HeapDumpOnOutOfMemoryError, -Des.path.home=/opt/elasticsearch-6.1.1, -Des.path.conf=/opt/elasticsearch-6.1.1/config] [2018-01-07T09:01:19,000][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [aggs-matrix-stats] [2018-01-07T09:01:19,000][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [analysis-common] [2018-01-07T09:01:19,000][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [ingest-common] [2018-01-07T09:01:19,001][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [lang-expression] [2018-01-07T09:01:19,001][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [lang-mustache] [2018-01-07T09:01:19,001][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [lang-painless] [2018-01-07T09:01:19,001][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [mapper-extras] [2018-01-07T09:01:19,001][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [parent-join] [2018-01-07T09:01:19,002][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [percolator] [2018-01-07T09:01:19,002][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [reindex] [2018-01-07T09:01:19,002][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [repository-url] [2018-01-07T09:01:19,002][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [transport-netty4] [2018-01-07T09:01:19,002][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [tribe] [2018-01-07T09:01:19,003][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded plugin [analysis-ik] [2018-01-07T09:01:21,678][INFO ][o.e.d.DiscoveryModule ] [cNWkQjt] using discovery type [zen] [2018-01-07T09:01:22,567][INFO ][o.e.n.Node ] initialized [2018-01-07T09:01:22,568][INFO ][o.e.n.Node ] [cNWkQjt] starting ... [2018-01-07T09:01:22,803][INFO ][o.e.t.TransportService ] [cNWkQjt] publish_address {192.168.80.131:9300}, bound_addresses {192.168.80.131:9300} [2018-01-07T09:01:22,837][INFO ][o.e.b.BootstrapChecks ] [cNWkQjt] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks [2018-01-07T09:01:25,940][INFO ][o.e.c.s.MasterService ] [cNWkQjt] zen-disco-elected-as-master ([0] nodes joined), reason: new_master {cNWkQjt}{cNWkQjt9SzKFNtyx8IIu-A}{Xvho5gpPTuavakz227C_uA}{192.168.80.131}{192.168.80.131:9300} [2018-01-07T09:01:25,949][INFO ][o.e.c.s.ClusterApplierService] [cNWkQjt] new_master {cNWkQjt}{cNWkQjt9SzKFNtyx8IIu-A}{Xvho5gpPTuavakz227C_uA}{192.168.80.131}{192.168.80.131:9300}, reason: apply cluster state (from master [master {cNWkQjt}{cNWkQjt9SzKFNtyx8IIu-A}{Xvho5gpPTuavakz227C_uA}{192.168.80.131}{192.168.80.131:9300} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)]]) [2018-01-07T09:01:25,993][INFO ][o.e.h.n.Netty4HttpServerTransport] [cNWkQjt] publish_address {192.168.80.131:9200}, bound_addresses {192.168.80.131:9200} [2018-01-07T09:01:25,993][INFO ][o.e.n.Node ] [cNWkQjt] started [2018-01-07T09:01:26,077][INFO ][o.w.a.d.Monitor ] try load config from /opt/elasticsearch-6.1.1/config/analysis-ik/IKAnalyzer.cfg.xml [2018-01-07T09:01:26,799][INFO ][o.e.g.GatewayService ] [cNWkQjt] recovered [2] indices into cluster_state [2018-01-07T09:01:27,526][INFO ][o.e.c.r.a.AllocationService] [cNWkQjt] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[test][2], [test][0]] ...]).
(1)ik_smart
其中pretty本意」漂亮的」,表示以美觀的形式打印出JSON格式響應。json
GET _analyze?pretty { "analyzer": "ik_smart", "text":"安徽省長江流域" }
分詞結果bootstrap
{
"tokens": [ { "token": "安徽省", "start_offset": 0, "end_offset": 3, "type": "CN_WORD", "position": 0 }, { "token": "長江流域", "start_offset": 3, "end_offset": 7, "type": "CN_WORD", "position": 1 } ] }
(2)ik_max_wordruby
GET _analyze?pretty { "analyzer": "ik_max_word", "text":"安徽省長江流域" }
分詞結果markdown
{
"tokens": [ { "token": "安徽省", "start_offset": 0, "end_offset": 3, "type": "CN_WORD", "position": 0 }, { "token": "安徽", "start_offset": 0, "end_offset": 2, "type": "CN_WORD", "position": 1 }, { "token": "省長", "start_offset": 2, "end_offset": 4, "type": "CN_WORD", "position": 2 }, { "token": "長江流域", "start_offset": 3, "end_offset": 7, "type": "CN_WORD", "position": 3 }, { "token": "長江", "start_offset": 3, "end_offset": 5, "type": "CN_WORD", "position": 4 }, { "token": "江流", "start_offset": 4, "end_offset": 6, "type": "CN_WORD", "position": 5 }, { "token": "流域", "start_offset": 5, "end_offset": 7, "type": "CN_WORD", "position": 6 } ] }
(3)新詞app
GET _analyze?pretty { "analyzer": "ik_smart", "text": "王者榮耀" }
分詞結果
{
"tokens": [ { "token": "王者", "start_offset": 0, "end_offset": 2, "type": "CN_WORD", "position": 0 }, { "token": "榮耀", "start_offset": 2, "end_offset": 4, "type": "CN_WORD", "position": 1 } ] }
(1)查看已有詞典
[es@node1 analysis-ik]$ pwd
/opt/elasticsearch-6.1.1/config/analysis-ik [es@node1 analysis-ik]$ ll total 8260 -rw-rw---- 1 es bigdata 5225922 Jan 7 08:52 extra_main.dic -rw-rw---- 1 es bigdata 63188 Jan 7 08:52 extra_single_word.dic -rw-rw---- 1 es bigdata 63188 Jan 7 08:52 extra_single_word_full.dic -rw-rw---- 1 es bigdata 10855 Jan 7 08:52 extra_single_word_low_freq.dic -rw-rw---- 1 es bigdata 156 Jan 7 08:52 extra_stopword.dic -rw-rw---- 1 es bigdata 625 Jan 7 08:52 IKAnalyzer.cfg.xml -rw-rw---- 1 es bigdata 3058510 Jan 7 08:52 main.dic -rw-rw---- 1 es bigdata 123 Jan 7 08:52 preposition.dic -rw-rw---- 1 es bigdata 1824 Jan 7 08:52 quantifier.dic -rw-rw---- 1 es bigdata 164 Jan 7 08:52 stopword.dic -rw-rw---- 1 es bigdata 192 Jan 7 08:52 suffix.dic -rw-rw---- 1 es bigdata 752 Jan 7 08:52 surname.dic [es@node1 analysis-ik]$
(2)自定義詞典
[es@node1 analysis-ik]$ mkdir custom [es@node1 analysis-ik]$ vi custom/new_word.dic [es@node1 analysis-ik]$ cat custom/new_word.dic 老鐵 王者榮耀 洪荒之力 共有產權房 一帶一路 [es@node1 analysis-ik]$
(3)更新配置
[es@node1 analysis-ik]$ vi IKAnalyzer.cfg.xml
[es@node1 analysis-ik]$ cat IKAnalyzer.cfg.xml
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd"> <properties> <comment>IK Analyzer 擴展配置</comment> <!--用戶能夠在這裏配置本身的擴展字典 --> <entry key="ext_dict">custom/new_word.dic</entry> <!--用戶能夠在這裏配置本身的擴展中止詞字典--> <entry key="ext_stopwords"></entry> <!--用戶能夠在這裏配置遠程擴展字典 --> <!-- <entry key="remote_ext_dict">words_location</entry> --> <!--用戶能夠在這裏配置遠程擴展中止詞字典--> <!-- <entry key="remote_ext_stopwords">words_location</entry> --> </properties> [es@node1 analysis-ik]$
(4)重啓elasticsearch
[es@node1 elasticsearch-6.1.1]$ bin/elasticsearch
[2018-01-07T10:00:23,032][INFO ][o.e.n.Node ] [] initializing ... [2018-01-07T10:00:23,170][INFO ][o.e.e.NodeEnvironment ] [cNWkQjt] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [14.3gb], net total_space [21.9gb], types [rootfs] [2018-01-07T10:00:23,171][INFO ][o.e.e.NodeEnvironment ] [cNWkQjt] heap size [1007.3mb], compressed ordinary object pointers [true] [2018-01-07T10:00:23,209][INFO ][o.e.n.Node ] node name [cNWkQjt] derived from node ID [cNWkQjt9SzKFNtyx8IIu-A]; set [node.name] to override [2018-01-07T10:00:23,210][INFO ][o.e.n.Node ] version[6.1.1], pid[3574], build[bd92e7f/2017-12-17T20:23:25.338Z], OS[Linux/3.10.0-514.el7.x86_64/amd64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_112/25.112-b15] [2018-01-07T10:00:23,210][INFO ][o.e.n.Node ] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -XX:+HeapDumpOnOutOfMemoryError, -Des.path.home=/opt/elasticsearch-6.1.1, -Des.path.conf=/opt/elasticsearch-6.1.1/config] [2018-01-07T10:00:24,717][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [aggs-matrix-stats] [2018-01-07T10:00:24,717][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [analysis-common] [2018-01-07T10:00:24,718][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [ingest-common] [2018-01-07T10:00:24,718][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [lang-expression] [2018-01-07T10:00:24,718][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [lang-mustache] [2018-01-07T10:00:24,718][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [lang-painless] [2018-01-07T10:00:24,718][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [mapper-extras] [2018-01-07T10:00:24,719][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [parent-join] [2018-01-07T10:00:24,719][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [percolator] [2018-01-07T10:00:24,719][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [reindex] [2018-01-07T10:00:24,719][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [repository-url] [2018-01-07T10:00:24,719][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [transport-netty4] [2018-01-07T10:00:24,720][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded module [tribe] [2018-01-07T10:00:24,720][INFO ][o.e.p.PluginsService ] [cNWkQjt] loaded plugin [analysis-ik] [2018-01-07T10:00:27,866][INFO ][o.e.d.DiscoveryModule ] [cNWkQjt] using discovery type [zen] [2018-01-07T10:00:28,794][INFO ][o.e.n.Node ] initialized [2018-01-07T10:00:28,795][INFO ][o.e.n.Node ] [cNWkQjt] starting ... [2018-01-07T10:00:29,047][INFO ][o.e.t.TransportService ] [cNWkQjt] publish_address {192.168.80.131:9300}, bound_addresses {192.168.80.131:9300} [2018-01-07T10:00:29,093][INFO ][o.e.b.BootstrapChecks ] [cNWkQjt] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks [2018-01-07T10:00:32,210][INFO ][o.e.c.s.MasterService ] [cNWkQjt] zen-disco-elected-as-master ([0] nodes joined), reason: new_master {cNWkQjt}{cNWkQjt9SzKFNtyx8IIu-A}{N6t0NiDmQp2vlrbx-FtcUQ}{192.168.80.131}{192.168.80.131:9300} [2018-01-07T10:00:32,217][INFO ][o.e.c.s.ClusterApplierService] [cNWkQjt] new_master {cNWkQjt}{cNWkQjt9SzKFNtyx8IIu-A}{N6t0NiDmQp2vlrbx-FtcUQ}{192.168.80.131}{192.168.80.131:9300}, reason: apply cluster state (from master [master {cNWkQjt}{cNWkQjt9SzKFNtyx8IIu-A}{N6t0NiDmQp2vlrbx-FtcUQ}{192.168.80.131}{192.168.80.131:9300} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)]]) [2018-01-07T10:00:32,285][INFO ][o.e.h.n.Netty4HttpServerTransport] [cNWkQjt] publish_address {192.168.80.131:9200}, bound_addresses {192.168.80.131:9200} [2018-01-07T10:00:32,286][INFO ][o.e.n.Node ] [cNWkQjt] started [2018-01-07T10:00:32,326][INFO ][o.w.a.d.Monitor ] try load config from /opt/elasticsearch-6.1.1/config/analysis-ik/IKAnalyzer.cfg.xml [2018-01-07T10:00:32,905][INFO ][o.w.a.d.Monitor ] [Dict Loading] custom/new_word.dic [2018-01-07T10:00:33,279][INFO ][o.e.g.GatewayService ] [cNWkQjt] recovered [2] indices into cluster_state [2018-01-07T10:00:34,092][INFO ][o.e.c.r.a.AllocationService] [cNWkQjt] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[test][3]] ...]).
從輸出信息中能夠看到
[Dict Loading] custom/new_word.dic
說明自定義詞典已經加載了。
(5)重啓Kibana
重啓Kibana後,重新執行下面命令
GET _analyze?pretty { "analyzer": "ik_smart", "text":"王者榮耀" }
分詞結果
{
"tokens": [ { "token": "王者榮耀", "start_offset": 0, "end_offset": 4, "type": "CN_WORD", "position": 0 } ] }