lucene : 倒排索引 以下: 我 (1:1) {0} 表示第一行出現一次,索引位置爲0
elasticsearch 部署 elasticsearch-2.2.1.zip 192.168.112.101 node1 192.168.112.102 node2 192.168.112.103 node3 三臺機器,每臺機器上都部署。 es不能以root用戶啓動(由於es能夠遠程執行腳本,對於主機不安全) ## 因此三臺主機都建立用戶 [root@node2 ~]# useradd sxt [root@node2 ~]# echo sxt | passwd --stdin sxt [root@node2 ~]# mkdir -p /opt/sxt/es [root@node2 ~]# cd /opt/sxt [root@node1 sxt]# cd /opt/sxt/es/ [root@node1 es]# ll total 28740 -rw-r--r--. 1 root root 29428075 Sep 10 21:18 elasticsearch-2.2.1.zip [root@node1 sxt]# chown sxt:sxt es [root@node1 sxt]# su sxt [sxt@node1 sxt]$ cd es [sxt@node1 es]$ ll total 28740 -rw-r--r--. 1 root root 29428075 Sep 10 21:18 elasticsearch-2.2.1.zip [sxt@node1 es]$ unzip elasticsearch-2.2.1.zip [sxt@node1 es]$ cd elasticsearch-2.2.1/config/elasticsearch.yml ## 修改 cluster.name: bjsxt-es node.name: node1 network.host: 192.168.112.101 discovery.zen.ping.multicast.enabled: false ## 放在末尾 discovery.zen.ping.unicast.hosts: ["192.168.112.101","192.168.112.102", "192.168.112.103"] discovery.zen.ping_timeout: 120s client.transport.ping_timeout: 60s [sxt@node1 es]$ scp -r elasticsearch-2.2.1 sxt@node2:`pwd` ## 分發到node2和node3 [sxt@node1 bin]$ cd /opt/sxt/es/elasticsearch-2.2.1/bin [sxt@node1 bin]$ ./elasticsearch ## node2,node3都啓動此命令
配置json內容的格式化ui 02_第二階段 hadoop體系之離線計算\12_EL SEARCH 搜索引擎\01資料\01資料\附件\plugins 將文件夾下的head上傳到 [root@node1 plugins]# pwd /opt/sxt/es/elasticsearch-2.2.1/plugins [root@node1 plugins]# ll total 4 drwxr-xr-x. 6 sxt sxt 4096 Sep 10 21:41 head ## 注意權限head 爲sxt [root@node1 plugins]# chown -R sxt:sxt head
## 若是不當心以root用戶啓動,報錯,以下。此時須要刪除logs文件夾。不然再次以sxt啓動也可能失敗。
[root@node1 plugins]# cd /opt/sxt/es/elasticsearch-2.2.1/bin
[root@node1 bin]# ./elasticsearch
Exception in thread "main" java.lang.RuntimeException: don't run elasticsearch as root.
at org.elasticsearch.bootstrap.Bootstrap.initializeNatives(Bootstrap.java:93)
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:144)
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:285)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:35)
[root@node1 elasticsearch-2.2.1]# rm -rf logs
## 從新啓動 ### ctrl+c 結束程序
[root@node1 elasticsearch-2.2.1]# su sxt
[sxt@node1 elasticsearch-2.2.1]$ cd /opt/sxt/es/elasticsearch-2.2.1/bin
[sxt@node1 bin]$ ./elasticsearch
## 訪問頁面內容以下;
http://node2:9200/_plugin/head/
橫向擴展sharding切片,縱向擴展搭建ha. 通常lucense的分片不可修改,在規劃時候須要考慮好,一經確認不可修改。(能夠給分片作備份)
經過curl 操做es [root@node1 plugins]# curl -XPUT http://192.168.112.101:9200/bjsxt/ 以下:建立了lucene分片。粗體表明主分片,普通矩形框表示備分片
稱爲建立索引庫 (至關於數據庫)
node3掛掉後,出現短暫的警告,過一下子又從新調整爲以下第二圖(達到健康狀態了,自動備份了)。
再次重啓node3.過一會如圖第三。 * 表明是主。
curl -XPOST http://192.168.112.101:9200/bjsxt/employee -d ' { "first_name" : "bin", "age" : 33, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ] }' 建立type和document. [root@node1 plugins]# curl -XPUT http://192.168.112.101:9200/bjsxt/ {"acknowledged":true}[root@node1 plugins]# curl -XPOST http://192.168.112.101:9200/bjsxt/employee -d ' > { > "first_name" : "bin", > "age" : 33, > "about" : "I love to go rock climbing", > "interests": [ "sports", "music" ] > }' {"_index":"bjsxt","_type":"employee","_id":"AW0brHsbOCeeN2j3g-hG","_version":1,"_shards":{"total":2,"successful":2,"failed":0},"created":true}[root@node1 plugins]#
curl -XPOST http://192.168.112.101:9200/bjsxt/employee -d ' { "first_name" : "gob bin", "age" : 43, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ] }' curl -XPOST http://192.168.112.101:9200/bjsxt/employee -d ' { "first_name" : "pablo2", "age" : 33, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ], "sex": "man" }' #XPUT 必須給出id curl -XPUT http://192.168.112.101:9200/bjsxt/employee/1 -d ' { "first_name" : "god bin", "last_name" : "pang", "age" : 42, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ] }' ## 修改age 44 curl -XPUT http://192.168.112.101:9200/bjsxt/employee/1 -d ' { "first_name" : "god bin", "last_name" : "pang", "age" : 44, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ] }' curl -XPOST http://192.168.112.101:9200/bjsxt/employee/1 -d ' { "first_name" : "pablo2", "age" : 33, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ], "sex": "man" }' ## XPUT,XPOST 均可以作建立和修改。 XPUT 必須給出id,若是id不存在就建立,存在則修改。 XPOST 不用必須給定id [root@node1 plugins]# curl -XGET http://192.168.112.101:9200/bjsxt/employee/1?pretty { "_index" : "bjsxt", "_type" : "employee", "_id" : "1", "_version" : 4, "found" : true, "_source" : { "first_name" : "pablo2", "age" : 33, "about" : "I love to go rock climbing", "interests" : [ "sports", "music" ], "sex" : "man" } }
[root@node1 plugins]# curl -XGET http://192.168.112.101:9200/bjsxt/employee/_search?q=first_name="bin" {"took":31,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":2,"max_score":0.079459734,"hits":[{"_index":"bjsxt","_type":"employee","_id":"AW0brHsbOCeeN2j3g-hG","_score":0.079459734,"_source": { "first_name" : "bin", "age" : 33, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ] }},{"_index":"bjsxt","_type":"employee","_id":"AW0brvCeOCeeN2j3g-hH","_score":0.01125201,"_source": { "first_name" : "gob bin", "age" : 43, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ] }}]}}
[root@node1 plugins]# curl -XGET http://192.168.112.101:9200/bjsxt/employee/_search?pretty -d ' > { > "query": > {"match": > {"first_name":"bin"} > } > }' { "took" : 13, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 1.0, "hits" : [ { "_index" : "bjsxt", "_type" : "employee", "_id" : "AW0brHsbOCeeN2j3g-hG", "_score" : 1.0, "_source" : { "first_name" : "bin", "age" : 33, "about" : "I love to go rock climbing", "interests" : [ "sports", "music" ] } }, { "_index" : "bjsxt", "_type" : "employee", "_id" : "AW0brvCeOCeeN2j3g-hH", "_score" : 0.19178301, "_source" : { "first_name" : "gob bin", "age" : 43, "about" : "I love to go rock climbing", "interests" : [ "sports", "music" ] } } ] } }
[root@node1 plugins]# curl -XGET http://192.168.112.101:9200/bjsxt/employee/_search?pretty -d ' > { > "query": > {"multi_match": > { > "query":"bin", > "fields":["last_name","first_name"], > "operator":"and" > } > } > }' { "took" : 13, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 0.5906161, "hits" : [ { "_index" : "bjsxt", "_type" : "employee", "_id" : "AW0brHsbOCeeN2j3g-hG", "_score" : 0.5906161, "_source" : { "first_name" : "bin", "age" : 33, "about" : "I love to go rock climbing", "interests" : [ "sports", "music" ] } }, { "_index" : "bjsxt", "_type" : "employee", "_id" : "AW0brvCeOCeeN2j3g-hH", "_score" : 0.058849156, "_source" : { "first_name" : "gob bin", "age" : 43, "about" : "I love to go rock climbing", "interests" : [ "sports", "music" ] } } ] } }
[root@node1 plugins]# curl -XGET http://192.168.112.101:9200/bjsxt/employee/_search?pretty -d ' > { > "query": > {"bool" : > { > "must" : > {"match": > {"first_name":"bin"} > }, > "must" : > {"match": > {"age":33} > } > } > } > }' { "took" : 10, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 1.163388, "hits" : [ { "_index" : "bjsxt", "_type" : "employee", "_id" : "AW0brHsbOCeeN2j3g-hG", "_score" : 1.163388, "_source" : { "first_name" : "bin", "age" : 33, "about" : "I love to go rock climbing", "interests" : [ "sports", "music" ] } } ] } }
[root@node1 plugins]# curl -XGET http://192.168.112.101:9200/bjsxt/employee/_search?pretty -d ' > { > "query": > {"bool" : > { > "must" : > {"match": > {"first_name":"bin"} > }, > "must_not" : > {"match": > {"age":33} > } > } > } > }' { "took" : 8, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.19178301, "hits" : [ { "_index" : "bjsxt", "_type" : "employee", "_id" : "AW0brvCeOCeeN2j3g-hH", "_score" : 0.19178301, "_source" : { "first_name" : "gob bin", "age" : 43, "about" : "I love to go rock climbing", "interests" : [ "sports", "music" ] } } ] } }
[root@node1 plugins]# curl -XGET http://192.168.112.101:9200/bjsxt/employee/_search?pretty -d ' > { > "query": > {"bool" : > { > "must_not" : > {"match": > {"first_name":"bin"} > }, > "must_not" : > {"match": > {"age":33} > } > } > } > }' { "took" : 10, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 0, "max_score" : null, "hits" : [ ] } }
以集合的方式思考
[root@node1 plugins]# curl -XGET http://192.168.112.101:9200/bjsxt/employee/_search -d ' > { > "query": > {"bool" : > { > "must" : > {"term" : > { "first_name" : "bin" } > } > , > "must_not" : > {"range": > {"age" : { "from" : 20, "to" : 33 } > } > } > } > } > }' {"took":17,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":0.19178301,"hits":[{"_index":"bjsxt","_type":"employee","_id":"AW0brvCeOCeeN2j3g-hH","_score":0.19178301,"_source": { "first_name" : "gob bin", "age" : 43, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ]
curl -XPUT 'http://192.168.112.101:9200/test2/' -d'{"settings":{"number_of_replicas":2}}'
curl -XPUT 'http://192.168.112.101:9200/test3/' -d'{"settings":{"number_of_shards":3,"number_of_replicas":3}}'
file segment(段,多個document組成) document(一條記錄,一個對象實例) field(對象的屬性) term(項,分詞以後的詞條) # yes curl -XPUT http://192.168.133.6:9200/bjsxt/ # yes curl -XDELETE http://192.168.133.6:9200/test2/ curl -XDELETE http://192.168.133.6:9200/test3/ #document:yes curl -XPOST http://192.168.133.6:9200/bjsxt/employee -d ' { "first_name" : "bin", "age" : 33, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ] }' curl -XPOST http://192.168.133.6:9200/bjsxt/employee -d ' { "first_name" : "gob bin", "age" : 43, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ] }' curl -XPOST http://192.168.133.6:9200/bjsxt/employee/2 -d ' { "first_name" : "bin", "age" : 45, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ] }' #add field yes curl -XPOST http://192.168.133.6:9200/bjsxt/employee -d ' { "first_name" : "pablo2", "age" : 33, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ], "sex": "man" }' curl -XPOST http://192.168.133.6:9200/bjsxt/employee/1 -d ' { "first_name" : "pablo2", "age" : 35, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ], "sex": "man" }' ---------------------------------------- #put:yes curl -XPUT http://192.168.133.6:9200/bjsxt/employee/1 -d ' { "first_name" : "god bin", "last_name" : "pang", "age" : 42, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ] }' curl -XPUT http://192.168.133.6:9200/bjsxt/employee -d ' { "first_name" : "god bin", "last_name" : "bin", "age" : 45, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ] }' curl -XPUT http://192.168.133.6:9200/bjsxt/employee/2 -d ' { "first_name" : "god bin", "last_name" : "bin", "age" : 45, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ] }' curl -XPUT http://192.168.133.6:9200/bjsxt/employee/1 -d ' { "first_name" : "god bin", "last_name" : "pang", "age" : 40, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ] }' #根據document的id來獲取數據:(without pretty) curl -XGET http://192.168.133.6:9200/bjsxt/employee/1?pretty #根據field來查詢數據: curl -XGET http://192.168.133.6:9200/bjsxt/employee/_search?q=first_name="bin" #根據field來查詢數據:match curl -XGET http://192.168.133.6:9200/bjsxt/employee/_search?pretty -d ' { "query": {"match": {"first_name":"bin"} } }' #對多個field發起查詢:multi_match curl -XGET http://192.168.133.6:9200/bjsxt/employee/_search?pretty -d ' { "query": {"multi_match": { "query":"bin", "fields":["last_name","first_name"], "operator":"and" } } }' #多個term對多個field發起查詢:bool(boolean) # 組合查詢,must,must_not,should # must + must : 交集 # must +must_not :差集 # should+should : 並集 curl -XGET http://192.168.133.6:9200/bjsxt/employee/_search?pretty -d ' { "query": {"bool" : { "must" : {"match": {"first_name":"bin"} }, "must" : {"match": {"age":33} } } } }' curl -XGET http://192.168.133.6:9200/bjsxt/employee/_search?pretty -d ' { "query": {"bool" : { "must" : {"match": {"first_name":"bin"} }, "must_not" : {"match": {"age":33} } } } }' curl -XGET http://192.168.133.6:9200/bjsxt/employee/_search?pretty -d ' { "query": {"bool" : { "must_not" : {"match": {"first_name":"bin"} }, "must_not" : {"match": {"age":33} } } } }' ##查詢first_name=bin的,或者年齡在20歲到33歲之間的 curl -XGET http://192.168.133.6:9200/bjsxt/employee/_search -d ' { "query": {"bool" : { "must" : {"term" : { "first_name" : "bin" } } , "must_not" : {"range": {"age" : { "from" : 20, "to" : 33 } } } } } }' #修改配置 curl -XPUT 'http://192.168.133.6:9200/test2/' -d'{"settings":{"number_of_replicas":2}}' curl -XPUT 'http://192.168.133.6:9200/test3/' -d'{"settings":{"number_of_shards":3,"number_of_replicas":3}}' curl -XPUT 'http://192.168.133.6:9200/test4/' -d'{"settings":{"number_of_shards":6,"number_of_replicas":4}}' curl -XPOST http://192.168.9.11:9200/bjsxt/person/_mapping -d' { "person": { "properties": { "content": { "type": "string", "store": "no", "term_vector": "with_positions_offsets", "analyzer": "ik_max_word", "search_analyzer": "ik_max_word", "include_in_all": "true", "boost": 8 } } } }'
官網 https://www.elastic.co/guide/index.html https://www.elastic.co/guide/en/elasticsearch/client/index.html https://www.elastic.co/guide/en/elasticsearch/client/java-api/index.html https://www.elastic.co/guide/en/elasticsearch/client/java-api/2.2/transport-client.html
爬取數據,做爲document的原始文件。在linux上 yum install wget ## 以下命令爬取 http://news.cctv.com;而且按照原有網站的url目錄存儲到data下 wget -o /tmp/wget.log -P /root/data --no-parent --no-verbose -m -D news.cctv.com -N --convert-links --random-wait -A html,HTML,shtml,SHTML http://news.cctv.com 配置分詞器 https://github.com/medcl/elasticsearch-analysis-ik 版本必須與es相對應 elasticsearch-2.2.1.zip elasticsearch-analysis-ik-1.8.0.zip ## [sxt@node1 ik]$ pwd /opt/sxt/es/elasticsearch-2.2.1/plugins/ik ## 修改以下配置文件 [sxt@node1 ik]$ cat plugin-descriptor.properties | grep version= elasticsearch.version=2.2.1 ## 版本號也修改對應。 ## 啓動es. ## 運行java程序 createIndex package com.sxt.es; import java.io.File; import java.net.InetAddress; import java.util.HashMap; import java.util.Map; import org.elasticsearch.action.admin.indices.exists.indices.IndicesExistsResponse; import org.elasticsearch.action.admin.indices.mapping.put.PutMappingRequest; import org.elasticsearch.action.search.SearchResponse; import org.elasticsearch.client.Client; import org.elasticsearch.client.Requests; import org.elasticsearch.client.transport.TransportClient; import org.elasticsearch.common.settings.Settings; import org.elasticsearch.common.text.Text; import org.elasticsearch.common.transport.InetSocketTransportAddress; import org.elasticsearch.common.xcontent.XContentBuilder; import org.elasticsearch.common.xcontent.XContentFactory; import org.elasticsearch.index.query.BoolQueryBuilder; import org.elasticsearch.index.query.MatchQueryBuilder; import org.elasticsearch.index.query.MultiMatchQueryBuilder; import org.elasticsearch.index.query.MultiMatchQueryParser; import org.elasticsearch.index.query.RangeQueryBuilder; import org.elasticsearch.search.SearchHit; import org.elasticsearch.search.SearchHits; import org.junit.Test; import org.springframework.stereotype.Service; import com.sxt.util.HtmlTool; @Service public class IndexService { //存放html文件的目錄 // public static String DATA_DIR="C:\\data\\"; public static String DATA_DIR="d:\\data\\"; public static Client client; static { Settings settings = Settings.settingsBuilder() .put("cluster.name", "bjsxt-es").build(); try { client = TransportClient .builder() .settings(settings) .build() .addTransportAddress( new InetSocketTransportAddress(InetAddress .getByName("node1"), 9300)) .addTransportAddress( new InetSocketTransportAddress(InetAddress .getByName("node2"), 9300)) .addTransportAddress( new InetSocketTransportAddress(InetAddress .getByName("node3"), 9300)); } catch (Exception e) { e.printStackTrace(); } } /** * admin():管理索引庫的。client.admin().indices() * * 索引數據的管理:client.prepare * */ @Test public void createIndex() throws Exception { IndicesExistsResponse resp = client.admin().indices().prepareExists("bjsxt").execute().actionGet(); if(resp.isExists()){ client.admin().indices().prepareDelete("bjsxt").execute().actionGet(); } client.admin().indices().prepareCreate("bjsxt").execute().actionGet(); new XContentFactory(); XContentBuilder builder = XContentFactory.jsonBuilder().startObject() .startObject("htmlbean").startObject("properties") .startObject("title").field("type", "string") .field("store", "yes").field("analyzer", "ik_max_word") .field("search_analyzer", "ik_max_word").endObject() .startObject("content").field("type", "string") .field("store", "yes").field("analyzer", "ik_max_word") .field("search_analyzer", "ik_max_word").endObject() // .startObject("url").field("type", "string") // .field("store", "yes").field("analyzer", "ik_max_word") // .field("search_analyzer", "ik_max_word").endObject() .endObject().endObject().endObject(); PutMappingRequest mapping = Requests.putMappingRequest("bjsxt").type("htmlbean").source(builder); client.admin().indices().putMapping(mapping).actionGet(); } /** * 把源數據html文件添加到索引庫中(構建索引文件) */ @Test public void addHtmlToES(){ readHtml(new File(DATA_DIR)); } /** * 遍歷數據文件目錄d:/data ,遞歸方法 * @param file */ public void readHtml(File file){ if(file.isDirectory()){ File[] fs =file.listFiles(); for (int i = 0; i < fs.length; i++) { File f = fs[i]; readHtml(f); } }else{ HtmlBean bean; try { bean = HtmlTool.parserHtml(file.getPath()); if(bean!=null){ Map<String, String> dataMap =new HashMap<String, String>(); dataMap.put("title", bean.getTitle()); dataMap.put("content", bean.getContent()); dataMap.put("url", bean.getUrl()); //寫索引 client.prepareIndex("bjsxt", "htmlbean").setSource(dataMap).execute().actionGet(); } } catch (Throwable e) { e.printStackTrace(); } } } /** * 搜索 * @param kw * @param num * @return */ public PageBean<HtmlBean> search(String kw,int num,int count){ PageBean<HtmlBean> wr =new PageBean<HtmlBean>(); wr.setIndex(num); // //構建查詢條件 // MatchQueryBuilder q1 =new MatchQueryBuilder("title", kw); // MatchQueryBuilder q2 =new MatchQueryBuilder("content", kw); // // //構建一個多條件查詢對象 // BoolQueryBuilder q =new BoolQueryBuilder(); //組合查詢條件對象 // q.should(q1); // q.should(q2); // RangeQueryBuilder q1 =new RangeQueryBuilder("age"); // q1.from(18); // q1.to(40); MultiMatchQueryBuilder q =new MultiMatchQueryBuilder(kw, new String[]{"title","content"}); SearchResponse resp=null; if(wr.getIndex()==1){ resp = client.prepareSearch("bjsxt") .setTypes("htmlbean") .setQuery(q) .addHighlightedField("title") .addHighlightedField("content") .setHighlighterPreTags("<font color=\"red\">") .setHighlighterPostTags("</font>") .setHighlighterFragmentSize(40)//設置顯示結果中一個碎片斷的長度 .setHighlighterNumOfFragments(5)//設置顯示結果中每一個結果最多顯示碎片斷,每一個碎片斷之間用...隔開 .setFrom(0) .setSize(10) .execute().actionGet(); }else{ wr.setTotalCount(count); resp = client.prepareSearch("bjsxt") .setTypes("htmlbean") .setQuery(q) .addHighlightedField("title") .addHighlightedField("content") .setHighlighterPreTags("<font color=\"red\">") .setHighlighterPostTags("</font>") .setHighlighterFragmentSize(40) .setHighlighterNumOfFragments(5) .setFrom(wr.getStartRow()) .setSize(10) .execute().actionGet(); } SearchHits hits= resp.getHits(); wr.setTotalCount((int)hits.getTotalHits()); for(SearchHit hit : hits.getHits()){ HtmlBean bean =new HtmlBean(); if(hit.getHighlightFields().get("title")==null){//title中沒有包含關鍵字 bean.setTitle(hit.getSource().get("title").toString());//獲取原來的title(沒有高亮的title) }else{ bean.setTitle(hit.getHighlightFields().get("title").getFragments()[0].toString()); } if(hit.getHighlightFields().get("content")==null){//title中沒有包含關鍵字 bean.setContent(hit.getSource().get("content").toString());//獲取原來的title(沒有高亮的title) }else{ StringBuilder sb =new StringBuilder(); for(Text text: hit.getHighlightFields().get("content").getFragments()){ sb.append(text.toString()+"..."); } bean.setContent(sb.toString()); } bean.setUrl("http://"+hit.getSource().get("url").toString()); wr.setBean(bean); } return wr; } // @Test // public void del(){ //// client.admin().indices().prepareDelete("bjsxt").execute().actionGet(); // client.admin().indices().prepareDelete("bjsxt2").execute().actionGet(); // } } ## 將linux wget 爬取到的數據存放到D:\\下。 ## 運行addHtmlToES()方法,數據文檔添加到es中 ## 以下時對項目:ES_SEARCH的演示效果。
window 查看端口和pid,殺死pid C:\WINDOWS\system32>netstat -ano | findstr 8080 TCP 0.0.0.0:8080 0.0.0.0:0 LISTENING 9448 TCP [::]:8080 [::]:0 LISTENING 9448 C:\WINDOWS\system32>taskkill /PID 9448 /F