elasticsearch初探

https://github.com/richardwilly98/elasticsearch-river-mongodbphp

https://github.com/mallocator/Elasticsearch-MySQL-Riverhtml

 https://github.com/BioMedCentralLtd/spring-data-elasticsearch-sample-application/blob/master/src/test/resources/springContext-book-test.xml 實例java

http://www.elasticsearch.org/guide/en/elasticsearch/client/java-api/current/search.htmlpython

http://docs.spring.io/spring-data/elasticsearch/docs/current/reference/html/#repositories.create-instances.springmysql

爲了替換如今使用體驗比較差的SegmentFault搜索,我開始了前期搜索引擎的探索。目前首選是elasticsearchnginx

elasticsearch須要java環境
安裝javagit

sudo aptitude install openjdk-7-jre 

下載elasticsearch
http://www.elasticsearch.org/overview/elkdownloads/
https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.1.0.debgithub

安裝
由於個人環境是ubuntu,因此直接用它的deb包。spring

sudo dpkg -i elasticsearch-1.1.0.deb 

啓動sql

sudo /etc/init.d/elasticsearch start 

jdbc river
用於按期或者實時導入須要搜索的數據
咱們數據庫是mysql,因此用的官方elasticsearch-river-jdbc
https://github.com/jprante/elasticsearch-river-jdbc

river jdbc quickstart
https://github.com/jprante/elasticsearch-river-jdbc/wiki/Quickstart

安裝

cd /usr/share/elasticsearch sudo /bin/plugin --install river-jdbc --url http://bit.ly/1jyXrR9 

若是安裝失敗,能夠手動下載後再安裝。

sudo /bin/plugin --install river-jdbc --url file:///tmp/elasticsearch-river-jdbc-1.0.0.1.zip 

建立一個JDBC river

curl -XPUT 'localhost:9200/_river/my_jdbc_river/_meta' -d '{
    "type" : "jdbc",
    "jdbc" : {
        "url" : "jdbc:mysql://localhost:3306/test",
        "user" : "root",
        "password" : "",
        "sql" : "select * from question", "index" : "question", "type" : "question" } }' 

測試導入效果:

curl -XGET 'localhost:9200/question/_search?pretty&q=*' 

or

localhost:9200/question/_search?pretty&q=*

官方有中文分詞支持,可是不是很是準確,這裏使用medcl的ik分詞
安裝elasticsearch-analysis-ik

cd /tmp wget https://github.com/medcl/elasticsearch-analysis-ik/archive/master.zip unzip master.zip cd elasticsearch-analysis-ik/ 

這裏須要用mvn package命令打包成elasticsearch-analysis-ik-1.2.6.jar

mvn package 

沒有maven的能夠安裝一下

sudo aptitude install maven 

複製elasticsearch-analysis-ik-1.2.6.jar到ES_HOME/plugins/analysis-ik下

sudo cp elasticsearch-analysis-ik-1.2.6.jar /usr/share/elasticsearch/plugins/analysis-ik 

將ik的配置和字典都複製到ES_HOME/config下

sudo cp -R ik /etc/elasticsearch 

elasticsearch配置啓用ik

sudo vim /etc/elasticsearch 

底部增長一行

index.analysis.analyzer.ik.type : 'ik' 

重啓服務加載配置

sudo service elasticsearch restart 

測試分詞效果

localhost:9200/question/_analyze?analyzer=ik&pretty=true&text=杭州堆棧科技有限公司 

返回

{
  "tokens" : [ { "token" : "杭州", "start_offset" : 0, "end_offset" : 2, "type" : "CN_WORD", "position" : 1 }, { "token" : "堆棧", "start_offset" : 2, "end_offset" : 4, "type" : "CN_WORD", "position" : 2 }, { "token" : "科技", "start_offset" : 4, "end_offset" : 6, "type" : "CN_WORD", "position" : 3 }, { "token" : "有限公司", "start_offset" : 6, "end_offset" : 10, "type" : "CN_WORD", "position" : 4 } ] } 

測試官方的php客戶端
官方的php客戶端經過composer安裝
先安裝composer

curl -s http://getcomposer.org/installer | php sudo mv composer.phar /usr/bin/composer 

生成一個composer.json,寫入

{
    "require": { "elasticsearch/elasticsearch": "~1.0" } } 

開始安裝

composer install --no-dev 

項目中require以後就可使用了

<?php require 'vendor/autoload.php'; $client = new Elasticsearch\Client(); 

使用中文分詞ik的mapping

$params['index'] = 'question'; $params['type'] = 'question'; $myTypeMapping = array( '_source' => array( 'enabled' => true ), '_all' => array( 'indexAnalyzer' => 'ik', 'searchAnalyzer' => 'ik', 'term_vector' => 'no', 'store' => 'false' ), 'properties' => array( 'text' => array( 'type' => 'string', 'term_vector' => 'with_positions_offsets', 'indexAnalyzer' => 'ik', 'searchAnalyzer' => 'ik', 'include_in_all' => 'true', 'boost' => 8 ), 'title' => array( 'type' => 'string', 'term_vector' => 'with_positions_offsets', 'indexAnalyzer' => 'ik', 'searchAnalyzer' => 'ik', 'include_in_all' => 'true', 'boost' => 8 ) ) ); $params['body']['question'] = $myTypeMapping; $response = $client->indices()->putMapping($params); 

測試一下效果,搜索問題內容‘php框架’

$searchParams = array(); $searchParams['index'] = 'question'; $searchParams['type'] = 'question'; $searchParams['body']['query']['match']['text'] = 'php框架'; 

返回,取了幾條

  1. segmentfault是用什麼php框架寫的啊? http://segmentfault.com/q/1010000000095152
  2. 剛學PHP,求介紹THINKPHP框架的優劣 http://segmentfault.com/q/1010000000129535
  3. symfony是否是比其餘的php框架功能強大不少?
    http://segmentfault.com/q/1010000000095952
  4. python 框架繁多,如何總體把握? http://segmentfault.com/q/1010000000186319
  5. 什麼是ORM,以及在php上的使用? http://segmentfault.com/q/1010000000318125
  6. codeigniter框架php依賴安裝問題 http://segmentfault.com/q/1010000000253966

總結elasticsearch安裝和使用仍是很是簡單的,從沒有優化的返回結果來看也比現有的搜索要理想。不過惟一的缺點就是文檔相比solr仍是太少,不少都只給了最基本的例子。優化 組合搜索等等,都要本身琢磨和查找。

相關文章
相關標籤/搜索