ElasticSearch使用

安裝以前,請參考https://github.com/richardwilly98/elasticsearch-river-mongodb根據你的MongoDB版本號決定須要的elasticsearch版本號和插件號。 html

1)安裝ES java

下載ElasticSearch_版本號.tar.gz,官網上有,下載好以後。 node

tar -zvxf elasticsearch-1.1.0.tar.gz 
cd elasticsearch-1.1.0

 安裝一下插件,也能夠不安裝,這個插件用來監控用的 mysql

./bin/plugin -i elasticsearch/marvel/latest

 想了解這個插件能夠參考官方文檔 git

http://www.elasticsearch.org/guide/en/marvel/current/index.html

 

2)執行程序 github

./elasticsearch

看到如下的就表示成功了 sql

[2014-04-09 10:12:41,414][INFO ][node                     ] [Lorna Dane] version[1.1.0], pid[839], build[2181e11/2014-03-25T15:59:51Z]
[2014-04-09 10:12:41,415][INFO ][node                     ] [Lorna Dane] initializing ...
[2014-04-09 10:12:41,431][INFO ][plugins                  ] [Lorna Dane] loaded [], sites []
[2014-04-09 10:12:44,383][INFO ][node                     ] [Lorna Dane] initialized
[2014-04-09 10:12:44,384][INFO ][node                     ] [Lorna Dane] starting ...
[2014-04-09 10:12:44,495][INFO ][transport                ] [Lorna Dane] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/XXXXXX:9300]}
[2014-04-09 10:12:47,522][INFO ][cluster.service          ] [Lorna Dane] new_master [Lorna Dane][Ml-gTu_ZTniHR2mkpbMQ_A][XXXXX][inet[/XXXXXX:9300]], reason: zen-disco-join (elected_as_master)
[2014-04-09 10:12:47,545][INFO ][discovery                ] [Lorna Dane] elasticsearch/Ml-gTu_ZTniHR2mkpbMQ_A
[2014-04-09 10:12:47,572][INFO ][http                     ] [Lorna Dane] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/XXXXX:9200]}
[2014-04-09 10:12:47,607][INFO ][gateway                  ] [Lorna Dane] recovered [0] indices into cluster_state
[2014-04-09 10:12:47,607][INFO ][node                     ] [Lorna Dane] started

若是想後臺運行,則執行 mongodb

./elasticsearch -d

想確認程序是否運行,則運行 數據庫

lsof -i:9200
lsof -i:9300
一個是節點對外服務端口,一個是節點間交互端口(若是有集羣的話)。

 

3)創建集羣 bootstrap

配置文件路徑是:

.....(你的實際路徑)/config/elasticsearch.yml

默認是所有配置項都屏蔽的,

我修改後配置項以下:

cluster.name: ctoes   ---配置集羣的名字
node.name: "QiangZiGeGe"---配置節點的名字,注意有雙引號
bootstrap.mlockall: true


 沒有提到的配置項都採用默認值,具體參數如何設置,還須要具體狀況具體分析。

修改好後,啓動es,能夠看到打印的消息裏有別的節點名字,就表示創建集羣成功。

注意:es是自動探測局域網內的同名集羣節點的。 

 查看集羣的狀態,能夠經過:

curl 'http://localhost:9200/_cluster/health?pretty'
響應以下:
{
  "cluster_name" : "ctoes",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 5,
  "active_shards" : 10,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0
}

接下來來使用一下來獲得直觀感覺

4)使用數據庫感覺一下

建立索引(至關於建立數據庫)

示例以下:

[deployer@XXXXXXX0013 ~]$ curl -XPUT 'http://localhost:9200/test1?pretty' -d'
> {
>  "settings":{
> "number_of_shards":2,
> "number_of_replicas":1
> }
> }
> '
{
  "acknowledged" : true
}

注意,這裏的number_of_shards參數是一次性設置,設置以後永遠不能夠再修改的,可是number_of_replicas是能夠隨後能夠修改的。

上面的url裏的test1其實就是創建的索引(數據庫)的名字,根據須要本身修改便可。

建立文檔

curl -XPUT 'http://localhost:9200/test1/table1/1' -d '
{ "first":"dewmobile",
"last":"technology",
"age":3000,
"about":"hello,world",
"interest":["basketball","music"]
}
'
響應以下:
{"_index":"test1","_type":"table1","_id":"1","_version":1,"created":true}

代表建立文檔成功

test1:創建的數據庫名字

table1:創建的type名字,type與關係數據庫的table對應

1:本身制定的文檔的主鍵,也能夠不指定主鍵由數據庫本身分配。

5)安裝數據庫同步插件

因爲咱們的數據源是放在MongoDB中的,因此這裏只講MongoDB數據源的數據同步。

插件源碼:https://github.com/richardwilly98/elasticsearch-river-mongodb/ 

MongoDB River Plugin (做者 Richard Louapre)

簡介:mongodb同步插件,mongodb必須搭成副本集的模式,由於這個插件的原理是經過按期讀取mongodb中的oplog來同步數據。

 

如何安裝使用呢?須要安裝2個插件

1)插件1

./plugin -install elasticsearch/elasticsearch-mapper-attachments/2.0.0

 

2)插件2

./bin/plugin --install com.github.richardwilly98.elasticsearch/elasticsearch-river-mongodb/2.0.0

安裝過程以下:

./bin/plugin --install com.github.richardwilly98.elasticsearch/elasticsearch-river-mongodb/2.0.0
-> Installing com.github.richardwilly98.elasticsearch/elasticsearch-river-mongodb/2.0.0...
Trying http://download.elasticsearch.org/com.github.richardwilly98.elasticsearch/elasticsearch-river-mongodb/elasticsearch-river-mongodb-2.0.0.zip...
Trying http://search.maven.org/remotecontent?filepath=com/github/richardwilly98/elasticsearch/elasticsearch-river-mongodb/2.0.0/elasticsearch-river-mongodb-2.0.0.zip...
Trying https://oss.sonatype.org/service/local/repositories/releases/content/com/github/richardwilly98/elasticsearch/elasticsearch-river-mongodb/2.0.0/elasticsearch-river-mongodb-2.0.0.zip...
Downloading .............................................................................................DONE
Installed com.github.richardwilly98.elasticsearch/elasticsearch-river-mongodb/2.0.0 into /usr/local/elasticsearch_1.1.0/elasticsearch/elasticsearch-1.1.0/plugins/river-mongodb

3)安裝elasticsearch-MySql插件

具體請參考:

https://github.com/jprante/elasticsearch-river-jdbc能夠直接下載二進制jar包。

https://github.com/jprante/elasticsearch-river-jdbc

 4)安裝mysql驅動jar包(必須!)

這樣,插件就裝好了。

6)使用插件告知ES添加監聽數據庫任務

模板以下:

 

curl -XPUT localhost:9200/_river/mongo_resource/_meta -d '
{
"type":"mongodb",
"mongodb":{
"servers":
[{"host":"10.XX.XX.XX","port":"60004"}
],
"db":"zapya_api",
"collection":"resources"
},
"index":{
"name":"mongotest",
"type":"resources"
}}'

 

 若是看到下面的內容表示建立成功
{"_index":"_river","_type":"mongodb","_id":"_meta","_version":1,"created":true}

 而後,數據就導入到了es中了,索引創建成功。

~~~~~~~~~~~~~~~~

若是是導入mysql,模板以下:

[deployer@XXX0014 ~]$ curl -XPUT 'localhost:9200/_river/my_jdbc_river/_meta' -d '{
> "type":"jdbc",
> "jdbc":{
> "url":"jdbc:mysql://localhost:3306/fastooth",
> "user":"XXX",
> "password":"XXX",
> "sql":"select *,base62Decode(display_name) as name from users"
> }
> }
> '

 更詳細的是:

{
    "jdbc" :{
        "strategy" : "simple",
        "url" : null,
        "user" : null,
        "password" : null,
        "sql" : null,
        "schedule" : null,
        "poolsize" : 1,
        "rounding" : null,
        "scale" : 2,
        "autocommit" : false,
        "fetchsize" : 10, /* Integer.MIN for MySQL */
        "max_rows" : 0,
        "max_retries" : 3,
        "max_retries_wait" : "30s",
        "locale" : Locale.getDefault().toLanguageTag(),
        "index" : "jdbc",
        "type" : "jdbc",
        "bulk_size" : 100,
        "max_bulk_requests" : 30,
        "bulk_flush_interval" : "5s",
        "index_settings" : null,
        "type_mapping" : null
    }
}

對於schedule參數:設置調度時刻的

格式參考:http://www.quartz-scheduler.org/documentation/quartz-1.x/tutorials/crontrigger

http://elasticsearch-users.115913.n3.nabble.com/Ann-JDBC-River-Plugin-for-ElasticSearch-td4019418.html

http://www.quartz-scheduler.org/documentation/quartz-1.x/tutorials/crontrigger

https://github.com/jprante/elasticsearch-river-jdbc/issues/186

官方文檔:

http://elasticsearch-users.115913.n3.nabble.com/Ann-JDBC-River-Plugin-for-ElasticSearch-td4019418.html

https://github.com/jprante/elasticsearch-river-jdbc/wiki/JDBC-River-parameters

https://github.com/jprante/elasticsearch-river-jdbc/wiki/Quickstart(包含如何刪除任務)

附錄:http://my.oschina.net/wenhaowu/blog/215219#OSC_h2_7 

 

測試過程當中,會出現錯誤:

[7]: index [yyyy], type [rrrr], id [1964986], message [RemoteTransportException[[2sdfsdf][inet[/xxxxxxxxxx:9300]][bulk/shard]]; nested: EsRejectedExecutionException[rejected execution (queue capacity 50) on org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1@3e82ee89]; ]

 

修改配置文件,在最後增長:

threadpool:
    bulk:
        type: fixed
        size: 60
        queue_size: 1000

至於這幾個參數是什麼意思,還請讀者本身去弄明白。

參考:

http://stackoverflow.com/questions/20683440/elasticsearch-gives-error-about-queue-size

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-threadpool.html

 

~~~~~~~~~~~~~~~

關於客戶端,咱們使用了Play框架,正如數據庫都須要驅動包同樣,咱們從官方網站上看到了這個

https://github.com/cleverage/play2-elasticsearch

關於中文分詞,能夠嘗試使用Ansj.

~~~~~~~~~~~~~~~~~~~~~

關於建立索引:

curl -i -XPUT  'XXX:9200/fasth' -d '
{
   "settings" :
   {
      "number_of_shards" : 3 ,
      "number_of_replicas" : 1
   }
  
}
'

~~~~~~~~~~~

建立映射

 

curl -i -XPUT  'http://localhost:9200/fa/users/_mapping' -d '
{

 "properties":
 {
  "_id":
  { 
  "type":"string",
  "index":"not_analyzed"
  },
  "name":
  {
  "type":"string"
  },
  "gender":
  {
  "type":"string",
  "index":"not_analyzed"
  },
  "primary_avatar":
  {
  "type":"string",
  "index":"not_analyzed"
  },
  "signature":
  {
  "type":"string",
  "index":"not_analyzed"
  }
 }

}
'


 

全量任務:

curl -XPUT  'xxx:9200/_river/mysql_users/_meta' -d '
{
 "type":"jdbc",
 "jdbc":
 {
 "url":"jdbc:mysql://XXX:3306/fastooth",
 "user":"XXX",
 "password":"XXX",
 "sql":"select distinct _id,base62Decode(display_name) as name,gender,primary_avatar,signature from users",
 "index":"XXX",
 "type":"XXX"
 }
}
'

 http://www.nosqldb.cn/1368777378160.html

相關文章
相關標籤/搜索