環境說明:centos 7.3,solr 6.6,zookeeper3.4,Tomcat8.5,jdk1.8html
zookeeper的部署請參考:http://www.cnblogs.com/Sunzz/p/8464284.html java
[root@solr_1 ~]# tar -xf apache-tomcat-8.5.23.tar.gz -C /opt/ [root@solr_1 ~]# tar -xf solr-6.6.2.tgz -C /opt/ [root@solr_1 ~]# cd /opt [root@solr_1 opt]# ln -sv apache-tomcat-8.5.23 tomcat [root@solr_1 opt]# ln -sv solr-6.6.2 solr
[root@solr_1 ~]# cp -r /opt/solr/server/solr-webapp/webapp/ /opt/tomcat/webapps/ [root@solr_1 ~]# mv /opt/tomcat/webapps/webapp /opt/tomcat/webapps/solr
① 將solr-6.6.0/server/lib/ext下的jar、mysql
② 將solr-6.6.0/server/lib下以metrics開頭的5個jar(metrics-core-3.2.2.jar、metrics-ganglia-3.2.2.jar、metrics-graphite-3.2.2.jar、metrics-jetty9-3.2.2.jar、metrics-jvm-3.2.2.jar)、web
③ 將solr-6.6.0/dist/下的solr-dataimporthandler-6.6.0.jar和solr-dataimporthandler-extras-6.6.0.jarsql
拷貝到apache-tomcat-8.5.20/webapps/solr/WEB-INF/lib下數據庫
[root@solr_1 ~]# cp /opt/solr/server/lib/ext/*.jar /opt/solr/server/lib/metrics*.jar /opt/solr/dist/solr-dataimporthandler-*.jar /opt/tomcat/webapps/solr/WEB-INF/lib/
[root@solr_1 ~]# mkdir /opt/solr/solr-home [root@solr_1 ~]# cp -r /opt/solr/server/solr/* /opt/solr/solr-home/
① 找到<env-entry>,解開註釋,並修改env-entry-value爲solr-home的路徑apache
命令:json
[root@solr_1 ~]# vim /opt/tomcat/webapps/solr/WEB-INF/web.xml
修改後vim
<env-entry> <env-entry-name>solr/home</env-entry-name> <env-entry-value>/opt/solr/solr-home</env-entry-value> <env-entry-type>Java.lang.String</env-entry-type> </env-entry>
② 去掉權限,否則訪問solr會出現沒有受權的錯誤,將兩個security-constraint標籤註釋。centos
修改後:
<!-- <security-constraint> <web-resource-collection> <web-resource-name>Disable TRACE</web-resource-name> <url-pattern>/</url-pattern> <http-method>TRACE</http-method> </web-resource-collection> <auth-constraint/> </security-constraint> <security-constraint> <web-resource-collection> <web-resource-name>Enable everything but TRACE</web-resource-name> <url-pattern>/</url-pattern> <http-method-omission>TRACE</http-method-omission> </web-resource-collection> </security-constraint> -->
並將solr-6.6.0/server/resources/log4j.properties拷貝過去
命令:
[root@solr_1 ~]# cd /opt/tomcat/webapps/solr/WEB-INF/ [root@ WEB-INF]# mkdir classes [root@ WEB-INF]# cp -rf /opt/solr/server/resources/log4j.properties ./classes/
並將 /solr-6.6.0/server/solr/configsets/basic_configs中conf文件夾複製到新建的collection1文件夾中.在collection1目錄下新建data文件夾.
[root@solr_1 ~]# mkdir /opt/solr/solr-home/collection1 [root@solr_1 ~]# cp -r /opt/solr/server/solr/configsets/basic_configs/conf/ /opt/solr/solr-home/collection1/ [root@solr_1 ~]# mkdir /opt/solr/solr-home/collection1/data
collection1中建立文件core.properties,寫入內容
[root@solr_1 ~]# vim /opt/solr/solr-home/collection1/core.properties
name=collection1 config=solrconfig.xml schema=managed-schema dataDir=data
修改/usr/local/solr/solr-home/solr.xml中的
<int name="hostPort">${jetty.port:8080}</int>
在瀏覽器輸入地址:http://192.168.29.110:8080/solr/index.html
[root@solr_1 ~]# unzip ikanalyzer-solr6.5.zip [root@solr_1 ~]# mv ikanalyzer-solr6.5 /opt/
把ext.dic、IKAnalyzer.cfg.xml和stopword.dic複製到apache-tomcat-8.5.20\webapps\solr\WEB-INF\classes中,
[root@solr_1 ~]# mkdir /opt/tomcat/webapps/solr/WEB-INF/classes [root@solr_1 ~]# cp /opt/ikanalyzer-solr6.5/ikanalyzer-solr5/ext.dic /opt/ikanalyzer-solr6.5/ikanalyzer-solr5/IKAnalyzer.cfg.xml /opt/ikanalyzer-solr6.5/ikanalyzer-solr5/stopword.dic /opt/tomcat/webapps/solr/WEB-INF/classes
把ik-analyzer-solr5-5.x.jar 和 solr-analyzer-ik-5.1.0.jar複製到apache-tomcat-8.5.20/webapps/solr/WEB-INF/lib中;
[root@solr_1 ~]# cp /opt/ikanalyzer-solr6.5/ikanalyzer-solr5/*.jar /opt/tomcat/webapps/solr/WEB-INF/lib/
[root@solr_1 ~]# vim /opt/solr/solr-home/collection1/conf/managed-schema
在</schema>前加入配置:
<!-- IK分詞 --> <fieldType name="text_ik" class="solr.TextField"> <analyzer type="index"> <tokenizer class="org.apache.lucene.analysis.ik.IKTokenizerFactory" useSmart="false"/> </analyzer> <analyzer type="query"> <tokenizer class="org.apache.lucene.analysis.ik.IKTokenizerFactory" useSmart="true"/> </analyzer> </fieldType>
進入http://192.168.29.110:8080/solr/index.html進行確認。
將pinyinTokenFilter-1.1.0-RELEASE.jar和pinyinAnalyzer4.3.1.jar和pinyin4j-2.5.0.jar複製到apache-tomcat-8.5.20/webapps/solr/WEB-INF/lib目錄下
[root@solr_1 ~]# cp /opt/ikanalyzer-solr6.5/pinyin* /opt/tomcat/webapps/solr/WEB-INF/lib/
(修改後)
<!-- IK分詞 --> <fieldType name="text_ik" class="solr.TextField"> <analyzer type="index"> <tokenizer class="org.apache.lucene.analysis.ik.IKTokenizerFactory" useSmart="false"/> <filter class="top.pinyin.index.solr.PinyinTokenFilterFactory" pinyin="true" isFirstChar="true" minTermLenght="2" /> <filter class="com.shentong.search.analyzers.PinyinNGramTokenFilterFactory" minGram="2" maxGram="20" /> </analyzer> <analyzer type="query"> <tokenizer class="org.apache.lucene.analysis.ik.IKTokenizerFactory" useSmart="true"/> </analyzer> </fieldType>
(上方標紅的爲添加內容)
重啓tomcat,測試
<fieldType name="text_ik" class="solr.TextField"> <analyzer type="index"> <tokenizer class="org.apache.lucene.analysis.ik.IKTokenizerFactory" useSmart="false"/> <filter class="top.pinyin.index.solr.PinyinTokenFilterFactory" pinyin="true" isFirstChar="true" minTermLenght="2"/> <filter class="com.shentong.search.analyzers.PinyinNGramTokenFilterFactory" minGram="2" maxGram="20"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="org.apache.lucene.analysis.ik.IKTokenizerFactory" useSmart="true"/ <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType>
hell,二是
誅仙,誅仙2,夢幻誅仙
首先須要加上用以聯想的字段,這裏假設咱們對name字段進行聯想,配置以下(managed-schema文件):
<field name="name" type="text_ik" multiValued="false" indexed="true" stored="true"/> <field name="suggestion" type="text_suggest" indexed="true" stored="true" multiValued="true" /> <copyField source="name" dest="suggestion"/>
suggestion字段即爲suggest聯想所取的字段。這裏將suggestion字段設爲text_suggest類型,text_suggest是一個自定義的類型,具體做用和配置後面再說。而後利用copyField將name字段拷貝到suggestion字段。那麼爲何咱們不直接對name字段進行聯想,而是專門創建一個字段把name字段拷貝過去,乃至專門創建了一個字段類型呢?在配置中咱們能夠看到,name字段採用了IKAnalyzer進行了中文分詞,若是咱們直接對name字段進行分詞,則聯想出來的就會是分詞以後的結果。例如指望聯想的記錄是「先吃水果真後吃雪糕」,最後聯想出來的倒是「先吃」。
接下來就須要創建一個專門的字段類型來配合suggest模塊進行檢察建議了。這裏該字段名稱爲text_suggest,配置以下(managed-schema文件):
<fieldType name="text_suggest" class="solr.TextField"> <analyzer type="index"> <tokenizer class="solr.KeywordTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.KeywordTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType>
在這裏咱們要對整個字段進行聯想,所以採用KeywordTokenizerFactory做爲分詞器,而且使用LowerCaseFilterFactory來保證其能夠不區分大小寫。能夠根據須要替換成本身須要的analyzer。
如今咱們的記錄表結構已經創建好了,下面咱們進行suggest模塊的配置。
首先咱們來添加suggest模塊。編輯solrconfig.xml文件,添加以下配置:
<searchComponent name="suggest" class="solr.SuggestComponent"> <lst name="suggester"> <str name="name">suggest</str> <str name="lookupImpl">AnalyzingLookupFactory</str> <str name="dictionaryImpl">DocumentDictionaryFactory</str> <str name="field">suggestion</str> <str name="suggestAnalyzerFieldType">text_suggest</str> <str name="buildOnStartup">false</str> </lst> </searchComponent>
說明:在本配置中
name爲該suggest模塊的名字; lookUpImpl爲查找器,默認爲JaspellLookupFactory; dictionaryImpl爲字典實現; field爲要聯想的字段; suggestAnalyzerFieldType規定了進行聯想操做所使用類型所對應的Analyzer(該字段必填); buildOnStartup表示是否在啓動時創建索引。
具體配置信息詳見https://cwiki.apache.org/confluence/display/solr/Suggester。
接下來咱們須要配置suggest模塊的requestHandler。編輯solrconfig.xml文件,添加以下配置:
<requestHandler name="/suggest" class="org.apache.solr.handler.component.SearchHandler"> <lst name="defaults"> <str name="suggest">true</str> <str name="suggest.dictionary">suggest</str> <str name="suggest.count">10</str> </lst> <arr name="components"> <str>suggest</str> </arr> </requestHandler>
下面解釋配置中涉及到的參數。suggest參數不用說了,必須爲true;
suggest.dictionary爲suggest操做所須要用到的字典,應當與上面suggest模塊配置中的name屬性保持一致;
suggest.count爲候選詞數量,這裏爲10。
具體配置可在solr官網中找到:https://lucene.apache.org/solr/guide/6_6/suggester.html
這裏咱們就已經把suggest模塊配置完畢了。若是suggest模塊配置中buildOnStartup設置爲false,則須要手動創建一次索引。創建索引連接形如:
http://192.168.29.110:8080/solr/collection1/suggest?suggest=true&suggest.dictionary=suggest&wt=json&suggest.q=Ath&suggest.build=true
本實例採用zookeeper3.4.10
使用:zookeeper的客戶端上傳。
[root@solr_1 ~]# cd /opt/solr/server/scripts/cloud-scripts/ [root@solr_1 cloud-scripts]# ./zkcli.sh -zkhost 192.168.29.110:2181,192.168.29.120:2181,192.168.29.130:2181 -cmd upconfig -confdir /opt/solr/solr-home/core_shopdemo_product2/conf/ -confname myconf
查看配置文件是否上傳成功:
[root@bogon bin]# bash /usr/local/zookeeper/zoo1/zookeeper-3.4.10/bin/zkCli.sh
Connecting to localhost:2181 [zk: localhost:2181(CONNECTED) 0] ls / [configs, zookeeper] [zk: localhost:2181(CONNECTED) 1] ls /configs [myconf] [zk: localhost:2181(CONNECTED) 2] ls /configs/myconf [admin-extra.menu-top.html, currency.xml, protwords.txt, mapping-FoldToASCII.txt, _schema_analysis_synonyms_english.json, _rest_managed.json, solrconfig.xml, _schema_analysis_stopwords_english.json, stopwords.txt, lang, spellings.txt, mapping-ISOLatin1Accent.txt, admin-extra.html, xslt, synonyms.txt, scripts.conf, update-script.js, velocity, elevate.xml, admin-extra.menu-bottom.html, clustering, schema.xml]
在其中加入DzkHost指定zookeeper服務器地址:
JAVA_OPTS="$JAVA_OPTS $JSSE_OPTS"
# Register custom URL handlers # Do this here so custom URL handles (specifically 'war:...') can be used in the security policy JAVA_OPTS="$JAVA_OPTS -Djava.protocol.handler.pkgs=org.apache.catalina.webresources" JAVA_OPTS="$JAVA_OPTS -DzkHost=192.168.29.110:2181,192.168.29.120:2181,192.168.29.130:2181"
(上方標紅的爲添加內容)
添加collection
說明:
config set:配置文件存放位置 numShards:片區數量 replicationFactor:每個片區提供服務的機器數量(小於機器總數) Show advanced 顯示高級設置 maxShardsPerNode:最大片區數量
(5、6 非必須步驟)
在瀏覽器中訪問:
http://192.168.29.110:8080/solr/admin/collections?action=CREATE&name=collection2&numShards=2&replicationFactor=2
鏈接中須要修改的內容:
ip:服務器ip
name:數據集名稱
numShards:數據集有幾個片區
replicationFactor:每個片區提供服務的機器數量(小於機器總數)
http://192.168.29.110:8080/solr/admin/collections?action=DELETE&name=collection1
鏈接中須要修改的內容:
ip:服務器ip
name:數據集名稱
在pom.xml增長solr的jar
<dependency> <groupId>org.apache.solr</groupId> <artifactId>solr-solrj</artifactId> <version>6.6.0</version> </dependency>
代碼:
package com.demo.util.solr; import java.io.IOException; import java.util.ArrayList; import java.util.Collection; import org.apache.solr.client.solrj.SolrClient; import org.apache.solr.client.solrj.SolrQuery; import org.apache.solr.client.solrj.SolrServerException; import org.apache.solr.client.solrj.impl.CloudSolrClient; import org.apache.solr.client.solrj.response.QueryResponse; import org.apache.solr.common.SolrDocument; import org.apache.solr.common.SolrDocumentList; import org.apache.solr.common.SolrInputDocument;
//SolrCloud 索引增刪查
public class SolrCloudTest { private static CloudSolrClient cloudSolrClient; private static synchronized CloudSolrClient getCloudSolrClient(final String zkHost) { if (cloudSolrClient == null) { try { cloudSolrClient = new CloudSolrClient(zkHost); } catch (Exception e) { e.printStackTrace(); } } return cloudSolrClient; } private static void addIndex(SolrClient solrClient) { try { SolrInputDocument doc1 = new SolrInputDocument(); doc1.addField("id", "421245251215121452521251"); doc1.addField("name", "張三"); doc1.addField("age", 30); doc1.addField("desc", "張三是個農民,勤勞致富,奔小康"); SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField("id", "4224558524254245848524243"); doc2.addField("name", "李四"); doc2.addField("age", 45); doc2.addField("desc", "李四是個企業家,白手起家,致富一方"); SolrInputDocument doc3 = new SolrInputDocument(); doc3.addField("id", "2224558524254245848524299"); doc3.addField("name", "王五"); doc3.addField("age", 60); doc3.addField("desc", "王五好吃懶作,溜鬚拍馬,跟着李四,也過着小康的日子"); Collection<SolrInputDocument> docs = new ArrayList<SolrInputDocument>(); docs.add(doc1); docs.add(doc2); docs.add(doc3); solrClient.add(docs); solrClient.commit(); } catch (SolrServerException e) { System.out.println("Add docs Exception !!!"); e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } catch (Exception e) { System.out.println("Unknowned Exception!!!!!"); e.printStackTrace(); } } public static void search(SolrClient solrClient, String String) { SolrQuery query = new SolrQuery(); query.setQuery(String); try { QueryResponse response = solrClient.query(query); SolrDocumentList docs = response.getResults(); System.out.println("文檔個數:" + docs.getNumFound()); System.out.println("查詢時間:" + response.getQTime()); for (SolrDocument doc : docs) { String id = (String) doc.getFieldValue("id"); String name = (String) doc.getFieldValue("name"); Integer age = (Integer) doc.getFieldValue("age"); String desc = (String) doc.getFieldValue("desc"); System.out.println("id: " + id); System.out.println("name: " + name); System.out.println("age: " + age); System.out.println("desc: " + desc); System.out.println(); } } catch (SolrServerException e) { e.printStackTrace(); } catch (Exception e) { System.out.println("Unknowned Exception!!!!"); e.printStackTrace(); } } public static void deleteAllIndex(SolrClient solrClient) { try { solrClient.deleteByQuery("*:*");// delete everything! solrClient.commit(); } catch (SolrServerException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } catch (Exception e) { System.out.println("Unknowned Exception !!!!"); e.printStackTrace(); } } public static void main(String[] args) throws IOException { final String zkHost = "192.168.29.110:2181,192.168.29.120:2181,192.168.29.130:2181"; final String defaultCollection = "collection1"; final int zkClientTimeout = 20000; final int zkConnectTimeout = 1000; CloudSolrClient cloudSolrClient = getCloudSolrClient(zkHost); System.out.println("The Cloud cloudSolrClient Instance has benn created!"); cloudSolrClient.setDefaultCollection(defaultCollection); cloudSolrClient.setZkClientTimeout(zkClientTimeout); cloudSolrClient.setZkConnectTimeout(zkConnectTimeout); cloudSolrClient.connect(); System.out.println("The cloud Server has been connected !!!!");
//建立索引
SolrCloudTest.addIndex(cloudSolrClient);
//查詢
SolrCloudTest.search(cloudSolrClient, "name:李四");
//刪除
SolrCloudTest.deleteAllIndex(cloudSolrClient); SolrCloudTest.search(cloudSolrClient, "name:李四"); cloudSolrClient.close(); } }
name和desc的字段類型使用添加的IK分詞「text_ik」,
age的字段類型使用int
數據庫主機以及帳號密碼:
mysql: 192.168.29.100:3306 user:root password:123456
將solr自帶的solr-dataimporthandler-6.6.0.jar, solr-dataimporthandler-extras-6.6.0.jar和mysql-connector-java-5.1.44.jar拷貝到tomcat中solr的lib下
找到「<requestHandler name="/select" class="solr.SearchHandler">」,在其上方增長配置 <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> <lst name="defaults"> <str name="config">data-config.xml</str> </lst> </requestHandler>
詳細配置:
<?xml version="1.0" encoding="UTF-8" ?> <dataConfig> <dataSource name="source1" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://192.168.29.100:3306/test1" user="root" password="123456"/> <document name="salesDoc"> <entity pk="id" dataSource="source1" name="user" query="select id,name,sex,age,insertTime from user" deltaQuery="select id,name,sex,age,insertTime from user where insertTime >'${dih.last_index_time}'"> <field name="id" column="id"/> <field name="name" column="name"/> <field name="sex" column="sex"/> <field name="age" column="age"/> <field name="insertTime" column="insertTime"/> </entity> </document> </dataConfig>
配置說明:
dataSource:設置數據源 document:Solr的信息的基本單位,它是一組描述某些事物的數據集合 entity:對應數據表 pk:表主鍵 dataSource:指定使用哪一個數據源 name:表名 query:查詢sql deltaQuery:增量更新時使用的查詢sql ${dih.last_index_time}:最後更新時間 field:表字段
dataimport.properties內容:
#Mon Nov 06 13:03:53 CST 2017
last_index_time=2017-11-06 13\:03\:50
user.last_index_time=2017-11-06 13\:03\:50
user.last_index_time指定user表的最後更新時間(建議使用此種方式,由於若是有多張表的話能夠分別更新)
<field name="id" type="int" indexed="true" stored="true" required="true" multiValued="false" /> <field name="name" type="text_ik" indexed="true" stored="true"/> <field name="sex" type="int" indexed="true" stored="true"/> <field name="age" type="int" indexed="true" stored="true"/> <field name="insertTime" type="int" indexed="true" stored="true"/>
如solr的配置已上傳至zookeeper,需重複「集成zookeeper」中的第一步將配置文件上傳至zookeeper。(也能夠執行「經常使用命令」中的「更新solr配置到zookeeper」進行單個文件上傳)
7.重啓tomcat,執行數據導入操做
說明:
full-import:全量索引 delta-import:增量索引 clean:清除原有索引 commit:執行後提交 entity:數據源表
在servlet節點前面增長:
<listener> <listener-class>org.apache.solr.handler.dataimport.scheduler.ApplicationListener</listener-class> </listener>
進入conf,在其中新建dataimport.properties
dataimport.properties配置
[root@solr_1 ~]# vim /opt/solr/solr-home/conf/dataimport.properties
1 ################################################# 2 # # 3 # dataimport scheduler properties # 4 # # 5 ################################################ 6 # to sync or not to sync 7 # 1 - active; anything else - inactive 8 syncEnabled=1 9 # which cores to schedule 10 # in a multi-core environment you can decide which cores you want syncronized 11 # leave empty or comment it out if using single-core deployment 12 syncCores=collection1 13 # solr server name or IP address 14 # [defaults to localhost if empty] 15 server=localhost 16 # solr server port 17 # [defaults to 80 if empty] 18 port=8080 19 # application name/context 20 # [defaults to current ServletContextListener's context (app) name] 21 webapp=solr 22 # URL params [mandatory] 23 # remainder of URL 24 #增量 25 params=/dataimport?command=delta-import&clean=false&commit=true 26 # schedule interval 27 # number of minutes between two runs 28 # [defaults to 30 if empty] 29 interval=1 30 # 重作索引的時間間隔,單位分鐘,默認7200,即1天; 31 # 爲空,爲0,或者註釋掉:表示永不重作索引 32 reBuildIndexInterval=7200 33 # 重作索引的參數 34 reBuildIndexParams=/dataimport?command=full-import&clean=true&commit=true 35 # 重作索引時間間隔的計時開始時間,第一次真正執行的時間=reBuildIndexBeginTime+reBuildIndexInterval*60*1000; 36 # 兩種格式:2012-04-11 03:10:00 或者 03:10:00,後一種會自動補全日期部分爲服務啓動時的日期 37 reBuildIndexBeginTime=03:10:00
在mysql中增長一條數據,等待1分鐘,在solr的管理頁面查看是否有增長數據
修改schema.xml配置文件以後,根本不用登陸zookeeper刪除原有文件,文件會自動覆蓋,這裏直接上傳便可,命令以下:
[root@solr_1 ~]# cd /opt/solr/server/scripts/cloud-scripts/
[root@ cloud-scripts]# ./zkcli.sh -zkhost 192.168.29.110:2181,192.168.29.120:2181,192.168.29.130:2181 -cmd upconfig -confdir /opt/solr/solr-home/core_shopdemo_product2/conf/ -confname myconf
此命令是在配置上傳至zookeeper後,修改配置時使用的