Solr環境搭建及IK分詞的集成及solrJ的調用（三）【完結】

時間 2019-11-08

原文原文鏈接

前兩篇的鏈接到這裏： php

Solr環境搭建及IK分詞的集成及solrJ的調用(一) http://my.oschina.net/zimingforever/blog/120732 java

Solr環境搭建及IK分詞的集成及solrJ的調用(二) http://my.oschina.net/zimingforever/blog/120928 sql

第一篇講了如何搭建solr環境，第二篇講了如何在solr中加入IK的分詞。本篇主要介紹如何使用solrJ，solrJ是java客戶端調用的api apache

首先在pom中引入solrJ json

<dependency>
   <groupId>org.apache.solr</groupId>
   <artifactId>solr-solrj</artifactId>
   <version>3.6.0</version>
</dependency>
<dependency>
   <groupId>org.apache.solr</groupId>
   <artifactId>solr-core</artifactId>
   <version>3.6.0</version>
</dependency>

接着是solrJ的一些主要用法： api

A如何獲取一個solrServer並清空裏面的索引 app

public static SolrServer getSolrServer() throws IOException, SolrServerException {
        //鏈接到solr
        String solrServerUrl = "http://localhost:8084/solr";
        SolrServer solrServer = new CommonsHttpSolrServer(solrServerUrl);
        // 清空以前創建的索引數據 
        solrServer.deleteByQuery("*:*");
        return solrServer;
}

B 創建索引，這裏使用的是solrDocument類

SolrInputDocument doc1 = new SolrInputDocument();
doc1.addField( "id", "id1", 1.0f );
doc1.addField( "name", "doc1", 1.0f );
doc1.addField( "price", 10 );

SolrInputDocument doc2 = new SolrInputDocument();
doc2.addField( "id", "id2", 1.0f );
doc2.addField( "name", "doc2", 1.0f );
doc2.addField( "price", 20 );

Collection<SolrInputDocument> docs = new ArrayList<SolrInputDocument>();
docs.add( doc1 );
docs.add( doc2 );

server.add( docs );
server.commit();

這裏創建兩個文檔，每一個文檔有3個field，分別是id，name,price，這裏要注意一下，這3個filed其實已經在schemal中配置好了，若是你本身定義了filed必定要在schemal.xml中配置好，這也是稍後爲何使用addbean的時候我自定義的field不起做用的緣由。

而後把這兩個文檔放到一個collection中，而後加入到server中，並作一個commit。 ide

這時候solr中就有內容了。能夠訪問這個地址試試：http://localhost:8084/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on 會出現索引的內容 url

C 使用addBean增長索引 spa

上面使用的是solrDocument對象增長的索引，其實還有更方便的方法增長索引，方法以下：

Collection<SeacheIndexDO> solrInputDocs = new ArrayList<SeacheIndexDO>();
for(SpiderResultDescribeDO spiderResultDescribeDO:spiderResultDescribeDOs){
   SpiderResultInforDO spiderResultInforDO= (SpiderResultInforDO) sqlMapClient.queryForObject("hunter.getSpinderInfor", spiderResultDescribeDO);
   String fileType=spiderResultInforDO.getUrlKey();
   String fileFullName= AddressUtils.appendUrl(spiderResultInforDO.getBaseUrl(),spiderResultInforDO.getFileName());
   String fileContent=FileUtils.getFileStringByPath(fileFullName, Commons.DEFAULT_DB_CHARSET);
   SeacheIndexDO solrInputDoc=new SeacheIndexDO();
   //id是惟一標示，url是跳轉的鏈接
   solrInputDoc.setId(spiderResultInforDO.getId());
   solrInputDoc.setHunterUrl(spiderResultInforDO.getUrl());
   solrInputDoc.setHunterTitle(spiderResultInforDO.getTitle());
   if(fileType.equals(SpiderSourceType.DBA_WIKI.getStringValue())){
        solrInputDoc.setHunterContent(fileContent);
   }else {
        System.out.println("不支持的類型");
   }
   solrInputDocs.add(solrInputDoc);
}
//增長文檔
solrServer.addBeans(solrInputDocs);
// 提交
solrServer.commit();

這裏要注意一下，SearchIndexDo是我自定義的類，它裏面的屬性solr是不認識的，也不知道如何去作索引，因此這個地方必需要把這些屬性字段在schemal中作配置

<!--自定義的分詞部分-->
<field name="hunterTitle" type="text" indexed="true" stored="true" />
<field name="hunterAuthor" type="string" indexed="true" stored="true" />
<field name="hunterContent" type="text" indexed="true" stored="true" />
<field name="hunterQuestion" type="text" indexed="true" stored="true" />
<field name="hunterAnswers" type="text" indexed="true" stored="true" />
<field name="hunterCreateTime" type="date" indexed="true" stored="true" />
<field name="hunterUpdateTime" type="date" indexed="true" stored="true" />
<field name="hunterUrl" type="string" indexed="true" stored="true" />
<field name="hunterAll" type="text" indexed="true" stored="false" multiValued="true"/>

這裏type中的text是我配置好的用IK來分詞的filedType，date和string都是默認的類型，另外hunterAll是一個複合字段，再下面有個copyfiled的配置

<!-- 自定義的copyfiled -->
  <copyField source="hunterContent" dest="hunterAll"/>
  <copyField source="hunterTitle" dest="hunterAll"/>

表示把title和content都集合在一塊兒，一塊兒做爲搜索的字段。這樣以後就能夠在搜索的字段中直接搜索hunterAll字段了。

另外還要注意的一點是在bean中對應schemal的字段要加上@Field的註解

D 如何讀取solr的data，上面的列子其實就是一個讀取*：*的查詢

q=*%3A*&version=2.2&start=0&rows=10&indent=on

這裏有幾個參數q,start,rows，還有幾個經常使用的參數以下：

//        q - 查詢字符串，必須的。
//        fl - 指定返回那些字段內容，用逗號或空格分隔多個。
//        start - 返回第一條記錄在完整找到結果中的偏移位置，0開始，通常分頁用。
//        rows - 指定返回結果最多有多少條記錄，配合start來實現分頁。
//        sort - 排序，格式：sort=<field name>+<desc|asc>[,<field name>+<desc|asc>]… 。示例：（inStock desc, price asc）表示先 「inStock」 降序, 再 「price」 升序，默認是相關性降序。
//        wt - (writer type)指定輸出格式，能夠有 xml, json, php, phps, 後面 solr 1.3增長的，要用通知咱們，由於默認沒有打開。
//        fq - （filter query）過慮查詢，做用：在q查詢符合結果中同時是fq查詢符合的，例如：q=mm&fq=date_time:[20081001 TO 20091031]，找關鍵字mm，而且date_time是20081001到20091031之間的

E 查詢的代碼以下：

SolrServer server= GetSolrServer.getSolrServer();
SolrQuery query = new SolrQuery();
query.setQuery( "*:*" );
query.addSortField( "price", SolrQuery.ORDER.asc );
QueryResponse rsp = server.query( query );
SolrDocumentList docs = rsp.getResults();

最後一步查詢出來的solrDocument，另外也可使用getBeans方法

List<Item> beans = rsp.getBeans(Item.class);

F 另外在項目中我還用到了高亮查詢的結果的方法，這個也是以前在使用luncene沒有用到的功能，相關的代碼以下:

//設置高亮 給hunterConntent及hunterTitle設置高亮,並設置成紅色的格式
        solrQuery.setHighlight(true);
        solrQuery.addHighlightField("hunterTitle");
        solrQuery.addHighlightField("hunterContent");
        solrQuery.setHighlightSimplePre("<font color=\"red\">");
        solrQuery.setHighlightSimplePost("</font>");

表示給其中的title和content設置高亮，高亮的格式是使用紅色的font

獲取高亮的代碼以下，它和獲取查詢結果返回的對象不是一個，還要分別處理，代碼上也有些「不乾淨」

Map<String, Map<String, List<String>>> queryResponseHighlighting =queryResponse.getHighlighting();

G 另外以前提到了在自定義的field中我作了一個hunterAll字段

//設置查詢的範圍
        solrQuery.set("df",queryFiled);
        //設置查詢的字
        solrQuery.setQuery(queryText);

那麼在查詢的時候我就不用使用相似於*：*的格式了，而表示直接在hunterAll這個字段中去取queryText

H 另外返回查詢的狀態和時間以下：

//獲取status
        int responseStauts=queryResponse.getStatus();
        //獲取responseTime
        int responseTime=queryResponse.getQTime();

總結一下：本篇博客主要介紹了solrj的經常使用使用方法，我基本上使用上面的代碼接完成了創建索引和查詢。另外結合前兩篇的博客基本上可使用簡單的solr了。

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。