[lucene] lucene的HelloWorld程序

時間 2019-11-08

標籤 lucene helloworld 程序简体版

原文原文鏈接

一.全文檢索
1.1全文檢索的定義:全文檢索，即全文搜索，是對文本數據進行索引搜索。
1.2全文檢索的特色:作了索引；對關鍵字作了高亮顯示;摘要截取；搜索效果更加準確；只關注文本，不關注語意。
1.3使用場景:替換數據庫的模糊查詢，提升查詢效率;全文檢索是搜索引擎的基礎；垂直搜索；在word,pdf等格式的內容
   中檢索內容；用在各類輸入法中。
二.全文檢索的核心
2.1建立索引:創建單詞與句子之間的對應關係，以便經過單詞搜索到對應句子編號。
           分詞-->語法處理-->排序--.>去重
2.2搜索索引:經過關鍵字到索引中搜多，找到對應句子的編號。
           輸入搜索關鍵字-->關鍵字分詞-->搜索獲得具體的編號-->經過編號獲取句子-->封裝成對象傳到前臺展現java

三.lucene入門
3.1.lucene是什麼:lucene是全文檢索的一種實現，是用java寫的一種工具包。
3.2.lucene的核心API:
   增刪改：IndexWriter(索引寫入器)
   查: IndexSearcher(索引搜索器)
3.3入門步驟:
   1)下載Lucene
   2)導入jar包
   3)測試:
       建立索引:
           建立IndexWriter;
           把要建立索引的文本數據放入Document的字段中;
           經過IndexWriter把document進行寫入
       搜索索引:
           建立IndexSearcher;
           建立Query對象--把特定格式字符串解析獲得
           使用IndexSearcher傳入Query進行搜索
           從結果中獲取documentId，再經過它獲取document
           把document轉換爲咱們想要的對象進行返回數據庫

四.建立索引的代碼實現:apache

package practice;

import org.apache.lucene.analysis.core.SimpleAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;

import java.nio.file.Paths;

/**
 * lucene建立索引
 *
 * @author he
 * @date 2018/9/20
 */
public class Writer {
    private static final String PATH = "H:/JAVAEE/ideaCode/lucene/src/main/resources/index";


    public static void main(String[] args) throws Exception {
        String doc1 = "hello world";
        String doc2 = "hello java world";
        String doc3 = "hello lucene world";

        // 建立IndexWriter
        Directory d = FSDirectory.open(Paths.get(PATH));
        IndexWriterConfig conf = new IndexWriterConfig(new SimpleAnalyzer());
        IndexWriter indexWriter = new IndexWriter(d, conf);

        // 把要建立的索引的文本數據放入Document中
        Document ducument1 = new Document();
        ducument1.add(new TextField("id", "1", Field.Store.YES));
        ducument1.add(new TextField("title", "doc1", Field.Store.YES));
        ducument1.add(new TextField("content", doc1, Field.Store.YES));
        Document ducument2 = new Document();
        ducument2.add(new TextField("id", "2", Field.Store.YES));
        ducument2.add(new TextField("title", "doc2", Field.Store.YES));
        ducument2.add(new TextField("content", doc2, Field.Store.YES));
        Document ducument3 = new Document();
        ducument3.add(new TextField("id", "3", Field.Store.YES));
        ducument3.add(new TextField("title", "doc3", Field.Store.YES));
        ducument3.add(new TextField("content", doc3, Field.Store.YES));

        // 經過IndexWriter把Document寫入
        indexWriter.addDocument(ducument1);
        indexWriter.addDocument(ducument2);
        indexWriter.addDocument(ducument3);

        indexWriter.commit();
        indexWriter.close();
    }
}

代碼結果:ide

五.查詢的代碼實現:工具

package practice;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.core.SimpleAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;

import java.nio.file.Paths;

/**
 * lucene搜索索引
 *
 * @author he
 * @date 2018/9/20
 */
public class Searcher {
    private static final String PATH = "H:/JAVAEE/ideaCode/lucene/src/main/resources/index";

    public static void main(String[] args) throws Exception {
        // 建立IndexSearcher
        Directory directory = FSDirectory.open(Paths.get(PATH));
        IndexReader r = DirectoryReader.open(directory);
        IndexSearcher indexSearcher = new IndexSearcher(r);

        String parStr = "content:java";

        // 建立搜索解析器
        String defaultField = "content";
        Analyzer analyzer = new SimpleAnalyzer();
        QueryParser queryParser = new QueryParser(defaultField, analyzer);

        // 解析搜索
        Query query = queryParser.parse(parStr);
        TopDocs topDocs = indexSearcher.search(query, 10000);
        System.out.println("總命中數: " + topDocs.totalHits);

        //讀取搜索到的內容
        ScoreDoc[] scoreDocs = topDocs.scoreDocs;
        for (ScoreDoc scoreDoc : scoreDocs) {
            Document document = indexSearcher.doc(scoreDoc.doc);
            System.out.println("id -> " + document.get("id"));
            System.out.println("title -> " + document.get("title"));
            System.out.println("content -> " + document.get("content"));
        }
    }
}

代碼結果:測試

1. Lucene入門 - HelloWorld
2. Lucene全文檢索之HelloWorld
3. Lucene系列二：Lucene（Lucene介紹、Lucene架構、Lucene集成）
4. Lucene的使用，Lucene入門
5. lucene（一）lucene介紹
6. 第一個lucene程序
7. Lucene系列二：Lucene（Lucene介紹、Lucene架構）
8. Lucene
9. lucene
更多相關文章...
• W3C 程序 - W3C 教程
• ASP 子程序 - ASP 教程
• 算法總結-歸併排序
• Java 8 Stream 教程

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。