lucene

時間 2019-11-30

標籤 lucene 简体版

原文原文鏈接

lucene 版本與jdk版本匹配：java

昆明IT培訓的老師知道Lucene，最新版是Lucene6.2.1，匹配的jdk版本是1.8正式版。sql

這裏用jdk7最後一版，因此用Lucene5.3.3。引用的jar包有： lucene-core-5.5.5.jar、lucene-queryparser-5.5.5.jar、lucene-analyzers-smartcn-5.5.5.jar、lucene-analyzers-common-5.5.5.jar、commons-io-2.5.jar數據庫

一：建立索引：索引能夠保存在內存和硬盤上，本例中把索引文件放到硬盤上。步驟：一、指定保存索引的目錄；二、建立 IndexWrier 對象；三、建立Document對象，至關於數據庫表；四、爲Document 添加Field （Field 至關於表的字段）；5；經過IndexWriter 添加文檔到索引目錄中。即把表信息寫入到指定的目錄中。代碼以下：數組

    public void index() {
        IndexWriter writer = null;
        try
        {
            //一、建立Directory ， 即索引文件在硬盤上保存的目錄
            Path path = Paths.get("E:/myexe/luceneTest01/index01");
            Directory directory = FSDirectory.open(path);
            //二、建立 IndexWriter ， 經過 IndexWriter 把 Document 寫入到硬盤中。
            IndexWriterConfig iwc = new IndexWriterConfig(new StandardAnalyzer());
            writer =  new IndexWriter(directory, iwc);
            //在寫入前須要先刪除之前的索引文件。不然會重複。
            writer.deleteAll();
            //三、建立Document 對象，相似數據庫表。
            Document doc = null;
            //四、讀取指定目錄下全部的源文件，把這些文件的 內容，文件名，文件路徑做爲Document表的 Field。
            File f= new File("E:/myexe/luceneTest01/example/");
            for(File file:f.listFiles()) {
               doc =  new Document();
               doc.add(new TextField("content",new FileReader(file)));
               doc.add(new TextField("filename",file.getName(),Field.Store.YES));
               doc.add(new TextField("path", file.getAbsolutePath(),Field.Store.YES));
               // 五、經過IndexWriter 把Document 寫入到索引文件中，索引文件保存在指定的 Directory 目錄。
               writer.addDocument(doc);
            }
        }
        catch (CorruptIndexException e) {
            e.printStackTrace();
        }
        catch (LockObtainFailedException e) {
            e.printStackTrace();
        }
        catch(IOException e)
        {
            e.printStackTrace();
        } finally {
            try{
                if (writer != null) {
                    //關閉IndexWriter
                    writer.close();
                }
            }
            catch (CorruptIndexException e) {
                e.printStackTrace();
            }
            catch(IOException e)
            {
                    e.printStackTrace();
            }
        }
    }

Field.Store.YES 設置爲YES 表示把這個域中的內容徹底存儲到索引文件中，方便進行文本的還原。 爲NO 表示把域的內容不存儲到文件中，但能夠被索引。

二：搜索spa

    public void searcher()  {
        try {
            //一、建立 Directory ， 指定搜索的 索引目錄
            Path path = Paths.get("E:/myexe/luceneTest01/index01");
            Directory directory = FSDirectory.open(path);
            //二、建立IndexReader， IndexReader讀取上面的 Directory
            IndexReader reader = DirectoryReader.open(directory);
            //三、根據IndexReader 建立IndexSearcher 
            IndexSearcher searcher = new IndexSearcher(reader);
            //四、建立搜索的Query，即至關於sql語句
            //指定全部域是 content，即在建立索引時保存的 content 域
            QueryParser parser = new QueryParser("content", new StandardAnalyzer());
            //搜索域 content 中包含 package 的文檔
            Query query = parser.parse("package");
            //五、根據IndexSearcher 搜索，並返回 TopDocs 
            //第一個參數是搜索的條件，第二個參數是返回10條結果
            TopDocs tds = searcher.search(query, 10);
            //六、搜索的內容保存在 TopDocs 對象的 ScoreDoc對象 數組中。
            ScoreDoc[] sds = tds.scoreDocs;
            for(ScoreDoc sd:sds) {
                //七、ScoreDoc 數組中元素的 doc 屬性就是 搜索的 Document 對象
                Document d = searcher.doc(sd.doc);
                //八、獲取 Document 的文件名、文件路徑、文件內容
                System.out.println(d.get("filename") + "  [" + d.get("path") + "]  " + d.get("content"));
                System.out.println(d.get("content"));
            }
            //九、關閉reader
            reader.close();
        } 
        catch  (CorruptIndexException e) {
            e.printStackTrace();
        }
        catch  (IOException e) {
            e.printStackTrace();
        }
        catch  (ParseException e) {
            e.printStackTrace();
        }
        
    }

輸出在E:\myexe\luceneTest01\example\ 目錄下的全部文件中包含 package 單詞的內容。由於 content 在建立索引時，沒有保存在索引文件中，因此 content 域是 null （由於文件內容 content 通常比較大，在建立索引是 code

doc.add(new TextField("content",new FileReader(file))); 沒有把文件內容保存在 索引中） ：

PagerAppoint.java [E:\myexe\luceneTest01\example\PagerAppoint.java] null
null
App.txt [E:\myexe\luceneTest01\example\App.txt] null
null對象

相關標籤/搜索

springmvc+mybatis+shiro+lucene+rest+webservice+maven

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。