1 全文檢索工具,方便實現全文檢索功能。服務器
2 全文檢索, 先對要搜索的文檔進行分詞,造成索引,根據索引經行檢索。ide
3 全文檢索流程工具
索引流程:採集數據, 處理數據,建立索引post
搜索流程:輸入查詢條件,Lucene查詢器查詢索引, 索引庫取出結果spa
4 IndexWriter是索引過程的核心組件,經過IndexWriter能夠建立新索引、更新索引、刪除索引操做。IndexWriter須要經過Directory對索引進行存儲操做。code
Directory描述了索引的存儲位置,底層封裝了I/O操做,負責對索引進行存儲。它是一個抽象類,它的子類經常使用的包括FSDirectory(在文件系統存儲索引)、RAMDirectory(在內存存儲索引)。xml
public class IndexManager { @Test public void createIndex() throws Exception { BookDao bookDao = new BookDaoImpl(); List<Book> books = bookDao.queryBooks(); List<Document> documents = new ArrayList<>(); Document document = null; for (Book book : books) { document = new Document(); Field id = new TextField("id", book.getId().toString(), Store.YES); Field name = new TextField("name", book.getName(), Store.YES); Field price = new TextField("price", book.getPrice().toString(), Store.YES); Field detail = new TextField("detail", book.getDetail(), Store.YES); document.add(id); document.add(name); document.add(price); document.add(detail); documents.add(document); } Analyzer analyzer = new StandardAnalyzer(); IndexWriter indexWriter = null; IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_4_10_3, analyzer); Directory directory = FSDirectory.open(new File("E:\\index\\")); indexWriter = new IndexWriter(directory, config); for (Document d : documents) { indexWriter.addDocument(d); } indexWriter.close(); } }
5 搜索輸入語法 and or not 大寫blog
public void indexSearch() throws Exception { QueryParser queryParser = new QueryParser("detail", new StandardAnalyzer()); Query query = queryParser.parse("detail:好 AND 大"); Directory directory = FSDirectory.open(new File("E:\\index\\")); IndexReader indexReader = DirectoryReader.open(directory); IndexSearcher searcher = new IndexSearcher(indexReader); TopDocs docs = searcher.search(query, 10); ScoreDoc[] scoreDocs = docs.scoreDocs; for (ScoreDoc scoreDoc : scoreDocs) { int docId = scoreDoc.doc; Document document = searcher.doc(docId); System.out.println(document.get("id")); System.out.println(document.get("name")); System.out.println(document.get("detail")); } indexReader.close(); }
5 field 屬性排序
1 是否分詞 tokenized 分詞爲了索引,(商品名稱,描述,價格),不分詞也能夠索引(商品id)索引
2 是否索引ndexed
3 是否存儲 stored 是否將field存到文檔域中,存儲目的顯示。 名稱,價格,id,圖片地址
@Test public void createIndex() throws Exception { BookDao bookDao = new BookDaoImpl(); List<Book> books = bookDao.queryBooks(); List<Document> documents = new ArrayList<>(); Document document = null; for (Book book : books) { document = new Document(); Field id = new StringField("id", book.getId().toString(), Store.YES); Field name = new TextField("name", book.getName(), Store.YES); Field price = new FloatField("price", book.getPrice(), Store.YES); Field detail = new TextField("detail", book.getDetail(), Store.NO); document.add(id); document.add(name); document.add(price); document.add(detail); documents.add(document); } Analyzer analyzer = new StandardAnalyzer(); IndexWriter indexWriter = null; IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_4_10_3, analyzer); Directory directory = FSDirectory.open(new File("E:\\index\\")); indexWriter = new IndexWriter(directory, config); for (Document d : documents) { indexWriter.addDocument(d); } indexWriter.close(); } }
6 修改索引
@Test public void updateIndex() throws Exception { Analyzer analyzer = new StandardAnalyzer(); IndexWriter indexWriter = null; IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_4_10_3, analyzer); Directory directory = FSDirectory.open(new File("E:\\index\\")); indexWriter = new IndexWriter(directory, config); Document document = new Document(); document.add(new TextField("name", "fdrr", Store.YES)); indexWriter.updateDocument(new Term("name", "fddd"), document); indexWriter.close(); }
7 相關度排序
就是查詢關鍵字和查詢結構的匹配相關度,匹配度越高越靠前,經過打分經行排序
打分兩個步驟:1 計算詞的權重 2 根據權重打分
詞的權重:詞就是term , 一個term對一個文檔的重要性就是權重
影響詞的權重 1 tf 詞在同一個文檔出現頻率,tf越高詞的權重越高
2 df 詞在多個文檔出現頻率,tf越高詞的權重越低
8 設置boost值影響打分。
boost 加權值 默認。1.0f 能夠在建立索引時,也能夠在查詢時。
在MultiFieldQueryParser建立時設置boost值。
solr
1 基於Lucene的全文檢索服務器,
索引: solr客戶端向solr服務器發送post請求,請求內容包括field信息的xml文檔,經過文檔實現對索引維護。
搜索: get請求,服務器返回一個xml文檔