Lucene搜索的時候就要構造查詢語句,本篇就介紹下各類Query。IndexSearcher是搜索主類,提供的經常使用查詢接口有:java
TopDocs search(Query query, int n);//find the top n hits for query TopDocs search(Query query, Filter filter, int n);// find the top n hits for query, applying filter if no-null
TermQuery:在某個Field上查找一個詞條apache
Term t = new Term("bookname", "Lucene");//詞條所在Field,詞條內容 Query q = new TermQuery(t);
BooleanQuery:由多個子句組成,子句間由「與、或、非」這樣的布爾邏輯鏈接。BooleanClause.Occur是個枚舉,包括MUST/MUST_NOT/SHOULD。經常使用的組合有:緩存
MUST和MUST:求交集,MUST和MUST_NOT:求差集,SHOULD和SHOULD:求並集。安全
void add(Query query, BooleanClause.Occur occur)
NumericRangeQuery/TermRangeQuery:範圍查詢,範圍能夠是日期、時間、數字,若是不設上限或下限,對應的邊界設爲null,inclusive設爲false。app
TermRangeQuery(String field, String lowerTerm, String upperTerm, boolean includeLower, boolean includeUpper); //NumericRangeQuery static NumericRangeQuery<Double> newDoubleRange(String field, Double min, Double max, boolean minInclusive, boolean maxInclusive); static NumericRangeQuery<Float> newFloatRange(String field, Float min, Float max, boolean minInclusive, boolean maxInclusive); static NumericRangeQuery<Integer> newIntRange(String field, Integer min, Integer max, boolean minInclusive, boolean maxInclusive); static NumericRangeQuery<integer> newIntRange(String field, int precisionStep, Integer min, Integer max, boolean minInclusive, boolean maxInclusive);
PhraseQuery:短語搜索,一個以上的關鍵字組成的短語,如中國,鋼鐵。能夠設置slop,容許短語中的字之間有其餘字的個數,默認爲0spa
void add(Term term);//add a term to the end of the query phrase void setSlop(int s);//set the number of other words between words in the query phrase //sample,bookname包含"中國"的會被搜到,其餘組合都不會被搜到 PhraseQuery query = new PhraseQuery(); query.add(new Term("bookname", "中")); query.add(new Term("bookname", "國"));
MultiPhraseQuery:一些短語有相同的前綴,或後綴,或中間詞,如中國好聲音和美國好聲音blog
void add(Term term);//Add a single term at the next position in the phrase. void add(Term[] terms);//Add multiple terms at the next position in the phrase. //sample MultiPhraseQuery query = new MultiPhraseQuery(); query.add(new Term[]{new Term("bookname", "中"), new Term("bookname", "美")}); query.add(new Term("song", "國")); query.add(new Term("song", "好")); query.add(new Term("song", "聲")); query.add(new Term("song", "音"));
PrefixQuery:前綴匹配繼承
PrefixQuery query = new PrefixQuery(new Term("bookname","鋼"));//查找以鋼開頭的bookname
FuzzyQuery:模糊匹配,比較兩個字符串時,執行一個串轉變爲另外一個串的操做(增、刪、改變字母),每執行一次轉變就扣除必定分數,最終獲得二者的距離(模糊度)索引
FuzzyQuery(Term term); FuzzyQuery(Term term, int maxEdits);//maxEdits-an edit distance fo at most maxEdits to term FuzzyQuery(Term term, int maxEdits, int prefixLength);//prefixLength-length of common (no-fuzzy) prefix
WildcardQuery:使用'?'和'*'通配符接口
WildcardQuery query = new WildcardQuery(new Term("bookname", "?o*"));
filter至關因而一個搜索必須條件,用於對搜索結果進行限制,如返回的文檔安全級別限制。全部過濾器都繼承org.apache.lucene.search.Filter,由於Filter條件大多與query無關,不須要每次都執行一次索引遍歷,因此lucene引入了緩存技術,避免一遍遍重複的搜索索引過濾文檔。
經常使用的有NumericRangeFilter、PrefixFilter、TermRangeFilter,封裝Filter以實現緩存的CachingWrapperFilter,針對某個Field進行緩存的FieldCacheRangeFilter、FieldCacheTermsFilter。
org.apache.lucene.queryParser用於解析子句生成Query。支持的語法規則以下
Query ::= ( Clause )* Clause ::= ["+", "-"] [<TERM> ":"] ( <TERM> | "(" Query ")" )
+ 必須,- 排除,: 表示針對某個Field搜索,通配符?*。舉例
+bookname:java -bookname:structs,搜bookname中包含java不包含structs的doc publishdate:[1990 TO 1998],第一版日期在1990和1998之間 bookname:work~0.5,模糊查詢 bookname:"apache lucene"~5,鬆散短語查詢,bookname必須包含apache和lucene,但二者距離要在5個詞內 "God helps",加引號表示不分詞,做爲完整的一個短語查詢 bookname:(java search),空格隔開的多個詞須要加括號,不然後面一個詞"search"不會被認爲是在bookname上的搜索,會認爲是default field上的搜索
經常使用方法有:
Query parse(String query); QueryParser(Version matchVersion, String f, Analyzer a)//分詞器應該與建索引的分詞器保持一致
注:
構造好Query後,想看下實際的查詢內容,能夠用query.toString()