使用lucene,咱們經過搜索出來的信息,都是相關性最強的排在前面的,這裏涉及到評分機制,在實際生產中一定是要根據具體的業務需求作出更爲複雜的自定義評分機制,但這裏先簡單看看lucene的評分是如何設定的。java
private Map<String,Float> scores = new HashMap<String,Float>(); //構造函數 public IndexUtil() { try { setDates(); //設置Score相對高的信息 scores.put("itat.org",2.0f); scores.put("zttc.edu", 1.5f); directory = FSDirectory.open(new File("d:/lucene/index02")); } catch (IOException e) { e.printStackTrace(); } } //建立索引 public void index() { IndexWriter writer = null; try { writer = new IndexWriter(directory, new IndexWriterConfig(Version.LUCENE_35, new StandardAnalyzer(Version.LUCENE_35))); writer.deleteAll(); Document doc = null; for(int i=0;i<ids.length;i++) { doc = new Document(); doc.add(new Field("id",ids[i],Field.Store.YES,Field.Index.NOT_ANALYZED_NO_NORMS)); doc.add(new Field("email",emails[i],Field.Store.YES,Field.Index.NOT_ANALYZED)); doc.add(new Field("email","test"+i+"@test.com",Field.Store.YES,Field.Index.NOT_ANALYZED)); doc.add(new Field("content",contents[i],Field.Store.NO,Field.Index.ANALYZED)); doc.add(new Field("name",names[i],Field.Store.YES,Field.Index.NOT_ANALYZED_NO_NORMS)); //存儲數字 doc.add(new NumericField("attach",Field.Store.YES,true).setIntValue(attachs[i])); //存儲日期 doc.add(new NumericField("date",Field.Store.YES,true).setLongValue(dates[i].getTime())); //截取@後面的字段 String et = emails[i].substring(emails[i].lastIndexOf("@")+1); //設置評分 if(scores.containsKey(et)) { doc.setBoost(scores.get(et)); } else { doc.setBoost(0.5f); } writer.addDocument(doc); } } catch (CorruptIndexException e) { e.printStackTrace(); } catch (LockObtainFailedException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } finally { try { if(writer!=null)writer.close(); } catch (CorruptIndexException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } } }
測試結果函數
進行評分前,它是長這樣的:學習
進行評分後,它是長這樣的:測試
注:這裏面是的doc進行setBoost,可是以前看一篇博文,Lucene4.x以後好像不能對doc進行評分了,只能對Field進行評分。在學習《Lucene 實戰》一書時,發現能夠對Query進行加大評分,特別是使用像BooleanQuery這種包含多個Query的查詢器。spa