Solr搜索基礎

 

本例咱們使用類庫和代碼均來自:html

http://www.cnblogs.com/TerryLiang/archive/2011/04/17/2018962.htmljava

使用C#來模擬搜索、索引創建、刪除、更新過程,Demo截圖以下:web

wps6401.tmp

image

1、準備工做:


先準備一個實體類Product:數據庫

  public  class Product
    {
      public string ID { get; set; }
      public string Name { get; set; }
      public String[] Features { get; set; }
      public float Price { get; set; }
      public int Popularity { get; set; }
      public bool InStock { get; set; }
      public DateTime Incubationdate_dt { get; set; }
    }

再爲這個實體類建立一個反序列化類ProductDeserializer:apache

  class ProductDeserializer : IObjectDeserializer<Product>
  {
      public IEnumerable<Product> Deserialize(SolrDocumentList result)
      {
          foreach (SolrDocument doc in result)
          {
              yield return new Product()
              {
                  ID = doc["id"].ToString(),
                  Name = doc["name"].ToString(),
                  Features = (string[])((ArrayList)doc["features"]).ToArray(typeof(string)),
                  Price = (float)doc["price"],
                  Popularity = (int)doc["popularity"],
                  InStock = (bool)doc["inStock"],
                  Incubationdate_dt = (DateTime)doc["incubationdate_dt"]
              };
          }
      }
  }

爲項目引入EasyNet.Solr.dll。json

2、建立搜索:


執行Solr客戶端初始化操做:tomcat

        #region 初始化
       static List<SolrInputDocument> docs = new List<SolrInputDocument>();
        static OptimizeOptions optimizeOptions = new OptimizeOptions();
        static ISolrResponseParser<NamedList, ResponseHeader> binaryResponseHeaderParser = new BinaryResponseHeaderParser();
        static IUpdateParametersConvert<NamedList> updateParametersConvert = new BinaryUpdateParametersConvert();
        static ISolrUpdateConnection<NamedList, NamedList> solrUpdateConnection = new SolrUpdateConnection<NamedList, NamedList>() { ServerUrl = "http://localhost:8080/solr/" };
        static ISolrUpdateOperations<NamedList> updateOperations = new SolrUpdateOperations<NamedList, NamedList>(solrUpdateConnection, updateParametersConvert) { ResponseWriter = "javabin" };

        static ISolrQueryConnection<NamedList> connection = new SolrQueryConnection<NamedList>() { ServerUrl = "http://localhost:8080/solr/" };
        static ISolrQueryOperations<NamedList> operations = new SolrQueryOperations<NamedList>(connection) { ResponseWriter = "javabin" };

        static IObjectDeserializer<Product> exampleDeserializer = new ProductDeserializer();
        static ISolrResponseParser<NamedList, QueryResults<Product>> binaryQueryResultsParser = new BinaryQueryResultsParser<Product>(exampleDeserializer);
        #endregion

咱們先模擬一個數據源,這裏內置一些數據做爲示例:安全

            List<Product> products = new List<Product>();
            Product juzi = new Product
            {
                ID = "SOLR1000",
                Name = "浙江桔子",
                Features = new String[] { 
                    "色香味兼優", 
                    "既可鮮食,又可加工成以果汁",
                    "果實養分豐富"},
                Price = 2.0f,
                Popularity = 100,
                InStock = true,
                Incubationdate_dt = new DateTime(2006, 1, 17, 0, 0, 0, DateTimeKind.Utc)
            };
            products.Add(juzi);

            var doc = new SolrInputDocument();
            doc.Add("id", new SolrInputField("id", juzi.ID));
            doc.Add("name", new SolrInputField("name", juzi.Name));
            doc.Add("features", new SolrInputField("features", juzi.Features));
            doc.Add("price", new SolrInputField("price", juzi.Price));
            doc.Add("popularity", new SolrInputField("popularity", juzi.Popularity));
            doc.Add("inStock", new SolrInputField("inStock", juzi.InStock));
            doc.Add("incubationdate_dt", new SolrInputField("incubationdate_dt", juzi.Incubationdate_dt));

            docs.Add(doc);

            Product pingguo = new Product
            {
                ID = "SOLR1002",
                Name = "陝西蘋果",
                Features = new String[] { 
                "味道甜美",
                "光澤鮮豔", 
                "養分豐富"
            },
                Price = 1.7f,
                Popularity = 50,
                InStock = true,
                Incubationdate_dt = new DateTime(2010, 1, 17, 0, 0, 0, DateTimeKind.Utc)
            };
            products.Add(pingguo);
            var doc2 = new SolrInputDocument();
            doc2.Add("id", new SolrInputField("id", pingguo.ID));
            doc2.Add("name", new SolrInputField("name", pingguo.Name));
            doc2.Add("features", new SolrInputField("features", pingguo.Features));
            doc2.Add("price", new SolrInputField("price", pingguo.Price));
            doc2.Add("popularity", new SolrInputField("popularity", pingguo.Popularity));
            doc2.Add("inStock", new SolrInputField("inStock", pingguo.InStock));
            doc2.Add("incubationdate_dt", new SolrInputField("incubationdate_dt", pingguo.Incubationdate_dt));

            docs.Add(doc2);

            dataGridView1.DataSource = products;

同時將這些數據添加到List<SolrInputDocument>中,SolrInputDocument是TerryLiang編寫的文檔交換實體,能夠在他提供的源代碼中看到。服務器

1. 建立索引:

  建立索引是指將原始數據傳遞給Solr,而後在Solr目錄下建立指定格式文件,這些文件可以被Solr快速查詢,以下圖:app

wps6412.tmp

建立索引實際上就是用Update將數據POST給collection1,代碼以下:

            var result = updateOperations.Update("collection1", "/update", new UpdateOptions() { OptimizeOptions = optimizeOptions, Docs = docs });
            var header = binaryResponseHeaderParser.Parse(result);

            lbl_info.Text= string.Format("Update Status:{0} QTime:{1}", header.Status, header.QTime);

索引成功後咱們能夠在Solr管理界面查詢:

wps6422.tmp

注意:每次使用管理器搜索時,右上角都會顯示搜索使用的URL:

http://localhost:8080/solr/collection1/select?q=*%3A*&wt=json&indent=true

這些參數的含義較爲簡單能夠查詢一些文檔獲取信息。

2. 建立查詢

  查詢其實就是提交一個請求給服務器,等待服務器將結果返回的過程,可使用任何語言只要能發起請求並接受結果便可,這裏咱們使用客戶端。

先建立一個ISolrQuery對象,傳入搜索關鍵字,關鍵字的構建方法能夠從Solr管理界面推理出來:

假如咱們要查詢name中帶「蘋果」的信息,咱們須要在管理界面輸入:

wps6443.tmp

若是想知道Solr是如何構建查詢的話能夠勾選DebugQuery選項,獲得調試信息:

wps6444.tmp

意思是隻在Name這個列中檢索。

因此咱們代碼中須要這麼寫:

ISolrQuery query = new SolrQuery("name:"+keyWord);

安全問題自行考慮。

可是若是要查詢所有就簡單多了:

ISolrQuery query = SolrQuery.All;

將查詢條件發送給服務器以後再把服務器返回的數據還原成對象顯示出來即完成了一次查詢操做,具體操做代碼以下:

            ISolrQuery query = SolrQuery.All;
            if (!string.IsNullOrWhiteSpace(keyWord))
            {
                query = new SolrQuery("name:"+keyWord);
            }
            var result = operations.Query("collection1", "/select", query, null);
            var header = binaryResponseHeaderParser.Parse(result);

            var examples = binaryQueryResultsParser.Parse(result);

            lbl_info.Text= string.Format("Query Status:{0} QTime:{1} Total:{2}", header.Status, header.QTime, examples.NumFound);
            dataGridView1.DataSource = examples.ToList();

3. 增量索引

   實際上常常會有數據是新增或者改變的,那麼咱們就須要及時更新索引便於查詢出新數據,就須要增量索引。這和初次索引同樣,若是你想更新原有數據,那麼將新數據再次提交一次便可,若是想增長提交不一樣數據便可。數據判斷標準爲id,這是個配置項,能夠在中D:\apache-tomcat-7.0.57\webapps\solr\solr_home\collection1\conf\schema.xml找到:

<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />

能夠理解爲主鍵。

代碼以下:

             var docs = new List<SolrInputDocument>();
             Product hetao = new Product
             {
                 ID = "SOLR1003",
                 Name = "陝西山核桃",
                 Features = new String[] { 
                "養分好吃",
                "微量元素豐富", 
                "補腦"
            },
                 Price = 1.7f,
                 Popularity = 50,
                 InStock = true,
                 Incubationdate_dt = new DateTime(2010, 1, 17, 0, 0, 0, DateTimeKind.Utc)
             };
             var doc2 = new SolrInputDocument();
             doc2.Add("id", new SolrInputField("id", hetao.ID));
             doc2.Add("name", new SolrInputField("name", hetao.Name));
             doc2.Add("features", new SolrInputField("features", hetao.Features));
             doc2.Add("price", new SolrInputField("price", hetao.Price));
             doc2.Add("popularity", new SolrInputField("popularity", hetao.Popularity));
             doc2.Add("inStock", new SolrInputField("inStock", hetao.InStock));
             doc2.Add("incubationdate_dt", new SolrInputField("incubationdate_dt", hetao.Incubationdate_dt));
             docs.Clear();
             docs.Add(doc2);

             var result = updateOperations.Update("collection1", "/update", new UpdateOptions() { OptimizeOptions = optimizeOptions, Docs = docs });
             var header = binaryResponseHeaderParser.Parse(result);

             lbl_info.Text= string.Format("Update Status:{0} QTime:{1}", header.Status, header.QTime);

4. 刪除索引

   和數據庫刪除同樣,固然按照主鍵進行刪除。傳入刪除Option同時帶入主鍵名和主鍵值發送給服務器便可。

具體操做代碼以下:

              var result = updateOperations.Update("collection1", "/update", new UpdateOptions() { OptimizeOptions = optimizeOptions, DelById = new string[] { id } });
              var header = binaryResponseHeaderParser.Parse(result);

              lbl_info.Text=string.Format("Update Status:{0} QTime:{1}", header.Status, header.QTime);

這樣就完成了一個最基本的建立索引,更新刪除索引和查詢的過程,本例查詢速度並無直接操做管理界面那麼快,緣由在於序列化和反序列化,延續上述提到的:任何語言只要能發起請求和接收響應便可以查詢,能夠避免這個過程,提升查詢效率。

代碼下載

相關文章
相關標籤/搜索