Elasticsearch 5.5 入門必會之Java client(二)

前言

  • 因爲本人一直從事Java方面研發,對Java也是尤爲熱愛,ES官方提供了Java的兩種訪問API的方式以下,固然,我選擇了Java API方式,所以我也開始了API踩坑之路(由於這個SDK文檔看起來讓人頭痛,可是當我一步步理解深刻的時候也發現挺簡單的):
  • Java API [5.5] — other versions
  • Java REST Client [5.5] — other versions

      注(es官方api文檔):https://www.elastic.co/guide/en/elasticsearch/client/index.htmlhtml

  • 相關文章:

         Elasticsearch 5.5 入門必會(一)java

         Elasticsearch 5.5 SQL語句轉Java Client 及相關注意事項(三)node

1、Java項目構建

  • 客戶端調用Maven依賴,客戶端我配置的是slf4j+log4j2,配置太多就不貼上來了
    <dependency>
    	<groupId>org.elasticsearch</groupId>
    	<artifactId>elasticsearch</artifactId>
    	<version>5.5.1</version>
    </dependency>
    <!-- 這個必定要引入,這是使用transport的jar -->
    <dependency>
    	<groupId>org.elasticsearch.client</groupId>
    	<artifactId>transport</artifactId>
    	<version>5.5.1</version>
    </dependency>
    <!-- es 的jar 對guava有依賴 -->
    <dependency>
    	<groupId>com.google.guava</groupId>
    	<artifactId>guava</artifactId>
    	<version>18.0</version>
    </dependency>

     

  • Java鏈接ES 節點代碼以下
    Settings settings = Settings.builder()
                        //集羣名稱
    					.put("cluster.name", "onesearch")
                        //自動嗅探
    					.put("client.transport.sniff", true)
    					.put("discovery.type", "zen")
    					.put("discovery.zen.minimum_master_nodes", 1)
    					.put("discovery.zen.ping_timeout", "500ms")
    					.put("discovery.initial_state_timeout", "500ms")
    					.build();
    Client client = new PreBuiltTransportClient(settings)
    					.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName(ip), 9300));

    啓動程序不報錯就表明您已經成功和ES創建鏈接。web

 2、Java客戶端操做索引數據

  • 入門時期看官方文檔心中會冒出「麻買皮」三個字,由於官方文檔有時候給你一個冷不丁的例子,有時候乾脆貼除了Rest方式的JSON代碼,萬隻草泥馬沒有在奔跑,正在瘋狂吃草,Java API方式實際上也是拼裝了JSON字符串,而後經過netty去和ES通訊,對比http的方式訪問的話SDK能夠自動嗅探節點仍是不錯的,一個節點掛了還能用另一個,http由於制定了單個IP,因此沒有這個優點

實例一:我怎樣寫數據到ES裏面去sql

/**
* ES的基本類型能夠去官網查看
* 若是您使用map的方式去寫入數據而且建立索引,es會自動根據map的value數據類型來自動轉換
* 好比age是int,es裏面使用有integer,不贅述
* 使用map有個大缺陷(除非本身封裝對象保存),當你保存java.util.Date類型進去的時候ES會所有轉成UTC來保存
* 這個只能經過後面的api方式定義索引field的一些屬性來指定才行
**/
@Test
public void createData() {
	Map<String, Object> map = new HashMap<String, Object>();
	// map.put("name", "Smith Wang");
	map.put("name", "Smith Chen");
	// map.put("age", 20);
	map.put("age", 5);
	// map.put("interests", new String[]{"sports","film"});
	map.put("interests", new String[] { "reading", "film" });
	// map.put("about", "I love to go rock music");
	map.put("about", "I love to go rock climbing");

	IndexResponse response = client.prepareIndex("megacorp", "employee", UUID.randomUUID().toString())
			.setSource(map).get();
	System.out.println("寫入數據結果=" + response.status().getStatus() + "!id=" + response.getId());
}
  •  說明:prepareIndex第一個參數是 index(索引) ,第二個是type(類型),第三個是記錄ID(不推薦使用UUID,後面會說)

 

    而後在基本查詢裏面就能夠查到你剛剛插入的數據了數據庫

---------------------------------------------------------------------------------------------------json

 

實例二:我怎樣從ES中根據字段來查詢數據(其實個人實例都是根據Elasticsearch權威指南上翻譯過來的,由於書中所有都是rest方式,不是Java api方式)api

/**
 * match使用,會被分詞查詢
 */
@Test
public void match() {
	SearchRequestBuilder requestBuilder = client.prepareSearch("megacorp").setTypes("employee")
			.setQuery(QueryBuilders.matchQuery("about", "rock climbing"));
	System.out.println(requestBuilder.toString());

	SearchResponse response = requestBuilder.execute().actionGet();

	System.out.println(response.status());
	if (response.status().getStatus() == 200) {
		for (SearchHit hits : response.getHits().getHits()) {
			System.out.println(hits.getSourceAsString());
		}
	}
}

OK,這些都是最基本的操做了!看似沒有難度app

 

3、經過Java API編寫複雜的查詢語句

  • match phrase短語精準匹配
/**
	 * matchphrase使用,短語精準匹配
     * 不使用matchPhraseQuery會致使 rock climbing被拆分查詢
	 */
	@Test
	public void matchPhrase() {
		SearchRequestBuilder requestBuilder = client.prepareSearch("megacorp").setTypes("employee")
				.setQuery(QueryBuilders.matchPhraseQuery("about", "rock climbing"));
		System.out.println(requestBuilder.toString());

		SearchResponse response = requestBuilder.execute().actionGet();
		System.out.println(response.status());
		if (response.status().getStatus() == 200) {
			for (SearchHit hits : response.getHits().getHits()) {
				System.out.println(hits.getSourceAsString());
			}
		}
	}
  • 高亮顯示
@Test
public void highlight() {
	HighlightBuilder highlightBuilder = new HighlightBuilder();
	// highlightBuilder.preTags(FragmentSettings.prefix);//設置前綴
	// highlightBuilder.postTags(FragmentSettings.subfix);//設置後綴
	highlightBuilder.field("about");
	// highlightBuilder.fragmenter(FragmentSettings.SPAN)
	// .fragmentSize(FragmentSettings.HIGHLIGHT_MAX_WORDS).numOfFragments(5);
	SearchRequestBuilder requestBuilder = client.prepareSearch("megacorp").setTypes("employee")
			.setQuery(QueryBuilders.matchPhraseQuery("about", "rock climbing")).highlighter(highlightBuilder);
	System.out.println(requestBuilder.toString());

	SearchResponse response = requestBuilder.execute().actionGet();

	System.out.println(response.status());
	if (response.status().getStatus() == 200) {
		for (SearchHit hits : response.getHits().getHits()) {
			System.out.println(hits.getSourceAsString());
			// 這裏使用hight field來覆蓋source裏面的字段便可
			System.out.println(hits.getHighlightFields());
		}
	}

}
  • 關係型數據的GROUP BY 方式查詢
@Test
public void aggregation() {
	SearchRequestBuilder searchBuilder = client.prepareSearch("megacorp").setTypes("employee")
			.addAggregation(AggregationBuilders.terms("by_interests").field("interests")
					.subAggregation(AggregationBuilders.terms("by_age").field("age")).size(10));
	System.out.println(searchBuilder.toString());
	SearchResponse response = searchBuilder.execute().actionGet();

	if (response.status().getStatus() == 200) {
		for (SearchHit hits : response.getHits().getHits()) {
			System.out.println(hits.getSourceAsString());
		}
	}
	StringTerms terms = response.getAggregations().get("by_interests");
	for (StringTerms.Bucket bucket : terms.getBuckets()) {
		System.out.println("-interest:" + bucket.getKey() + "," + bucket.getDocCount());
		if (bucket.getAggregations() != null && bucket.getAggregations().get("by_age") != null) {
			LongTerms ageTerms = bucket.getAggregations().get("by_age");
			for (LongTerms.Bucket bucket2 : ageTerms.getBuckets()) {
				System.out.println("--------by age:" + bucket2.getKey() + "," + bucket2.getDocCount());
			}
		}
	}
}
  • GROUP BY 的同時求平均值(求和等)
/**
	 * 聚合類+求平均年齡
     * 求和使用AggregationBuilders.sum
     * 注意AggregationBuilders.terms("by_interests") by_interests是分組的一個key,返回結果時你根據key反
     * 過來取值便可
	 */
	@Test
	public void aggregationAvg() {
		SearchRequestBuilder searchBuilder = client.prepareSearch("megacorp").setTypes("employee")
				.addAggregation(AggregationBuilders.terms("by_interests").field("interests")
						.subAggregation(AggregationBuilders.avg("avg_age").field("age")).size(10));
		System.out.println(searchBuilder.toString());
		SearchResponse response = searchBuilder.execute().actionGet();
		if (response.status().getStatus() == 200) {
			for (SearchHit hits : response.getHits().getHits()) {
				System.out.println(hits.getSourceAsString());
			}
		}

		StringTerms terms = response.getAggregations().get("by_interests");
		for (StringTerms.Bucket bucket : terms.getBuckets()) {
			System.out.println("-interest:" + bucket.getKey() + "," + bucket.getDocCount() + ",");
			InternalAvg agg = bucket.getAggregations().get("avg_age");
			System.out.println("---------avg age:" + agg.value() + ",count=" + agg.getValueAsString());
		}
	}

 

4、經過Java API進行索引操做

  • 下面是官方給出的建立索引,而且指定字段類型的操做,這裏很「麻買皮」
@Test
	public void createIndexInfo() {
		client.admin().indices().prepareCreate("megacorp")
				.setSettings(Settings.builder().put("index.number_of_shards", 4).put("index.number_of_replicas", 1))
				.addMapping("employee",
						"{\n" + "  \"properties\": {\n" + "    \"age\": {\n" + "      \"type\": \"integer\"\n"
								+ "    },\n" + "    \"name\": {\n" + "      \"type\": \"text\"\n" + "    },\n"
								+ "    \"interests\": {\n" + "      \"type\": \"text\",\n"
								+ "      \"fielddata\": true\n" + "    },\n" + "    \"about\": {\n"
								+ "      \"type\": \"text\"\n" + "    }\n" + "  }\n" + "}",
						XContentType.JSON)
				.get();
	}
  • 固然,官方也給出了一個比較優雅的解決方案(XContentBuilder),以下
XContentBuilder mapping = JsonXContent.contentBuilder()
.startObject()
	.startObject("productIndex")
		.startObject("properties")
			.startObject("title").field("type", "string").field("store", "yes").endObject()
			.startObject("description").field("type", "string").field("index", "not_analyzed").endObject()
			.startObject("price").field("type", "double").endObject()
			.startObject("onSale").field("type", "boolean").endObject()
			.startObject("type").field("type", "integer").endObject()
			.startObject("createDate").field("type", "date").endObject()
		.endObject()
	.endObject()
.endObject();


至關於: 
{
	{
		"productIndex":{
			"properties": {
				"title":{
					"type":"string",
					"store":"yes"
				}
			},
            ..
		}
	}
}

總的來講,這種解決方式會比拼接字符串好一點,不會感受很lowdom

  • 完整的API方式建立索引(這裏麻煩湊合看下,由於我作了一個從關係數據庫抽取數據寫到ES的完整操做),看一下重點關注代碼行便可,我其實作了XML相關的改造,將數據庫字段映射成ES字段操做,您先關注簡單的建立流程
@Test
	public void createIndexWithXML() throws Exception {
        //重點關注代碼行
		IndicesExistsRequestBuilder indices = client.admin().indices().prepareExists("test");
		List<SqlMappingConfig> mappingList = ElasticXMLReader.getSearchInfoList();
		//重點關注代碼行
		if(!indices.execute().actionGet().isExists()) {
            //重點關注代碼行
			XContentBuilder builder = JsonXContent.contentBuilder();
			builder.startObject().startObject("properties");
			SqlMappingConfig mapping = mappingList.get(0);
			for(Column column : mapping.getSearchInfo().getColumns()) {
				builder.startObject(column.getAttriMap().get("index-column"));
					for(Entry<String, String> entry : column.getAttriMap().entrySet()) {
						if(!entry.getKey().equals("index-column") &&  !entry.getKey().equals("sql-column")) {
							builder.field(entry.getKey().equals("data-type")?"type":entry.getKey(), entry.getValue());
						}
					}
				builder.endObject();
			}
			builder.endObject().endObject();

            //重點關注代碼行
			PutMappingRequest mappingRequest = Requests.putMappingRequest(mapping.getSearchInfo().getIndex()).type(mapping.getSearchInfo().getType());
			mappingRequest.source(builder);
			
            //重點關注代碼行
			CreateIndexResponse response = client.admin().indices().prepareCreate(mapping.getSearchInfo().getIndex())
					.setSettings(Settings.builder().put("index.number_of_shards", 8).put("index.number_of_replicas", 1))
					.addMapping(mapping.getSearchInfo().getType(), mappingRequest.source(),XContentType.JSON).execute().actionGet();
			
			System.out.println(response.isAcknowledged());
		}
	}

 

最後

     不少人有潔癖,喜歡用純SDK代碼方式來操做API,我也踩了無數的坑,上面的代碼都是我一步步試出來的,以前加了一個es的學習羣,可是不知道是否是我問的問題太簡單了,在裏面問問題都沒有人指導,後來很遺憾的退出了那個羣。不過很感謝那個羣,我學到了一個東西,就是Elasticsearch-sql工具,這個工具支持關係型數據庫的語句轉 es的查詢參數,很方便! 經過生成的json參數,能夠反過來照抄來寫Java代碼(雖然很彆扭,可是已經很不錯了)

     後面我會寫一篇關於關係型數據庫的查詢語句 變成 ES Java代碼的樣例出來,還請關注

相關文章
相關標籤/搜索