ES中默認最大查詢結果爲10000,大於10000時查不出結果,報錯超過最大值,如把 from調到大於10000.html
針對這個問題,有兩種解決辦法。數據庫
不少人都用這種方法,簡單粗暴。缺點是真的簡單粗暴,對部分情形可用,可是對一些特殊情形可能就不行了。elasticsearch
PUT index/_settings { "index":{ "max_result_window":100000000 } }
一篇能夠參考的博客:關於搜索elasticsearch的數據條數大於10000的坑 max_result_window的兩種設置方式ide
scroll
API 能夠被用來檢索大量的結果, 甚至全部的結果 ,就像在傳統數據庫中使用的遊標 cursor。單元測試
中文翻譯參考:https://blog.csdn.net/ctwy291314/article/details/82751898fetch
如下代碼是要實現獲取ES中所有文檔的nid字段,並將其存到文件中,是在單元測試中寫的,NID是內部類。ui
具體代碼:this
public static class NID { private String nid; public String getNid() { return nid; } public void setNid(String nid) { this.nid = nid; } } @Test public void testScroll() { //RestHighLevelClient client = elasticClient.getRestHighLevelClient(); RestHighLevelClient client = esConfig.client(); // 初始化scroll // 設定滾動時間間隔 // 這個時間並不須要長到能夠處理全部的數據,僅僅須要足夠長來處理前一批次的結果。每一個 scroll 請求(包含 scroll 參數)設置了一個新的失效時間。 final Scroll scroll = new Scroll(TimeValue.timeValueMinutes(1L)); SearchRequest searchRequest = new SearchRequest(esConfig.getCaterIndex()); // 新建索引搜索請求 searchRequest.scroll(scroll); SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); searchSourceBuilder.query(matchAllQuery()); searchSourceBuilder.size(5000); //設定每次返回多少條數據 searchSourceBuilder.fetchSource(new String[]{"nid"},null);//設置返回字段和排除字段 searchRequest.source(searchSourceBuilder); SearchResponse searchResponse = null; try { searchResponse = client.search(searchRequest, RequestOptions.DEFAULT); } catch (IOException e) { e.printStackTrace(); } int page = 0 ; File outFile = new File("E://cater_nid.csv");//寫出的CSV文件 try { BufferedWriter writer = new BufferedWriter(new FileWriter(outFile)); SearchHit[] searchHits = searchResponse.getHits().getHits(); page++; System.out.println("-----第"+ page +"頁-----"); for (SearchHit searchHit : searchHits) { //System.out.println(searchHit.getSourceAsString()); String sourceAsString = searchHit.getSourceAsString(); NID t = JSON.parseObject(sourceAsString, NID.class); writer.write(t.getNid()); writer.newLine(); } //遍歷搜索命中的數據,直到沒有數據 String scrollId = searchResponse.getScrollId(); while (searchHits != null && searchHits.length > 0) { SearchScrollRequest scrollRequest = new SearchScrollRequest(scrollId); scrollRequest.scroll(scroll); try { searchResponse = client.scroll(scrollRequest, RequestOptions.DEFAULT); } catch (IOException e) { e.printStackTrace(); } scrollId = searchResponse.getScrollId(); searchHits = searchResponse.getHits().getHits(); if (searchHits != null && searchHits.length > 0) { page++; System.out.println("-----第"+ page +"頁-----"); for (SearchHit searchHit : searchHits) { //System.out.println(searchHit.getSourceAsString()); String sourceAsString = searchHit.getSourceAsString(); NID t = JSON.parseObject(sourceAsString, NID.class); writer.write(t.getNid()); writer.newLine(); } } } //清除滾屏 ClearScrollRequest clearScrollRequest = new ClearScrollRequest(); clearScrollRequest.addScrollId(scrollId);//也能夠選擇setScrollIds()將多個scrollId一塊兒使用 ClearScrollResponse clearScrollResponse = null; try { clearScrollResponse = client.clearScroll(clearScrollRequest, RequestOptions.DEFAULT); } catch (IOException e) { e.printStackTrace(); } boolean succeeded = clearScrollResponse.isSucceeded(); System.out.println("succeeded:" + succeeded); writer.close(); } catch (IOException e) { e.printStackTrace(); } }
代碼參考:http://www.javashuo.com/article/p-wjsxfrea-mc.htmlspa