RestHighLevelClient 之 Scroll

ES中默認最大查詢結果爲10000,大於10000時查不出結果,報錯超過最大值,如把 from調到大於10000.html

針對這個問題,有兩種解決辦法。數據庫

第一種,修改 max_result_window

不少人都用這種方法,簡單粗暴。缺點是真的簡單粗暴,對部分情形可用,可是對一些特殊情形可能就不行了。elasticsearch

PUT index/_settings

{
  "index":{
    "max_result_window":100000000
  }
}

一篇能夠參考的博客:關於搜索elasticsearch的數據條數大於10000的坑 max_result_window的兩種設置方式ide


第二種,Scroll

scroll API 能夠被用來檢索大量的結果, 甚至全部的結果 ,就像在傳統數據庫中使用的遊標 cursor。單元測試

本方法官方文檔:https://www.elastic.co/guide/en/elasticsearch/reference/7.2/search-request-scroll.html#scroll-search-context測試

中文翻譯參考:https://blog.csdn.net/ctwy291314/article/details/82751898fetch

如下代碼是要實現獲取ES中所有文檔的nid字段,並將其存到文件中,是在單元測試中寫的,NID是內部類。ui

具體代碼:this

public static class NID {
    private String nid;
    public String getNid() {
        return nid;
    }
    public void setNid(String nid) {
        this.nid = nid;
    }
}

@Test
public void testScroll() {
    //RestHighLevelClient client = elasticClient.getRestHighLevelClient();
    RestHighLevelClient client = esConfig.client();
    // 初始化scroll
    // 設定滾動時間間隔
    // 這個時間並不須要長到能夠處理全部的數據,僅僅須要足夠長來處理前一批次的結果。每一個 scroll 請求(包含 scroll 參數)設置了一個新的失效時間。
    final Scroll scroll = new Scroll(TimeValue.timeValueMinutes(1L));
    SearchRequest searchRequest = new SearchRequest(esConfig.getCaterIndex()); // 新建索引搜索請求
    searchRequest.scroll(scroll);
    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
    searchSourceBuilder.query(matchAllQuery());
    searchSourceBuilder.size(5000); //設定每次返回多少條數據
    searchSourceBuilder.fetchSource(new String[]{"nid"},null);//設置返回字段和排除字段
    searchRequest.source(searchSourceBuilder);

    SearchResponse searchResponse = null;
    try {
        searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
    } catch (IOException e) {
        e.printStackTrace();
    }

    int page = 0 ;
    File outFile = new File("E://cater_nid.csv");//寫出的CSV文件
    try {
        BufferedWriter writer = new BufferedWriter(new FileWriter(outFile));

        SearchHit[] searchHits = searchResponse.getHits().getHits();
        page++;
        System.out.println("-----第"+ page +"頁-----");
        for (SearchHit searchHit : searchHits) {
            //System.out.println(searchHit.getSourceAsString());
            String sourceAsString = searchHit.getSourceAsString();
            NID t = JSON.parseObject(sourceAsString, NID.class);
            writer.write(t.getNid());
            writer.newLine();
        }

        //遍歷搜索命中的數據,直到沒有數據
        String scrollId = searchResponse.getScrollId();
        while (searchHits != null && searchHits.length > 0) {
            SearchScrollRequest scrollRequest = new SearchScrollRequest(scrollId);
            scrollRequest.scroll(scroll);
            try {
                searchResponse = client.scroll(scrollRequest, RequestOptions.DEFAULT);
            } catch (IOException e) {
                e.printStackTrace();
            }
            scrollId = searchResponse.getScrollId();
            searchHits = searchResponse.getHits().getHits();
            if (searchHits != null && searchHits.length > 0) {
                page++;
                System.out.println("-----第"+ page +"頁-----");
                for (SearchHit searchHit : searchHits) {
                    //System.out.println(searchHit.getSourceAsString());
                    String sourceAsString = searchHit.getSourceAsString();
                    NID t = JSON.parseObject(sourceAsString, NID.class);
                    writer.write(t.getNid());
                    writer.newLine();
                }
            }
        }
        //清除滾屏
        ClearScrollRequest clearScrollRequest = new ClearScrollRequest();
        clearScrollRequest.addScrollId(scrollId);//也能夠選擇setScrollIds()將多個scrollId一塊兒使用
        ClearScrollResponse clearScrollResponse = null;
        try {
            clearScrollResponse = client.clearScroll(clearScrollRequest, RequestOptions.DEFAULT);
        } catch (IOException e) {
            e.printStackTrace();
        }
        boolean succeeded = clearScrollResponse.isSucceeded();
        System.out.println("succeeded:" + succeeded);

        writer.close();

    } catch (IOException e) {
        e.printStackTrace();
    }
}

代碼參考:http://www.javashuo.com/article/p-wjsxfrea-mc.htmlspa







TIM圖片20190628110618

相關文章
相關標籤/搜索