詳解Java 8中Stream類型的「懶」加載

時間 2019-11-09

標籤詳解 java stream 類型加載欄目 Java 简体版

原文原文鏈接

在進入正題以前，咱們須要先引入Java 8中Stream類型的兩個很重要的操做：java

中間和終結操做(Intermediate and Terminal Operation)

Stream類型有兩種類型的方法：app

中間操做(Intermediate Operation)
終結操做(Terminal Operation)

官方文檔給出的描述爲［不想看字母的請直接跳過］：less

Stream operations are divided into intermediate and terminal operations, and are combined to form stream pipelines. A stream pipeline consists of a source (such as a Collection, an array, a generator function, or an I/O channel); followed by zero or more intermediate operations such as Stream.filter or Stream.map; and a terminal operation such as Stream.forEach or Stream.reduce.

Intermediate operations return a new stream. They are always lazy; executing an intermediate operation such as filter() does not actually perform any filtering, but instead creates a new stream that, when traversed, contains the elements of the initial stream that match the given predicate. Traversal of the pipeline source does not begin until the terminal operation of the pipeline is executed.

Terminal operations, such as Stream.forEach or IntStream.sum, may traverse the stream to produce a result or a side-effect. After the terminal operation is performed, the stream pipeline is considered consumed, and can no longer be used; if you need to traverse the same data source again, you must return to the data source to get a new stream. In almost all cases, terminal operations are eager, completing their traversal of the data source and processing of the pipeline before returning. Only the terminal operations iterator() and spliterator() are not; these are provided as an "escape hatch" to enable arbitrary client-controlled pipeline traversals in the event that the existing operations are not sufficient to the task.

Processing streams lazily allows for significant efficiencies; in a pipeline such as the filter-map-sum example above, filtering, mapping, and summing can be fused into a single pass on the data, with minimal intermediate state. Laziness also allows avoiding examining all the data when it is not necessary; for operations such as "find the first string longer than 1000 characters", it is only necessary to examine just enough strings to find one that has the desired characteristics without examining all of the strings available from the source. (This behavior becomes even more important when the input stream is infinite and not merely large.)

其實看完這個官方文檔，擼主整我的是很蒙圈的，給你們講講官方文檔這段話到底說了些什麼：ide

第一段：流操做分爲中間操做和終結操做（我就這麼翻譯了啊），這兩種操做外加數據源就構成了所謂的pipeline，處理管道。spa

第二段：說中間操做會返回一個流；中間操做是懶的(lazy，究竟怎麼個懶法，咱們後面會講到）；還拿filter舉了個例子說，執行中間操做filter的時候實際上並無進行任何的過濾操做，而是建立了一個新的流，這個新流包含啥呢？包含的是在遍歷原來流（initial stream）過程當中符合篩選條件的元素（很奇怪哎，這不明顯是一個過濾操做嗎？怎麼說沒有呢）；要注意的是：中間操做在pipeline執行到終結操做以前是不會開始執行的（這將在咱們後面的內容中講到）；翻譯

第三段：人家說了，終結操做是eager的，也就是說，執行到終結操做的時候我就要開始遍歷數據源而且執行中間操做這個過程了，不會再去等誰了。並且一旦pipeline中的終結操做完成了，那麼這個pipeline的使命就完成了，若是你還有新的終結操做，那麼對不起，這個舊的pipeline就用不了了，你得新建一個stream，而後在造一遍輪子。這裏有一句話我實在沒弄明白什麼意思啊，"code

Only the terminal operations iterator() and spliterator() are not; these are provided as an "escape hatch" to enable arbitrary client-controlled pipeline traversals in the event that the existing operations are not sufficient to the task.

"，還但願道友們幫忙解釋一下，感激涕零！orm

第四段：誇了一下stream「懶」執行的好處：效率高。將中間操做融合在一塊兒，使操做對對象的狀態改變最小化；並且還能使咱們避免一些不必的工做，給了個例子：在一堆字符串裏要找出第一個含超過1000個字符的字符串，經過stream operation的laziness那麼咱們就不用遍歷所有元素了，只需執行能找出知足條件的元素的操做就行（其實這個需求不經過stream pipeline也能作到不是嗎？）；其實最重要的仍是當面對一個無限數據源的操做時，它的不可替代性才體現了出來，由於經典java中collection是finite的，固然這個不是咱們今天的目標，這裏就不拓展開講了。對象

願文檔後面還有一點內容，講了中間操做有的是持有狀態的(stateful)，有的是無狀態的(stateless)，他們在對原數據的遍歷上也有一些不一樣感興趣的同窗可本身去研究研究，咱們今天主要仍是看看中間操做是怎麼個「懶」法以及這個「懶」的過程是怎麼樣的。blog

Stream之因此「懶」的祕密也在於每次在使用Stream時，都會鏈接多箇中間操做，並在最後附上一個結束操做。像map()和filter()這樣的方法是中間操做，在調用它們時，會當即返回另外一個Stream對象。而對於reduce()及findFirst()這樣的方法，它們是終結操做，在調用它們時纔會執行真正的操做來獲取須要的值。

從一個例子出發：

好比，當咱們須要打印出第一個長度爲3的大寫名字時：

public class LazyStreams {
    private static int length(final String name) {
        System.out.println("getting length for " + name);
        return name.length();
    }
    private static String toUpper(final String name ) {
        System.out.println("converting to uppercase: " + name);
        return name.toUpperCase();
    }
    public static void main(final String[] args) {
        List<String> names = Arrays.asList("Brad", "Kate", "Kim", "Jack", "Joe", "Mike", "Susan", "George", "Robert", "Julia", "Parker", "Benson");

        final String firstNameWith3Letters = names.stream()
            .filter(name -> length(name) == 3)
            .map(name -> toUpper(name))
            .findFirst()
            .get();

        System.out.println(firstNameWith3Letters);
    }
}

你可能認爲以上的代碼會對names集合進行不少操做，好比首先遍歷一次集合獲得長度爲3的全部名字，再遍歷一次filter獲得的集合，將名字轉換爲大寫。最後再從大寫名字的集合中找到第一個並返回。這也是經典狀況下Java Eager處理的角度。此時的處理順序是這樣的

對於Stream操做，更好的代碼閱讀順序是從右到左，或者從下到上。每個操做都只會作到恰到好處。若是以Eager的視角來閱讀上述代碼，它也許會執行15步操做：

但是實際狀況並非這樣，不要忘了Stream但是很是「懶」的，它不會執行任何多餘的操做。實際上，只有當findFirst方法被調用時，filter和map方法纔會被真正觸發。而filter也不會一口氣對整個集合實現過濾，它會一個個的過濾，若是發現了符合條件的元素，會將該元素置入到下一個中間操做，也就是map方法中。因此實際的狀況是這樣的：

控制檯的輸出是這樣的：

getting length for Brad
getting length for Kate
getting length for Kim
converting to uppercase: Kim
KIM

爲了更好理解上述過程，咱們將Lambda表達式換爲經典的Java寫法，即匿名內部類的形式：

final String firstNameWith3Letters = names.stream()
            .filter(new Predicate<String>{
                public boolean test(String name){
                    return length(name)==3;
                }
             })
            .map(new Function<String,String>{
                public String apply(String name){
                    return toUpper(name);
                }
            })
            .findFirst()
            .get();

執行的見下圖：

很容易得出以前的結論：只有當findFirst方法被調用時，filter和map方法纔會被真正觸發。而filter也不會一口氣對整個集合實現過濾，它會一個個的過濾，若是發現了符合條件的元素，會將該元素置入到下一個中間操做，也就是map方法中。

當終結操做得到了它須要的答案時，整個計算過程就結束了。若是沒有得到到答案，那麼它會要求中間操做對更多的集合元素進行計算，直到找到答案或者整個集合被處理完畢。

JDK會將全部的中間操做合併成一個，這個過程被稱爲熔斷操做(Fusing Operation)。所以，在最壞的狀況下(即集合中沒有符合要求的元素)，集合也只會被遍歷一次，而不會像咱們想象的那樣執行了屢次遍歷，也許這就回答了官方文檔中爲何說"Processing streams lazily allows for significant efficiencies"了。

爲了看清楚在底層發生的事情，咱們能夠將以上對Stream的操做按照類型進行分割：

Stream<String> namesWith3Letters = names.stream()
    .filter(name -> length(name) == 3)
    .map(name -> toUpper(name));

System.out.println("Stream created, filtered, mapped...");
System.out.println("ready to call findFirst...");

final String firstNameWith3Letters = namesWith3Letters.findFirst().get();

System.out.println(firstNameWith3Letters);

// 輸出結果 // Stream created, filtered, mapped... // ready to call findFirst... // getting length for Brad // getting length for Kate // getting length for Kim // converting to uppercase: Kim // KIM