Java 8 Stream 總結

時間 2020-04-18

標籤 java stream 總結欄目 Java 简体版

原文原文鏈接

Stream 簡介

Stream 是什麼

Classes to support functional-style operations on streams of elements, such as map-reduce transformations on collections.

Stream 是 Java 8 新特性，可對 Stream 中元素進行函數式編程操做，例如 map-reduce。html

先來看一段代碼：java

int sum = widgets.stream()
                 .filter(b -> b.getColor() == RED)
                 .mapToInt(b -> b.getWeight())
                 .sum();

這段 Java 代碼看起來是否是像經過 SQL 來操做集合：算法

select sum(weight) from widgets where color='RED';

Stream 類型

java.util.stream 包下提供了一下四種類型的 Stream：sql

Stream ：對象類型對應的 Stream
IntStream ：基本類型 int 對應的 Stream
LongStream ：基本類型 long 對應的 Stream
DoubleStream ：基本類型 double 對應的 Stream

如何得到 Stream

Collection to Stream

List、Set 等 Collection 接口的實現類，能夠經過 Collection.stream() 或 Collection.parallelStream() 方法返回 Stream 對象：編程

List<String> stringList = ...;
Stream<String> stream = stringList.stream();

Array to Stream

能夠經過靜態方法 Arrays.stream(T[] array) 或 Stream.of(T... values) 將數組轉爲 Stream：segmentfault

String[] stringArray = ...;
Stream<String> stringStream1 = Arrays.stream(stringArray); //  方法一
Stream<String> stringStream2 = Stream.of(stringArray); // 方法二

基本類型數組能夠經過相似的方法轉爲 IntStream、LongStream、DoubleStream ：api

int[] intArray = {1, 2, 3};
IntStream intStream1 = Arrays.stream(intArray);
IntStream intStream2 = IntStream.of(intArray);

另外， Stream.of(T... values)、IntStream.of(int... values) 等靜態方法支持 varargs（可變長度參數），可直接建立 Stream：數組

IntStream intStream = IntStream.of(1, 2, 3);

Map to Stream

Map 自己不是 Collection 的實現類，沒有 stream() 或 parallelStream() 方法，能夠經過 Map.entrySet()、Map.keySet()、Map.values() 返回一個 Collection：oracle

Map<Integer, String> map = ...;
Stream<Map.Entry<Integer, String>> stream = map.entrySet().stream();

其餘

String 按字符拆分紅 IntStream：app

String s = "Hello World";
IntStream stringStream = s.chars(); // 返回將字符串每一個 char 轉爲 int 建立 Stream

BufferedReader 生成按行分隔的 Stream<String>：

BufferedReader bufferedReader = ...;
Stream<String> lineStream = bufferedReader.lines();

IntStream、LongStream 提供了靜態方法 range 生成對應的 Stream：

IntStream intStream = IntStream.range(1, 5); // 1,2,3,4 (不包含5)

Stream 的方法

intermediate operation 和 terminal operation

Stream operations are divided into intermediate and terminal operations, and are combined to form stream pipelines. A stream pipeline consists of a source (such as a Collection, an array, a generator function, or an I/O channel); followed by zero or more intermediate operations such as Stream.filter or Stream.map; and a terminal operation such as Stream.forEach or Stream.reduce.

Stream 操做分爲中間操做（intermediate operation）和最終操做（terminal operation），這些操做結合起來造成 stream pipeline。stream pipeline 包含一個 Stream 源，後面跟着零到多個 intermediate operations（例如 Stream.filter、Stream.map），再跟上一個 terminal operation（例如 Stream.forEach、Stream.reduce）。

intermediate operation 用於對 Stream 中元素處理和轉換，terminal operation 用於獲得最終結果。

例如在本文開頭的例子中，包含如下 intermediate operation 和 terminal operation：

int sum = widgets.stream()
                 .filter(b -> b.getColor() == RED) // intermediate operation
                 .mapToInt(b -> b.getWeight()) // intermediate operation
                 .sum(); // terminal operation

intermediate operation

Intermediate operations return a new stream. They are always lazy; executing an intermediate operation such as filter() does not actually perform any filtering, but instead creates a new stream that, when traversed, contains the elements of the initial stream that match the given predicate. Traversal of the pipeline source does not begin until the terminal operation of the pipeline is executed.

intermediate operation 會再次返回一個新的 Stream，因此能夠支持鏈式調用。

intermediate operation 還有一個重要特性，延遲（lazy）性：

IntStream.of(0, 1, 2, 3).filter(i -> {
    System.out.println(i);
    return i > 1;
});

以上這段代碼並不會輸出：1 2 3 4，實際上這段代碼運行後沒有任何輸出，也就是 filter 並未執行。由於 filter 是一個 intermediate operation，若是想要 filter 執行，必須加上一個 terminal operation：

IntStream.of(0, 1, 2, 3).filter(i -> {
    System.out.println(i);
    return i > 1;
}).sum();

intermediate operation 經常使用方法

filter ：按條件過濾，相似於 SQL 中的 where 語句
limit(long n) ：截取 Stream 的前 n 條數據，生成新的 Stream，相似於 MySQL 中的 limit n 語句
skip(long n) ：跳過前 n 條數據，結合 limit 使用 stream.skip(offset).limit(count)，效果至關於 MySQL 中的 LIMIT offset,count 語句
sorted ：排序，相似於 SQL 中的 order by 語句
distinct ：排除 Stream 中重複的元素，經過 equals 方法來判斷重複，這個和 SQL 中的 distinct 相似
boxed ：將 IntStream、LongStream、DoubleStream 轉換爲 Stream<Integer>、Stream<Long>、Stream<Double>
peek ：相似於 forEach，兩者區別是 forEach 是 terminal operation，peek 是 intermediate operation
map、mapToInt、mapToLong、mapToDouble、mapToObj ：這些方法會傳入一個函數做爲參數，將 Stream 中的每一個元素經過這個函數轉換，轉換後組成一個新的 Stream。mapToXxx 中的 Xxx 表示轉換後的元素類型，也就是傳入的函數返回值，例如 mapToInt 就是將原 Stream 中的每一個元素轉爲 int 類型，最終返回一個 IntStream
flatMap、flatMapToInt、flatMapToLong、flatMapToDouble ：相似 map、mapToXxx，不一樣的是 flatMap 會將一個元素轉爲一個 Stream，其中可包含0到多個元素，最終將每一個 Stream 中的全部元素組成一個新的 Stream 返回

map、flatMap 區別

map 和 flatMap 的區別就是 map 是一對一，flatMap 是一對零到多，能夠用下圖簡單說明：

map 示例

經過 mapToInt 獲取一個字符串集合中每一個字符串長度：
```
Stream<String> stringStream = Stream.of("test1", "test23", "test4");
IntStream intStream = stringStream.mapToInt(String::length);
```
經過 String.length 函數能夠將每一個 String 轉爲一個 int，最終組成一個 IntStream。以上代碼中的 stringStream 和 intStream 中的元素是一一對應的，每一個字符串對應一個長度，兩個 Stream 的元素數量是一致的。
flatMap 示例

經過 flatMapToInt 將一個字符串集合中每一個字符串按字符拆分，組成一個新的 Stream：
```
Stream<String> stringStream = Stream.of("test1", "test23", "test4");
IntStream intStream = stringStream.flatMapToInt(String::chars);
```
每一個字符串按字符拆分後可能會獲得 0 到多個字符，最終獲得的 intStream 元素數量和 stringStream 的元素數量可能不一致。

如下表格列出了全部map相關的方法以及轉換規則：

Stream	方法	函數類型	函數參數	函數返回值	轉換後
Stream<T>	map	Function	T	R	Stream<R>
Stream<T>	mapToInt	ToIntFunction	T	int	IntStream
Stream<T>	mapToLong	ToLongFunction	T	long	LongStream
Stream<T>	mapToDouble	ToDoubleFunction	T	double	DoubleStream
Stream<T>	flatMap	Function	T	Stream<R>	Stream<R>
Stream<T>	flatMapToInt	Function	T	IntStream	IntStream
Stream<T>	flatMapToLong	Function	T	LongStream	LongStream
Stream<T>	flatMapToDouble	Function	T	DoubleStream	DoubleStream
IntStream	map	IntUnaryOperator	int	int	IntStream
IntStream	mapToLong	IntToLongFunction	int	long	LongStream
IntStream	mapToDouble	IntToDoubleFunction	int	double	DoubleStream
IntStream	mapToObj	IntFunction	int	R	Stream<R>
IntStream	flatMap	IntFunction	int	IntStream	IntStream
LongStream	map	LongUnaryOperator	long	long	LongStream
LongStream	mapToInt	LongToIntFunction	long	int	IntStream
LongStream	mapToDouble	LongToDoubleFunction	long	double	DoubleStream
LongStream	mapToObj	LongFunction	long	R	Stream<R>
LongStream	flatMap	LongFunction	long	LongStream	LongStream
DoubleStream	map	DoubleUnaryOperator	double	double	DoubleStream
DoubleStream	mapToInt	DoubleToIntFunction	double	int	IntStream
DoubleStream	mapToLong	DoubleToLongFunction	double	long	LongStream
DoubleStream	mapToObj	DoubleFunction	double	R	Stream<R>
DoubleStream	flatMap	DoubleFunction	double	DoubleStream	DoubleStream

例如對一個 Stream<Stirng> 執行 stream.mapToInt(String::length)，能夠理解爲將一個參數爲 String 返回值爲 int 的函數 String::length 傳入 mapToInt 方法做爲參數，最終返回一個 IntStream。

terminal operation

Terminal operations, such as Stream.forEach or IntStream.sum, may traverse the stream to produce a result or a side-effect. After the terminal operation is performed, the stream pipeline is considered consumed, and can no longer be used; if you need to traverse the same data source again, you must return to the data source to get a new stream.

當 terminal operation 執行事後，Stream 就不能再使用了，若是想要再使用就必須從新建立一個新的 Stream：

IntStream intStream = IntStream.of(1, 2, 3);
intStream.forEach(System.out::println); // 第一次執行 terminal operation forEach 正常
intStream.forEach(System.out::println); // 第二次執行會拋出異常 IllegalStateException: stream has already been operated upon or closed

terminal operation 經常使用方法

forEach ：迭代Stream
toArray ：轉爲數組
max ：取最大值
min ：取最小值
sum ：求和
count ： Stream 中元素數量
average ：求平均數
findFirst ：返回第一個元素
findAny ：返回流中的某一個元素
allMatch ：是否全部元素都知足條件
anyMatch ：是否存在元素知足條件
noneMatch ：是否沒有元素知足條件
reduce ：執行聚合操做，上面的 sum、min、max 方法通常是基於 reduce 來實現的
collect ：執行相對 reduce 更加複雜的聚合操做，上面的 average 方法通常是基於 collect 來實現的

reduce

先看一段使用 reduce 來實現 sum 求和的代碼：

IntStream intStream = IntStream.of(1, 2, 4, 5, 8);
int sum = intStream.reduce(0, Integer::sum);

或者

IntStream intStream = IntStream.of(1, 2, 4, 5, 8);
int sum = intStream.reduce(0, (a, b) -> a + b);

上面例子中的 reduce 方法有兩個參數：

identity ：初始值，當 Stream 中沒有元素是也會做爲默認值返回
accumulator ：一個帶有兩個參數和一個返回值的函數，例如上面代碼中的 Integer::sum 或者 (a, b) -> a + b 求和函數

以上代碼等同於：

int result = identity;
for (int element : intArray)
    result = Integer.sum(result, element); // 或者 result = result + element;
return result;

collect

先看一段代碼，將一個 Stream<String> 中的元素拼接成一個字符串，若是用 reduce 能夠這樣實現：

Stream<String> stream = Stream.of("Hello", "World");
String result = stream.reduce("", String::concat); // 或者 String result = stream.reduce("", (a, b) -> a + b);

當 Stream 中有大量元素是，用字符串拼接方式性能會大打折扣，應該使用性能更高的 StringBuilder，能夠經過 collect 方法來實現：

Stream<String> stream = Stream.of("Hello", "World");
StringBuilder result = stream.collect(StringBuilder::new, StringBuilder::append, StringBuilder::append);

上面例子中的 collect 方法有三個參數：

supplier ：傳入一個函數，用於建立一個存放聚合計算結果的容器（result container），例如上面的例子中第一個傳入參數 StringBuilder::new ，該函數用於建立一個新的 StringBuilder 來存放接成字符串的結果
accumulator ：傳入一個函數，用於將 Stream 中的一個元素合併到 result container 中，例如上面的例子中第二個傳入參數 StringBuilder::append ，該函數用於將 Stream 中的字符串 append 到 StringBuilder 中
combiner ：傳入一個函數，用於將兩個 result container 合併，這個函數通常會在並行流中用到，例如上面的例子中第三個傳入參數 StringBuilder::append ，該函數用於將兩個 StringBuilder 合併

下面再用 collect 實現求平均數：

計算平均數須要有兩個關鍵的數據：數量、總和，首先須要建立一個 result container 存放這兩個值，並定義相關方法：

public class Averager {
    private int total = 0;
    private int count = 0;

    public double average() {
        return count > 0 ? ((double) total) / count : 0;
    }

    public void accumulate(int i) {
        total += i;
        count++;
    }

    public void combine(Averager other) {
        total += other.total;
        count += other.count;
    }
}

經過計算平均值：

IntStream intStream = IntStream.of(1, 2, 3, 4);
Averager averager = intStream.collect(Averager::new, Averager::accumulate, Averager::combine);
System.out.println(averager.average()); // 2.5

Collector

Stream 接口中還有一個 collect 的重載方法，僅有一個參數：collect(Collector collector)。

Collector 是什麼：

This class encapsulates the functions used as arguments in the collect operation that requires three arguments (supplier, accumulator, and combiner functions).

Collector 實際上就是一個包含 supplier、accumulator、combiner 函數的類，能夠實現對經常使用聚合算法的抽象和複用。

例如將 Stream<String> 中的元素拼接成一個字符串，用 Collector 實現：

public class JoinCollector implements Collector<String, StringBuilder, String> {

    @Override
    public Supplier<StringBuilder> supplier() {
        return StringBuilder::new;
    }

    @Override
    public BiConsumer<StringBuilder, String> accumulator() {
        return StringBuilder::append;
    }

    @Override
    public BinaryOperator<StringBuilder> combiner() {
        return StringBuilder::append;
    }

    @Override
    public Function<StringBuilder, String> finisher() {
        return StringBuilder::toString;
    }

    @Override
    public Set<Characteristics> characteristics() {
        return Collections.emptySet();
    }
}

或者直接用 Collector.of() 靜態方法直接建立一個 Collector 對象：

Collector<String, StringBuilder, String> joinCollector = Collector.of(StringBuilder::new,
                StringBuilder::append,
                StringBuilder::append,
                StringBuilder::toString);
Stream<String> stream = Stream.of("Hello", "World");
String result = stream.collect(joinCollector);

另外還有一個更簡單的方式，使用 Collectors.joining()：

Stream<String> stream = Stream.of("Hello", "World");
String result = stream.collect(Collectors.joining());

Collectors

在 java.util.stream.Collectors ：中提供了大量經常使用的 Collector：

Collectors.toList() ：將 Stream 轉爲 List
Collectors.toSet() ：將 Stream 轉爲 Set
Collectors.joining() ：將 Stream 中的字符串拼接
Collectors.groupingBy() ：將 Stream 中的元素分組，相似於 SQL 中的 group by 語句
Collectors.counting() ：用於計算 Stream 中元素數量，stream.collect(Collectors.counting()) 等同於 stream.count()
Collectors.averagingDouble()、Collectors.averagingInt()、Collectors.averagingLong() ：計算平均數

上面只列出了 Collectors 中的一部分方法，還有其餘經常使用的方法能夠參考文檔。

下面列出一些 Collectors 的實用示例：

將 Stream 轉爲 List：

Stream<String> stream = Stream.of("Hello", "World");
List<String> list = stream.collect(Collectors.toList());

將學生（Student）按年齡分組，返回每一個年齡對應的學生列表：

Stream<Student> stream = ...;
Map<Integer, List<Student>> data = stream.collect(Collectors.groupingBy(Student::getAge));

將學生（Student）按年齡分組，返回每一個年齡對應的學生數量，實現和 SQL 同樣的結果： select age,count(*) from student group by age：
```
Stream<Student> stream = ...;
Map<Integer, Long> data = stream.collect(Collectors.groupingBy(Student::getAge, Collectors.counting()));
```

計算學生（Student）年齡平均數：

Stream<Student> stream = ...;
Double data = stream.collect(Collectors.averagingInt(Student::getAge)); 
// 或者能夠 double average = stream.mapToInt(Student::getAge).average().getAsDouble();