3分鐘看完Java 8——史上最強Java 8新特性總結之第二篇 Stream API

時間 2019-12-05

標籤 3分看完 java 史上最強特性總結第二 stream api 欄目 Java 简体版

原文原文鏈接

概況

1. Stream API：以聲明性方式處理數據集合，即說明想要完成什麼（好比篩選熱量低的菜餚）而不是說明如何實現一個操做（利用循環和if條件等控制流語句）。

2. Stream API特色

a) 流水線：不少流操做自己會返回一個流，這樣多個操做就能夠連接起來，造成一個大的流水線。這讓可實現延遲和短路優化。

b) 內部迭代：與使用迭代器顯式迭代的集合不一樣，流的迭代操做是在背後進行的。

3. Stream（流）：從支持數據處理操做的源生成的元素序列（A sequence of elements from a source that supports data processing operations）。

a) 元素序列：與集合相似，流也提供了一個接口（java.util.stream.Stream），能夠訪問特定元素類型的一組有序值。由於集合是數據結構，因此它的主要目的是以特定的時間/空間複雜度存儲和訪問元素（如ArrayList、LinkedList）；但流的目的在於表達計算，好比filter、sorted和map。

b) 源：流會使用一個提供數據的源，如集合、數組或輸入/輸出。注意，從有序集合生成流時會保留原有的順序。

c) 數據處理操做：流的數據處理功能支持相似於數據庫的操做，以及函數式編程語言中的經常使用操做，如filter、map、reduce、find、match、sort等。流操做能夠順序執行，也可並行執行。

4. 流操做分類

a) 中間操做（Intermediate Operations）：能夠鏈接起來的流操做，並不會生成任何結果。

b) 終端操做（Terminal Operations）：關閉流的操做，處理流水線以返回結果。

c) 經常使用中間操做

操做	返回類型	操做參數	函數描述符
filter	Stream<T>	Predicate<T>	T -> boolean
map	Stream<R>	Function<T, R>	T -> R
limit	Stream<T>
sorted	Stream<T>	Comparator<T>	(T, T) -> R
distinct	Stream<T>

d) 經常使用終端操做

操做	目的
forEach	消費流中的每一個元素並對其應用Lambda。這一操做返回void。
count	返回流中元素的個數。這一操做返回long。
collect	把流歸約成一個集合，好比List、Map甚至是Integer。

5. 舉例

a) Dish.java（後續舉例將屢次使用到該類）

 1 public class Dish {
 2     private final String name;
 3     private final boolean vegetarian;
 4     private final int calories;
 5     private final Type type;
 6 
 7     public enum Type {MEAT, FISH, OTHER}
 8 
 9     public Dish(String name, boolean vegetarian, int calories, Type type) {
10         this.name = name;
11         this.vegetarian = vegetarian;
12         this.calories = calories;
13         this.type = type;
14     }
15 
16     public String getName() {
17         return name;
18     }
19 
20     public boolean isVegetarian() {
21         return vegetarian;
22     }
23 
24     public int getCalories() {
25         return calories;
26     }
27 
28     public Type getType() {
29         return type;
30     }
31 
32     @Override
33     public String toString() {
34         return name;
35     }
36 
37 }

b) DishUtils.java（後續舉例將屢次使用到該類）

 1 import java.util.Arrays;
 2 import java.util.List;
 3 
 4 public class DishUtils {
 5 
 6     public static List<Dish> makeMenu() {
 7         return Arrays.asList(
 8                 new Dish("pork", false, 800, Dish.Type.MEAT),
 9                 new Dish("beef", false, 700, Dish.Type.MEAT),
10                 new Dish("chicken", false, 400, Dish.Type.MEAT),
11                 new Dish("french fries", true, 530, Dish.Type.OTHER),
12                 new Dish("rice", true, 350, Dish.Type.OTHER),
13                 new Dish("season fruit", true, 120, Dish.Type.OTHER),
14                 new Dish("pizza", true, 550, Dish.Type.OTHER),
15                 new Dish("prawns", false, 300, Dish.Type.FISH),
16                 new Dish("salmon", false, 450, Dish.Type.FISH));
17     }
18 
19     public static <T> void printList(List<T> list) {
20         for (T i : list) {
21             System.out.println(i);
22         }
23     }
24 
25 }

c) Test.java

 1 import java.util.List;
 2 
 3 import static java.util.stream.Collectors.toList;
 4 
 5 public class Test {
 6 
 7     public static void main(String[] args) {
 8         List<String> names = DishUtils.makeMenu().stream() // 獲取流
 9                 .filter(d -> d.getCalories() > 300) // 中間操做，選出高熱量菜
10                 .map(Dish::getName) // 中間操做，獲取菜名
11                 .limit(3) // 中間操做，選出前三
12                 .collect(toList()); // 終端操做，將結果保存在List中
13         DishUtils.printList(names);
14 
15         DishUtils.makeMenu().stream()
16                 .filter(d -> d.getCalories() > 300)
17                 .map(Dish::getName)
18                 .limit(3)
19                 .forEach(System.out::println); // 遍歷並打印
20     }
21 
22 }

d) 示意圖

篩選（Filtering）

1. 篩選相關方法

a) filter()方法：使用Predicate篩選流中元素。

b) distinct()方法：調用流中元素的hashCode()和equals()方法去重元素。

2. 舉例

 1 import java.util.Arrays;
 2 import java.util.List;
 3 import static java.util.stream.Collectors.toList;
 4 // filter()方法
 5 List<Dish> vegetarianMenu = DishUtils.makeMenu().stream()
 6         .filter(Dish::isVegetarian)
 7         .collect(toList());
 8 DishUtils.printList(vegetarianMenu);
 9 System.out.println("-----");
10 // distinct()方法
11 List<Integer> numbers = Arrays.asList(1, 2, 1, 3, 3, 2, 4);
12 numbers.stream()
13         .filter(i -> i % 2 == 0)
14         .distinct()
15         .forEach(System.out::println);

切片（Slicing）

1. 切片相關方法

a) limit()方法：返回一個不超過給定長度的流。

b) skip()方法：返回一個扔掉了前n個元素的流。若是流中元素不足n個，則返回一個空流。

2. 舉例

 1 import java.util.List;
 2 import static java.util.stream.Collectors.toList;
 3 // limit()方法
 4 List<Dish> dishes1 = DishUtils.makeMenu().stream()
 5         .filter(d -> d.getCalories() > 300)
 6         .limit(3)
 7         .collect(toList());
 8 DishUtils.printList(dishes1);
 9 System.out.println("-----");
10 // skip()方法
11 List<Dish> dishes2 = DishUtils.makeMenu().stream()
12         .filter(d -> d.getCalories() > 300)
13         .skip(2)
14         .collect(toList());
15 DishUtils.printList(dishes2);

映射（Mapping）

1. 映射相關方法

a) map()方法：接受一個函數做爲參數，該函數用於將每一個元素映射成一個新的元素。

b) flatMap()方法：接受一個函數做爲參數，該函數用於將每一個數組元素映射成新的扁平化流。

c) 注意：map()、flatMap()方法都不會修改原元素。

2. 舉例

 1 import java.util.Arrays;
 2 import java.util.List;
 3 import static java.util.stream.Collectors.toList;
 4 // map()方法
 5 List<Integer> dishNameLengths = DishUtils.makeMenu().stream()
 6         .map(Dish::getName)
 7         .map(String::length)
 8         .collect(toList());
 9 DishUtils.printList(dishNameLengths);
10 System.out.println("-----");
11 // flatMap()方法
12 String[] arrayOfWords = {"Goodbye", "World"};
13 Arrays.stream(arrayOfWords)
14         .map(w -> w.split("")) // 將每一個單詞轉換爲由其字母構成的數組
15         .flatMap(Arrays::stream) // 將各個生成流扁平化爲單個流
16         .distinct() // 去重
17         .forEach(System.out::println);

匹配（Matching）

1. 匹配相關方法

a) anyMatch()方法：檢查流中是否有一個元素能匹配給定的Predicate。

b) allMatch()方法：檢查流中是否全部元素能匹配給定的Predicate。

c) noneMatch()方法：檢查流中是否全部元素都不匹配給定的Predicate。

2. 舉例

 1 // anyMatch()方法
 2 if (DishUtils.makeMenu().stream().anyMatch(Dish::isVegetarian)) {
 3     System.out.println("The menu is (somewhat) vegetarian friendly!!");
 4 }
 5 // allMatch()方法
 6 boolean isHealthy1 = DishUtils.makeMenu().stream()
 7         .allMatch(d -> d.getCalories() < 1000);
 8 System.out.println(isHealthy1);
 9 // noneMatch()方法
10 boolean isHealthy2 = DishUtils.makeMenu().stream()
11         .noneMatch(d -> d.getCalories() >= 1000);
12 System.out.println(isHealthy2);

查找（Finding）

1. 查找相關方法

a) findAny()方法：返回當前流中的任意元素，返回類型爲java.util.Optional（Java 8用於解決NullPointerException的新類）。

b) findFirst()方法：與findAny()方法相似，區別在於返回第一個元素。

2. 舉例

 1 import java.util.Arrays;
 2 import java.util.List;
 3 import java.util.Optional;
 4 // findAny()方法
 5 Optional<Dish> dish = DishUtils.makeMenu().stream()
 6         .filter(Dish::isVegetarian)
 7         .findAny();
 8 System.out.println(dish.get()); // french fries
 9 // findFirst()方法
10 List<Integer> someNumbers = Arrays.asList(1, 2, 3, 4, 5);
11 Optional<Integer> firstSquareDivisibleByThree = someNumbers.stream()
12                 .map(x -> x * x)
13                 .filter(x -> x % 3 == 0)
14                 .findFirst(); // 9
15 System.out.println(firstSquareDivisibleByThree.get());

歸約（Reducing）

1. 歸約相關方法

a) reduce()方法：把一個流中的元素組合起來，也叫摺疊（fold）。

i. 若是指定初始值，則直接返回歸約結果值。

ii. 若是不指定初始值，則返回Optional。

2. 舉例

 1 import java.util.ArrayList;
 2 import java.util.List;
 3 import java.util.Optional;
 4 List<Integer> numbers = new ArrayList<>();
 5 for (int n = 1; n <= 100; n++) {
 6     numbers.add(n);
 7 }
 8 // 元素求和
 9 int sum1 = numbers.stream().reduce(0, (a, b) -> a + b); // 指定初始值0
10 System.out.println(sum1);
11 Optional<Integer> sum2 = numbers.stream().reduce((a, b) -> a + b); // 不指定初始值0
12 System.out.println(sum2);
13 int sum3 = numbers.stream().reduce(0, Integer::sum); // 方法引用
14 System.out.println(sum3);
15 // 最大值
16 Optional<Integer> max1 = numbers.stream().reduce((a, b) -> a < b ? b : a); // Lambda表達式
17 System.out.println(max1);
18 Optional<Integer> max2 = numbers.stream().reduce(Integer::max); // 方法引用
19 System.out.println(max2);
20 // 統計個數
21 int count1 = DishUtils.makeMenu().stream()
22         .map(d -> 1)
23         .reduce(0, (a, b) -> a + b); // MapReduce編程模型，更易並行化
24 System.out.println(count1);
25 long count2 = DishUtils.makeMenu().stream().count();
26 System.out.println(count2);

排序（Sorting）

1. 排序相關方法

a) sorted()方法：根據指定的java.util.Comparator規則排序。

2. 舉例

1 import static java.util.Comparator.comparing;
2 DishUtils.makeMenu().stream()
3         .sorted(comparing(Dish::getCalories))
4         .forEach(System.out::println);

數值流（Numeric streams）

原始類型流（Primitive stream）

1. 使用目的：避免自動裝箱帶來的開銷。

2. 相關方法

a) mapToInt()：將流轉換爲原始類型流IntStream。

b) mapToDouble()：將流轉換爲原始類型流DoubleStream。

c) mapToLong()：將流轉換爲原始類型流LongStream。

d) boxed()：將原始類型流轉換爲對象流。

3. Optional的原始類型版本：OptionalInt、OptionalDouble和OptionalLong。

4. 舉例

 1 import java.util.OptionalInt;
 2 import java.util.stream.IntStream;
 3 import java.util.stream.Stream;
 4 // 映射到數值流
 5 int calories = DishUtils.makeMenu().stream() // 返回Stream<Dish>
 6         .mapToInt(Dish::getCalories) // 返回IntStream
 7         .sum();
 8 System.out.println(calories);
 9 // 轉換回對象流
10 IntStream intStream = DishUtils.makeMenu().stream().mapToInt(Dish::getCalories); // 將Stream 轉換爲數值流
11 Stream<Integer> stream = intStream.boxed(); // 將數值流轉換爲Stream
12 // OptionalInt
13 OptionalInt maxCalories = DishUtils.makeMenu().stream()
14         .mapToInt(Dish::getCalories)
15         .max();
16 int max = maxCalories.orElse(1); // 若是沒有最大值的話，顯式提供一個默認最大值
17 System.out.println(max);

數值範圍（Numeric ranges）

1. 數值範圍相關方法

a) range()方法：生成起始值到結束值範圍的數值，不包含結束值。

b) rangeClosed()方法：生成起始值到結束值範圍的數值，包含結束值。

2. 舉例

1 import java.util.stream.IntStream;
2 IntStream.range(1, 5).forEach(System.out::println); // 1～4
3 IntStream.rangeClosed(1, 5).forEach(System.out::println); // 1～5

構建流

由值建立流

1. 舉例

a) Stream.of()方法

1 import java.util.stream.Stream;
2 Stream<String> stream = Stream.of("Java 8 ", "Lambdas ", "In ", "Action");
3 stream.map(String::toUpperCase).forEach(System.out::println);

b) 空流

1 import java.util.stream.Stream;
2 Stream<String> emptyStream = Stream.empty();

由數組建立流

1. 舉例

1 int[] numbers = {2, 3, 5, 7, 11, 13};
2 int sum = Arrays.stream(numbers).sum();
3 System.out.println(sum); // 41

由文件生成流

1. 舉例

1 try (Stream<String> lines = Files.lines(Paths.get("data.txt"), Charset.defaultCharset())) {
2     long uniqueWords = lines.flatMap(line -> Arrays.stream(line.split(" ")))
3             .distinct()
4             .count();
5     System.out.println(uniqueWords);
6 } catch (IOException e) {
7     e.printStackTrace();
8 }

由函數生成流（建立無限流）

1. 無限流：沒有固定大小的流。

2. 相關方法

a) Stream.iterate()方法：生成無限流，其初始值爲第1個參數，下一個值由第2個參數的Lambda表達式生成。

b) Stream.generate()方法：生成無限流，其值由參數的Lambda表達式生成。

3. 注意：通常，應該使用limit(n)對無限流加以限制，以免生成無窮多個值。

4. 舉例

1 Stream.iterate(0, n -> n + 2)
2         .limit(5)
3         .forEach(System.out::println); // 0 2 4 6 8
4 Stream.generate(Math::random)
5         .limit(5)
6         .forEach(System.out::println);

collect()高級用法

歸約和彙總（Reducing and summarizing）

1. 舉例

a) 按元素某字段查找最大值

1 import java.util.Comparator;
2 import java.util.Optional;
3 import static java.util.stream.Collectors.maxBy;
4 Comparator<Dish> dishCaloriesComparator = Comparator.comparingInt(Dish::getCalories);
5 Optional<Dish> mostCalorieDish = DishUtils.makeMenu().stream()
6         .collect(maxBy(dishCaloriesComparator));
7 System.out.println(mostCalorieDish);

b) 按元素某字段求和

1 import static java.util.stream.Collectors.summingInt;
2 int totalCalories = DishUtils.makeMenu().stream().collect(summingInt(Dish::getCalories));
3 System.out.println(totalCalories);

c) 按元素某字段求平均值

1 import static java.util.stream.Collectors.averagingInt;
2 double avgCalories = DishUtils.makeMenu().stream().collect(averagingInt(Dish::getCalories));
3 System.out.println(avgCalories);

d) 鏈接字符串

1 import static java.util.stream.Collectors.joining;
2 String shortMenu = DishUtils.makeMenu().stream().map(Dish::getName).collect(joining(", "));
3 System.out.println(shortMenu);

e) 廣義歸約

 1 // 全部熱量求和
 2 import static java.util.stream.Collectors.reducing;
 3 // i.e.
 4 // int totalCalories = DishUtils.makeMenu().stream()
 5 //         .mapToInt(Dish::getCalories) // 轉換函數
 6 //         .reduce(0, Integer::sum); // 初始值、累積函數
 7 int totalCalories = DishUtils.makeMenu().stream()
 8         .collect(reducing(
 9                 0, // 初始值
10                 Dish::getCalories, // 轉換函數
11                 Integer::sum)); // 累積函數
12 System.out.println(totalCalories);

分組（Grouping）

1. 分組：相似SQL語句的group by，區別在於這裏的分組可聚合（即SQL的聚合函數），也可不聚合。

2. 舉例

a) 簡單分組

1 Map<Dish.Type, List<Dish>> dishesByType = DishUtils.makeMenu().stream()
2                 .collect(groupingBy(Dish::getType));
3 System.out.println(dishesByType); // {FISH=[prawns, salmon], MEAT=[pork, beef, chicken], OTHER=[french fries, rice, season fruit, pizza]}

b) 複雜分組

1 import static java.util.stream.Collectors.groupingBy;
2 public enum CaloricLevel {DIET, NORMAL, FAT}
3 Map<CaloricLevel, List<Dish>> dishesByCaloricLevel = DishUtils.makeMenu().stream().collect(
4         groupingBy(dish -> {
5             if (dish.getCalories() <= 400) return CaloricLevel.DIET;
6             else if (dish.getCalories() <= 700) return CaloricLevel.NORMAL;
7             else return CaloricLevel.FAT;
8         }));
9 System.out.println(dishesByCaloricLevel); // {NORMAL=[beef, french fries, pizza, salmon], DIET=[chicken, rice, season fruit, prawns], FAT=[pork]}

c) 多級分組

 1 import static java.util.stream.Collectors.groupingBy;
 2 public enum CaloricLevel {DIET, NORMAL, FAT}
 3 Map<Dish.Type, Map<CaloricLevel, List<Dish>>> dishesByTypeCaloricLevel = DishUtils.makeMenu().stream().collect(
 4         groupingBy(Dish::getType, // 一級分類函數
 5                 groupingBy(dish -> { // 二級分類函數
 6                     if (dish.getCalories() <= 400) return CaloricLevel.DIET;
 7                     else if (dish.getCalories() <= 700) return CaloricLevel.NORMAL;
 8                     else return CaloricLevel.FAT;
 9                 })
10         )
11 );
12 System.out.println(dishesByTypeCaloricLevel);
13 // {FISH={NORMAL=[salmon], DIET=[prawns]}, MEAT={NORMAL=[beef], DIET=[chicken], FAT=[pork]}, OTHER={NORMAL=[french fries, pizza], DIET=[rice, season fruit]}}

d) 分組聚合

 1 import static java.util.Comparator.comparingInt;
 2 import static java.util.stream.Collectors.groupingBy;
 3 import static java.util.stream.Collectors.counting;
 4 Map<Dish.Type, Long> typesCount = DishUtils.makeMenu().stream()
 5         .collect(groupingBy(Dish::getType, counting()));
 6 System.out.println(typesCount); // {FISH=2, MEAT=3, OTHER=4}
 7 
 8 Map<Dish.Type, Optional<Dish>> mostCaloricByType1 = DishUtils.makeMenu().stream()
 9                 .collect(groupingBy(Dish::getType, maxBy(comparingInt(Dish::getCalories))));
10 System.out.println(mostCaloricByType1); // {FISH=Optional[salmon], MEAT=Optional[pork], OTHER=Optional[pizza]}
11 
12 Map<Dish.Type, Dish> mostCaloricByType2 = DishUtils.makeMenu().stream()
13                 .collect(groupingBy(Dish::getType, // 分類函數
14                         collectingAndThen(
15                                 maxBy(comparingInt(Dish::getCalories)), // 包裝後的收集器
16                                 Optional::get))); // 轉換函數
17 System.out.println(mostCaloricByType2); // {FISH=salmon, MEAT=pork, OTHER=pizza}

分區（Partitioning）

1. 分區：分區是分組的特殊狀況，即根據Predicate<T>分組爲true和false兩組，所以分組後的Map的Key是Boolean類型。

2. 舉例

 1 import java.util.List;
 2 import java.util.Map;
 3 import java.util.Optional;
 4 import static java.util.Comparator.comparingInt;
 5 import static java.util.stream.Collectors.*;
 6 Map<Boolean, List<Dish>> partitionedMenu = DishUtils.makeMenu().stream()
 7         .collect(partitioningBy(Dish::isVegetarian));
 8 System.out.println(partitionedMenu);
 9 // {false=[pork, beef, chicken, prawns, salmon], true=[french fries, rice, season fruit, pizza]}
10 
11 Map<Boolean, Map<Dish.Type, List<Dish>>> vegetarianDishesByType = DishUtils.makeMenu().stream()
12         .collect(partitioningBy(Dish::isVegetarian, groupingBy(Dish::getType)));
13 System.out.println(vegetarianDishesByType);
14 // {false={FISH=[prawns, salmon], MEAT=[pork, beef, chicken]}, true={OTHER=[french fries, rice, season fruit, pizza]}}
15 
16 Map<Boolean, Dish> mostCaloricPartitionedByVegetarian = DishUtils.makeMenu().stream()
17         .collect(partitioningBy(Dish::isVegetarian, collectingAndThen(maxBy(comparingInt(Dish::getCalories)), Optional::get)));
18 System.out.println(mostCaloricPartitionedByVegetarian);
19 // {false=pork, true=pizza}

並行流

1. 並行流：一個把內容分紅多個數據塊，並用不一樣的線程分別處理每一個數據塊的流。

2. 並行流相關方法

a) parallel()方法：將順序流轉換爲並行流。

b) sequential()方法：將並行流轉換爲順序流。

c) 以上兩方法並無對流自己有任何實際的變化，只是在內部設了一個boolean標誌，表示讓調用parallel()/sequential()以後進行的全部操做都並行/順序執行。

3. 並行流原理：並行流內部默認使用ForkJoinPool，其默認的線程數爲CPU核數（經過Runtime.getRuntime().availableProcessors()獲取），同時支持經過系統屬性設置（全局），好比：

System.setProperty('java.util.concurrent.ForkJoinPool.common.parallelism','12');

4. 什麼時候並行流更有效？

a) 實測：在待運行的特定機器上，分別用順序流和並行流作基準測試性能。

b) 注意裝/拆箱：自動裝箱和拆箱會大大下降性能，應避免。

c) 某些操做性能並行流比順序流差：好比limit()和findFirst()，由於在並行流上執行代價較大。

d) 計算流操做流水線的總成本：設N是要處理的元素的總數，Q是一個元素經過流水線的大體處理成本，則N*Q就是這個對成本的一個粗略的定性估計。Q值較高就意味着使用並行流時性能好的可能性比較大。

e) 數據量較小時並行流比順序流性能差：由於並行化會有額外開銷。

f) 流背後的數據結構是否易於分解：見下表。

數據結構	可分解性
ArrayList	極佳
LinkedList	差
IntStream.range	極佳
Stream.iterate	差
HashSet	好
TreeSet	好

g) 流自身特色、流水線的中間操做修改流的方式，均可能會改變分解過程的性能：好比未執行篩選操做時，流被分紅大小差很少的幾部分，此時並行執行效率很高；但執行篩選操做後，可能致使這幾部分大小相差較大，此時並行執行效率就較低。

h) 終端操做合併步驟的代價：若是該步驟代價很大，那麼合併每一個子流產生的部分結果所付出的代價就可能會超出經過並行流獲得的性能提高。

5. 舉例

 1 // 順序流
 2 long sum1 = Stream.iterate(1L, i -> i + 1)
 3         .limit(8)
 4         .reduce(0L, Long::sum);
 5 System.out.println(sum1);
 6 // 並行流
 7 long sum2 = Stream.iterate(1L, i -> i + 1)
 8         .limit(8)
 9         .parallel()
10         .reduce(0L, Long::sum);
11 System.out.println(sum2);