Java 8 Stream API的使用示例

時間 2019-11-07

標籤 java stream api 使用示例欄目 Java 简体版

原文原文鏈接

前言

Java Stream API藉助於Lambda表達式，爲Collection操做提供了一個新的選擇。若是使用得當，能夠極大地提升編程效率和代碼可讀性。html

本文將介紹Stream API包含的方法，並經過示例詳細展現其用法。java

1、Stream特色

Stream不是集合元素，它不是數據結構也不保存數據，而更像一個高級版本的迭代器(Iterator)。Stream操做能夠像鏈條同樣排列，造成Stream Pipeline，即鏈式操做。算法

Stream Pipeline由數據源的零或多箇中間(Intermediate)操做和一個終端(Terminal)操做組成。中間操做都以某種方式進行流數據轉換，將一個流轉換爲另外一個流，轉換後元素類型可能與輸入流相同或不一樣，例如將元素按函數映射到其餘類型或過濾掉不知足條件的元素。終端操做對流執行最終計算，例如將其元素存儲到集合中、遍歷打印元素等。編程

Stream特色：數組

無存儲。Stream不是一種數據結構，也不保存數據，數據源能夠是一個數組，Java容器或I/O Channel等。安全
爲函數式編程而生。對Stream的任何修改都不會修改數據源，例如對Stream過濾操做不會刪除被過濾的元素，而是產生一個不包含被過濾元素的新Stream。數據結構
惰性執行。Stream上的中間操做並不會當即執行，只有等到用戶真正須要結果時纔會執行。多線程
一次消費。Stream只能被「消費」一次，一旦遍歷過就會失效，就像容器的迭代器那樣，想要再次遍歷必須從新生成。dom

注意：沒有終端操做的流管道是靜默無操做的，因此不要忘記包含一個終端操做。ide

2、用法示例

如下將基於《Java 8 Optional類使用的實踐經驗》一文中的Person類，展現Stream API的用法。考慮到代碼簡潔度，示例中儘可能使用方法引用。

2.1 Stream建立

2.1.1 經過參數序列建立Stream

對於可變參數序列，經過Stream.of()建立Stream，而沒必要先建立Array再建立Stream。

IntStream stream = IntStream.of(10, 20, 30, 40, 50); // 不要使用Stream<Integer>
Stream<String> colorStream = Stream.of("Red", "Pink", "Purple");
Stream<Person> personStream = Stream.of(
        new Person("mike", "male", 10),
        new Person("lucy", "female", 4),
        new Person("jason", "male", 5)
);

2.1.2 經過數組建立Stream

不用區分基礎數據類型，但參數只能是數組。

int[] intNumbers = {10, 20, 30, 40, 50};
IntStream stream = IntStream.of(intNumbers);

2.1.3 經過集合(Collection子類)建立Stream

調用parallelStream()或stream().parallel()方法可建立並行Stream。

Stream<Integer> numberStream = Arrays.asList(10, 20, 30, 40, 50).stream();

2.1.4 經過生成器建立Stream

· 一般用於隨機數、元素知足固定規則的Stream，或用於生成海量測試數據的場景。

· 應配合limit()、filter()使用，以控制Stream大小，不然stream長度無限。

Stream.generate(Math::random).limit(10)
Stream.generate(() -> (int) (System.nanoTime() % 100)).limit(5)

2.1.5 經過iterate建立Stream

· 重複對給定種子值(seed)調用指定的函數來建立Stream，其元素爲seed, f(seed), f(f(seed))...無限循環。

· 一般用於隨機數、元素知足固定規則的Stream，或用於生成海量測試數據的場景。

· 應配合limit()、filter()使用，以控制Stream大小，不然stream長度無限。

// 按行依次輸出：0、五、十、1五、20
Stream.iterate(0, n -> n + 5).limit(5).forEach(System.out::println);

2.1.6 經過區間建立整數序列Stream

用於IntStream、LongStream，range()不包含尾元素，rangeClosed()包含尾元素。

LongStream longRange = LongStream.range(-100L, 100L); // 生成[-100, 100)區間的元素序列

2.1.7 經過IO方式建立Stream

· 適用於從文本文件中逐行讀取數據、遍歷文件目錄等場景。

· 一般配合try ... with resources語法使用，以安全而簡潔地關閉資源。

try (Stream<String> lines = Files.lines(Paths.get("./file.txt"), StandardCharsets.UTF_8)) {
            // 跳過第一行，輸出第2~4共計三行
            lines.skip(1).limit(3).forEach(System.out::println);
        } catch (IOException e){
            System.out.println("Oops!");
        }

2.2 Stream操做

常見的操做能夠歸類以下：

Intermediate：Stream通過此類操做後，結果仍爲Stream

map (mapToInt, flatMap 等)、 filter、 distinct、 sorted、 peek、 limit、 skip、 parallel、 sequential、 unordered

Terminal：Stream裏包含的內容按照某種算法匯聚爲一個值

forEach、 forEachOrdered、 toArray、 reduce、 collect、 min、 max、 count、 anyMatch、 allMatch、 noneMatch、 findFirst、 findAny、 iterator

基本的Stream用法格式爲Stream.Intermediate.Terminal(SIT)。Java8特性詳解 lambda表達式 Stream以圖示形式直觀描述了這種格式及若干Intermediate操做。

本節主要介紹經常使用操做及代碼示例。爲便於演示，首先定義以下集合對象：

List<Person> persons = Arrays.asList(
        new Person("mike", "male", 10).setLocation("China", "Nanjing"),
        new Person("lucy", "female", 4),
        new Person("jason", "male", 5).setLocation("China", "Xian")
);

2.2.1 map + sum + filter + reduce

只有IntStream、LongStream和DoubleStream支持sum()方法。

// 計算年齡總和：totalAge = 19
int totalAge = persons.stream().mapToInt(Person::getAge).sum();
// 並行計算年齡總和，此處不建議使用reduce(針對複雜的規約操做)
persons.stream().parallel().mapToInt(Person::getAge).reduce(0, Integer::sum);
// 計算男生年齡總和：totalAge = 15
persons.stream().filter(person -> "male".equals(person.getGender())).mapToInt(Person::getAge).sum();

2.2.2 map + average + max

average()返回OptionalDouble，max()/min()返回OptionalInt或Optional 。

// 計算年齡均值，輸出6.333333333333333
persons.stream().mapToInt(Person::getAge).average().ifPresent(System.out::println);
// 計算字典序最大的人名，輸出mike
persons.stream().map(Person::getName).max(String::compareToIgnoreCase).ifPresent(System.out::println);

2.2.3 map + forEach

// 輸出每一個學生姓名的大寫形式，按行輸出：MIKE、LUCY、JASON
persons.stream()
        .map(Person::getName) // 將Person對象映射爲String（姓名）
        .map(String::toUpperCase) // 將姓名轉換大寫
        .forEach(System.out::println); // 按行輸出List元素

2.2.4 collect

· collect操做可將Stream元素轉換爲不一樣的數據類型，如字符串、List、Set和Map等。

· Java 8經過Collectors類支持各類內置收集器，以簡化collect操做。

// 獲得字符串：Colors: Red&Pink&Purple!
colorStream.collect(Collectors.joining("&", "Colors: ", "!"));
// 獲得ArrayList，元素爲：Red, Pink, Purple
// 注意，Stream轉換爲數組的格式形如stream.toArray(String[]::new)
colorStream.collect(Collectors.toList());
// 獲得HashSet，元素爲：Red, Pink, Purple
colorStream.collect(Collectors.toSet());
// 獲得LinkedList，toCollection()用於指定集合類型
colorStream.collect(Collectors.toCollection(LinkedList::new));
// 獲得HashMap，{mike=Person{name='mike'}, jason=Person{name='jason'}, lucy=Person{name='lucy'}}
personStream.collect(Collectors.toMap(Person::getName, Function.identity()));

collect收集器還提供summingInt()、averagingInt()和summarizingInt()等計算方法。

// 返回流中整數屬性求和，即19
persons.stream().collect(Collectors.summingInt(Person::getAge))
// 計算流中Integer屬性的平均值，即6.333333333333333
persons.stream().collect(Collectors.averagingInt(Person::getAge))
// 收集流中Integer屬性的統計值，即IntSummaryStatistics{count=3, sum=19, min=4, average=6.333333, max=10}
persons.stream().collect(Collectors.summarizingInt(Person::getAge))

2.2.5 sorted + collect

// 按照年齡升序排序：sortedpersons = [Person{name='lucy'}, Person{name='jason'}, Person{name='mike'}]
List<Person> sortedPersons = persons.stream()
        .sorted(Comparator.comparingInt(Person::getAge)) // 按照年齡排序
        .collect(Collectors.toList()); // 匯聚爲一個List對象
// 按照姓名長度升序排序，按行輸出：mike: 四、lucy: 四、jason: 5
persons.stream()
        .sorted(Comparator.comparingInt(p -> p.getName().length()))
        .map(Person::getName)
        .map(name -> name + ": " + name.length())
        .forEach(System.out::println);

2.2.6 map + anyMatch

// 判斷是否存在名爲jason的人：existed = true
boolean existed = persons.stream()
        .map(Person::getName)
        .anyMatch("jason"::equals); // 任意匹配項是否存在

2.2.7 groupingBy + map + reduce

// 將全部人按照性別分組並計數，輸出：{female=1, male=2}
Map<String, Long> groupBySex = persons.stream().collect(groupingBy(Person::getGender, counting()));
System.out.println(groupBySex);
// 將全部人按照性別分組並計算各組最大年齡，輸出：Person{name='mike'}
Map<String, Optional<Person>> groupBySexAge = persons.stream().collect(
        groupingBy(Person::getGender, maxBy(Comparator.comparingInt(Person::getAge))));
System.out.println(groupBySexAge.get("male").get());
// 將全部人按照性別分組，按行輸出：female: lucy、male: mike,jason
persons.stream().collect(groupingBy(Person::getGender))
        .forEach((k, v) ->System.out.println(k + ": "
                + v.stream().map(Person::getName)
                .reduce((x, y) -> x + "," + y).get()));

注意，本例採用import static java.util.stream.Collectors.*;這種靜態導入的方式簡化Collectors.groupingBy()的調用，代碼更簡潔易讀。此外，不推薦示例中forEach()的用法。

2.2.8 maps + collect

// 計算身高比例分佈：agePercentages = [52.63%, 21.05%, 26.32%]
List<String> agePercentages = persons.stream()
        .mapToInt(Person::getAge) // 將Person對象映射爲年齡整型值
        .mapToDouble(age -> age / (double)totalAge * 100) // 計算年齡比例
        .mapToObj(new DecimalFormat("##.00")::format) // DoubleStream -> Stream<String>
        .map(percentage -> percentage + "%") // 添加百分比後綴

        .collect(Collectors.toList());
// 若元素數目較多，可先定義formator = new DecimalFormat("##.00")，再調用mapToObj(formator::format)

2.2.9 flatMap

flatMap()將Stream中的集合實例內的元素所有拍平鋪開，造成一個新的Stream，從而到達合併的效果。

// 傳統寫法（注意兩層循環）
private static int countPrefix(List<List<String>> nested, String prefix) {
    int count = 0;
    for (List<String> element : nested) {
        if (element != null) {
            for (String str : element) {
                if (str.startsWith(prefix)) {
                    count++;
                }
            }
        }
    }
    return count;
}
// Stream寫法
private static int countPrefixWithStream(List<List<String>> nested, String prefix) {
    return (int) nested.stream()
            .filter(Objects::nonNull)
            .flatMap(Collection::stream)
            .filter(str -> str.startsWith(prefix))
            .count();
}

List<List<String>> lists = Arrays.asList(
        Arrays.asList("Jame"),
        Arrays.asList("Mike", "Jason"),
        Arrays.asList("Jean", "Lucy", "Beth")
);
System.out.println("以J開頭的人名數：" + countPrefixWithStream(lists, "J"));

3、規則總結

使用Stream時，需注意如下規則：

避免重用Stream。

Java 8 Stream一旦被Terminal操做消費，將不可以再使用，必須爲待執行的每一個Terminal操做建立新的Stream鏈。在實際開發時，將共用的Stream實例定義爲成員變量時，尤爲容易犯錯。

重用Stream將報告stream has already been operated upon or closed的異常。

若須要屢次調用，可利用Stream Supplier實例來建立已構建全部中間操做的新Stream。例如：
```
Supplier<Stream<String>> streamSupplier =
        () -> Stream.of("d2", "a2", "b1", "b3", "c")
                .filter(s -> s.startsWith("a"));
streamSupplier.get().anyMatch(s -> true);   // 每次調用get()構造一個新stream
streamSupplier.get().noneMatch(s -> true);
```
注意，anyMatch()方法接受Predicate引元，一般無需使用filter，此處僅爲示例方便。

避免建立無限流。

經過iterate或生成器建立Stream時，應配合limit()使用，以控制Stream大小。

但distinct()與limit()共用時，應特別注意去重後元素數目是否知足limit限制。例如：

IntStream.iterate(0, i -> (i + 1) % 2) // 生成0和1的整數序列   
    .distinct() // 去重後爲0和1兩個元素   
    .limit(10) // limit(10)限制得不到知足，從而變成無限流   
    .forEach(System.out::println);

注意Stream操做順序，儘量提早經過filter()等操做下降數據規模。

如下面一段簡單的代碼爲例：

Stream.of("a1", "b2", "c3", "d4", "e5").map(s -> {   
    System.out.println("map: " + s);
    return s.toUpperCase();
}).filter(s -> {
    System.out.println("filter: " + s);
    return s.startsWith("A");
}).forEach(s -> System.out.println("forEach: " + s));

運行輸出以下：

map: a1
filter: A1
forEach: A1
map: b2
filter: B2
map: c3
filter: C3
map: d4
filter: D4
map: e5
filter: E5

可見，流中的每一個字符串都被調用5次map()和filter()，而forEach()只調用一次。

再改變操做順序，將filter()移到Stream操做鏈的頭部：

Stream.of("a1", "b2", "c3", "d4", "e5").filter(s -> {
    System.out.println("filter: " + s);
    return s.startsWith("a");
}).map(s -> {
    System.out.println("map: " + s);
    return s.toUpperCase();
}).forEach(s -> System.out.println("forEach: " + s));

運行輸出以下：

filter: a1
map: a1
forEach: A1
filter: b2
filter: c3
filter: d4
filter: e5

可見，map()只被調用一次。雖然Stream惰性計算的特性使得操做順序並不影響最終結果，但合理地安排順序能夠減小實際執行次數。數據規模較大時，性能會有較明顯的提高。

注意Stream操做的反作用。

大多數Stream操做必須是無干擾、無狀態的。

「無干擾」是指在流操做的過程當中，不去修改流的底層數據源。例如，遍歷流時不能經過添加或刪除集合中的元素來修改集合。

「無狀態」是指Lambda表達式的結果不能依賴於流管道執行過程當中，可能發生變化的外部做用域的任何可變變量或狀態。

如下代碼試圖在操做流時添加和移出元素，運行時均會拋出java.util.ConcurrentModificationException異常：

List<String> strings = new ArrayList<>(Arrays.asList("one", "two"));
String concatenatedString = strings.stream()
        // 不要這樣作，干擾發生在這裏
        .peek(s -> strings.add("three"))
        .reduce((a, b) -> a + " " + b)
        .get();
List<Integer> list = IntStream.range(0, 10)
        .boxed() // 流元素裝箱爲Integer類型
        .collect(Collectors.toCollection(ArrayList::new));
list.stream()
        .peek(list::remove) // 不要這樣作，干擾發生在這裏
        .forEach(System.out::println);

如下代碼對並行Stream使用了有狀態的Lambda表達式：

Integer[] intArray = {1, 2, 3, 4, 5, 6, 7, 8};
List<Integer> listOfIntegers = new ArrayList<>(Arrays.asList(intArray));
List<Integer> parallelStorage = new ArrayList<>();
//List<Integer> parallelStorage = Collections.synchronizedList(new ArrayList<>());
listOfIntegers.parallelStream()
        // 不要這樣作，此處使用了有狀態的Lambda表達式
        .map(e -> { parallelStorage.add(e); return e; })
        .forEachOrdered(e -> System.out.print(e + " "));
System.out.println(": 1st");
parallelStorage.stream().forEachOrdered(e -> System.out.print(e + " "));
System.out.println(": 2nd");

運行結果可能出現如下幾種：

// 並行執行流時，map()添加元素的順序和隨後的forEachOrdered()元素打印順序不一樣
1 2 3 4 5 6 7 8 : 1st
1 6 3 2 7 8 5 4 : 2nd
// 多線程可能同時讀取到相同的下標n進行賦值，致使元素數量少於預期（採用synchronizedList可解決該問題）
1 2 3 4 5 6 7 8 : 1st
1 5 8 3 6 : 2nd

《Effective Java 第三版》中指出，不要嘗試並行化流管道，除非有充分的理由相信它將保持計算的正確性並提升其速度。不恰當地並行化流的代價多是程序失敗或性能災難。