Stream流與Lambda表達式（二) Stream收集器 Collector接口

時間 2019-12-07

標籤 stream lambda 表達式收集 collector 接口欄目 Apache 简体版

原文原文鏈接

1、Stream收集器 Collector接口

package com.java.design.java8.Stream;

import com.java.design.java8.entity.Student;
import com.java.design.java8.entity.Students;
import org.junit.Before;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.test.context.junit4.SpringRunner;

import java.util.*;
import java.util.stream.Collectors;


/**
 * @author 陳楊
 */

@SpringBootTest
@RunWith(SpringRunner.class)
public class CollectorDetail {

    private List<Student> students;

    @Before
    public void init() {
        students=new Students().init();
    }

    @Test
    public void testCollectorDetail() {


        //     Collect 收集器 ---- Collector接口

        //     T-->匯聚操做的元素類型 即流中元素類型
        //     A-->匯聚操做的可變累積類型
        //     R-->匯聚操做的結果類型
        //     public interface Collector<T, A, R>

        //     Collector接口   一種可變匯聚操做
        //                    將輸入元素累積到可變結果容器中
        //                    在處理完全部輸入元素後 能夠選擇將累積的結果轉換爲最終表示（可選操做）
        //                    歸約操做支持串行與並行
        //     A  mutable reduction operation that  accumulates input elements into a mutable result container,
        //     optionally transforming  the accumulated result into a final representation after all input         elements
        //     have been processed.  Reduction operations can be performed either sequentially  or in parallel.


        //     Collectors 提供 Collector 匯聚實現  其實是一個Collector工廠
        //     The class {@link Collectors}  provides implementations of many common mutable reductions.

2、Collector 接口組成

//     Collector 由如下4個函數協同累積到容器 可選的執行最終轉換
    //               supplier           建立一個新的結果容器
    //               accumulator累加器   將新數據元素合併到結果容器中
    //               combiner           合併結果容器  處理線程併發
    //               finisher           對容器執行可選的最終轉換
    //
    //     A {@code Collector} is specified by four functions that work together to
    //     accumulate entries into a mutable result container, and optionally perform
    //     a final transform on the result.  They are:
    //           creation of a new result container ({@link #supplier()})
    //           incorporating a new data element into a result container ({@link #accumulator()})
    //           combining two result containers into one ({@link #combiner()})
    //           performing an optional final transform on the container ({@link #finisher()})

3、combiner

/*
         *     A function that accepts two partial results and merges them.  The
         *     combiner function may fold state from one argument into the other and
         *     return that, or may return a new result container.
         *
         *
         *     BinaryOperator<A> combiner();
         */

       /*     supplier建立單個結果容器-->accumulator調用累積功能-->partition結果--分區容器-->combiner合併分區容器

              A sequential implementation of a reduction using a collector would
              create a single result container using the supplier function, and invoke the
              accumulator function once for each input element.  A parallel implementation
              would partition the input, create a result container for each partition,
              accumulate the contents of each partition into a subresult for that partition,
              and then use the combiner function to merge the subresults into a combined
              result.
        */

4、identity associativity 約束

/*
       確保串行與並行結果的一致性，知足約束： identity  associativity
       To ensure that sequential and parallel executions produce equivalent
       results, the collector functions must satisfy an identity and an associativity constraints.
 */

/*     identity 約束：
       對於任何部分累積的結果， 將其與空結果容器組合必須生成等效的結果
       a == combiner.apply(a, supplier.get())

       The identity constraint says that for any partially accumulated result,
       combining it with an empty result container must produce an equivalent
       result.  That is, for a partially accumulated result {@code a} that is the
       result of any series of accumulator and combiner invocations, {@code a} must
       be equivalent to {@code combiner.apply(a, supplier.get())}.
 */

/*     associativity 約束：
       串行計算與並行拆分計算必須產生同等的結果

       The associativity constraint says that splitting the computation must
       produce an equivalent result.  That is, for any input elements {@code t1}
       and {@code t2}, the results {@code r1} and {@code r2} in the computation
       below must be equivalent:

         A a1 = supplier.get();
         accumulator.accept(a1, t1);
         accumulator.accept(a1, t2);
         R r1 = finisher.apply(a1);  // result without splitting

         A a2 = supplier.get();
         accumulator.accept(a2, t1);
         A a3 = supplier.get();
         accumulator.accept(a3, t2);
         R r2 = finisher.apply(combiner.apply(a2, a3));  // result with splitting

 */

5、reduction 匯聚的實現方式

//      reduction 匯聚 的實現方式
        //      list.stream().reduce()                        對象不可變
        //      list.stream().collect(Collectors.reducing())  對象可變
        //      單線程能夠實現結果一致 但在多線程中就會出現錯誤

        /*

                 Libraries that implement reduction based on {@code Collector}, such as
                 {@link Stream#collect(Collector)}, must adhere to the following constraints:


                 傳遞給accumulator的第一個參數，傳遞給combiner的二個參數，傳遞給finisher的參數
                 必須是函數（supplier accumulator combiner）上一次調用結果
                 理解： 參數類型A
                 Supplier<A> supplier();
                 BiConsumer<A, T> accumulator();
                 BinaryOperator<A> combiner();
                 Function<A, R> finisher();

                 The first argument passed to the accumulator function, both
                 arguments passed to the combiner function, and the argument passed to the
                 finisher function must be the result of a previous invocation of the
                 result supplier, accumulator, or combiner functions


                supplier accumulator combiner的實現結果-->
                傳遞給下一次supplier accumulator combiner操做
                或返還給匯聚操做的調用方
                而不進行其餘操做
                The implementation should not do anything with the result of any of
                the result supplier, accumulator, or combiner functions other than to
                pass them again to the accumulator, combiner, or finisher functions,
                or return them to the caller of the reduction operation


                一個結果傳遞給combiner finisher而相同的對象沒有今後函數中返回 這個結果不會再被使用
                這個傳入結果是被消費了 生成了新的對象
                 If a result is passed to the combiner or finisher
                 function, and the same object is not returned from that function, it is
                 never used again


                一旦結果傳遞給combiner finisher 則再也不會傳遞給accumulator
                說明流中元素已經傳遞徹底  accumulator任務已執行完畢
                Once a result is passed to the combiner or finisher function, it
                is never passed to the accumulator function again

                非併發單線程
                For non-concurrent collectors, any result returned from the result
                supplier, accumulator, or combiner functions must be serially
                thread-confined.  This enables collection to occur in parallel without
                the {@code Collector} needing to implement any additional synchronization.
                The reduction implementation must manage that the input is properly
                partitioned, that partitions are processed in isolation, and combining
                happens only after accumulation is complete

                併發多線程
                For concurrent collectors, an implementation is free to (but not
                required to) implement reduction concurrently.  A concurrent reduction
                is one where the accumulator function is called concurrently from
                multiple threads, using the same concurrently-modifiable result container,
                rather than keeping the result isolated during accumulation.
                A concurrent reduction should only be applied if the collector has the
                {@link Characteristics#UNORDERED} characteristics or if the
                originating data is unordered

            */

6、Characteristics對Collectors的性能優化

/*      Characteristics對Collectors的性能優化
             *
             *      Collectors also have a set of characteristics, that provide hints that can be used by a
             *      reduction implementation to provide better performance.
             *
             *
             *      Characteristics indicating properties of a {@code Collector}, which can
             *      be used to optimize reduction implementations.
             *
             *   enum Characteristics {
             *
                  * Indicates that this collector is <em>concurrent</em>, meaning that
                  * the result container can support the accumulator function being
                  * called concurrently with the same result container from multiple
                  * threads.
                  *
                  * If a {@code CONCURRENT} collector is not also {@code UNORDERED},
                  * then it should only be evaluated concurrently if applied to an
                  * unordered data source.

                 CONCURRENT, 多線程處理併發 必定要保證線程安全 使用無序數據源  與UNORDERED聯合使用


                  * Indicates that the collection operation does not commit to preserving
                  * the encounter order of input elements.  (This might be true if the
                  * result container has no intrinsic order, such as a {@link Set}.)

                 UNORDERED,  無序集合


                  * Indicates that the finisher function is the identity function and
                  * can be elided.  If set, it must be the case that an unchecked cast
                  * from A to R will succeed.

                 IDENTITY_FINISH  強制類型轉換
             }*/

7、Collector接口與 Collectors

//     Collectors---> Collector接口簡單實現  靜態內部類CollectorImpl
        //     爲何要在Collectors類內部定義一個靜態內部類CollectorImpl：
        //          Collectors是一個工廠、輔助類  方法的定義是靜態的
        //          以類名直接調用方法的方式向developer提供最多見的Collector實現 其實現方式是CollectorImpl
        //          CollectorImpl類 有且僅有在 Collectors類 中使用 因此放在一塊兒

8、測試方法：

// Accumulate names into a List  將學生姓名累積成ArrayList集合
        List<String> snameList = students.stream()
                .map(Student::getName).collect(Collectors.toList());
        System.out.println("將學生姓名累積成ArrayList集合：" + snameList.getClass());
        System.out.println(snameList);
        System.out.println("-----------------------------------------\n");

        // Accumulate names into a TreeSet 將學生姓名累積成TreeSet集合
        Set<String> snameTree = students.stream()
                .map(Student::getName).collect(Collectors.toCollection(TreeSet::new));



        System.out.println("將學生姓名累積成TreeSet集合：" + snameTree.getClass());
        System.out.println(snameTree);
        System.out.println("-----------------------------------------\n");

        // Convert elements to strings and concatenate them, separated by commas  將學生姓名累積成一個Json串 以逗號分隔
        String joinedStudents = students.stream()
                .map(Student::toString).collect(Collectors.joining(","));
        System.out.println(" 將學生姓名累積成一個Json串 以逗號分隔：" + joinedStudents);
        System.out.println("-----------------------------------------\n");

        // Compute sum of salaries of students  求學生總薪水
        double totalSalary = students.stream()
                .mapToDouble(Student::getSalary).sum();
        System.out.println("學生總薪水：" + totalSalary);
        System.out.println("-----------------------------------------\n");


        // the min id of students     打印最小id的學生信息
        System.out.println("最小id的學生信息：");
        students.stream()
                .min(Comparator.comparingInt(Student::getId))
                .ifPresent(System.out::println);
        System.out.println("-----------------------------------------\n");


        // the max id of students     打印最大id的學生信息
        System.out.println("最大id的學生信息：");
        students.stream()
                .max(Comparator.comparingInt(Student::getId))
                .ifPresent(System.out::println);
        System.out.println("-----------------------------------------\n");


        // Compute avg of Age of students   求學平生均年齡
        Double avgAge = students.stream()
                .collect(Collectors.averagingInt(Student::getAge));
        System.out.println("學平生均年齡：" + avgAge);
        System.out.println("-----------------------------------------\n");


        // Compute SummaryStatistics of Age of students   打印學生年齡的彙總信息
        IntSummaryStatistics ageSummaryStatistics = students.stream()
                .collect(Collectors.summarizingInt(Student::getAge));
        System.out.println("學生年齡的彙總信息：" + ageSummaryStatistics);
        System.out.println("-----------------------------------------\n");


        //  根據性別分組 取id最小的學生
        //  直接使用Collectors.minBy返回的是Optional<Student>
        //  因能確認不爲Null 使用Collectors.collectingAndThen-->Optional::get獲取
        Map<String, Student> minIdStudent = students.stream().
                collect(Collectors.groupingBy(Student::getSex, Collectors.collectingAndThen
                        (Collectors.minBy(Comparator.comparingInt(Student::getId)), Optional::get)));

        System.out.println(minIdStudent);
        System.out.println("-----------------------------------------\n");

    }
}

9、測試結果

.   ____          _            __ _ _
 /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
 \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
  '  |____| .__|_| |_|_| |_\__, | / / / /
 =========|_|==============|___/=/_/_/_/
 :: Spring Boot ::        (v2.1.2.RELEASE)

2019-02-20 16:11:56.217  INFO 17260 --- [           main] c.j.design.java8.Stream.CollectorDetail  : Starting CollectorDetail on DESKTOP-87RMBG4 with PID 17260 (started by 46250 in E:\IdeaProjects\design)
2019-02-20 16:11:56.223  INFO 17260 --- [           main] c.j.design.java8.Stream.CollectorDetail  : No active profile set, falling back to default profiles: default
2019-02-20 16:11:56.699  INFO 17260 --- [           main] c.j.design.java8.Stream.CollectorDetail  : Started CollectorDetail in 0.678 seconds (JVM running for 1.401)
-----------------------------------------

將學生姓名累積成ArrayList集合：class java.util.ArrayList
[Kirito, Asuna, Sinon, Yuuki, Alice]
-----------------------------------------

將學生姓名累積成TreeSet集合：class java.util.TreeSet
[Alice, Asuna, Kirito, Sinon, Yuuki]
-----------------------------------------

 將學生姓名累積成一個Json串 以逗號分隔：Student(id=1, name=Kirito, sex=Male, age=18, addr=Sword Art Online, salary=9.99999999E8),Student(id=2, name=Asuna, sex=Female, age=17, addr=Sword Art Online, salary=9.99999999E8),Student(id=3, name=Sinon, sex=Female, age=16, addr=Gun Gale Online, salary=9.99999999E8),Student(id=4, name=Yuuki, sex=Female, age=15, addr=Alfheim Online, salary=9.99999999E8),Student(id=5, name=Alice, sex=Female, age=14, addr=Alicization, salary=9.99999999E8)
-----------------------------------------

學生總薪水：4.999999995E9
-----------------------------------------

最小id的學生信息：
Student(id=1, name=Kirito, sex=Male, age=18, addr=Sword Art Online, salary=9.99999999E8)
-----------------------------------------

最大id的學生信息：
Student(id=5, name=Alice, sex=Female, age=14, addr=Alicization, salary=9.99999999E8)
-----------------------------------------

學平生均年齡：16.0
-----------------------------------------

學生年齡的彙總信息：IntSummaryStatistics{count=5, sum=80, min=14, average=16.000000, max=18}
-----------------------------------------

{Female=Student(id=2, name=Asuna, sex=Female, age=17, addr=Sword Art Online, salary=9.99999999E8), Male=Student(id=1, name=Kirito, sex=Male, age=18, addr=Sword Art Online, salary=9.99999999E8)}
-----------------------------------------