java8-02-Stream-API

時間 2019-12-08

標籤 java8 java stream api 欄目 Java 简体版

原文原文鏈接

[TOC]java

0 Stream簡介

家庭住址：java.util.stream.Stream<T>linux
出生年月：Java8問世的時候他就來到了世上git
主要技能：那能夠吹上三天三夜了……github
主要特徵docker
- 不改變輸入源apache
- 中間的各類操做是lazy的(惰性求值、延遲操做)編程
- 只有當開始消費流的時候，流纔有意義api
- 隱式迭代app
……less

整體感受，Stream至關於一個進化版的Iterator。Java8源碼裏是這麼註釋的：

A sequence of elements supporting sequential and parallel aggregate operations

能夠方便的對集合進行遍歷、過濾、映射、匯聚、切片等複雜操做。最終匯聚成一個新的Stream，不改變原始數據。而且各類複雜的操做都是lazy的，也就是說會盡量的將全部的中間操做在最終的匯聚操做一次性完成。

比起傳統的對象和數據的操做，Stream更專一於對流的計算,和傳說中的函數式編程有點相似。

他具體進化的多牛逼，本身體驗吧。

給一組輸入數據:

List<Integer> list = Arrays.asList(1, null, 3, 1, null, 4, 5, null, 2, 0);

求輸入序列中非空奇數之和，而且相同奇數算做同一個。

在lambda還在孃胎裏的時候，爲了實現這個功能，可能會這麼作

int s = 0;
// 先放在Set裏去重
Set<Integer> set = new HashSet<>(list);
for (Integer i : set) {
  if (i != null && (i & 1) == 0) {
    s += i;
  }
}
System.out.println(s);

當lambda和Stream雙劍合璧以後：

int sum = list.stream().filter(e -> e != null && (e & 1) == 1).distinct().mapToInt(i -> i).sum();

1 獲取Stream

從lambda的其餘好基友那裏獲取Stream

從1.8開始，接口中也能夠存在 default 修飾的方法了。

java.util.Collection<E> 中有以下聲明：

public interface Collection<E> extends Iterable<E> {
    // 獲取普通的流
    default Stream<E> stream() {
        return StreamSupport.stream(spliterator(), false);
    }
    // 獲取並行流
    default Stream<E> parallelStream() {
        return StreamSupport.stream(spliterator(), true);
    }
}

java.util.Arrays中有以下聲明：

public static <T> Stream<T> stream(T[] array) {
        return stream(array, 0, array.length);
    }

    public static IntStream stream(int[] array) {
        return stream(array, 0, array.length);
    }

    // 其餘相似的方法再也不一一列出

示例

List<String> strs = Arrays.asList("apache", "spark");
Stream<String> stringStream = strs.stream();

IntStream intStream = Arrays.stream(new int[] { 1, 25, 4, 2 });

經過Stream接口獲取

Stream<String> stream = Stream.of("hello", "world");
Stream<String> stream2 = Stream.of("haha");
Stream<HouseInfo> stream3 = Stream.of(new HouseInfo[] { new HouseInfo(), new HouseInfo() });

Stream<Integer> stream4 = Stream.iterate(1, i -> 2 * i + 1);

Stream<Double> stream5 = Stream.generate(() -> Math.random());

注意：Stream.iterate()和 Stream.generate()生成的是無限流，通常要手動limit 。

2 轉換Stream

流過濾、流切片

這部分相對來講還算簡單明瞭，看個例子就夠了

// 獲取流
Stream<String> stream = Stream.of(//
    null, "apache", null, "apache", "apache", //
    "github", "docker", "java", //
    "hadoop", "linux", "spark", "alifafa");

stream// 去除null,保留包含a的字符串
    .filter(e -> e != null && e.contains("a"))//
    .distinct()// 去重,固然要有equals()和hashCode()方法支持了
    .limit(3)// 只取知足條件的前三個
    .forEach(System.out::println);// 消費流

map/flatMap

Stream的map定義以下：

<R> Stream<R> map(Function<? super T, ? extends R> mapper);

也就是說，接收一個輸入(T:當前正在迭代的元素)，輸出另外一種類型(R)。

Stream.of(null, "apache", null, "apache", "apache", //
          "hadoop", "linux", "spark", "alifafa")//

  .filter(e -> e != null && e.length() > 0)//
  .map(str -> str.charAt(0))//取出第一個字符
  .forEach(System.out::println);

sorted

排序也比較直觀，有兩種：

// 按照元素的Comparable接口的實現來排序
Stream<T> sorted();

// 指定Comparator來自定義排序
Stream<T> sorted(Comparator<? super T> comparator);

示例:

List<HouseInfo> houseInfos = Lists.newArrayList(//
    new HouseInfo(1, "恆大星級公寓", 100, 1), //
    new HouseInfo(2, "匯智湖畔", 999, 2), //
    new HouseInfo(3, "張江湯臣豪園", 100, 1), //
    new HouseInfo(4, "保利星苑", 23, 10), //
    new HouseInfo(5, "北顧小區", 66, 23), //
    new HouseInfo(6, "北傑公寓", null, 55), //
    new HouseInfo(7, "保利星苑", 77, 66), //
    new HouseInfo(8, "保利星苑", 111, 12)//
);

houseInfos.stream().sorted((h1, h2) -> {
    if (h1 == null || h2 == null)
      return 0;
    if (h1.getDistance() == null || h2.getDistance() == null)
      return 0;
    int ret = h1.getDistance().compareTo(h2.getDistance());
    if (ret == 0) {
      if (h1.getBrowseCount() == null || h2.getBrowseCount() == null)
        return 0;
      return h1.getBrowseCount().compareTo(h2.getBrowseCount());
    }
    return ret;
});

3 終止/消費Stream

條件測試、初級統計操做

List<Integer> list = Arrays.asList(1, 2, 3, 4, 5);

// 是否是全部元素都大於零
System.out.println(list.stream().allMatch(e -> e > 0));
// 是否是存在偶數
System.out.println(list.stream().anyMatch(e -> (e & 1) == 0));
// 是否是都不小於零
System.out.println(list.stream().noneMatch(e -> e < 0));

// 找出第一個大於等於4的元素
Optional<Integer> optional = list.stream().filter(e -> e >= 4).findFirst();
// 若是存在的話,就執行ifPresent中指定的操做
optional.ifPresent(System.out::println);

// 大於等於4的元素的個數
System.out.println(list.stream().filter(e -> e >= 4).count());
// 獲取最小的
System.out.println(list.stream().min(Integer::compareTo));
// 獲取最大的
System.out.println(list.stream().max(Integer::compareTo));
// 先轉換成IntStream,max就不須要比較器了
System.out.println(list.stream().mapToInt(i -> i).max());

reduce

這個詞不知道怎麼翻譯，有人翻譯爲 規約 或 匯聚。

反正就是將通過一系列轉換後的流中的數據最終收集起來，收集的同時可能會反覆 apply 某個 reduce函數。

reduce()方法有如下兩個重載的變體：

// 返回的不是Optional,由於正常狀況下至少有參數identity能夠保證返回值不會爲null
T reduce(T identity, BinaryOperator<T> accumulator);

<U> U reduce(U identity,
             BiFunction<U, ? super T, U> accumulator,
             BinaryOperator<U> combiner);

示例：

// 遍歷元素，反覆apply (i,j)->i+j的操做
Integer reduce = Stream.iterate(1, i -> i + 1)//1,2,3,...,10,...
    .limit(10)//
    .reduce(0, (i, j) -> i + j);//55


Optional<Integer> reduce2 = Stream.iterate(1, i -> i + 1)//
    .limit(10)//
    .reduce((i, j) -> i + j);

collect

該操做很好理解，顧名思義就是將Stream中的元素collect到一個地方。

最常規(最不經常使用)的collect方法

// 最牛逼的每每是最不經常使用的,畢竟這個方法理解起來太過複雜了
<R> R collect(Supplier<R> supplier,
              BiConsumer<R, ? super T> accumulator,
              BiConsumer<R, R> combiner);
// 至於這個方法的參數含義，請看下面的例子

一個參數的版本

<R, A> R collect(Collector<? super T, A, R> collector);

Collector接口(他不是函數式接口，無法使用lambda)的關鍵代碼以下：

public interface Collector<T, A, R> {
    /**
     *
     */
    Supplier<A> supplier();

    /**
     * 
     */
    BiConsumer<A, T> accumulator();

    /**
     * 
     */
    BinaryOperator<A> combiner();

    /**
     *
     */
    Function<A, R> finisher();

    /**
     * 
     */
    Set<Characteristics> characteristics();

}

先來看一個關於三個參數的collect()方法的例子，除非特殊狀況，否則我保證你看了以後這輩子都不想用它……

List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5);
ArrayList<Integer> ret1 = numbers.stream()//
    .map(i -> i * 2)// 擴大兩倍
    .collect(//
    () -> new ArrayList<Integer>(), //參數1
    (list, e) -> list.add(e), //參數2
    (list1, list2) -> list1.addAll(list2)//參數3
);

/***
 * <pre>
 * collect()方法的三個參數解釋以下：
 * 1. () -> new ArrayList<Integer>() 
 *         生成一個新的用來存儲結果的集合
 * 2. (list, e) -> list.add(e)
 *         list：是參數1中生成的新集合
 *         e：是Stream中正在被迭代的當前元素
 *         該參數的做用就是將元素添加到新生成的集合中
 * 3. (list1, list2) -> list1.addAll(list2)
 *         合併集合
 * </pre>
 ***/

ret1.forEach(System.out::println);

不使用lambda的時候，等價的代碼應該是這個樣子的……

List<Integer> ret3 = numbers.stream()//
    .map(i -> i * 2)// 擴大兩倍
    .collect(new Supplier<List<Integer>>() {
      @Override
      public List<Integer> get() {
        // 只是爲了提供一個集合來存儲元素
        return new ArrayList<>();
      }
    }, new BiConsumer<List<Integer>, Integer>() {
      @Override
      public void accept(List<Integer> list, Integer e) {
        // 將當前元素添加至第一個參數返回的容器中
        list.add(e);
      }
    }, new BiConsumer<List<Integer>, List<Integer>>() {

      @Override
      public void accept(List<Integer> list1, List<Integer> list2) {
        // 合併容器
        list1.addAll(list2);
      }
  });

ret3.forEach(System.out::println);

是否是被噁心到了……

一樣的，用Java調用spark的api的時候，若是沒有lambda的話，比上面的代碼還噁心……

順便打個免費的廣告，能夠看看本大俠這篇使用各類版本實現的Spark的HelloWorld: http://blog.csdn.net/hylexus/...，來證實一下有lambda的世界是有多麼幸福……

不過，當你理解了三個參數的collect方法以後，可使用構造器引用和方法引用來使代碼更簡潔：

ArrayList<Integer> ret2 = numbers.stream()//
    .map(i -> i * 2)// 擴大兩倍
    .collect(//
    ArrayList::new, //
    List::add, //
    List::addAll//
);

ret2.forEach(System.out::println);

Collectors工具的使用(高級統計操做)

上面的三個和一個參數的collect()方法都異常複雜，最經常使用的仍是一個參數的版本。可是那個Collector本身實現的話仍是很噁心。

還好，經常使用的Collect操做對應的Collector都在java.util.stream.Collectors 中提供了。很強大的工具……

如下示例都是對該list的操做：

List<HouseInfo> houseInfos = Lists.newArrayList(//
    new HouseInfo(1, "恆大星級公寓", 100, 1), // 小區ID，小區名，瀏覽數，距離
    new HouseInfo(2, "匯智湖畔", 999, 2), //
    new HouseInfo(3, "張江湯臣豪園", 100, 1), //
    new HouseInfo(4, "保利星苑", 111, 10), //
    new HouseInfo(5, "北顧小區", 66, 23), //
    new HouseInfo(6, "北傑公寓", 77, 55), //
    new HouseInfo(7, "保利星苑", 77, 66), //
    new HouseInfo(8, "保利星苑", 111, 12)//
);

好了，開始裝逼之旅 ^_^ ……

提取小區名

// 獲取全部小區名，放到list中
List<String> ret1 = houseInfos.stream()
      .map(HouseInfo::getHouseName).collect(Collectors.toList());
ret1.forEach(System.out::println);

// 獲取全部的小區名，放到set中去重
// 固然也可先distinct()再collect到List中
Set<String> ret2 = houseInfos.stream()
      .map(HouseInfo::getHouseName).collect(Collectors.toSet());
ret2.forEach(System.out::println);

// 將全部的小區名用_^_鏈接起來
// 恆大星級公寓_^_匯智湖畔_^_張江湯臣豪園_^_保利星苑_^_北顧小區_^_北傑公寓_^_保利星苑_^_保利星苑
String names = houseInfos.stream()
      .map(HouseInfo::getHouseName).collect(Collectors.joining("_^_"));
System.out.println(names);

// 指定集合類型爲ArrayList
ArrayList<String> collect = houseInfos.stream()
      .map(HouseInfo::getHouseName)
      .collect(Collectors.toCollection(ArrayList::new));

最值

// 獲取瀏覽數最高的小區
Optional<HouseInfo> ret3 = houseInfos.stream()//
  .filter(h -> h.getBrowseCount() != null)// 過濾掉瀏覽數爲空的
  .collect(Collectors.maxBy((h1, h2) -> Integer.compare(h1.getBrowseCount(), h2.getBrowseCount())));
System.out.println(ret3.get());

// 獲取最高瀏覽數
Optional<Integer> ret4 = houseInfos.stream()//
  .filter(h -> h.getBrowseCount() != null)// 去掉瀏覽數爲空的
  .map(HouseInfo::getBrowseCount)// 取出瀏覽數
  .collect(Collectors.maxBy(Integer::compare));// 方法引用，比較瀏覽數
System.out.println(ret4.get());

總數、總和

// 獲取總數
// 其實這個操做直接用houseInfos.size()就能夠了，此處僅爲演示語法
Long total = houseInfos.stream().collect(Collectors.counting());
System.out.println(total);

// 瀏覽數總和
Integer ret5 = houseInfos.stream()//
  .filter(h -> h.getBrowseCount() != null)// 過濾掉瀏覽數爲空的
  .collect(Collectors.summingInt(HouseInfo::getBrowseCount));
System.out.println(ret5);

// 瀏覽數總和
Integer ret6 = houseInfos.stream()//
  .filter(h -> h.getBrowseCount() != null)// 過濾掉瀏覽數爲空的
  .map(HouseInfo::getBrowseCount).collect(Collectors.summingInt(i -> i));
System.out.println(ret6);

// 瀏覽數總和
int ret7 = houseInfos.stream()//
  .filter(h -> h.getBrowseCount() != null)// 過濾掉瀏覽數爲空的
  .mapToInt(HouseInfo::getBrowseCount)// 先轉換爲IntStream後直接用其sum()方法
  .sum();
System.out.println(ret7);

均值

// 瀏覽數平均值
Double ret8 = houseInfos.stream()//
  .filter(h -> h.getBrowseCount() != null)// 過濾掉瀏覽數爲空的
  .collect(Collectors.averagingDouble(HouseInfo::getBrowseCount));
System.out.println(ret8);

// 瀏覽數平均值
OptionalDouble ret9 = houseInfos.stream()//
  .filter(h -> h.getBrowseCount() != null)// 過濾掉瀏覽數爲空的
  .mapToDouble(HouseInfo::getBrowseCount)// 先轉換爲DoubleStream後直接用其average()方法
  .average();
System.out.println(ret9.getAsDouble());

統計信息

// 獲取統計信息
DoubleSummaryStatistics statistics = houseInfos.stream()//
  .filter(h -> h.getBrowseCount() != null)
  .collect(Collectors.summarizingDouble(HouseInfo::getBrowseCount));
System.out.println("avg:" + statistics.getAverage());
System.out.println("max:" + statistics.getMax());
System.out.println("sum:" + statistics.getSum());

分組

// 按瀏覽數分組
Map<Integer, List<HouseInfo>> ret10 = houseInfos.stream()//
  .filter(h -> h.getBrowseCount() != null)// 過濾掉瀏覽數爲空的
  .collect(Collectors.groupingBy(HouseInfo::getBrowseCount));
ret10.forEach((count, house) -> {
  System.out.println("BrowseCount:" + count + " " + house);
});

// 多級分組
// 先按瀏覽數分組,二級分組用距離分組
Map<Integer, Map<String, List<HouseInfo>>> ret11 = houseInfos.stream()//
  .filter(h -> h.getBrowseCount() != null && h.getDistance() != null)//
  .collect(Collectors.groupingBy(
      HouseInfo::getBrowseCount,
      Collectors.groupingBy((HouseInfo h) -> {
          if (h.getDistance() <= 10)
            return "較近";
          else if (h.getDistance() <= 20)
            return "近";
          return "較遠";
    })));

//結果大概長這樣
ret11.forEach((count, v) -> {
  System.out.println("瀏覽數:" + count);
  v.forEach((desc, houses) -> {
    System.out.println("\t" + desc);
    houses.forEach(h -> System.out.println("\t\t" + h));
  });
});
/****
 * <pre>
 *  瀏覽數:66
        較遠
            HouseInfo [houseId=5, houseName=北顧小區, browseCount=66, distance=23]
    瀏覽數:100
        較近
            HouseInfo [houseId=1, houseName=恆大星級公寓, browseCount=100, distance=1]
            HouseInfo [houseId=3, houseName=張江湯臣豪園, browseCount=100, distance=1]
    瀏覽數:999
        較近
            HouseInfo [houseId=2, houseName=匯智湖畔, browseCount=999, distance=2]
    瀏覽數:77
        較遠
            HouseInfo [houseId=6, houseName=北傑公寓, browseCount=77, distance=55]
            HouseInfo [houseId=7, houseName=保利星苑, browseCount=77, distance=66]
    瀏覽數:111
        近
            HouseInfo [houseId=8, houseName=保利星苑, browseCount=111, distance=12]
        較近
            HouseInfo [houseId=4, houseName=保利星苑, browseCount=111, distance=10]
 * 
 * </pre>
 * 
 ****/

分區

// 按距離分區(兩部分)
Map<Boolean, List<HouseInfo>> ret12 = houseInfos.stream()//
  .filter(h -> h.getDistance() != null)//
  .collect(Collectors.partitioningBy(h -> h.getDistance() <= 20));
/****
         * <pre>
         *  較遠
                    HouseInfo [houseId=5, houseName=北顧小區, browseCount=66, distance=23]
                    HouseInfo [houseId=6, houseName=北傑公寓, browseCount=77, distance=55]
                    HouseInfo [houseId=7, houseName=保利星苑, browseCount=77, distance=66]
            較近
                    HouseInfo [houseId=1, houseName=恆大星級公寓, browseCount=100, distance=1]
                    HouseInfo [houseId=2, houseName=匯智湖畔, browseCount=999, distance=2]
                    HouseInfo [houseId=3, houseName=張江湯臣豪園, browseCount=100, distance=1]
                    HouseInfo [houseId=4, houseName=保利星苑, browseCount=111, distance=10]
                    HouseInfo [houseId=8, houseName=保利星苑, browseCount=111, distance=12]
         * 
         * </pre>
         ****/
ret12.forEach((t, houses) -> {
  System.out.println(t ? "較近" : "較遠");
  houses.forEach(h -> System.out.println("\t\t" + h));
});


Map<Boolean, Map<Boolean, List<HouseInfo>>> ret13 = houseInfos.stream()//
  .filter(h -> h.getDistance() != null)//
  .collect(
          Collectors.partitioningBy(h -> h.getDistance() <= 20,
        Collectors.partitioningBy(h -> h.getBrowseCount() >= 70))
);

/*****
         * <pre>
         *  較遠
                瀏覽較少
                    HouseInfo [houseId=5, houseName=北顧小區, browseCount=66, distance=23]
                瀏覽較多
                    HouseInfo [houseId=6, houseName=北傑公寓, browseCount=77, distance=55]
                    HouseInfo [houseId=7, houseName=保利星苑, browseCount=77, distance=66]
            較近
                瀏覽較少
                瀏覽較多
                    HouseInfo [houseId=1, houseName=恆大星級公寓, browseCount=100, distance=1]
                    HouseInfo [houseId=2, houseName=匯智湖畔, browseCount=999, distance=2]
                    HouseInfo [houseId=3, houseName=張江湯臣豪園, browseCount=100, distance=1]
                    HouseInfo [houseId=4, houseName=保利星苑, browseCount=111, distance=10]
                    HouseInfo [houseId=8, houseName=保利星苑, browseCount=111, distance=12]
         * </pre>
         ****/

ret13.forEach((less, value) -> {
  System.out.println(less ? "較近" : "較遠");
  value.forEach((moreCount, houses) -> {
    System.out.println(moreCount ? "\t瀏覽較多" : "\t瀏覽較少");
    houses.forEach(h -> System.out.println("\t\t" + h));
  });
});

更多相關文章...

相關標籤/搜索

Apache

Java

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。