take
def take(num: Int): Array[T]es6
take用於獲取RDD中從0到num-1下標的元素,不排序。apache
- scala> var rdd1 = sc.makeRDD(Seq(10, 4, 2, 12, 3))
- rdd1: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[40] at makeRDD at :21
-
- scala> rdd1.take(1)
- res0: Array[Int] = Array(10)
-
- scala> rdd1.take(2)
- res1: Array[Int] = Array(10, 4)
-
top
def top(num: Int)(implicit ord: Ordering[T]): Array[T]函數
top函數用於從RDD中,按照默認(降序)或者指定的排序規則,返回前num個元素。es5
- scala> var rdd1 = sc.makeRDD(Seq(10, 4, 2, 12, 3))
- rdd1: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[40] at makeRDD at :21
-
- scala> rdd1.top(1)
- res2: Array[Int] = Array(12)
-
- scala> rdd1.top(2)
- res3: Array[Int] = Array(12, 10)
-
- //指定排序規則
- scala> implicit val myOrd = implicitly[Ordering[Int]].reverse
- myOrd: scala.math.Ordering[Int] = scala.math.Ordering$$anon$4@767499ef
-
- scala> rdd1.top(1)
- res4: Array[Int] = Array(2)
-
- scala> rdd1.top(2)
- res5: Array[Int] = Array(2, 3)
-
takeOrdered
def takeOrdered(num: Int)(implicit ord: Ordering[T]): Array[T]spa
takeOrdered和top相似,只不過以和top相反的順序返回元素。scala
- scala> var rdd1 = sc.makeRDD(Seq(10, 4, 2, 12, 3))
- rdd1: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[40] at makeRDD at :21
-
- scala> rdd1.top(1)
- res4: Array[Int] = Array(2)
-
- scala> rdd1.top(2)
- res5: Array[Int] = Array(2, 3)
-
- scala> rdd1.takeOrdered(1)
- res6: Array[Int] = Array(12)
-
- scala> rdd1.takeOrdered(2)
- res7: Array[Int] = Array(12, 10)