sparkling-water 安裝


java和spark的運行環境
export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
export SPARK_HOME=/opt/cloudera/parcels/CDH-5.11.1-1.cdh5.11.1.p0.4/lib/spark/html

下載安裝包 sparkling-water-1.6.13.zip
wget http://h2o-release.s3.amazonaws.com/sparkling-water/rel-1.6/13/sparkling-water-1.6.13.zipjava

或者http://h2o-release.s3.amazonaws.com/sparkling-water/rel-1.6/3/index.htmlgit

解壓安裝包github

安裝包上傳到 /usr/localshell

cd /usr/local; unzip sparkling-water-1.6.13.zip;cd sparkling-water-1.6.13apache

啓動sparkling-shell 運行腳本
sudo -u hdfs bin/sparkling-shell --num-executors 3 --executor-memory 2g --master yarn-client --conf "spark.dynamicAllocation.enabled=false" --master yarn-clientspa

運行案例:摘抄自https://github.com/h2oai/sparkling-water/tree/rel-1.6
1.Initialize H2O services on top of Spark cluster:
scala> import org.apache.spark.h2o._
scala> val h2oContext = H2OContext.getOrCreate(sc)
scala> import h2oContext._
scala> import h2oContext.implicits._scala

2.Load weather data for Chicago international airport (ORD), with help from the RDD API:
scala> import org.apache.spark.examples.h2o._
scala> val weatherDataFile = "/tmp/examples/Chicago_Ohare_International_Airport.csv"
#該路徑爲hdfs上的路徑
scala> val wrawdata = sc.textFile(weatherDataFile,3).cache()
scala> val weatherTable = wrawdata.map(_.split(",")).map(row => WeatherParse(row)).filter(!_.isWrongRow())htm

3.Load airlines data using the H2O parser:
scala> import java.io.File
scala> val dataFile = "/usr/local/sparkling-water-1.6.13/examples/smalldata/allyears2k_headers.csv.gz"
#能夠發現該本地路徑隨資源分類的結點發生變化
scala> val airlinesData = new H2OFrame(new File(dataFile))ip

4.Select flights destined for Chicago (ORD):
scala> val airlinesTable : RDD[Airlines] = asRDD[Airlines](airlinesData)
scala> val flightsToORD = airlinesTable.filter(f => f.Dest==Some("ORD"))

5.Compute the number of these flights:
scala> flightsToORD.count

scala> flightsToORD.count
res0: Long = 2103

API: http://h2o-release.s3.amazonaws.com/sparkling-water/rel-1.4/1/scaladoc/index.html#org.apache.spark.h2o.H2OContext

相關文章
相關標籤/搜索