SPARK+LIVY

Apache Livy簡介

 

[Spark]使用Spark的REST服務Livy Apache Livy是由Cloudera Labs貢獻的基於Apache Spark的開源REST服務,它不只以REST的方式代替了Spark傳統的處理交互方式,同時也提供企業應用中不可忽視的多用戶,安全,以及容錯的支持。其功能以下:
  • 擁有可用於多Spark做業或多客戶端長時間運行的SparkContext;
  • 同時管理多個SparkContext,並在集羣(YARN / Mesos)而不是Livy服務器上運行它們,以實現良好的容錯性和併發性;
  • 能夠經過預先編譯好的JAR、代碼片斷或是java/scala客戶端API將Spark做業提交到遠端的Spark集羣上執行。

[Spark]使用Spark的REST服務Livy

創建測試環境

 

[Spark]使用Spark的REST服務Livy

使用Livy的REST API

 

建立交互式會話

curl -X POST -d '{"kind": "spark"}' -H "Content-Type: application/json" http://10.211.55.101:8998/sessions
{
    "id":0,
    "appId":null,
    "owner":null,
    "proxyUser":null,
    "state":"starting",
    "kind":"spark",
    "appInfo":{
        "driverLogUrl":null,
        "sparkUiUrl":null
    },
    "log":[
        "stdout: ",
        "
stderr: "
    ]
}

 

成功建立會話0,kind指定爲spark,若是以後提交的代碼中沒有指定kind,則使用此處的會話默認kind。java

查詢交互式會話列表

curl http://10.211.55.101:8998/sessions
{
    "from":0,
    "total":1,
    "sessions":[
        {
            "id":0,
            "appId":null,
            "owner":null,
            "proxyUser":null,
            "state":"idle",
            "kind":"spark",
            "appInfo":{
                "driverLogUrl":null,
                "sparkUiUrl":null
            },
            "log":[
                "2018-07-18 03:19:16 INFO BlockManager:54 - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy",
                "2018-07-18 03:19:16 INFO BlockManagerMaster:54 - Registering BlockManager BlockManagerId(driver, node1, 37891, None)",
                "2018-07-18 03:19:16 INFO BlockManagerMasterEndpoint:54 - Registering block manager node1:37891 with 366.3 MB RAM, BlockManagerId(driver, node1, 37891, None)",
                "2018-07-18 03:19:16 INFO BlockManagerMaster:54 - Registered BlockManager BlockManagerId(driver, node1, 37891, None)",
                "2018-07-18 03:19:16 INFO BlockManager:54 - Initialized BlockManager: BlockManagerId(driver, node1, 37891, None)",
                "2018-07-18 03:19:16 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6bc3c1d4{/metrics/json,null,AVAILABLE,@Spark}",
                "2018-07-18 03:19:17 INFO EventLoggingListener:54 - Logging events to hdfs://node1/user/spark/applicationHistory/local-1531883956147",
                "2018-07-18 03:19:17 INFO SparkEntries:53 - Spark context finished initialization in 1925ms",
                "2018-07-18 03:19:17 INFO SparkEntries:87 - Created Spark session (with Hive support).",
                "
stderr: "
            ]
        }
    ]
}

 

提交scala代碼片斷

curl -X POST -H 'Content-Type: application/json' -d '{"code":"123+321"}' http://10.211.55.101:8998/sessions/0/statements
{
    "id":0,
    "code":"123+321",
    "state":"waiting",
    "output":null,
    "progress":0
}

 

查詢結果

curl http://10.211.55.101:8998/sessions/0/statements/0
{
    "id":0,
    "code":"123+321",
    "state":"available",
    "output":{
        "status":"ok",
        "execution_count":0,
        "data":{
            "text/plain":"res0: Int = 444"
        }
    },
    "progress":1
}

 

提交sql代碼片斷

curl http://10.211.55.101:8998/sessions/0/statements -X POST -H 'Content-Type: application/json' -d '{"code":"show tables","kind":"sql"}'
{
    "id":1,
    "code":"show tables",
    "state":"waiting",
    "output":null,
    "progress":0
}

 

查詢結果

curl http://10.211.55.101:8998/sessions/0/statements/1
{
    "id":1,
    "code":"show tables",
    "state":"available",
    "output":{
        "status":"ok",
        "execution_count":1,
        "data":{
            "application/json":{
                "schema":{
                    "type":"struct",
                    "fields":[
                        {
                            "name":"database",
                            "type":"string",
                            "nullable":false,
                            "metadata":{

                            }
                        },
                        {
                            "name":"tableName",
                            "type":"string",
                            "nullable":false,
                            "metadata":{

                            }
                        },
                        {
                            "name":"isTemporary",
                            "type":"boolean",
                            "nullable":false,
                            "metadata":{

                            }
                        }
                    ]
                },
                "data":[
                    [
                        "default",
                        "emp",
                        false
                    ],
                    [
                        "default",
                        "emp2",
                        false
                    ],
                    [
                        "default",
                        "emp3",
                        false
                    ],
                    [
                        "default",
                        "t1",
                        false
                    ],
                    [
                        "default",
                        "t2",
                        false
                    ],
                    [
                        "default",
                        "t4",
                        false
                    ],
                    [
                        "default",
                        "tv",
                        false
                    ],
                    [
                        "default",
                        "yqu1",
                        false
                    ],
                    [
                        "default",
                        "yqu2",
                        false
                    ]
                ]
            }
        }
    },
    "progress":1
}

 測試下來,發現暫時不支持一次提交多條sql;node

提交批處理請求

hadoop fs -put /home/vagrant/HelloSparkHive/build/libs/hello-spark-hive-0.1.0.jar /tmp/
curl -H "Content-Type: application/json" -X POST -d '{ "conf": {"spark.master":"local [2]"}, "file":"/tmp/hello-spark-hive-0.1.0.jar", "className":"com.yqu.sparkhive.HelloSparkHiveDriver", "args":["yqu9"] }' http://10.211.55.101:8998/batches

{
    "id":3,
    "state":"running",
    "appId":null,
    "appInfo":{
        "driverLogUrl":null,
        "sparkUiUrl":null
    },
    "log":[
        "stdout: ",
        "
stderr: "
    ]
}

 

查詢批處理會話列表

curl http://10.211.55.101:8998/batches
{
    "from":0,
    "total":2,
    "sessions":[
        {
            "id":1,
            "state":"dead",
            "appId":null,
            "appInfo":{
                "driverLogUrl":null,
                "sparkUiUrl":null
            },
            "log":[
                "       at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:692)",
                "       at org.apache.spark.deploy.DependencyUtils$.downloadFile(DependencyUtils.scala:131)",
                "       at org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$7.apply(SparkSubmit.scala:401)",
                "       at org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$7.apply(SparkSubmit.scala:401)",
                "       at scala.Option.map(Option.scala:146)",
                "       at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:400)",
                "       at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:170)",
                "       at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)",
                "       at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)",
                "
stderr: "
            ]
        },
        {
            "id":3,
            "state":"running",
            "appId":null,
            "appInfo":{
                "driverLogUrl":null,
                "sparkUiUrl":null
            },
            "log":[
                "2018-07-18 03:48:45 INFO MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!",
                "2018-07-18 03:48:45 INFO MemoryStore:54 - MemoryStore cleared",
                "2018-07-18 03:48:45 INFO BlockManager:54 - BlockManager stopped",
                "2018-07-18 03:48:45 INFO BlockManagerMaster:54 - BlockManagerMaster stopped",
                "2018-07-18 03:48:45 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!",
                "2018-07-18 03:48:45 INFO SparkContext:54 - Successfully stopped SparkContext",
                "2018-07-18 03:48:45 INFO ShutdownHookManager:54 - Shutdown hook called",
                "2018-07-18 03:48:45 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-ffa75451-735a-4edc-9572-32d1a07ba748",
                "2018-07-18 03:48:45 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-7567b293-c81e-4516-a644-531e9fc584d1",
                "
stderr: "
            ]
        }
    ]
}

 

曾經嘗試過使用file:///home/vagrant/HelloSparkHive/build/libs/hello-spark-hive-0.1.0.jar,可是報錯:"requirement failed: Local path /home/vagrant/HelloSparkHive/build/libs/hello-spark-hive-0.1.0.jar cannot be added to user sessions.",有可能不支持本地jar文件。git

查詢特定批處理會話信息

curl http://10.211.55.101:8998/batches/3
{
    "id":3,
    "state":"success",
    "appId":null,
    "appInfo":{
        "driverLogUrl":null,
        "sparkUiUrl":null
    },
    "log":[
        "2018-07-18 03:48:45 INFO MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!",
        "2018-07-18 03:48:45 INFO MemoryStore:54 - MemoryStore cleared",
        "2018-07-18 03:48:45 INFO BlockManager:54 - BlockManager stopped",
        "2018-07-18 03:48:45 INFO BlockManagerMaster:54 - BlockManagerMaster stopped",
        "2018-07-18 03:48:45 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!",
        "2018-07-18 03:48:45 INFO SparkContext:54 - Successfully stopped SparkContext",
        "2018-07-18 03:48:45 INFO ShutdownHookManager:54 - Shutdown hook called",
        "2018-07-18 03:48:45 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-ffa75451-735a-4edc-9572-32d1a07ba748",
        "2018-07-18 03:48:45 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-7567b293-c81e-4516-a644-531e9fc584d1",
        "
stderr: "
    ]
}

 

查詢特定批處理會話狀態

curl http://10.211.55.101:8998/batches/3/state
{
    "id":3,
    "state":"success"
}

 

查看WEB UI

[Spark]使用Spark的REST服務Livy

刪除回話

curl -X DELETE http://10.211.55.101:8998/sessions/0
{
    "msg":"deleted"
}
curl -X DELETE http://10.211.55.101:8998/batches/3
{
    "msg":"deleted"
}
相關文章
相關標籤/搜索
本站公眾號
   歡迎關注本站公眾號,獲取更多信息