Ambari中使用Spark連結Hive讀取Hbase映射表

[TOC]java

在Ambari中使用SparkSQL連結Hive讀取Hbase映射表時,須要在修改配置以下.sql

對應組件版本

  • spark2.3.0
  • hive3.0.0
  • hbase2.0.0
  • Ambari2.7.1
  • HDP3.0

創建軟鏈接

在全部的SparkClient目錄中增長Hbase的軟鏈接apache

ln -s /usr/hdp/current/hbase-client/lib/hbase-client.jar /usr/hdp/current/spark2-client/jars/hbase-client.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-protocol.jar /usr/hdp/current/spark2-client/jars/hbase-protocol.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-common.jar /usr/hdp/current/spark2-client/jars/hbas-common.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-server.jar /usr/hdp/current/spark2-client/jars/hbase-server.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-annotations.jar /usr/hdp/current/spark2-client/jars/hbase-annotations.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-backup.jar /usr/hdp/current/spark2-client/jars/hbase-backup.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-endpoint.jar /usr/hdp/current/spark2-client/jars/hbase-endpoint.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-examples.jar /usr/hdp/current/spark2-client/jars/hbase-examples.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-external-blockcache.jar /usr/hdp/current/spark2-client/jars/hbase-external-blockcache.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-hadoop-compat.jar /usr/hdp/current/spark2-client/jars/hbase-hadoop-compat.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-hadoop2-compat.jar /usr/hdp/current/spark2-client/jars/hbase-hadoop2-compat.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-http.jar /usr/hdp/current/spark2-client/jars/hbase-http.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-it.jar /usr/hdp/current/spark2-client/jars/hbase-it.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-mapreduce.jar /usr/hdp/current/spark2-client/jars/hbase-mapreduce.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-metrics-api.jar /usr/hdp/current/spark2-client/jars/hbase-metrics-api.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-metrics.jar /usr/hdp/current/spark2-client/jars/hbase-metrics.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-procedure.jar /usr/hdp/current/spark2-client/jars/hbase-procedure.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-protocol-shaded.jar /usr/hdp/current/spark2-client/jars/hbase-protocol-shaded.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-replication.jar /usr/hdp/current/spark2-client/jars/hbase-replication.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-resource-bundle.jar /usr/hdp/current/spark2-client/jars/hbase-resource-bundle.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-rest.jar /usr/hdp/current/spark2-client/jars/hbase-rest.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-rsgroup.jar /usr/hdp/current/spark2-client/jars/hbase-rsgroup.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-shaded-client.jar /usr/hdp/current/spark2-client/jars/hbase-shaded-client.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-shaded-mapreduce.jar /usr/hdp/current/spark2-client/jars/hbase-shaded-mapreduce.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-spark.jar /usr/hdp/current/spark2-client/jars/hbase-spark.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-thrift.jar /usr/hdp/current/spark2-client/jars/hbase-thrift.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-zookeeper.jar /usr/hdp/current/spark2-client/jars/hbase-zookeeper.jar
ln -s /usr/hdp/current/hbase-client/lib/hbase-shaded-netty-2.1.0.jar /usr/hdp/current/spark2-client/jars/hbase-shaded-netty-2.1.0.jar
ln -s /usr/hdp/current/hbase-client/lib/metrics-core-3.2.1.jar /usr/hdp/current/spark2-client/jars/metrics-core-3.2.1.jar
ln -s /usr/hdp/3.0.1.0-187/hbase/lib/hbase-shaded-miscellaneous-2.1.0.jar /usr/hdp/current/spark2-client/jars/hbase-shaded-miscellaneous-2.1.0.jar
ln -s /usr/hdp/3.0.1.0-187/hbase/lib/hbase-shaded-mapreduce.jar /usr/hdp/current/spark2-client/jars/hbase-shaded-mapreduce.jar
ln -s /usr/hdp/3.0.1.0-187/hbase/lib/hbase-shaded-client.jar /usr/hdp/current/spark2-client/jars/hbase-shaded-client.jar
ln -s /usr/hdp/3.0.1.0-187/hbase/lib/hbase-shaded-protobuf-2.1.0.jar /usr/hdp/current/spark2-client/jars/hbase-shaded-protobuf-2.1.0.jar

增長Hive軟鏈接api

ln -s /usr/hdp/current/hive-server2/lib/hive-hbase-handler.jar /usr/hdp/current/spark2-client/jars/hive-hbase-handler.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-accumulo-handler.jar /usr/hdp/current/spark2-client/jars/hive-accumulo-handler.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-beeline.jar /usr/hdp/current/spark2-client/jars/hive-beeline.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-classification.jar /usr/hdp/current/spark2-client/jars/hive-classification.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-cli.jar /usr/hdp/current/spark2-client/jars/hive-cli.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-common.jar /usr/hdp/current/spark2-client/jars/hive-common.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-contrib.jar /usr/hdp/current/spark2-client/jars/hive-contrib.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-druid-handler.jar /usr/hdp/current/spark2-client/jars/hive-druid-handler.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-exec.jar /usr/hdp/current/spark2-client/jars/hive-exec.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-hcatalog-core.jar /usr/hdp/current/spark2-client/jars/hive-hcatalog-core.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-hcatalog-server-extensions.jar /usr/hdp/current/spark2-client/jars/hive-hcatalog-server-extensions.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-hplsql.jar /usr/hdp/current/spark2-client/jars/hive-hplsql.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-jdbc-handler.jar /usr/hdp/current/spark2-client/jars/hive-jdbc-handler.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-jdbc.jar /usr/hdp/current/spark2-client/jars/hive-jdbc.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-kryo-registrator.jar /usr/hdp/current/spark2-client/jars/hive-kryo-registrator.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-llap-client.jar /usr/hdp/current/spark2-client/jars/hive-llap-client.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-llap-common.jar /usr/hdp/current/spark2-client/jars/hive-llap-common.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-llap-ext-client.jar /usr/hdp/current/spark2-client/jars/hive-llap-ext-client.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-llap-server.jar /usr/hdp/current/spark2-client/jars/hive-llap-server.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-llap-tez.jar /usr/hdp/current/spark2-client/jars/hive-llap-tez.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-metastore.jar /usr/hdp/current/spark2-client/jars/hive-metastore.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-pre-upgrade.jar /usr/hdp/current/spark2-client/jars/hive-pre-upgrade.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-serde.jar /usr/hdp/current/spark2-client/jars/hive-serde.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-service-rpc.jar /usr/hdp/current/spark2-client/jars/hive-service-rpc.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-service.jar /usr/hdp/current/spark2-client/jars/hive-service.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-shims-common.jar /usr/hdp/current/spark2-client/jars/hive-shims-common.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-shims-scheduler.jar /usr/hdp/current/spark2-client/jars/hive-shims-scheduler.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-shims.jar /usr/hdp/current/spark2-client/jars/hive-shims.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-standalone-metastore.jar /usr/hdp/current/spark2-client/jars/hive-standalone-metastore.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-storage-api.jar /usr/hdp/current/spark2-client/jars/hive-storage-api.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-streaming.jar /usr/hdp/current/spark2-client/jars/hive-streaming.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-testutils.jar /usr/hdp/current/spark2-client/jars/hive-testutils.jar
ln -s /usr/hdp/current/hive-server2/lib/hive-vector-code-gen.jar /usr/hdp/current/spark2-client/jars/hive-vector-code-gen.jar

在$SPARK_HOME/standalone-metastore目錄下增長hive-hbase-handler軟鏈接app

ln -s /usr/hdp/current/hive-server2/lib/hive-hbase-handler.jar /usr/hdp/current/spark2-client/standalone-metastore/hive-hbase-handler.jar

項目POM清單

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>XXXXX</groupId>
    <artifactId>XXXXXXX</artifactId>
    <version>1.0</version>

    <properties>
        <spark.version>2.3.1</spark.version>
        <hive.version>3.1.0</hive.version>
        <hbase.version>2.0.0</hbase.version>
        <scala.version>2.11</scala.version>
        <zookeeper.version>3.4.13</zookeeper.version>
        <hadoop.version>3.1.0</hadoop.version>

    </properties>

    <dependencies>
        <!-- 加入spark支持 -->
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_${scala.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-hive_${scala.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>

        <!-- 加入hbase -->
        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-client</artifactId>
            <version>${hbase.version}</version>
            <exclusions>
                <exclusion>
                    <artifactId>httpclient</artifactId>
                    <groupId>org.apache.httpcomponents</groupId>
                </exclusion>
                <exclusion>
                    <artifactId>log4j</artifactId>
                    <groupId>log4j</groupId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-common</artifactId>
            <version>${hbase.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-mapreduce</artifactId>
            <version>${hbase.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>org.glassfish</groupId>
                    <artifactId>javax.el</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-procedure</artifactId>
            <version>${hbase.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-protocol-shaded</artifactId>
            <version>${hbase.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hbase.thirdparty</groupId>
            <artifactId>hbase-shaded-miscellaneous</artifactId>
            <version>${hbase.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hbase.thirdparty</groupId>
            <artifactId>hbase-shaded-netty</artifactId>
            <version>2.1.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hbase.thirdparty</groupId>
            <artifactId>hbase-shaded-protobuf</artifactId>
            <version>2.1.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.htrace</groupId>
            <artifactId>htrace-core4</artifactId>
            <version>4.2.0-incubating</version>
        </dependency>

        <!-- 加入hive支持-->
        <dependency>
            <groupId>org.apache.hive</groupId>
            <artifactId>hive-hbase-handler</artifactId>
            <version>${hive.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>org.glassfish</groupId>
                    <artifactId>javax.el</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <!-- 加入logback -->
        <!-- 日誌 -->
        <dependency>
            <groupId>ch.qos.logback</groupId>
            <artifactId>logback-classic</artifactId>
            <version>1.1.2</version>
        </dependency>
        <dependency>
            <groupId>com.typesafe.scala-logging</groupId>
            <artifactId>scala-logging-slf4j_2.11</artifactId>
            <version>2.1.1</version>
        </dependency>
    </dependencies>

    <build>
        <sourceDirectory>src/main/java</sourceDirectory>
        <!--<testSourceDirectory>src/test/java</testSourceDirectory>-->
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.5.1</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>

            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>2.4.3</version>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                            <transformers>
                                <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                    <mainClass>com.bcht.bigdata.streaming.ApplicationRabbitMQ</mainClass>
                                </transformer>
                                <transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
                                    <resource>reference.conf</resource>
                                </transformer>
                            </transformers>
                            <filters>
                                <filter>
                                    <artifact>*:*</artifact>
                                    <excludes>
                                        <exclude>META-INF/*.SF</exclude>
                                        <exclude>META-INF/*.DSA</exclude>
                                        <exclude>META-INF/*.RSA</exclude>
                                    </excludes>
                                </filter>
                            </filters>
                        </configuration>
                    </execution>
                </executions>
            </plugin>

            <plugin>
                <groupId>org.scala-tools</groupId>
                <artifactId>maven-scala-plugin</artifactId>
                <version>2.15.2</version>
                <executions>
                    <execution>
                        <id>scala-compile-first</id>
                        <goals>
                            <goal>compile</goal>
                        </goals>
                        <configuration>
                            <includes>
                                <include>**/*.scala</include>
                            </includes>
                        </configuration>
                    </execution>
                    <execution>
                        <id>scala-test-compile</id>
                        <goals>
                            <goal>testCompile</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>

        </plugins>
    </build>
</project>

配置Spark

打開Ambari,找到Spark2 ->CONFIGS ->Advanced spark2-defaults。配置spark.driver.extraLibraryPath與spark.executor.extraLibraryPath分別以下:maven

spark.driver.extraLibraryPath={{spark_hadoop_lib_native}}:/usr/hdp/current/spark2-client/standalone-metastore/hive-hbase-handler.jar
spark.executor.extraLibraryPath={{spark_hadoop_lib_native}}:/usr/hdp/current/spark2-client/standalone-metastore/hive-hbase-handler.jar

提交Spark-submit測試

程序代碼爲:oop

##System.setProperty("hadoop.home.dir", "D:\\hadoop-3.1.0")
    val warehouseLocation:String="hdfs://nn1.bcht:8020/user/hive/warehouse"
    val ss = SparkSession.builder().master("local[*]").appName("XXXXXXX").config("spark.sql.warehouse.dir", warehouseLocation).enableHiveSupport().getOrCreate()
    ss.sqlContext.table("hive_vio_violation").createOrReplaceTempView("hive_vio_violation")
    ss.sqlContext.sql("select * from hive_vio_violation limit 10").show(10)

最後化打包上傳至集羣,使用Spark-submit進行測試測試

spark-submit --master yarn --deploy-mode cluster --files /usr/local/bcht_lhyjg/hive-site.xml --class com.bcht.bigdata.lhyjg.Application_ydfx_bak /usr/local/bcht_lhyjg/original-LHYJG-1.0.jar
相關文章
相關標籤/搜索