Hive安裝及與Spark的集成配置

 

本博客主要介紹SparkSql基於Hive做爲元數據的基本操做,包括如下內容:html

一、Hive安裝java

二、Spark與Hive的集成mysql

三、SparkSql的操做sql

注:在操做本博客的內容時,須要安裝Hadoop和Spark。數據庫

其中hadoop安裝可參考:https://my.oschina.net/u/729917/blog/1556872apache

spark安裝可參考:https://my.oschina.net/u/729917/blog/1556871vim

一、Hive安裝bash

a)、安裝Mysql數據庫,此步驟自行百度。ide

b)、官網下載Hive:http://mirror.bit.edu.cn/apache/hive/,做者下載後放在了目錄/home/hadoop/tools/apache-hive-2.2.0-bin.tar.gz下。oop

c)、移動到指定目錄並解壓:做者是解壓到/usr/local/目錄下,而且做者的Hadoop和Spark均是安裝在此目錄下。

sudo mv /home/hadoop/tools/apache-hive-2.2.0-bin.tar.gz /usr/local/apache-hive-2.2.0-bin/
sudo tar -zxvf apache-hive-2.2.0-bin.tar.gz

d)、配置環境變量

vim ~/.bashrc
export HIVE_HOME=/usr/local/apache-hive-2.2.0-bin
export PATH=$PATH:${HIVE_HOME}/bin

 環境變量生效

source ~/.bashrc

e)、在conf目錄下新建一個hive-site.xml,配置hive信息,使用mysql保存hive元數據信息

hadoop@Master:/usr/local/apache-hive-2.2.0-bin/conf$ touch hive-site.xml

下面是hive-site.xml的信息 

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
   <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>root</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>0000</value>
    </property>
    <property>
   <name>hive.metastore.schema.verification</name>
   <value>false</value>
    <description>
    Enforce metastore schema version consistency.
    True: Verify that version information stored in metastore matches with one from Hive jars.  Also disable automatic
          schema migration attempt. Users are required to manully migrate schema after Hive upgrade which ensures
          proper metastore schema migration. (Default)
    False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
    </description>
 </property>
</configuration>

f)、啓動hive:輸入hive命令便可

啓動成功顯示:

hadoop@Master:~$ hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/apache-hive-2.2.0-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/usr/local/apache-hive-2.2.0-bin/lib/hive-common-2.2.0.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive>

二、Spark集成Hive

a)、在spark的conf目錄下新建hive-site.xml文件

hadoop@Master:/usr/local/spark-2.2.0-bin-hadoop2.7/conf$ touch hive-site.xml
<configuration>
<property>
<name>hive.metastore.uris</name>
<value>thrift://Master:9083</value>
</property>
</configuration>

b)、啓動hadoop和spark

c)、啓動hive service metastore服務

hadoop@Master:/usr/local/spark-2.2.0-bin-hadoop2.7/bin$ hive --service metastore&

d)、啓動spark-sql進行測試

hadoop@Master:/usr/local/spark-2.2.0-bin-hadoop2.7/bin$ ./spark-sql

啓動成功後部分截屏

17/11/19 21:50:37 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/ce95f463-74ca-42de-ac85-3a283aa1520a
17/11/19 21:50:37 INFO SessionState: Created local directory: /tmp/hadoop/ce95f463-74ca-42de-ac85-3a283aa1520a
17/11/19 21:50:37 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/ce95f463-74ca-42de-ac85-3a283aa1520a/_tmp_space.db
17/11/19 21:50:37 INFO HiveClientImpl: Warehouse location for Hive client (version 1.2.1) is file:/usr/local/spark-2.2.0-bin-hadoop2.7/bin/spark-warehouse
17/11/19 21:50:37 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
17/11/19 21:50:38 INFO SessionState: Created local directory: /tmp/2110b645-b83e-4b65-87a8-5e9f1482699e_resources
17/11/19 21:50:38 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/2110b645-b83e-4b65-87a8-5e9f1482699e
17/11/19 21:50:38 INFO SessionState: Created local directory: /tmp/hadoop/2110b645-b83e-4b65-87a8-5e9f1482699e
17/11/19 21:50:38 INFO SessionState: Created HDFS directory: /tmp/hive/hadoop/2110b645-b83e-4b65-87a8-5e9f1482699e/_tmp_space.db
17/11/19 21:50:38 INFO HiveClientImpl: Warehouse location for Hive client (version 1.2.1) is file:/usr/local/spark-2.2.0-bin-hadoop2.7/bin/spark-warehouse
spark-sql>
相關文章
相關標籤/搜索