Hadoop Hive與Hbase整合+thrift

Hadoop Hive與Hbase整合+thrift

1.  簡介php

Hive是基於Hadoop的一個數據倉庫工具,能夠將結構化的數據文件映射爲一張數據庫表,並提供完整的sql查詢功能,能夠將sql語句轉換爲MapReduce任務進行運行。 其優勢是學習成本低,能夠經過類SQL語句快速實現簡單的MapReduce統計,沒必要開發專門的MapReduce應用,十分適合數據倉庫的統計分析。html

Hive與HBase的整合功能的實現是利用二者自己對外的API接口互相進行通訊,相互通訊主要是依靠hive_hbase-handler.jar工具類, 大體意思如圖所示:java

 

 

2.  Hive項目介紹node

 

 

項目結構
 

Hive配置文件介紹
•hive-site.xml      hive的配置文件
•hive-env.sh        hive的運行環境文件
•hive-default.xml.template  默認模板
•hive-env.sh.template     hive-env.sh默認配置
•hive-exec-log4j.properties.template   exec默認配置
• hive-log4j.properties.template log默認配置
hive-site.xml
property>
  <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:MySQL://localhost:3306/hive?createData baseIfNotExist=true</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
  <description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
    <value>root</value>
   <description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
   <value>test</value>
   <description>password to use against metastore database</description>
</property>
  
mysql

hive-env.sh
•配置Hive的配置文件路徑
•export HIVE_CONF_DIR= your path
•配置Hadoop的安裝路徑
•HADOOP_HOME=your hadoop home
linux

咱們按數據元的存儲方式不一樣安裝。sql

 

 

3.  使用Derby數據庫安裝shell

 

什麼是Derby安裝方式
•Apache Derby是一個徹底用java編寫的數據庫,因此能夠跨平臺,但須要在JVM中運行
•Derby是一個Open source的產品,基於Apache License 2.0分發
•即將元數據存儲在Derby數據庫中,也是Hive默認的安裝方式

 

 

1 .Hadoop和Hbase都已經成功安裝了

Hadoop集羣配置:http://blog.csdn.net/hguisu/article/details/723739數據庫

hbase安裝配置:http://blog.csdn.net/hguisu/article/details/7244413apache

2. 下載hive

hive目前最新的版本是0.12,咱們先從http://mirror.bit.edu.cn/apache/hive/hive-0.12.0/ 上下載hive-0.12.0.tar.gz,可是請注意,此版本基因而基於hadoop1.3和hbase0.94的(若是安裝hadoop2.X ,咱們須要修改相應的內容)

3. 安裝:

tar zxvf hive-0.12.0.tar.gz 

 cd hive-0.12.0

 

4. 替換jar包,與hbase0.96和hadoop2.2版本一致。

   因爲咱們下載的hive是基於hadoop1.3和hbase0.94的,因此必須進行替換,由於咱們的hbse0.96是基於hadoop2.2的,因此咱們必須先解決hive的hadoop版本問題,目前咱們從官網下載的hive都是用1.幾的版本編譯的,所以咱們須要本身下載源碼來用hadoop2.X的版本從新編譯hive,這個過程也很簡單,只須要以下步驟:
 
    1. 先從 http://svn.apache.org/repos/asf/hive/branches/branch-0.12 或者是 http://svn.apache.org/repos/asf/hive/trunk   咱們下載到/home/hadoop/branch-0.12下。
 
    2.   branch-0.12是使用ant編譯,trunk下面是使用maven編譯,若是未按照maven,須要從 http://maven.apache.org/download.cgi 下載maven,或者使用yum install maven。而後解壓出來並在PATH下把$maven_home/bin加入或者使用連接(ln -s /usr/local/bin/mvn $maven_home/bin ).而後就是使用mvn 命令。運行mvn -v就能知道maven是否配置成功
 
       3.   配置好maven開始編譯hive,咱們cd到下載源碼的 branch-0.12 目錄,而後運行mvn clean package -DskipTests -Phadoop-2開始編譯
 
    4.編譯好後的新jar包是存放在各個模塊下的target的,這些新jar包的名字都叫hive-***-0.13.0-SNAPSHOT.jar,***爲hive下的模塊名,咱們須要運行命令將其拷貝到hive-0.12.0/lib下。
    find /home/hadoop/branch-0.12  -name "hive*SNAPSHOT.jar"|xargs -i cp {} /home/hadoop/hive-0.12.0/lib。拷貝過去後咱們比照着刪除原lib下對應的0.12版本的jar包。
  
    5. 接着咱們同步hbase的版本,先cd到hive0.12.0/lib下,將hive-0.12.0/lib下hbase-0.94開頭的那兩個jar包刪掉,而後從/home/hadoop/hbase-0.96.0-hadoop2/lib下hbase開頭的包都拷貝過來
     find /home/hadoop/hbase-0.96.0-hadoop/lib -name "hbase*.jar"|xargs -i cp {} ./
 
    6. 基本的同步完成了,重點檢查下zookeeper和protobuf的jar包是否和hbase保持一致,若是不一致,

       拷貝protobuf.**.jar和zookeeper-3.4.5.jar到hive/lib下。

  
   7.若是用mysql當原數據庫,
      別忘了找一個mysql的jdbcjar包mysql-connector-java-3.1.12-bin.jar也拷貝到hive-0.12.0/lib下
 

5. 配置hive

•進入hive-0.12/conf目錄
•依據hive-env.sh.template,建立hive-env.sh文件
•cp  hive-env.sh.template hive-env.sh
•修改hive-env.sh
•指定hive配置文件的路徑
•export HIVE_CONF_DIR=/home/hadoop/hive-0.12/conf
•指定Hadoop路徑
• HADOOP_HOME=/home/hadoop/hadoop-2.2.0
 
 
 
hive-site.xml
 
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>

<!-- Hive Execution Parameters -->

<property>
  <name>hive.exec.reducers.bytes.per.reducer</name>
  <value>1000000000</value>
  <description>size per reducer.The default is 1G, i.e if the input size is 10G, it will use 10 reducers.</description>
</property>

<property>
  <name>hive.exec.reducers.max</name>
  <value>999</value>
  <description>max number of reducers will be used. If the one
        specified in the configuration parameter mapred.reduce.tasks is
        negative, hive will use this one as the max number of reducers when
        automatically determine number of reducers.</description>
</property>

<property>
  <name>hive.exec.scratchdir</name>
  <value>/hive/scratchdir</value>
  <description>Scratch space for Hive jobs</description>
</property>

<property>
  <name>hive.exec.local.scratchdir</name>
  <value>/tmp/${user.name}</value>
  <description>Local scratch space for Hive jobs</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:derby:;databaseName=metastore_db;create=true</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>org.apache.derby.jdbc.EmbeddedDriver</value>
  <description>Driver class name for a JDBC metastore</description>
</property>

<property>
  <name>javax.jdo.PersistenceManagerFactoryClass</name>
  <value>org.datanucleus.api.jdo.JDOPersistenceManagerFactory</value>
  <description>class implementing the jdo persistence</description>
</property>

<property>
  <name>javax.jdo.option.DetachAllOnCommit</name>
  <value>true</value>
  <description>detaches all objects from session so that they can be used after transaction is committed</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>APP</value>
  <description>username to use against metastore database</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>mine</value>
  <description>password to use against metastore database</description>
</property>

<property>
  <name>hive.metastore.warehouse.dir</name>
  <value>/hive/warehousedir</value>
  <description>location of default database for the warehouse</description>
</property>


<property>
 <name>hive.aux.jars.path</name>
  <value>
  file:///home/hadoop/hive-0.12.0/lib/hive-ant-0.13.0-SNAPSHOT.jar,
  file:///home/hadoop/hive-0.12.0/lib/protobuf-java-2.4.1.jar,
  file:///home/hadoop/hive-0.12.0/lib/hbase-client-0.96.0-hadoop2.jar,
  file:///home/hadoop/hive-0.12.0/lib/hbase-common-0.96.0-hadoop2.jar,
  file:///home/hadoop/hive-0.12.0/lib/zookeeper-3.4.5.jar,
  file:///home/hadoop/hive-0.12.0/lib/guava-11.0.2.jar
  </value>
</property>
 

 

Hive使用Hadoop,這意味着你必須在PATH裏面設置了hadoop路徑,或者導出export HADOOP_HOME=<hadoop-install-dir>也能夠。
另外,你必須在建立Hive庫表前,在HDFS上建立/tmp和/hive/warehousedir(也稱爲hive.metastore.warehouse.dir的),而且將它們的權限設置爲chmod g+w。完成這個操做的命令以下:
$ $HADOOP_HOME/bin/hadoop fs -mkdir /tmp
$ $HADOOP_HOME/bin/hadoop fs -mkdir /hive/warehousedir
$ $HADOOP_HOME/bin/hadoop fs -chmod g+w /tmp
$ $HADOOP_HOME/bin/hadoop fs -chmod g+w/hive/warehousedir
    我一樣發現設置HIVE_HOME是很重要的,但並不是必須。
$ export HIVE_HOME=<hive-install-dir>
    在Shell中使用Hive命令行(cli)模式:
$ $HIVE_HOME/bin/hive

5. 啓動hive

1).單節點啓動

#bin/hive -hiveconf hbase.master=master:490001

2) 集羣啓動:

#bin/hive -hiveconf hbase.zookeeper.quorum=node1,node2,node3

如何hive-site.xml文件中沒有配置hive.aux.jars.path,則能夠按照以下方式啓動。

bin/hive --auxpath /usr/local/hive/lib/hive-hbase-handler-0.96.0.jar, /usr/local/hive/lib/hbase-0.96.jar, /usr/local/hive/lib/zookeeper-3.3.2.jar -hiveconf hbase.zookeeper.quorum=node1,node2,node3

啓動直接#bin/hive 也能夠。

6 測試hive

•創建測試表pokes
hive> CREATE TABLE pokes (foo INT, bar STRING);
OK
Time taken: 1.842 seconds
hive> show tables;                             
OK
pokes
Time taken: 0.182 seconds, Fetched: 1 row(s)
 
•數據導入pokes
hive> LOAD DATA LOCAL INPATH './examples/files/kv1.txt' OVERWRITE INTO pokse
 
而後查看hadoop的文件:
bin/hadoop dfs -ls /hive/warehousedir
看到新增一個文件:
drwxr-xr-x   - hadoop supergroup     0  09:06 /hive/warehousedir/pokes

注:使用derby存儲方式時,運行hive會在當前目錄生成一個derby文件和一個metastore_db目錄。這種存儲方式的弊端是在同一個目錄下同時只能有一個hive客戶端能使用數據庫,不然報錯。

 

 

4.  使用MYSQL數據庫的方式安裝

安裝MySQL
• Ubuntu 採用apt-get安裝
• sudo apt-get install mysql-server
• 創建數據庫hive
• create database hivemeta
• 建立hive用戶,並受權
• grant all on hive.* to hive@'%'  identified by 'hive';  
• flush privileges;  

咱們直接修改hive-site.xml就能夠啦。

修改hive-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>


<property>
  <name>hive.exec.scratchdir</name>
  <value>/hive/scratchdir</value>
  <description>Scratch space for Hive jobs</description>
</property>


<property>
  <name>hive.exec.local.scratchdir</name>
  <value>/tmp/${user.name}</value>
  <description>Local scratch space for Hive jobs</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:mysql://192.168.1.214:3306/hiveMeta?createDatabaseIfNotExist=true</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>


<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>com.mysql.jdbc.Driver</value>
  <description>Driver class name for a JDBC metastore</description>
</property>


<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>hive</value>
  <description>username to use against metastore database</description>
</property>


<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>hive</value>
  <description>password to use against metastore database</description>
</property>


<property>
  <name>hive.metastore.warehouse.dir</name>
  <value>/hive/warehousedir</value>
  <description>location of default database for the warehouse</description>
</property>

<property>
 <name>hive.aux.jars.path</name>
  <value>
  file:///home/hadoop/hive-0.12.0/lib/hive-ant-0.13.0-SNAPSHOT.jar,
  file:///home/hadoop/hive-0.12.0/lib/protobuf-java-2.4.1.jar,
  file:///home/hadoop/hive-0.12.0/lib/hbase-client-0.96.0-hadoop2.jar,
  file:///home/hadoop/hive-0.12.0/lib/hbase-common-0.96.0-hadoop2.jar,
  file:///home/hadoop/hive-0.12.0/lib/zookeeper-3.4.5.jar,
  file:///home/hadoop/hive-0.12.0/lib/guava-11.0.2.jar
  </value>
</property>
 
jdbc:mysql://192.168.1.214:3306/hiveMeta?createDatabaseIfNotExist=true
其中hiveMeta是mysql的數據庫名。createDatabaseIfNotExist沒有就自動建立
 

本地mysql啓動hive :

 

直接運行#bin/hive 就能夠。

 

遠端mysql方式,啓動hive:

 

服務器端(192.168.1.214上機master上):

 

     在服務器端啓動一個 MetaStoreServer,客戶端利用 Thrift 協議經過 MetaStoreServer 訪問元數據庫。

 
   啓動hive,這個又能夠分爲啓動metastore和hiveserver,其中metastore用於和mysql之間的表結構建立或更新時通信,hiveserver用於客戶端鏈接,這這個都要啓動,具體的啓動命令:
啓動metastore: hive --service metastore    -hiveconf hbase.zookeeper.quorum=node1,node2,node3 -hiveconf hbase.zookeeper.property.clientPort=2222 (遠程mysql須要啓動)
 
啓動hiveservice: hive --service hiveserver   -hiveconf hbase.zookeeper.quorum= node1, node2,node3   -hiveconf hbase.zookeeper.property.clientPort=2222 (啓動服務,這樣jdbc:hive就能連上,默認10000端口,後面的部分必定要帶上,不然用eclipse鏈接不上的)
 起來後咱們在eclipse就可使用jdbc:hive來鏈接了。如
        Class. forName("org.apache.hadoop.hive.jdbc.HiveDriver");
         Connection conn = DriverManager.getConnection("jdbc:hive://server1:10000/hiveMeta","root","111111");
         return conn;
其實使用上和普通的數據庫已經很類似了,除了建表的語句有一些差異。
 
固然你也能夠在hive-0.12.0/bin運行
hive  -hiveconf hive.root.logger=DEBUG,console  -hiveconf hbase.zookeeper.quorum=server2,server3  -hiveconf hbase.zookeeper.property.clientPort=2222
其中 hbase.zookeeper.property.clientPort就是hbase-site.xml配置的zookeeper的端口號。
 
客戶端hive 的hive-site.xml配置文件:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
 
<configuration>

<property>
  <name>hive.metastore.warehouse.dir</name>
  <value>/hive/warehousedir</value>
</property>
 
<property>
  <name>hive.metastore.local</name>
  <value>false</value>
</property>

<property>
  <name>hive.metastore.uris</name>
  <value>thrift://192.168.1.214:9083</value>
</property>

</configuration>
這一步,咱們新加了配置
<property>  
  <name>hive.metastore.uris</name>  
  <value>thrift://192.168.1.214:9083</value>  
</property>  
這個就是使用thrift訪問的端口配置。 thrift://192.168.1.214:9083就是hive元數據訪問路徑。

進入hive客戶端,運行show tables;
 
至此,能夠在linux用各類shell來測試,也能夠經過eclipse鏈接到hive來測試,和經過jdbc鏈接普通數據庫一致
 
hive的服務端和客戶端均可以放在同一臺服務器上:
hive-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>


<property>
  <name>hive.exec.scratchdir</name>
  <value>/hive/scratchdir</value>
  <description>Scratch space for Hive jobs</description>
</property>


<property>
  <name>hive.exec.local.scratchdir</name>
  <value>/tmp/${user.name}</value>
  <description>Local scratch space for Hive jobs</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:mysql://192.168.1.214:3306/hiveMeta?createDatabaseIfNotExist=true</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>


<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>com.mysql.jdbc.Driver</value>
  <description>Driver class name for a JDBC metastore</description>
</property>


<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>hive</value>
  <description>username to use against metastore database</description>
</property>


<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>hive</value>
  <description>password to use against metastore database</description>
</property>


<property>
  <name>hive.metastore.warehouse.dir</name>
  <value>/hive/warehousedir</value>
  <description>location of default database for the warehouse</description>
</property>

<property>
 <name>hive.aux.jars.path</name>
  <value>
  file:///home/hadoop/hive-0.12.0/lib/hive-ant-0.13.0-SNAPSHOT.jar,
  file:///home/hadoop/hive-0.12.0/lib/protobuf-java-2.4.1.jar,
  file:///home/hadoop/hive-0.12.0/lib/hbase-client-0.96.0-hadoop2.jar,
  file:///home/hadoop/hive-0.12.0/lib/hbase-common-0.96.0-hadoop2.jar,
  file:///home/hadoop/hive-0.12.0/lib/zookeeper-3.4.5.jar,
  file:///home/hadoop/hive-0.12.0/lib/guava-11.0.2.jar
  </value>

<property>  
  <name>hive.metastore.uris</name>  
  <value>thrift://192.168.1.214:9083</value>  
</property>  
</property>




 

4.  與Hbase整合

 

以前咱們測試建立表的都是建立本地表,非hbase對應表。如今咱們整合回到hbase。

1.建立hbase識別的數據庫:

CREATE TABLE hbase_table_1(key int, value string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")
TBLPROPERTIES ("hbase.table.name" = "xyz");  

hbase.table.name 定義在hbase的table名稱

hbase.columns.mapping 定義在hbase的列族 

 

在hbase 下也能看到,兩邊新增數據都能實時看到。

能夠登陸Hbase去查看數據了
#bin/hbase shell
hbase(main):001:0> describe 'xyz'  
hbase(main):002:0> scan 'xyz'  
hbase(main):003:0> put 'xyz','100','cf1:val','www.360buy.com'

這時在Hive中能夠看到剛纔在Hbase中插入的數據了。

2.使用sql導入數據

 
若是要insert 與hbase整合的表,不能像本地表同樣load,須要利用已有的表進行。
insert overwrite hbase_table_1 hivetest select * from pokes  
注意兩個的類型要一致,不然用insert overwrite table hivetest select * from table_hive; 導不進去數據
 

使用sql導入hbase_table_1:

hive> INSERT OVERWRITE TABLE hbase_table_1 SELECT * FROM pokes WHERE foo=86; 

 

3 hive訪問已經存在的hbase

使用CREATE EXTERNAL TABLE:

CREATE EXTERNAL TABLE hbase_table_2(key int, value string)      
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = "cf1:val")
TBLPROPERTIES("hbase.table.name" = "some_existing_table");

內容參考:http://wiki.apache.org/hadoop/Hive/HBaseIntegration

 

5.  問題

 

bin/hive 執行show tables 報錯:

Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

若是是使用Derby數據庫的安裝方式,查看

<property>
  <name>hive.metastore.warehouse.dir</name>
  <value>/hive/warehousedir</value>
  <description>location of default database for the warehouse</description>
</property>

配置是否正確,

或者

<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:derby:;databaseName=metastore_db;create=true</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>

是否有權限訪問。

若是配置了mysql的Metastore方式,檢查的權限:

 bin/hive  -hiveconf hive.root.logger=DEBUG,console  

而後show tables 就會看到ava.sql.SQLException: Access denied for user 'hive'@'××××8' (using password: YES) 之類從錯誤消息。

 

 

執行

CREATE TABLE hbase_table_1(key int, value string)  
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'  
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")  
TBLPROPERTIES ("hbase.table.name" = "xyz");

報錯:

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:org.apache.hadoop.hbase.MasterNotRunningException: Retried 10 times

出現這個錯誤的緣由是引入的hbase包和hive自帶的hive包衝突,刪除hive/lib下的 hbase-0.94.×××.jar, OK了。

同時也要移走hive-0.12**.jar 包。

 

執行

hive>select uid from user limit 100;

Java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.

 

解決:修改$HIVE_HOME/conf/hive-env.sh文件,加入

export HADOOP_HOME=hadoop的安裝目錄

 


 

5.  經過thrift訪問hive(使用php作客戶端)

 

php鏈接hive執行sql查詢

 

使用php鏈接hive的條件:

 

 

1. 下載thrift

 

wget http://mirror.bjtu.edu.cn/apache//thrift/0.9.1/thrift-0.9.1.tar.gz

2. 解壓

tar -xzf thrift-0.9.1.tar.gz

3 .編譯安裝:

若是是源碼編譯的,首先要使用./boostrap.sh建立文件./configure ,咱們這下載的tar包,自帶有configure文件了。((能夠查閱README文件))

If you are building from the first time out of the source repository, you will
need to generate the configure scripts.  (This is not necessary if you
downloaded a tarball.)  From the top directory, do:
./bootstrap.sh

./configure

1  須要安裝thrift  安裝步驟

#  ./configure --without-ruby 

不要使用ruby,

 

make ; make install

若是沒有安裝libevent libevent-devel的應該先安裝這兩個依賴庫yum -y install libevent libevent-devel

其實Thrift就是使用來生成客戶端和服務器端代碼的。在這裏沒用到。

 

安裝好後啓動hive thrift

# ./hive --service hiveserver >/dev/null 2>/dev/null &

查看hiveserver默認端口是否打開10000   若是打開表示成功,在官網wiki有介紹文章:https://cwiki.apache.org/confluence/display/Hive/HiveServer

 

Thrift Hive Server

 

HiveServer is an optional service that allows a remote client to submit requests to Hive, using a variety of programming languages, and retrieve results. HiveServer is built on Apache ThriftTM(http://thrift.apache.org/), therefore it is sometimes called the Thrift server although this can lead to confusion because a newer service named HiveServer2 is also built on Thrift.

Thrift's interface definition language (IDL) file for HiveServer is hive_service.thrift, which is installed in $HIVE_HOME/service/if/.

WARNING!

Icon

HiveServer cannot handle concurrent requests from more than one client. This is actually a limitation imposed by the Thrift interface that HiveServer exports, and can't be resolved by modifying the HiveServer code.
HiveServer2 is a rewrite of HiveServer that addresses these problems, starting with Hive 0.11.0. See HIVE-2935.

Once Hive has been built using steps in Getting Started, the Thrift server can be started by running the following:

0.8 and Later
$ build /dist/bin/hive  --service hiveserver --help
usage: hiveserver
  -h,--help                        Print help information
     --hiveconf <property=value>   Use value  for  given property
     --maxWorkerThreads <arg>      maximum number of worker threads,
                                   default:2147483647
     --minWorkerThreads <arg>      minimum number of worker threads,
                                   default:100
  -p <port>                        Hive Server port number, default:10000
  - v ,--verbose                     Verbose mode
 
$ bin /hive  --service hiveserver

 

 

下載php客戶端包:

其實hive-0.12包中自帶的php lib,經測試,該包報php語法錯誤。命名空間的名稱居然是空的。

我上傳php客戶端包:http://download.csdn.net/detail/hguisu/6913673(源下載http://download.csdn.net/detail/jiedushi/3409880)

 

php鏈接hive客戶端代碼

 

<?php
// php鏈接hive thrift依賴包路徑
ini_set('display_errors', 1);
error_reporting(E_ALL);
$GLOBALS['THRIFT_ROOT'] = dirname(__FILE__). "/";
// load the required files for connecting to Hive
require_once $GLOBALS['THRIFT_ROOT'] . 'packages/hive_service/ThriftHive.php';
require_once $GLOBALS['THRIFT_ROOT'] . 'transport/TSocket.php';
require_once $GLOBALS['THRIFT_ROOT'] . 'protocol/TBinaryProtocol.php';
// Set up the transport/protocol/client
$transport = new TSocket('192.168.1.214', 10000);
$protocol = new TBinaryProtocol($transport);

//$protocol = new TBinaryProtocolAccelerated($transport);

$client = new ThriftHiveClient($protocol);
$transport->open();

// run queries, metadata calls etc

$client->execute('show tables');
var_dump($client->fetchAll());
$transport->close();

?>

 

 

打開瀏覽器瀏覽http://localhost/Thrift/test.php就能夠看到查詢結果了

相關文章
相關標籤/搜索