hive安裝完成後,若是隻是本地使用,啓用python
nohup hive --service metastore & [hadoop@master1 usr]$ hive Logging initialized using configuration in file:/data/usr/hive/conf/hive-log4j.properties hive> use fmcm; OK Time taken: 0.874 seconds
若是是要腳本調用,則須要啓用HiveServer2,確保10000端口已經被監聽(可在hive-site.xml中修改端口)c++
nohup hive --service hiveserver2 & [hadoop@master1 usr]$ netstat -an|grep 10000 tcp 0 0 0.0.0.0:10000 0.0.0.0:* LISTEN
HiveServer2爲客戶端在遠程執行hive查詢提供了接口,經過Thrift RPC來實現,還提供了多用戶併發和認證功能。目前python能夠經過pyhs2這個模塊來鏈接HiveServer2,實現查詢和取回結果的操做。git
不過pyhs2已經不在維護,追新的能夠參考另外2個很好的python package(已經被證實pyhs2存在性能瓶頸,最好儘快切換到pyhive)github
https://github.com/dropbox/PyHivesql
https://github.com/cloudera/impyla數據庫
安裝sasl失敗的話,先安裝: yum install gcc-c++ python-devel.x86_64 cyrus-sasl-devel.x86_64
pyhs2的項目託管在github之上,地址爲https://github.com/BradRuderman/pyhs2或在https://pypi.python.org/pypi/pyhs2/0.2直接下載bash
若是安裝不成功,能夠嘗試先安裝如下的組件:併發
yum install cyrus-sasl-plain
yum install cyrus-sasl-develcurl
安裝時若是遇到報錯: tcp
error: sasl/sasl.h: No such file or directory
能夠嘗試先安裝sasl , ubantu能夠用sudo apt-get install libsasl2-dev, CentOS能夠使用anaconda的pip安裝, 或者按照如下步驟安裝:
curl -O -L ftp://ftp.cyrusimap.org/cyrus-sasl/cyrus-sasl-2.1.26.tar.gz tar xzf cyrus-sasl-2.1.2.26.tar.gz cd cyrus-sasl-2.1.26.tar.gz ./configure && make install 最後附上測試代碼:
# -*- coding:utf-8 -*- ''' 採用Hive和thrift方式鏈接數據庫 ''' import pyhs2 import sys reload(sys) sys.setdefaultencoding('utf8') class HiveClient: def __init__(self, db_host, user, password, database, port=10000, authMechanism="PLAIN"): self.conn = pyhs2.connect(host=db_host, port=port, authMechanism=authMechanism, user=user, password=password, database=database, ) def query(self, sql): with self.conn.cursor() as cursor: cursor.execute(sql) return cursor.fetch() def close(self): self.conn.close() def main(): """ main process @rtype: @return: @note: """ hive_client = HiveClient(db_host='10.24.33.3', port=10000, user='hadoop', password='hadoop', database='fmcm', authMechanism='PLAIN') result = hive_client.query('select * from fm_news_newsaction limit 10') print result hive_client.close() if __name__ == '__main__': main()