python操做hive

前言

HiveServer2爲客戶端在遠程執行hive查詢提供了接口,經過Thrift RPC來實現,還提供了多用戶併發和認證功能。目前使用python的用戶能夠經過pyhs2這個模塊來鏈接HiveServer2,實現查詢和取回結果的操做。python客戶端採用pyhs2模塊python

安裝python工具模塊

  1. 安轉pip https://pip.pypa.io/en/stable/installing/
  2. 安裝依賴模塊
    • yum install cyrus-sasl-plain
    • yum install cyrus-sasl-devel
    • yum install python-devel
  3. pip install pyhs2

python客戶端代碼

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# hive util with hive server2
"""
[@author](https://my.oschina.net/arthor):wyf
[@create](https://my.oschina.net/u/192469):2016-06-29 16:55
"""
__author__ = 'wyf'
__version__ = '0.1'

import pyhs2
import sys

default_encoding = 'utf-8'
if sys.getdefaultencoding() != default_encoding:
    reload(sys)
    sys.setdefaultencoding(default_encoding)

class HiveClient:
    def __init__(self, db_host, user, password, database, port=10000, authMechanism="PLAIN"):
        """
        create connection to hive server2
        """
        self.conn = pyhs2.connect(host=db_host,
                                  port=port,
                                  authMechanism=authMechanism,
                                  user=user,
                                  password=password,
                                  database=database,
                                  )

    def query(self, sql):

        """
        query
        """
        with self.conn.cursor() as cursor:
            cursor.execute(sql)
            return cursor.fetch()

    def close(self):
        """
        close connection
        """
        self.conn.close()


def main():
    """
    main process
    """
    try:
        hive_client = HiveClient(db_host='192.168.1.13', port=10000, user='hive', password='hive',

                             database='default', authMechanism='PLAIN')

        sql = 'select * from record limit 10'#實例sql語句
        result = hive_client.query(sql)
        hive_client.close()
    except pyhs2.error, tx:
        print '%s' % (tx.message)
        sys.exit(1)
    writeXlwt('test.xls',result)

if __name__ == '__main__':  
    main()

python操做hive結果

相關文章
相關標籤/搜索