HBase Python API

HBase Python API

HBase經過thrift機制能夠實現多語言編程,信息經過端口傳遞,所以Python是個不錯的選擇python

吐槽

博主在Mac上配置HBase,奈何Zoomkeeper一直報錯,結果Ubuntu虛擬機上10min解決……可是虛擬機裏沒有IDE寫Java代碼仍是不方便,所以用Mac主機鏈接虛擬機的想法孕育而生,這樣又能夠愉快地使用主機的IDE了~編程

1、服務端啓動Hbase Thrift RPC

HBase的啓動方式有不少,這裏再也不贅述,Ubuntu啓動HBase以後,啓動thriftapi

hbase-daemon.sh start thrift

默認的服務端口是9090bash

2、客戶端安裝依賴包

sudo pip install thrift
sudo pip install hbase-thrift

3、編寫客戶端代碼

# coding=utf-8
from thrift.transport import TSocket
from thrift.transport.TTransport import TBufferedTransport
from thrift.protocol import TBinaryProtocol

from hbase import Hbase
from hbase.ttypes import ColumnDescriptor
from hbase.ttypes import Mutation


class HBaseClient(object):
    def __init__(self, ip, port=9090):
        """
        創建與thrift server端的鏈接
        """
        # server端地址和端口設定
        self.__transport = TBufferedTransport(TSocket.TSocket(ip, port))
        # 設置傳輸協議
        protocol = TBinaryProtocol.TBinaryProtocol(self.__transport)
        # 客戶端
        self.__client = Hbase.Client(protocol)
        # 打開鏈接
        self.__transport.open()

    def __del__(self):
        self.__transport.close()

    def get_tables(self):
        """
        得到全部表
        :return:表名列表
        """
        return self.__client.getTableNames()

    def create_table(self, table, *columns):
        """
        建立表格
        :param table:表名
        :param columns:列族名
        """
        func = lambda col: ColumnDescriptor(col)
        column_families = map(func, columns)
        self.__client.createTable(table, column_families)

    def put(self, table, row, columns):
        """
        添加記錄
        :param table:表名
        :param row:行鍵
        :param columns:列名
        :return:
        """
        func = lambda (k, v): Mutation(column=k, value=v)
        mutations = map(func, columns.items())
        self.__client.mutateRow(table, row, mutations)

    def delete(self, table, row, column):
        """
        刪除記錄
        :param table:表名
        :param row:行鍵
        """
        self.__client.deleteAll(table, row, column)

    def scan(self, table, start_row="", columns=None):
        """
        得到記錄
        :param table: 表名
        :param start_row: 起始行
        :param columns: 列族
        :param attributes:
        """
        scanner = self.__client.scannerOpen(table, start_row, columns)
        func = lambda (k, v): (k, v.value)
        while True:
            r = self.__client.scannerGet(scanner)
            if not r:
                break
            yield dict(map(func, r[0].columns.items()))


if __name__ == '__main__':
    client = HBaseClient("10.211.55.7")

    # client.create_table('student', 'name', 'course')
    client.put("student", "1",
               {"name:": "Jack",
                "course:art": "88",
                "course:math": "12"})

    client.put("student", "2",
               {"name:": "Tom", "course:art": "90",
                "course:math": "100"})

    client.put("student", "3",
               {"name:": "Jerry"})
    client.delete('student', '1', 'course:math')
    for v in client.scan('student'):
        print v

4、測試結果

{'course:art': '88', 'name:': 'Jack'}
{'course:art': '90', 'name:': 'Tom', 'course:math': '100'}
{'name:': 'Jerry'}

5、小結

有了Python接口後,編寫簡單任務腳本變得很是方便,這大大得益於RPC機制,很好地解耦了Client和Server,方便開發人員合做。測試

相關文章
相關標籤/搜索