Hadoop+Hbase分佈式集羣架構「徹底篇」

本文收錄在Linux運維企業架構實戰系列html

前言:本篇博客是博主踩過無數坑,反覆查閱資料,一步步搭建,操做完成後整理的我的心得,分享給你們~~~java

一、認識Hadoop和Hbase

1.1 hadoop簡單介紹

  Hadoop是一個使用java編寫的Apache開放源代碼框架,它容許使用簡單的編程模型跨大型計算機的大型數據集進行分佈式處理Hadoop框架工做的應用程序能夠在跨計算機羣集提供分佈式存儲和計算的環境中工做。Hadoop旨在從單一服務器擴展到數千臺機器,每臺機器都提供本地計算和存儲。node

 

1.2 Hadoop架構

Hadoop框架包括如下四個模塊:linux

  •  Hadoop Common:這些是其餘Hadoop模塊所需的Java庫和實用程序。這些庫提供文件系統和操做系統級抽象,幷包含啓動Hadoop所需的必要Java文件和腳本。
  •  Hadoop YARN:這是做業調度和集羣資源管理框架
  •  Hadoop分佈式文件系統(HDFS:提供對應用程序數據的高吞吐量訪問的分佈式文件系統。
  •  Hadoop MapReduce 這是基於YARN的大型數據集並行處理系統。

咱們可使用下圖來描述Hadoop框架中可用的這四個組件。web

  自2012年以來,術語「Hadoop」一般不只指向上述基本模塊,並且還指向能夠安裝在Hadoop之上或以外的其餘軟件包,例如Apache PigApache HiveApache HBaseApache火花等shell

 

1.3 Hadoop如何工做?

(1)階段1數據庫

  用戶/應用程序能夠經過指定如下項目向Hadoophadoop做業客戶端)提交所需的進程:apache

  •  分佈式文件系統中輸入和輸出文件的位置。
  •  java類以jar文件的形式包含了mapreduce功能的實現。
  •  經過設置做業特定的不一樣參數來進行做業配置。

(2)階段2編程

  而後,Hadoop做業客戶端將做業(jar /可執行文件等)和配置提交給JobTrackerJobTracker負責將軟件/配置分發到從站,調度任務和監視它們,向做業客戶端提供狀態和診斷信息。vim

(3)階段3

  不一樣節點上的TaskTrackers根據MapReduce實現執行任務,並將reduce函數的輸出存儲到文件系統的輸出文件中。

 

1.4 Hadoop的優勢

  •  Hadoop框架容許用戶快速編寫和測試分佈式系統。它是高效的,它自動分配數據並在機器上工做,反過來利用CPU核心的底層並行性。
  •  Hadoop不依賴硬件提供容錯和高可用性(FTHA,而是Hadoop庫自己被設計爲檢測和處理應用層的故障。
  •  服務器能夠動態添加或從集羣中刪除,Hadoop繼續運行而不會中斷。
  •  Hadoop的另外一大優勢是,除了是開放源碼,它是全部平臺兼容的,由於它是基於Java的。

 

1.5 HBase介紹

  Hbase全稱爲Hadoop Database,即hbasehadoop的數據庫,是一個分佈式的存儲系統Hbase利用HadoopHDFS做爲其文件存儲系統利用HadoopMapReduce來處理Hbase中的海量數據利用zookeeper做爲其協調工具 

 

1.6 HBase體系架構

Client

  •  包含訪問HBase的接口並維護cache來加快對HBase的訪問

Zookeeper

  •  保證任什麼時候候,集羣中只有一個master
  •  存貯全部Region的尋址入口。
  •  實時監控Region server的上線和下線信息。並實時通知Master
  •  存儲HBaseschematable元數據

Master

  •  Region server分配region
  •  負責Region server的負載均衡
  •  發現失效的Region server並從新分配其上的region
  •  管理用戶對table的增刪改操做

RegionServer

  •  Region server維護region,處理對這些regionIO請求
  •  Region server負責切分在運行過程當中變得過大的region 

HLog(WAL log)

  •  HLog文件就是一個普通的Hadoop Sequence FileSequence File KeyHLogKey對象,HLogKey中記錄了寫入數據的歸屬信息,除了tableregion名字外,同時還包括sequence numbertimestamptimestamp寫入時間sequence number的起始值爲0,或者是最近一次存入文件系 統中sequence number
  •  HLog SequeceFileValueHBaseKeyValue對象,即對應HFile中的 KeyValue

Region

  •  HBase自動把表水平劃分紅多個區域(region),每一個region會保存一個表 裏面某段連續的數據;每一個表一開始只有一個region,隨着數據不斷插 入表,region不斷增大,當增大到一個閥值的時候,region就會等分會 兩個新的region(裂變);
  •  table中的行不斷增多,就會有愈來愈多的region。這樣一張完整的表 被保存在多個Regionserver上。

Memstore storefile

  •  一個region由多個store組成,一個store對應一個CF(列族)
  •  store包括位於內存中的memstore和位於磁盤的storefile寫操做先寫入 memstore,當memstore中的數據達到某個閾值,hregionserver會啓動 flashcache進程寫入storefile,每次寫入造成單獨的一個storefile
  •  storefile文件的數量增加到必定閾值後,系統會進行合併(minormajor compaction),在合併過程當中會進行版本合併和刪除工做 (majar),造成更大的storefile
  •  當一個region全部storefile的大小和超過必定閾值後,會把當前的region 分割爲兩個,並由hmaster分配到相應的regionserver服務器,實現負載均衡。
  •  客戶端檢索數據,先在memstore找,找不到再找storefile
  •  HRegionHBase中分佈式存儲和負載均衡的最小單元。最小單元就表 示不一樣的HRegion能夠分佈在不一樣的HRegion server上。
  •  HRegion由一個或者多個Store組成,每一個store保存一個columns family
  •  每一個Strore又由一個memStore0至多個StoreFile組成。

 

2、安裝搭建hadoop

2.1 配置說明

本次集羣搭建共三臺機器,具體說明下:

主機名 IP 說明
hadoop01 192.168.10.101 DataNode、NodeManager、ResourceManager、NameNode
hadoop02 192.168.10.102 DataNode、NodeManager、SecondaryNameNode
hadoop03 192.168.10.106 DataNode、NodeManager

 

2.2 安裝前準備

2.2.1 機器配置說明

$ cat /etc/redhat-release 
CentOS Linux release 7.3.1611 (Core)   
$ uname -r
3.10.0-514.el7.x86_64

注:本集羣內全部進程均由clsn用戶啓動;要在集羣全部服務器都進行操做。

 

2.2.2 關閉selinux、防火牆

[along@hadoop01 ~]$ sestatus 
SELinux status:                 disabled
[root@hadoop01 ~]$ iptables -F
[along@hadoop01 ~]$ systemctl status firewalld.service 
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
   Active: inactive (dead)
     Docs: man:firewalld(1)

  

2.2.3 準備用戶

$ id along
uid=1000(along) gid=1000(along) groups=1000(along)

  

2.2.4 修改hosts文件,域名解析

$ cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.10.101 hadoop01
192.168.10.102 hadoop02
192.168.10.103 hadoop03

  

2.2.5 同步時間

$ yum -y install ntpdate
$ sudo ntpdate cn.pool.ntp.org

  

2.2.6 ssh互信配置

1)生成密鑰對,一直回車便可

[along@hadoop01 ~]$ ssh-keygen 

2)保證每臺服務器各自都有對方的公鑰

---along用戶
[along@hadoop01 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub 127.0.0.1
[along@hadoop01 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop01
[along@hadoop01 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop02
[along@hadoop01 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop03
---root用戶
[along@hadoop01 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub 127.0.0.1
[along@hadoop01 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop01
[along@hadoop01 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop02
[along@hadoop01 ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop03

注:要在集羣全部服務器都進行操做

3)驗證無祕鑰認證登陸

[along@hadoop02 ~]$ ssh along@hadoop01
[along@hadoop02 ~]$ ssh along@hadoop02
[along@hadoop02 ~]$ ssh along@hadoop03

  

2.3 配置jdk

在三臺機器上都須要操做

[root@hadoop01 ~]# tar -xvf jdk-8u201-linux-x64.tar.gz -C /usr/local
[root@hadoop01 ~]# chown along.along -R /usr/local/jdk1.8.0_201/
[root@hadoop01 ~]# ln -s /usr/local/jdk1.8.0_201/ /usr/local/jdk
[root@hadoop01 ~]# cat /etc/profile.d/jdk.sh
export JAVA_HOME=/usr/local/jdk
PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH
[root@hadoop01 ~]# source /etc/profile.d/jdk.sh
[along@hadoop01 ~]$ java -version
java version "1.8.0_201"
Java(TM) SE Runtime Environment (build 1.8.0_201-b09)
Java HotSpot(TM) 64-Bit Server VM (build 25.201-b09, mixed mode)

  

2.4 安裝hadoop

[root@hadoop01 ~]# wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-3.2.0/hadoop-3.2.0.tar.gz
[root@hadoop01 ~]# tar -xvf hadoop-3.2.0.tar.gz -C /usr/local/
[root@hadoop01 ~]# chown along.along -R /usr/local/hadoop-3.2.0/
[root@hadoop01 ~]# ln -s /usr/local/hadoop-3.2.0/  /usr/local/hadoop

  

3、配置啓動hadoop

3.1  hadoop-env.sh 配置hadoop環境變量

[along@hadoop01 ~]$ cd /usr/local/hadoop/etc/hadoop/
[along@hadoop01 hadoop]$ vim hadoop-env.sh
export JAVA_HOME=/usr/local/jdk
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop

  

3.2 core-site.xml 配置HDFS

[along@hadoop01 hadoop]$ vim core-site.xml
<configuration>
    <!-- 指定HDFS默認(namenode)的通訊地址 -->
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://hadoop01:9000</value>
    </property>
    <!-- 指定hadoop運行時產生文件的存儲路徑 -->
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/data/hadoop/tmp</value>
    </property>
</configuration>
[root@hadoop01 ~]# mkdir /data/hadoop

  

3.3 hdfs-site.xml 配置namenode

[along@hadoop01 hadoop]$ vim hdfs-site.xml 
<configuration>
    <!-- 設置namenode的http通信地址 -->
    <property>
        <name>dfs.namenode.http-address</name>
        <value>hadoop01:50070</value>
    </property>

    <!-- 設置secondarynamenode的http通信地址 -->
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>hadoop02:50090</value>
    </property>

    <!-- 設置namenode存放的路徑 -->
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/data/hadoop/name</value>
    </property>

    <!-- 設置hdfs副本數量 -->
    <property>
        <name>dfs.replication</name>
        <value>2</value>
    </property>
    <!-- 設置datanode存放的路徑 -->
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>/data/hadoop/datanode</value>
    </property>

    <property>
        <name>dfs.permissions</name>
        <value>false</value>
    </property>
</configuration>
[root@hadoop01 ~]# mkdir /data/hadoop/name -p
[root@hadoop01 ~]# mkdir /data/hadoop/datanode -p

  

3.4 mapred-site.xml 配置框架

[along@hadoop01 hadoop]$ vim mapred-site.xml
<configuration>
    <!-- 通知框架MR使用YARN -->
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
        <name>mapreduce.application.classpath</name>
        <value>
        /usr/local/hadoop/etc/hadoop,
        /usr/local/hadoop/share/hadoop/common/*,
        /usr/local/hadoop/share/hadoop/common/lib/*,
        /usr/local/hadoop/share/hadoop/hdfs/*,
        /usr/local/hadoop/share/hadoop/hdfs/lib/*,
        /usr/local/hadoop/share/hadoop/mapreduce/*,
        /usr/local/hadoop/share/hadoop/mapreduce/lib/*,
        /usr/local/hadoop/share/hadoop/yarn/*,
        /usr/local/hadoop/share/hadoop/yarn/lib/*
        </value>
    </property>
</configuration>

  

3.5 yarn-site.xml 配置resourcemanager

[along@hadoop01 hadoop]$ vim yarn-site.xml
<configuration>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>hadoop01</value>
    </property>

    <property>
        <description>The http address of the RM web application.</description>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>${yarn.resourcemanager.hostname}:8088</value>
    </property>

    <property>
        <description>The address of the applications manager interface in the RM.</description>
        <name>yarn.resourcemanager.address</name>
        <value>${yarn.resourcemanager.hostname}:8032</value>
    </property>

    <property>
        <description>The address of the scheduler interface.</description>
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>${yarn.resourcemanager.hostname}:8030</value>
    </property>

    <property>
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>${yarn.resourcemanager.hostname}:8031</value>
    </property>

    <property>
        <description>The address of the RM admin interface.</description>
        <name>yarn.resourcemanager.admin.address</name>
        <value>${yarn.resourcemanager.hostname}:8033</value>
    </property>
</configuration>

  

3.6 配置masters & slaves

[along@hadoop01 hadoop]$ echo 'hadoop02' >> /usr/local/hadoop/etc/hadoop/masters
[along@hadoop01 hadoop]$ echo 'hadoop03 hadoop01'  >> /usr/local/hadoop/etc/hadoop/slaves

  

3.7 啓動前準備

3.7.1 準備啓動腳本

啓動腳本文件所有位於 /usr/local/hadoop/sbin 文件夾下:

1)修改 start-dfs.sh stop-dfs.sh 文件添加:

[along@hadoop01 ~]$ vim /usr/local/hadoop/sbin/start-dfs.sh 
[along@hadoop01 ~]$ vim /usr/local/hadoop/sbin/stop-dfs.sh
HDFS_DATANODE_USER=along
HADOOP_SECURE_DN_USER=hdfs
HDFS_NAMENODE_USER=along
HDFS_SECONDARYNAMENODE_USER=along

2)修改start-yarn.sh stop-yarn.sh文件添加:

[along@hadoop01 ~]$ vim /usr/local/hadoop/sbin/start-yarn.sh 
[along@hadoop01 ~]$ vim /usr/local/hadoop/sbin/stop-yarn.sh
YARN_RESOURCEMANAGER_USER=along
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=along

  

3.7.2 受權

[root@hadoop01 ~]# chown -R along.along /usr/local/hadoop-3.2.0/
[root@hadoop01 ~]# chown -R along.along /data/hadoop/

  

3.7.3 配置hadoop命令環境變量

[root@hadoop01 ~]# vim /etc/profile.d/hadoop.sh 
[root@hadoop01 ~]# cat /etc/profile.d/hadoop.sh
export HADOOP_HOME=/usr/local/hadoop
PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

  

3.7.4 集羣初始化

[root@hadoop01 ~]# vim /data/hadoop/rsync.sh
#在集羣內全部機器上都建立所須要的目錄
for i in hadoop02 hadoop03
    do 
         sudo rsync -a /data/hadoop $i:/data/
done 

#複製hadoop配置到其餘機器
for i in hadoop02 hadoop03
    do 
         sudo rsync -a  /usr/local/hadoop-3.2.0/etc/hadoop $i:/usr/local/hadoop-3.2.0/etc/
done 
[root@hadoop01 ~]# /data/hadoop/rsync.sh

  

3.8 啓動hadoop集羣

3.8.1 第一次啓動前須要格式化,集羣全部服務器都須要;

[along@hadoop01 ~]$ hdfs namenode -format
... ...
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hadoop01/192.168.10.101
************************************************************/
[along@hadoop02 ~]$ hdfs namenode -format
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hadoop02/192.168.10.102
************************************************************/
[along@hadoop03 ~]$ hdfs namenode -format
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hadoop03/192.168.10.103
************************************************************/

  

3.8.2 啓動並驗證集羣

1)啓動namenodedatanode

[along@hadoop01 ~]$ start-dfs.sh 
[along@hadoop02 ~]$ start-dfs.sh 
[along@hadoop03 ~]$ start-dfs.sh 
[along@hadoop01 ~]$ jps
4480 DataNode
4727 Jps
4367 NameNode
[along@hadoop02 ~]$ jps
4082 Jps
3958 SecondaryNameNode
3789 DataNode
[along@hadoop03 ~]$ jps
2689 Jps
2475 DataNode

2)啓動YARN

[along@hadoop01 ~]$ start-yarn.sh
[along@hadoop02 ~]$ start-yarn.sh
[along@hadoop03 ~]$ start-yarn.sh
[along@hadoop01 ~]$ jps
4480 DataNode
4950 NodeManager
5447 NameNode
5561 Jps
4842 ResourceManager
[along@hadoop02 ~]$ jps
3958 SecondaryNameNode
4503 Jps
3789 DataNode
4367 NodeManager
[along@hadoop03 ~]$ jps
12353 Jps
12226 NodeManager
2475 DataNode

  

3.9 集羣啓動成功

1)網頁訪問:http://hadoop01:8088

該頁面爲ResourceManager 管理界面,在上面能夠看到集羣中的三臺Active Nodes

2)網頁訪問:http://hadoop01:50070/dfshealth.html#tab-datanode

該頁面爲NameNode管理頁面

到此hadoop集羣已經搭建完畢!!!

 

4、安裝配置Hbase

4.1 安裝Hbase

[root@hadoop01 ~]# wget https://mirrors.tuna.tsinghua.edu.cn/apache/hbase/1.4.9/hbase-1.4.9-bin.tar.gz
[root@hadoop01 ~]# tar -xvf hbase-1.4.9-bin.tar.gz -C /usr/local/
[root@hadoop01 ~]# chown -R along.along /usr/local/hbase-1.4.9/
[root@hadoop01 ~]# ln -s /usr/local/hbase-1.4.9/ /usr/local/hbase

注:當前時間2018.03.08hbase-2.1版本有問題;也多是我配置的問題,hbase會啓動失敗;因此,我降級到了hbase-1.4.9版本。

 

4.2 配置Hbase

4.2.1 hbase-env.sh 配置hbase環境變量

[root@hadoop01 ~]# cd /usr/local/hbase/conf/
[root@hadoop01 conf]# vim hbase-env.sh
export JAVA_HOME=/usr/local/jdk
export HBASE_CLASSPATH=/usr/local/hbase/conf

  

4.2.2 hbase-site.xml 配置hbase

[root@hadoop01 conf]# vim hbase-site.xml
<configuration>
<property>
    <name>hbase.rootdir</name>
    <!-- hbase存放數據目錄 -->
    <value>hdfs://hadoop01:9000/hbase/hbase_db</value>
    <!-- 端口要和Hadoop的fs.defaultFS端口一致-->
</property>
<property>
    <name>hbase.cluster.distributed</name>
    <!-- 是否分佈式部署 -->
    <value>true</value>
</property>
<property>
    <name>hbase.zookeeper.quorum</name>
    <!-- zookooper 服務啓動的節點,只能爲奇數個 -->
    <value>hadoop01,hadoop02,hadoop03</value>
</property>
<property>
    <!--zookooper配置、日誌等的存儲位置,必須爲以存在 -->
    <name>hbase.zookeeper.property.dataDir</name>
    <value>/data/hbase/zookeeper</value>
</property>
<property>
    <!--hbase master -->
    <name>hbase.master</name>
    <value>hadoop01</value>
</property>
<property>
    <!--hbase web 端口 -->
    <name>hbase.master.info.port</name>
    <value>16666</value>
</property>
</configuration>

 注:zookeeper有這樣一個特性:

  •  集羣中只要有過半的機器是正常工做的,那麼整個集羣對外就是可用的。
  •  也就是說若是有2zookeeper,那麼只要有1個死了zookeeper就不能用了,由於1沒有過半,因此2zookeeper的死亡容忍度爲0
  •  同理,要是有3zookeeper,一個死了,還剩下2個正常的,過半了,因此3zookeeper的容忍度爲1
  •  再多列舉幾個:2->0 ; 3->1 ; 4->1 ; 5->2 ; 6->2 會發現一個規律,2n2n-1的容忍度是同樣的,都是n-1,因此爲了更加高效,何須增長那一個沒必要要的zookeeper

 

4.2.3 指定集羣節點

[root@hadoop01 conf]# vim regionservers
hadoop01
hadoop02
hadoop03

  

5、啓動Hbase集羣

5.1 配置hbase命令環境變量

[root@hadoop01 ~]# vim /etc/profile.d/hbase.sh
export HBASE_HOME=/usr/local/hbase
PATH=$HBASE_HOME/bin:$PATH

  

5.2 啓動前準備

[root@hadoop01 ~]# mkdir -p /data/hbase/zookeeper
[root@hadoop01 ~]# vim /data/hbase/rsync.sh 
#在集羣內全部機器上都建立所須要的目錄
for i in hadoop02 hadoop03
    do 
         sudo rsync -a /data/hbase $i:/data/
         sudo scp -p /etc/profile.d/hbase.sh $i:/etc/profile.d/
done 

#複製hbase配置到其餘機器
for i in hadoop02 hadoop03
    do 
         sudo rsync -a  /usr/local/hbase-2.1.3 $i:/usr/local/
done
[root@hadoop01 conf]# chown -R along.along /data/hbase
[root@hadoop01 ~]# /data/hbase/rsync.sh
hbase.sh                                                        100%   62     0.1KB/s   00:00    
hbase.sh                                                        100%   62     0.1KB/s   00:00    

  

5.3 啓動hbase

注:只需在hadoop01服務器上操做便可。

1)啓動

[along@hadoop01 ~]$ start-hbase.sh 
hadoop03: running zookeeper, logging to /usr/local/hbase/logs/hbase-along-zookeeper-hadoop03.out
hadoop01: running zookeeper, logging to /usr/local/hbase/logs/hbase-along-zookeeper-hadoop01.out
hadoop02: running zookeeper, logging to /usr/local/hbase/logs/hbase-along-zookeeper-hadoop02.out
... ...

2)驗證

---主hbase
[along@hadoop01 ~]$ jps
4480 DataNode
23411 HQuorumPeer		# zookeeper進程
4950 NodeManager
24102 Jps
5447 NameNode
23544 HMaster			# hbase master進程
4842 ResourceManager
23711 HRegionServer
---2個從
[along@hadoop02 ~]$ jps
12948 HRegionServer		# hbase slave進程
3958 SecondaryNameNode
13209 Jps
12794 HQuorumPeer		# zookeeper進程
3789 DataNode
4367 NodeManager
[along@hadoop03 ~]$ jps
12226 NodeManager
19559 Jps
19336 HRegionServer		# hbase slave進程
19178 HQuorumPeer		# zookeeper進程
2475 DataNode

  

5.4 頁面查看hbase狀態

網頁訪問http://hadoop01:16666

 

6、簡單操做Hbase

6.1 hbase shell基本操做命令

名稱

命令表達式

建立表

create '表名稱','列簇名稱1','列簇名稱2'.......

添加記錄

put '表名稱', '行名稱','列簇名稱:',''

查看記錄

get '表名稱','行名稱'

查看錶中的記錄總數

count '表名稱'

刪除記錄

delete '表名',行名稱','列簇名稱'

刪除表

①disable '表名稱' ②drop '表名稱'

查看全部記錄

scan '表名稱'

查看某個表某個列中全部數據

scan '表名稱',['列簇名稱:']

更新記錄

即重寫一遍進行覆蓋

 

6.2 通常操做

1)啓動hbase 客戶端

[along@hadoop01 ~]$ hbase shell    #須要等待一些時間
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hbase-1.4.9/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop-3.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
Version 1.4.9, rd625b212e46d01cb17db9ac2e9e927fdb201afa1, Wed Dec  5 11:54:10 PST 2018

hbase(main):001:0> 

  

2)查詢集羣狀態

hbase(main):001:0> status 
1 active master, 0 backup masters, 3 servers, 0 dead, 0.6667 average load

  

3)查詢hive版本

hbase(main):002:0> version
1.4.9, rd625b212e46d01cb17db9ac2e9e927fdb201afa1, Wed Dec  5 11:54:10 PST 2018

  

6.3 DDL操做

1)建立一個demo表,包含 idinfo 兩個列簇

hbase(main):001:0> create 'demo','id','info'
0 row(s) in 23.2010 seconds

=> Hbase::Table - demo

  

2)得到表的描述

hbase(main):002:0> list
TABLE                                                                                             
demo                                                                                              
1 row(s) in 0.6380 seconds

=> ["demo"]
---獲取詳細描述
hbase(main):003:0> describe 'demo'
Table demo is ENABLED                                                                             
demo                                                                                              
COLUMN FAMILIES DESCRIPTION                                                                       
{NAME => 'id', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 
'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '
0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                         
{NAME => 'info', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS =
> 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS =>
 '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                       
2 row(s) in 0.3500 seconds

  

3)刪除一個列簇

注:任何刪除操做,都須要先disable

hbase(main):004:0> disable 'demo'
0 row(s) in 2.5930 seconds

hbase(main):006:0> alter 'demo',{NAME=>'info',METHOD=>'delete'}
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 4.3410 seconds

hbase(main):007:0> describe 'demo'
Table demo is DISABLED                                                                              
demo                                                                                                
COLUMN FAMILIES DESCRIPTION                                                                         
{NAME => 'id', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'F
ALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', 
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                               
1 row(s) in 0.1510 seconds

  

4)刪除一個表

要先disable,drop

hbase(main):008:0> list
TABLE                                                                                               
demo                                                                                                
1 row(s) in 0.1010 seconds

=> ["demo"]
hbase(main):009:0> disable 'demo'
0 row(s) in 0.0480 seconds

hbase(main):010:0> is_disabled 'demo'   #判斷表是否disable
true                                                                                                
0 row(s) in 0.0210 seconds

hbase(main):013:0> drop 'demo'
0 row(s) in 2.3270 seconds

hbase(main):014:0> list   #已經刪除成功
TABLE                                                                                               
0 row(s) in 0.0250 seconds

=> []
hbase(main):015:0> is_enabled 'demo'   #查詢是否存在demo表

ERROR: Unknown table demo!

  

6.4 DML操做

1)插入數據

hbase(main):024:0> create 'demo','id','info'
0 row(s) in 10.0720 seconds

=> Hbase::Table - demo
hbase(main):025:0> is_enabled 'demo'
true                                                                                                
0 row(s) in 0.1930 seconds

hbase(main):030:0> put 'demo','example','id:name','along'
0 row(s) in 0.0180 seconds

hbase(main):039:0> put 'demo','example','id:sex','male'
0 row(s) in 0.0860 seconds

hbase(main):040:0> put 'demo','example','id:age','24'
0 row(s) in 0.0120 seconds

hbase(main):041:0> put 'demo','example','id:company','taobao'
0 row(s) in 0.3840 seconds

hbase(main):042:0> put 'demo','taobao','info:addres','china'
0 row(s) in 0.1910 seconds

hbase(main):043:0> put 'demo','taobao','info:company','alibaba'
0 row(s) in 0.0300 seconds

hbase(main):044:0> put 'demo','taobao','info:boss','mayun'
0 row(s) in 0.1260 seconds

  

2)獲取demo表的數據

hbase(main):045:0> get 'demo','example'
COLUMN                     CELL                                                                     
 id:age                    timestamp=1552030411620, value=24                                        
 id:company                timestamp=1552030467196, value=taobao                                    
 id:name                   timestamp=1552030380723, value=along                                     
 id:sex                    timestamp=1552030392249, value=male                                      
1 row(s) in 0.8850 seconds

hbase(main):046:0> get 'demo','taobao'
COLUMN                     CELL                                                                     
 info:addres               timestamp=1552030496973, value=china                                     
 info:boss                 timestamp=1552030532254, value=mayun                                     
 info:company              timestamp=1552030520028, value=alibaba                                   
1 row(s) in 0.2500 seconds

hbase(main):047:0> get 'demo','example','id'
COLUMN                     CELL                                                                     
 id:age                    timestamp=1552030411620, value=24                                        
 id:company                timestamp=1552030467196, value=taobao                                    
 id:name                   timestamp=1552030380723, value=along                                     
 id:sex                    timestamp=1552030392249, value=male                                      
1 row(s) in 0.3150 seconds

hbase(main):048:0> get 'demo','example','info'
COLUMN                     CELL                                                                     
0 row(s) in 0.0200 seconds

hbase(main):049:0> get 'demo','taobao','id'
COLUMN                     CELL                                                                     
0 row(s) in 0.0410 seconds

hbase(main):053:0> get 'demo','taobao','info'
COLUMN                     CELL                                                                     
 info:addres               timestamp=1552030496973, value=china                                     
 info:boss                 timestamp=1552030532254, value=mayun                                     
 info:company              timestamp=1552030520028, value=alibaba                                   
1 row(s) in 0.0240 seconds

hbase(main):055:0> get 'demo','taobao','info:boss'
COLUMN                     CELL                                                                     
 info:boss                 timestamp=1552030532254, value=mayun                                     
1 row(s) in 0.1810 seconds

  

3)更新一條記錄

hbase(main):056:0> put 'demo','example','id:age','88'
0 row(s) in 0.1730 seconds

hbase(main):057:0> get 'demo','example','id:age'
COLUMN                     CELL                                                                     
 id:age                    timestamp=1552030841823, value=88                                        
1 row(s) in 0.1430 seconds

  

4)獲取時間戳數據

你們應該看到timestamp這個標記

hbase(main):059:0> get 'demo','example',{COLUMN=>'id:age',TIMESTAMP=>1552030841823}
COLUMN                     CELL                                                                     
 id:age                    timestamp=1552030841823, value=88                                        
1 row(s) in 0.0200 seconds

hbase(main):060:0> get 'demo','example',{COLUMN=>'id:age',TIMESTAMP=>1552030411620}
COLUMN                     CELL                                                                     
 id:age                    timestamp=1552030411620, value=24                                        
1 row(s) in 0.0930 seconds

  

5)全表顯示

hbase(main):061:0> scan 'demo'
ROW                        COLUMN+CELL                                                              
 example                   column=id:age, timestamp=1552030841823, value=88                         
 example                   column=id:company, timestamp=1552030467196, value=taobao                 
 example                   column=id:name, timestamp=1552030380723, value=along                     
 example                   column=id:sex, timestamp=1552030392249, value=male                       
 taobao                    column=info:addres, timestamp=1552030496973, value=china                 
 taobao                    column=info:boss, timestamp=1552030532254, value=mayun                   
 taobao                    column=info:company, timestamp=1552030520028, value=alibaba              
2 row(s) in 0.3880 seconds

  

6)刪除idexample'id:age'字段

hbase(main):062:0> delete 'demo','example','id:age'
0 row(s) in 1.1360 seconds

hbase(main):063:0> get 'demo','example'
COLUMN                     CELL                                                                                                           
 id:company                timestamp=1552030467196, value=taobao                                    
 id:name                   timestamp=1552030380723, value=along                                     
 id:sex                    timestamp=1552030392249, value=male

  

7)刪除整行

hbase(main):070:0> deleteall 'demo','taobao'
0 row(s) in 1.8140 seconds

hbase(main):071:0> get 'demo','taobao'
COLUMN                     CELL                                                                     
0 row(s) in 0.2200 seconds

  

8)給example這個id增長'id:age'字段,並使用counter實現遞增

hbase(main):072:0> incr 'demo','example','id:age'
COUNTER VALUE = 1
0 row(s) in 3.2200 seconds

hbase(main):073:0> get 'demo','example','id:age'
COLUMN                     CELL                                                                     
 id:age                    timestamp=1552031388997, value=\x00\x00\x00\x00\x00\x00\x00\x01          
1 row(s) in 0.0280 seconds

hbase(main):074:0> incr 'demo','example','id:age'
COUNTER VALUE = 2
0 row(s) in 0.0340 seconds

hbase(main):075:0> incr 'demo','example','id:age'
COUNTER VALUE = 3
0 row(s) in 0.0420 seconds

hbase(main):076:0> get 'demo','example','id:age'
COLUMN                     CELL                                                                     
 id:age                    timestamp=1552031429912, value=\x00\x00\x00\x00\x00\x00\x00\x03          
1 row(s) in 0.0690 seconds

hbase(main):077:0> get_counter 'demo','example','id:age'   #獲取當前count值
COUNTER VALUE = 3

  

9)清空整個表

hbase(main):078:0> truncate 'demo'
Truncating 'demo' table (it may take a while):
 - Disabling table...
 - Truncating table...
0 row(s) in 33.0820 seconds

能夠看出hbase是先disable掉該表,而後drop,最後從新create該表來實現清空該表。

相關文章
相關標籤/搜索