概述 php
對於建表,和RDBMS相似,HBase也有namespace的概念,能夠指定表空間建立表,也能夠直接建立表,進入default表空間。 java
對於數據操做,HBase支持四類主要的數據操做,分別是: sql
Put :增長一行,修改一行; shell
Delete :刪除一行,刪除指定列族,刪除指定column的多個版本,刪除指定column的制定版本等; 數據庫
這四個類都是 org.apache.hadoop.hbase.client的子類,能夠到官網API去查看詳細信息,本文僅總結經常使用方法,力爭讓讀者用20%的時間掌握80%的經常使用功能。 apache
1.命名空間Namespace 緩存
2.建立表 安全
3.刪除表 服務器
4.修改表 less
5.新增、更新數據Put
6.刪除數據Delete
7.獲取單行Get
8.獲取多行Scan
1. 命名空間Namespace
在關係數據庫系統中,命名空間
namespace指的是一個 表的邏輯分組 ,同一組中的表有相似的用途。命名空間的概念爲 即將到來 的多租戶特性打下基礎:
1.1.命名空間管理
命名空間能夠被建立、移除、修改。
表和命名空間的隸屬關係 在在建立表時決定,經過如下格式指定:
<namespace>:<table>
Example:hbase shell中建立命名空間、建立命名空間中的表、移除命名空間、修改命名空間
#Create a namespace create_namespace 'my_ns'
#create my_table in my_ns namespace create 'my_ns:my_table', 'fam'
#drop namespace drop_namespace 'my_ns'
#alter namespace alter_namespace 'my_ns', {METHOD => 'set', 'PROPERTY_NAME' => 'PROPERTY_VALUE'}
1.2. 預約義的命名空間
有兩個系統內置的預約義命名空間:
Example:指定命名空間和默認命名空間
#namespace=foo and table qualifier=bar create 'foo:bar', 'fam' #namespace=default and table qualifier=bar create 'bar', 'fam'
廢話很少說,直接上樣板代碼,代碼後再說明注意事項和知識點:
Configuration conf = HBaseConfiguration. create ();
HBaseAdmin admin = new HBaseAdmin(conf);
//create namespace named "my_ns"
admin.createNamespace(NamespaceDescriptor. create ( "my_ns" ).build());
//create tableDesc, with namespace name "my_ns" and table name "mytable "
HTableDescriptor tableDesc = new HTableDescriptor(TableName. valueOf ("my_ns:mytable" ));
tableDesc.setDurability(Durability. SYNC_WAL );
//add a column family " mycf "
HColumnDescriptor hcd = new HColumnDescriptor( "mycf" );
tableDesc.addFamily(hcd);
admin.createTable(tableDesc);
admin.close();
關鍵知識點:
刪除表沒建立表那麼多學問,直接上代碼:
Configuration conf = HBaseConfiguration. create ();
HBaseAdmin admin = new HBaseAdmin(conf);
String tablename = "my_ns:mytable" ;
if (admin.tableExists(tablename)) {
try {
admin.disableTable(tablename);
admin.deleteTable(tablename);
} catch (Exception e) {
// TODO : handle exception
e.printStackTrace();
}
}
admin.close();
說明 :刪除表前必須先disable表。
4.1.實例代碼
(1)刪除列族、新增列族
修改以前,四個列族:
hbase(main):014:0> describe 'rd_ns:itable'
DESCRIPTION ENABLED
'rd_ns:itable', {NAME => ' info ', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', V true
ERSIONS => '10', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', KEEP_DELETED_CELLS => 'false',
BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => ' newcf ', DATA_BLOCK_ENCODING => 'NONE
', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '10', TTL => '2147483647',
MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'tr
ue'}, {NAME => ' note ', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS =>
'10', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => '2147483647', KEEP_DELETED_CELLS => 'false', BLOCKSIZE
=> '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => ' sysinfo ', DATA_BLOCK_ENCODING => 'NONE', BLOOM
FILTER => 'ROW', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '10', TTL => '2147483647', MIN_VERS
IONS => '0', KEEP_DELETED_CELLS => 'true', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
1 row(s) in 0.0450 seconds
修改表,刪除三個列族,新增一個列族,代碼以下:
Configuration conf = HBaseConfiguration. create ();
HBaseAdmin admin = new HBaseAdmin(conf);
String tablename = "rd_ns:itable" ;
if (admin.tableExists(tablename)) {
try {
admin.disableTable(tablename);
//get the TableDescriptor of target table
HTableDescriptor newtd = admin.getTableDescriptor (Bytes. toBytes ("rd_ns:itable" ));
//remove 3 useless column families
newtd.removeFamily(Bytes. toBytes ( "note" ));
newtd.removeFamily(Bytes. toBytes ( "newcf" ));
newtd.removeFamily(Bytes. toBytes ( "sysinfo" ));
//create HColumnDescriptor for new column family
HColumnDescriptor newhcd = new HColumnDescriptor( "action_log" );
newhcd.setMaxVersions(10);
newhcd.setKeepDeletedCells( true );
//add the new column family(HColumnDescriptor) to HTableDescriptor
newtd.addFamily(newhcd);
//modify target table struture
admin. modifyTable (Bytes. toBytes ( "rd_ns:itable" ),newtd);
admin.enableTable(tablename);
} catch (Exception e) {
// TODO : handle exception
e.printStackTrace();
}
}
admin.close();
修改以後:
hbase(main):015:0> describe 'rd_ns:itable'
DESCRIPTION ENABLED
'rd_ns:itable', {NAME => ' action_log ', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => true
'0', COMPRESSION => 'NONE', VERSIONS => '10', TTL => '2147483647', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'tr
ue', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => ' info ', DATA_BLOCK_ENCODING => '
NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '0', VERSIONS => '10', COMPRESSION => 'NONE', MIN_VERSIONS => '
0', TTL => '2147483647', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE =>
'true'}
1 row(s) in 0.0400 seconds
邏輯很簡單:
(2)修改現有列族的屬性(setMaxVersions)
Configuration conf = HBaseConfiguration. create ();
HBaseAdmin admin = new HBaseAdmin(conf);
String tablename = "rd_ns:itable" ;
if (admin.tableExists(tablename)) {
try {
admin.disableTable(tablename);
//get the TableDescriptor of target table
HTableDescriptor htd = admin.getTableDescriptor(Bytes. toBytes ("rd_ns:itable" ));
HColumnDescriptor infocf = htd.getFamily(Bytes. toBytes ( "info" ));
infocf.setMaxVersions(100);
//modify target table struture
admin.modifyTable(Bytes. toBytes ( "rd_ns:itable" ),htd);
admin.enableTable(tablename);
} catch (Exception e) {
// TODO : handle exception
e.printStackTrace();
}
}
admin.close();
5.新增、更新數據Put
5.1.經常使用構造函數:
(1)指定行鍵
public Put(byte[] row)
參數: row 行鍵
(2)指定行鍵和時間戳
public Put(byte[] row, long ts)
參數: row 行鍵, ts 時間戳
(3)從目標字符串中提取子串,做爲行鍵
Put(byte[] rowArray, int rowOffset, int rowLength)
(4)從目標字符串中提取子串,做爲行鍵,並加上時間戳
Put(byte[] rowArray, int rowOffset, int rowLength, long ts)
5.2.經常使用方法:
(1)指定 列族、限定符 ,添加值
add(byte[] family, byte[] qualifier, byte[] value)
(2)指定 列族、限定符、時間戳 ,添加值
add(byte[] family, byte[] qualifier, long ts, byte[] value)
(3) 設置寫WAL (Write-Ahead-Log)的級別
public void setDurability(Durability d)
參數是一個枚舉值,能夠有如下幾種選擇:
ASYNC_WAL : 當數據變更時,異步寫WAL日誌
SYNC_WAL : 當數據變更時,同步寫WAL日誌
FSYNC_WAL : 當數據變更時,同步寫WAL日誌,而且,強制將數據寫入磁盤
SKIP_WAL : 不寫WAL日誌
5.3.實例代碼
(1)插入行
Configuration conf = HBaseConfiguration. create ();
HTable table = new HTable(conf, "rd_ns:leetable" );
Put put = new Put(Bytes. toBytes ( "100001" ));
put.add(Bytes. toBytes ( "info" ), Bytes. toBytes ( "name" ), Bytes. toBytes ("lion" ));
put.add(Bytes. toBytes ( "info" ), Bytes. toBytes ( "address" ), Bytes. toBytes ("shangdi" ));
put.add(Bytes. toBytes ( "info" ), Bytes. toBytes ( "age" ), Bytes. toBytes ( "30"));
put.setDurability(Durability. SYNC_WAL );
table.put(put);
table.close();
(2)更新行
Configuration conf = HBaseConfiguration. create ();
HTable table = new HTable(conf, "rd_ns:leetable" );
Put put = new Put(Bytes. toBytes ( "100001" ));
put.add(Bytes. toBytes ( "info" ), Bytes. toBytes ( "name" ), Bytes. toBytes ("lee" ));
put.add(Bytes. toBytes ( "info" ), Bytes. toBytes ( "address" ), Bytes. toBytes ("longze" ));
put.add(Bytes. toBytes ( "info" ), Bytes. toBytes ( "age" ), Bytes. toBytes ( "31"));
put.setDurability(Durability. SYNC_WAL );
table.put(put);
table.close();
Put的構造函數都須要指定行鍵,若是是全新的行鍵,則新增一行;若是是已有的行鍵,則更新現有行。
(3) 從目標字符串中提取子串,做爲行鍵,構建Put
Configuration conf = HBaseConfiguration. create ();
HTable table = new HTable(conf, "rd_ns:leetable" );
Put put = new Put(Bytes. toBytes ( "100001_100002" ),7,6);
put.add(Bytes. toBytes ( "info" ), Bytes. toBytes ( "name" ), Bytes. toBytes ("show" ));
put.add(Bytes. toBytes ( "info" ), Bytes. toBytes ( "address" ), Bytes. toBytes ("caofang" ));
put.add(Bytes. toBytes ( "info" ), Bytes. toBytes ( "age" ), Bytes. toBytes ( "30"));
table.put(put);
table.close();
注意,關於: Put put = new Put(Bytes. toBytes ( "100001_100002" ),7,6)
Delete類用於刪除表中的一行數據,經過HTable.delete來執行該動做。
在執行Delete操做時,HBase並不會當即刪除數據,而是對須要刪除的數據打上一個「墓碑」標記,直到當Storefile合併時,再清除這些被標記上「墓碑」的數據。
若是但願刪除整行,用行鍵來初始化一個Delete對象便可。若是但願進一步定義刪除的具體內容,可使用如下這些Delete對象的方法:
下面詳細說明構造函數和經常使用方法:
6.1.構造函數
(1)指定要刪除的行鍵
Delete(byte[] row)
刪除行鍵指定行的數據。
若是沒有進一步的操做,使用該構造函數將刪除行鍵指定的行中 全部列族中全部列的全部版本 !
(2)指定要刪除的行鍵和時間戳
Delete(byte[] row, long timestamp)
刪除行鍵和時間戳共同肯定行的數據。
若是沒有進一步的操做,使用該構造函數將刪除行鍵指定的行中,全部列族中全部列的 時間戳 小於等於 指定時間戳的數據版本 。
注意 :該時間戳僅僅和刪除行有關,若是須要進一步指定列族或者列,你必須分別爲它們指定時間戳。
(3)給定一個字符串,目標行鍵的偏移,截取的長度
Delete(byte[] rowArray, int rowOffset, int rowLength)
(4)給定一個字符串,目標行鍵的偏移,截取的長度,時間戳
Delete(byte[] rowArray, int rowOffset, int rowLength, long ts)
6.2.經常使用方法
6.3.實例代碼
(1)刪除整行的全部列族、全部行、全部版本
Configuration conf = HBaseConfiguration. create ();
HTable table = new HTable(conf, "rd_ns:leetable" );
Delete delete = new Delete(Bytes. toBytes ( "000" ));
table.delete(delete);
table.close();
(2)刪除 指定列的最新版本
如下是刪除以前的數據,注意看100003行的info:address,這是該列最新版本的數據,值是caofang1,在這以前的版本值是caofang:
hbase(main):007:0> scan 'rd_ns:leetable'
ROW COLUMN+CELL
100001 column=info:address, timestamp=1405304843114, value=longze
100001 column=info:age, timestamp=1405304843114, value=31
100001 column=info:name, timestamp=1405304843114, value=leon
100002 column=info:address, timestamp=1405305471343, value=caofang
100002 column=info:age, timestamp=1405305471343, value=30
100002 column=info:name, timestamp=1405305471343, value=show
100003 column=info:address, timestamp=1405390959464, value=caofang1
100003 column=info:age, timestamp=1405390959464, value=301
100003 column=info:name, timestamp=1405390959464, value=show1
3 row(s) in 0.0270 seconds
執行如下代碼:
Configuration conf = HBaseConfiguration. create ();
HTable table = new HTable(conf, "rd_ns:leetable" );
Delete delete = new Delete(Bytes. toBytes ( "100003" ));
delete.deleteColumn(Bytes. toBytes ( "info" ), Bytes. toBytes ( "address" ));
table.delete(delete);
table.close();
而後查看數據,發現100003列的info:address列的值顯示爲前一個版本的caofang了!其他值均不變:
hbase(main):008:0> scan 'rd_ns:leetable'
ROW COLUMN+CELL
100001 column=info:address, timestamp=1405304843114, value=longze
100001 column=info:age, timestamp=1405304843114, value=31
100001 column=info:name, timestamp=1405304843114, value=leon
100002 column=info:address, timestamp=1405305471343, value=caofang
100002 column=info:age, timestamp=1405305471343, value=30
100002 column=info:name, timestamp=1405305471343, value=show
100003 column=info:address, timestamp=1405390728175, value=caofang
100003 column=info:age, timestamp=1405390959464, value=301
100003 column=info:name, timestamp=1405390959464, value=show1
3 row(s) in 0.0560 seconds
(3)刪除 指定列的全部版本
接以上場景,執行如下代碼:
Configuration conf = HBaseConfiguration. create ();
HTable table = new HTable(conf, "rd_ns:leetable" );
Delete delete = new Delete(Bytes. toBytes ( "100003" ));
delete. deleteColumns (Bytes. toBytes ( "info" ), Bytes. toBytes ( "address" ));
table.delete(delete);
table.close();
而後咱們會發現,100003行的整個info:address列都沒了:
hbase(main):009:0> scan 'rd_ns:leetable'
ROW COLUMN+CELL
100001 column=info:address, timestamp=1405304843114, value=longze
100001 column=info:age, timestamp=1405304843114, value=31
100001 column=info:name, timestamp=1405304843114, value=leon
100002 column=info:address, timestamp=1405305471343, value=caofang
100002 column=info:age, timestamp=1405305471343, value=30
100002 column=info:name, timestamp=1405305471343, value=show
100003 column=info:age, timestamp=1405390959464, value=301
100003 column=info:name, timestamp=1405390959464, value=show1
3 row(s) in 0.0240 seconds
(4) 刪除指定列族中全部 列的時間戳 等於 指定時間戳 的版本數據
爲了演示效果,我已經向100003行的info:address列新插入一條數據
hbase(main):010:0> scan 'rd_ns:leetable'
ROW COLUMN+CELL
100001 column=info:address, timestamp=1405304843114, value=longze
100001 column=info:age, timestamp=1405304843114, value=31
100001 column=info:name, timestamp=1405304843114, value=leon
100002 column=info:address, timestamp=1405305471343, value=caofang
100002 column=info:age, timestamp=1405305471343, value=30
100002 column=info:name, timestamp=1405305471343, value=show
100003 column=info:address, timestamp= 1405391883886 , value=shangdi
100003 column=info:age, timestamp= 1405390959464 , value=301
100003 column=info:name, timestamp= 1405390959464 , value=show1
3 row(s) in 0.0250 seconds
如今,咱們的目的是刪除info列族中,時間戳爲1405390959464的全部列數據:
Configuration conf = HBaseConfiguration. create ();
HTable table = new HTable(conf, "rd_ns:leetable" );
Delete delete = new Delete(Bytes. toBytes ( "100003" ));
delete. deleteFamilyVersion (Bytes. toBytes ( "info" ), 1405390959464L);
table.delete(delete);
table.close();
hbase(main):011:0> scan 'rd_ns:leetable'
ROW COLUMN+CELL
100001 column=info:address, timestamp=1405304843114, value=longze
100001 column=info:age, timestamp=1405304843114, value=31
100001 column=info:name, timestamp=1405304843114, value=leon
100002 column=info:address, timestamp=1405305471343, value=caofang
100002 column=info:age, timestamp=1405305471343, value=30
100002 column=info:name, timestamp=1405305471343, value=show
100003 column=info:address, timestamp= 1405391883886 , value=shangdi
100003 column=info:age, timestamp= 1405390728175 , value=30
100003 column=info:name, timestamp= 1405390728175 , value=show
3 row(s) in 0.0250 seconds
能夠看到,100003行的info列族,已經不存在時間戳爲 1405390959464的數據,比它更早版本的數據被查詢出來,而info列族中時間戳不等於 1405390959464的address列,不受該delete的影響 。
7.獲取單行Get
若是但願獲取整行數據,用行鍵初始化一個Get對象就能夠,若是但願進一步縮小獲取的數據範圍,可使用Get對象的如下方法:
下面詳細描述構造函數及經常使用方法:
7.1.構造函數
Get的構造函數很簡單,只有一個構造函數: Get(byte[] row) 參數是行鍵。
7.2.經常使用方法
7.3.實測代碼
測試表的全部數據:
hbase(main):016:0> scan 'rd_ns:leetable'
ROW COLUMN+CELL
100001 column=info:address, timestamp=1405304843114, value=longze
100001 column=info:age, timestamp=1405304843114, value=31
100001 column=info:name, timestamp=1405304843114, value=leon
100002 column=info:address, timestamp=1405305471343, value=caofang
100002 column=info:age, timestamp=1405305471343, value=30
100002 column=info:name, timestamp=1405305471343, value=show
100003 column=info:address, timestamp=1405407883218, value=qinghe
100003 column=info:age, timestamp=1405407883218, value=28
100003 column=info:name, timestamp=1405407883218, value=shichao
3 row(s) in 0.0250 seconds
(1)獲取行鍵指定行的 全部列族、全部列 的 最新版本 數據
Configuration conf = HBaseConfiguration. create ();
HTable table = new HTable(conf, "rd_ns:leetable" );
Get get = new Get(Bytes. toBytes ( "100003" ));
Result r = table.get(get);
for (Cell cell : r.rawCells()) {
System. out .println(
"Rowkey : " +Bytes. toString (r.getRow())+
" Familiy:Quilifier : " +Bytes. toString (CellUtil. cloneQualifier (cell))+
" Value : " +Bytes. toString (CellUtil. cloneValue (cell))
);
}
table.close();
代碼輸出:
Rowkey : 100003 Familiy:Quilifier : address Value : qinghe
Rowkey : 100003 Familiy:Quilifier : age Value : 28
Rowkey : 100003 Familiy:Quilifier : name Value : shichao
(2)獲取行鍵指定行中, 指定列 的最新版本數據
Configuration conf = HBaseConfiguration. create ();
HTable table = new HTable(conf, "rd_ns:leetable" );
Get get = new Get(Bytes. toBytes ( "100003" ));
get.addColumn(Bytes. toBytes ( "info" ), Bytes. toBytes ( "name" ));
Result r = table.get(get);
for (Cell cell : r.rawCells()) {
System. out .println(
"Rowkey : " +Bytes. toString (r.getRow())+
" Familiy:Quilifier : " +Bytes. toString (CellUtil. cloneQualifier (cell))+
" Value : " +Bytes. toString (CellUtil. cloneValue (cell))
);
}
table.close();
代碼輸出:
Rowkey : 100003 Familiy:Quilifier : name Value : shichao
(3)獲取行鍵指定的行中, 指定時間戳 的數據
Configuration conf = HBaseConfiguration. create ();
HTable table = new HTable(conf, "rd_ns:leetable" );
Get get = new Get(Bytes. toBytes ( "100003" ));
get.setTimeStamp(1405407854374L);
Result r = table.get(get);
for (Cell cell : r.rawCells()) {
System. out .println(
"Rowkey : " +Bytes. toString (r.getRow())+
" Familiy:Quilifier : " +Bytes. toString (CellUtil. cloneQualifier (cell))+
" Value : " +Bytes. toString (CellUtil. cloneValue (cell))
);
}
table.close();
代碼輸出了上面scan命令輸出中沒有展現的歷史數據:
Rowkey : 100003 Familiy:Quilifier : address Value : huangzhuang
Rowkey : 100003 Familiy:Quilifier : age Value : 32
Rowkey : 100003 Familiy:Quilifier : name Value : lily
(4)獲取行鍵指定的行中, 全部版本 的數據
Configuration conf = HBaseConfiguration. create ();
HTable table = new HTable(conf, "rd_ns:itable" );
Get get = new Get(Bytes. toBytes ( "100003" ));
get.setMaxVersions();
Result r = table.get(get);
for (Cell cell : r.rawCells()) {
System. out .println(
"Rowkey : " +Bytes. toString (r.getRow())+
" Familiy:Quilifier : " +Bytes. toString (CellUtil. cloneQualifier (cell))+
" Value : " +Bytes. toString (CellUtil. cloneValue (cell))+
" Time : " +cell.getTimestamp()
);
}
table.close();
代碼輸出:
Rowkey : 100003 Familiy:Quilifier : address Value : xierqi Time : 1405417500485
Rowkey : 100003 Familiy:Quilifier : address Value : shangdi Time : 1405417477465
Rowkey : 100003 Familiy:Quilifier : address Value : longze Time : 1405417448414
Rowkey : 100003 Familiy:Quilifier : age Value : 29 Time : 1405417500485
Rowkey : 100003 Familiy:Quilifier : age Value : 30 Time : 1405417477465
Rowkey : 100003 Familiy:Quilifier : age Value : 31 Time : 1405417448414
Rowkey : 100003 Familiy:Quilifier : name Value : leon Time : 1405417500485
Rowkey : 100003 Familiy:Quilifier : name Value : lee Time : 1405417477465
Rowkey : 100003 Familiy:Quilifier : name Value : lion Time : 1405417448414
注意:
能輸出多版本數據的前提是當前列族能保存多版本數據,列族能夠保存的數據版本數經過HColumnDescriptor的setMaxVersions(Int)方法設置。
8.獲取多行Scan
Scan對象能夠返回知足給定條件的多行數據。 若是但願獲取全部的行,直接初始化一個Scan對象便可。 若是但願限制掃描的行範圍,可使用如下方法:
下面是官網文檔中的一個入門示例:假設表有幾行鍵值爲 "row1", "row2", "row3",還有一些行有鍵值 "abc1", "abc2", 和 "abc3",目標是返回"row"打頭的行:
HTable htable = ... // instantiate HTable
Scan scan = new Scan();
scan.addColumn(Bytes.toBytes("cf"),Bytes.toBytes("attr"));
scan.setStartRow( Bytes.toBytes("row")); // start key is inclusive
scan.setStopRow( Bytes.toBytes("row" + (char)0)); // stop key is exclusive
ResultScanner rs = htable.getScanner(scan);
try {
for (Result r = rs.next(); r != null; r = rs.next()) {
// process result...
} finally {
rs.close(); // always close the ResultScanner!
}
8.1.經常使用構造函數
(1)建立掃描全部行的Scan
Scan()
(2)建立Scan,從指定行開始掃描 ,
Scan(byte[] startRow)
參數: startRow 行鍵
注意 :若是指定行不存在,從下一個最近的行開始
(3)建立Scan,指定起止行
Scan(byte[] startRow, byte[] stopRow)
參數: startRow起始行, stopRow終止行
注意 : startRow <= 結果集 < stopRow
(4)建立Scan,指定起始行和過濾器
Scan(byte[] startRow, Filter filter)
參數: startRow 起始行, filter 過濾器
注意:過濾器的功能和構造參見http://blog.csdn.net/u010967382/article/details/37653177
8.2.經常使用方法
void setRaw (boolean raw) 激活或者禁用raw模式。若是raw模式被激活,Scan將返回 全部已經被打上刪除標記但還沒有被真正刪除 的數據。該功能僅用於激活了KEEP_DELETED_ROWS的列族,即列族開啓了 hcd.setKeepDeletedCells(true)
。Scan激活raw模式後,就不能指定任意的列,不然會報錯
Enable/disable "raw" mode for this scan. If "raw" is enabled the scan will return all delete marker and deleted rows that have not been collected, yet. This is mostly useful for Scan on column families that have KEEP_DELETED_ROWS enabled. It is an error to specify any column when "raw" is set.
hcd.setKeepDeletedCells(true);
8.3.實測代碼
(1)掃描表中的 全部行 的最新版本數據
Configuration conf = HBaseConfiguration. create ();
HTable table = new HTable(conf, "rd_ns:itable" );
Scan s = new Scan();
ResultScanner rs = table.getScanner(s);
for (Result r : rs) {
for (Cell cell : r.rawCells()) {
System. out .println(
"Rowkey : " +Bytes. toString (r.getRow())+
" Familiy:Quilifier : " +Bytes. toString (CellUtil. cloneQualifier(cell))+
" Value : " +Bytes. toString (CellUtil. cloneValue (cell))+
" Time : " +cell.getTimestamp()
);
}
}
table.close();
代碼輸出:
Rowkey : 100001 Familiy:Quilifier : address Value : anywhere Time : 1405417403438
Rowkey : 100001 Familiy:Quilifier : age Value : 24 Time : 1405417403438
Rowkey : 100001 Familiy:Quilifier : name Value : zhangtao Time : 1405417403438
Rowkey : 100002 Familiy:Quilifier : address Value : shangdi Time : 1405417426693
Rowkey : 100002 Familiy:Quilifier : age Value : 28 Time : 1405417426693
Rowkey : 100002 Familiy:Quilifier : name Value : shichao Time : 1405417426693
Rowkey : 100003 Familiy:Quilifier : address Value : xierqi Time : 1405417500485
Rowkey : 100003 Familiy:Quilifier : age Value : 29 Time : 1405417500485
Rowkey : 100003 Familiy:Quilifier : name Value : leon Time : 1405417500485
(2) 掃描指定行鍵範圍,經過末尾加0,使得結果集包含StopRow
Configuration conf = HBaseConfiguration. create ();
HTable table = new HTable(conf, "rd_ns:itable" );
Scan s = new Scan();
s. setStartRow (Bytes. toBytes ( "100001" ));
s. setStopRow (Bytes. toBytes ( " 1000020 " ));
ResultScanner rs = table.getScanner(s);
for (Result r : rs) {
for (Cell cell : r.rawCells()) {
System. out .println(
"Rowkey : " +Bytes. toString (r.getRow())+
" Familiy:Quilifier : " +Bytes. toString (CellUtil. cloneQualifier(cell))+
" Value : " +Bytes. toString (CellUtil. cloneValue (cell))+
" Time : " +cell.getTimestamp()
);
}
}
table.close();
代碼輸出:
Rowkey : 100001 Familiy:Quilifier : address Value : anywhere Time : 1405417403438
Rowkey : 100001 Familiy:Quilifier : age Value : 24 Time : 1405417403438
Rowkey : 100001 Familiy:Quilifier : name Value : zhangtao Time : 1405417403438
Rowkey : 100002 Familiy:Quilifier : address Value : shangdi Time : 1405417426693
Rowkey : 100002 Familiy:Quilifier : age Value : 28 Time : 1405417426693
Rowkey : 100002 Familiy:Quilifier : name Value : shichao Time : 1405417426693
(3) 返回 全部已經被打上刪除標記但還沒有被真正刪除 的數據
本測試針對rd_ns:itable表的100003行。
若是使用get結合 setMaxVersions() 方法能返回全部未刪除的數據,輸出以下:
Rowkey : 100003 Familiy:Quilifier : address Value : huilongguan Time : 1405494141522
Rowkey : 100003 Familiy:Quilifier : address Value : shangdi Time : 1405417477465
Rowkey : 100003 Familiy:Quilifier : age Value : new29 Time : 1405494141522
Rowkey : 100003 Familiy:Quilifier : name Value : liyang Time : 1405494141522
然而,使用Scan強大的 s.setRaw( true ) 方法,能夠得到全部 已經被打上刪除標記但還沒有被真正刪除 的數據。
代碼以下:
Configuration conf = HBaseConfiguration. create ();
HTable table = new HTable(conf, "rd_ns:itable" );
Scan s = new Scan();
s.setStartRow(Bytes. toBytes ( "100003" ));
s.setRaw( true );
s.setMaxVersions();
ResultScanner rs = table.getScanner(s);
for (Result r : rs) {
for (Cell cell : r.rawCells()) {
System. out .println(
"Rowkey : " +Bytes. toString (r.getRow())+
" Familiy:Quilifier : " +Bytes. toString (CellUtil. cloneQualifier(cell))+
" Value : " +Bytes. toString (CellUtil. cloneValue (cell))+
" Time : " +cell.getTimestamp()
);
}
}
table.close();
輸出結果以下:
Rowkey : 100003 Familiy:Quilifier : address Value : huilongguan Time : 1405494141522
Rowkey : 100003 Familiy:Quilifier : address Value : Time : 1405417500485
Rowkey : 100003 Familiy:Quilifier : address Value : xierqi Time : 1405417500485
Rowkey : 100003 Familiy:Quilifier : address Value : shangdi Time : 1405417477465
Rowkey : 100003 Familiy:Quilifier : address Value : Time : 1405417448414
Rowkey : 100003 Familiy:Quilifier : address Value : longze Time : 1405417448414
Rowkey : 100003 Familiy:Quilifier : age Value : new29 Time : 1405494141522
Rowkey : 100003 Familiy:Quilifier : age Value : Time : 1405417500485
Rowkey : 100003 Familiy:Quilifier : age Value : Time : 1405417500485
Rowkey : 100003 Familiy:Quilifier : age Value : 29 Time : 1405417500485
Rowkey : 100003 Familiy:Quilifier : age Value : 30 Time : 1405417477465
Rowkey : 100003 Familiy:Quilifier : age Value : 31 Time : 1405417448414
Rowkey : 100003 Familiy:Quilifier : name Value : liyang Time : 1405494141522
Rowkey : 100003 Familiy:Quilifier : name Value : Time : 1405493879419
Rowkey : 100003 Familiy:Quilifier : name Value : leon Time : 1405417500485
Rowkey : 100003 Familiy:Quilifier : name Value : lee Time : 1405417477465
Rowkey : 100003 Familiy:Quilifier : name Value : lion Time : 1405417448414
(4) 結合過濾器,獲取全部age在25到30之間的行
目前的數據:
hbase(main):049:0> scan 'rd_ns:itable'
ROW COLUMN+CELL
100001 column=info:address, timestamp=1405417403438, value=anywhere
100001 column=info:age, timestamp=1405417403438, value=24
100001 column=info:name, timestamp=1405417403438, value=zhangtao
100002 column=info:address, timestamp=1405417426693, value=shangdi
100002 column=info:age, timestamp=1405417426693, value=28
100002 column=info:name, timestamp=1405417426693, value=shichao
100003 column=info:address, timestamp=1405494141522, value=huilongguan
100003 column=info:age, timestamp=1405494999631, value=29
100003 column=info:name, timestamp=1405494141522, value=liyang
3 row(s) in 0.0240 seconds
代碼:
Configuration conf = HBaseConfiguration. create ();
HTable table = new HTable(conf, "rd_ns:itable" );
FilterList filterList = new FilterList(FilterList.Operator. MUST_PASS_ALL );
SingleColumnValueFilter filter1 = new SingleColumnValueFilter(
Bytes. toBytes ( "info" ),
Bytes. toBytes ( "age" ),
CompareOp. GREATER_OR_EQUAL ,
Bytes. toBytes ( "25" )
);
SingleColumnValueFilter filter2 = new SingleColumnValueFilter(
Bytes. toBytes ( "info" ),
Bytes. toBytes ( "age" ),
CompareOp. LESS_OR_EQUAL ,
Bytes. toBytes ( "30" )
);
filterList.addFilter(filter1);
filterList.addFilter(filter2);
Scan scan = new Scan();
scan.setFilter(filterList);
ResultScanner rs = table.getScanner(scan);
for (Result r : rs) {
for (Cell cell : r.rawCells()) {
System. out .println(
"Rowkey : " +Bytes. toString (r.getRow())+
" Familiy:Quilifier : " +Bytes. toString (CellUtil. cloneQualifier(cell))+
" Value : " +Bytes. toString (CellUtil. cloneValue (cell))+
" Time : " +cell.getTimestamp()
);
}
}
table.close();
代碼輸出:
Rowkey : 100002 Familiy:Quilifier : address Value : shangdi Time : 1405417426693
Rowkey : 100002 Familiy:Quilifier : age Value : 28 Time : 1405417426693
Rowkey : 100002 Familiy:Quilifier : name Value : shichao Time : 1405417426693
Rowkey : 100003 Familiy:Quilifier : address Value : huilongguan Time : 1405494141522
Rowkey : 100003 Familiy:Quilifier : age Value : 29 Time : 1405494999631
Rowkey : 100003 Familiy:Quilifier : name Value : liyang Time : 1405494141522