HBase Filter及對應Shell

比較運算符 CompareFilter.CompareOp
比較運算符用於定義比較關係,能夠有如下幾類值供選擇:正則表達式

  • EQUAL 相等
  • GREATER 大於
  • GREATER_OR_EQUAL 大於等於
  • LESS 小於
  • LESS_OR_EQUAL 小於等於
  • NOT_EQUAL 不等於

比較器 ByteArrayComparable
經過比較器能夠實現多樣化目標匹配效果,比較器有如下子類能夠使用:數組

  • BinaryComparator 匹配完整字節數組
  • BinaryPrefixComparator 匹配字節數組前綴
  • BitComparator  不經常使用
  • NullComparator  不經常使用
  • RegexStringComparator 匹配正則表達式
  • SubstringComparator 匹配子字符串

1.多重過濾器--FilterList(Shell不支持)
FilterList表明一個過濾器鏈,它能夠包含一組即將應用於目標數據集的過濾器,過濾器間具備「FilterList.Operator.MUST_PASS_ALL 和「FilterList.Operator.MUST_PASS_ONE 關係。函數

複製代碼
//結合過濾器,獲取全部age在15到30之間的行
private static void scanFilter() throws IOException,
        UnsupportedEncodingException {
    Configuration conf = HBaseConfiguration.create();
    conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
    conf.set("hbase.zookeeper.quorum", "ncst");
    HTable ht = new HTable(conf, "users");
    
    // And
    FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ALL);
    // >=15
    SingleColumnValueFilter filter1 = new SingleColumnValueFilter("info".getBytes(), "age".getBytes(), CompareOp.GREATER_OR_EQUAL, "15".getBytes());
    // =<30
    SingleColumnValueFilter filter2 = new SingleColumnValueFilter("info".getBytes(), "age".getBytes(), CompareOp.LESS_OR_EQUAL, "30".getBytes());
    filterList.addFilter(filter1);
    filterList.addFilter(filter2);        
    
    Scan scan = new Scan();
    // set Filter
    scan.setFilter(filterList);
    
    ResultScanner rs = ht.getScanner(scan);
    for(Result result : rs){
        for(Cell cell : result.rawCells()){
            System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
                    +new String(CellUtil.cloneFamily(cell))+"\t"
                    +new String(CellUtil.cloneQualifier(cell))+"\t"
                    +new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
                    +cell.getTimestamp());
        }
    }
    ht.close();
}    
複製代碼

2. 列值過濾器--SingleColumnValueFilter
用於測試列值相等(CompareOp.EQUAL ),不等(CompareOp.NOT_EQUAL),或單側範圍 (如CompareOp.GREATER)。構造函數:
2.1.比較的關鍵字是一個字符數組(Shell不支持?)
SingleColumnValueFilter(byte[] family, byte[] qualifier, CompareFilter.CompareOp compareOp, byte[] value)測試

複製代碼
//SingleColumnValueFilter例子
private static void scanFilter01() throws IOException,
        UnsupportedEncodingException {
    Configuration conf = HBaseConfiguration.create();
    conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
    conf.set("hbase.zookeeper.quorum", "ncst");
    HTable ht = new HTable(conf, "users");
    
    SingleColumnValueFilter scvf = new SingleColumnValueFilter("info".getBytes(), "age".getBytes(), CompareOp.EQUAL, "18".getBytes());
    Scan scan = new Scan();
    scan.setFilter(scvf);
    ResultScanner rs = ht.getScanner(scan);
    for(Result result : rs){
        for(Cell cell : result.rawCells()){
            System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
                    +new String(CellUtil.cloneFamily(cell))+"\t"
                    +new String(CellUtil.cloneQualifier(cell))+"\t"
                    +new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
                    +cell.getTimestamp());
        }
    }
    ht.close();
}
複製代碼

2.2.比較的關鍵字是一個比較器ByteArrayComparable
SingleColumnValueFilter(byte[] family, byte[] qualifier, CompareFilter.CompareOp compareOp, ByteArrayComparable comparator)spa

複製代碼
//SingleColumnValueFilter例子2 -- RegexStringComparator
private static void scanFilter02() throws IOException,
        UnsupportedEncodingException {
    Configuration conf = HBaseConfiguration.create();
    conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
    conf.set("hbase.zookeeper.quorum", "ncst");
    HTable ht = new HTable(conf, "users");
    
   //值比較的正則表達式 -- RegexStringComparator //匹配info:age值以"4"結尾
    RegexStringComparator comparator = new RegexStringComparator(".4");
    //第四個參數不同
    SingleColumnValueFilter scvf = new SingleColumnValueFilter("info".getBytes(), "age".getBytes(), CompareOp.EQUAL, comparator);
    Scan scan = new Scan();
    scan.setFilter(scvf);
    ResultScanner rs = ht.getScanner(scan);
    for(Result result : rs){
        for(Cell cell : result.rawCells()){
            System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
                    +new String(CellUtil.cloneFamily(cell))+"\t"
                    +new String(CellUtil.cloneQualifier(cell))+"\t"
                    +new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
                    +cell.getTimestamp());
        }
    }
    ht.close();
}
複製代碼
hbase(main):032:0> scan 'users',{FILTER=>"SingleColumnValueFilter('info','age',=,'regexstring:.4')"}
ROW                                 COLUMN+CELL                                                                                         
 xiaoming01                         column=address:contry, timestamp=1442000277200, value=\xE4\xB8\xAD\xE5\x9B\xBD                      
 xiaoming01                         column=address:country, timestamp=1442000228945, value=\xE4\xB8\xAD\xE5\x9B\xBD                     
 xiaoming01                         column=info:age, timestamp=1441998917568, value=24                                                  
 xiaoming02                         column=info:age, timestamp=1441998917594, value=24                                                  
 xiaoming03                         column=info:age, timestamp=1441998919607, value=24                                                  
3 row(s) in 0.0130 seconds
複製代碼
複製代碼
複製代碼
//SingleColumnValueFilter例子2 -- SubstringComparator
private static void scanFilter03() throws IOException,
        UnsupportedEncodingException {
    Configuration conf = HBaseConfiguration.create();
    conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
    conf.set("hbase.zookeeper.quorum", "ncst");
    HTable ht = new HTable(conf, "users");
    
    //檢測一個子串是否存在於值中(大小寫不敏感) -- SubstringComparator
    //過濾age值中包含'4'的RowKey
    SubstringComparator comparator = new SubstringComparator("4");
    //第四個參數不同
    SingleColumnValueFilter scvf = new SingleColumnValueFilter("info".getBytes(), "age".getBytes(), CompareOp.EQUAL, comparator);
    Scan scan = new Scan();
    scan.setFilter(scvf);
    ResultScanner rs = ht.getScanner(scan);
    for(Result result : rs){
        for(Cell cell : result.rawCells()){
            System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
                    +new String(CellUtil.cloneFamily(cell))+"\t"
                    +new String(CellUtil.cloneQualifier(cell))+"\t"
                    +new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
                    +cell.getTimestamp());
        }
    }
    ht.close();
}
複製代碼
hbase(main):033:0> scan 'users',{FILTER=>"SingleColumnValueFilter('info','age',=,'substring:4')"}
ROW                                 COLUMN+CELL                                                                                         
 xiaoming01                         column=address:contry, timestamp=1442000277200, value=\xE4\xB8\xAD\xE5\x9B\xBD                      
 xiaoming01                         column=address:country, timestamp=1442000228945, value=\xE4\xB8\xAD\xE5\x9B\xBD                     
 xiaoming01                         column=info:age, timestamp=1441998917568, value=24                                                  
 xiaoming02                         column=info:age, timestamp=1441998917594, value=24                                                  
 xiaoming03                         column=info:age, timestamp=1441998919607, value=24                                                  
3 row(s) in 0.0180 seconds
複製代碼
複製代碼

3.列名過濾器
因爲HBase採用鍵值對保存內部數據,列名過濾器過濾一行的列名(ColumnFamily:Qualifiers)是否存在 , 對應前節所述列值的狀況。server

3.1.基於Columun Family列族過濾數據的FamilyFilter
FamilyFilter(CompareFilter.CompareOp familyCompareOp, ByteArrayComparable familyComparator)ip

注意:
1.若是但願查找的是一個已知的列族,則使用 scan.addFamily(family); 比使用過濾器效率更高.
2.因爲目前HBase對多列族支持不完善,因此該過濾器目前用途不大.ci

複製代碼
//基於列族過濾數據的FamilyFilter
private static void scanFilter04() throws IOException,
        UnsupportedEncodingException {
    Configuration conf = HBaseConfiguration.create();
    conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
    conf.set("hbase.zookeeper.quorum", "ncst");
    HTable ht = new HTable(conf, "users");

    //過濾 = 'address'的列族
    //FamilyFilter familyFilter = new FamilyFilter(CompareOp.EQUAL, new BinaryComparator("address".getBytes()));
    
    //過濾以'add'開頭的列族
    FamilyFilter familyFilter = new FamilyFilter(CompareOp.EQUAL, new BinaryPrefixComparator("add".getBytes()));
    
    Scan scan = new Scan();
    scan.setFilter(familyFilter);
    ResultScanner rs = ht.getScanner(scan);
    for(Result result : rs){
        for(Cell cell : result.rawCells()){
            System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
                    +new String(CellUtil.cloneFamily(cell))+"\t"
                    +new String(CellUtil.cloneQualifier(cell))+"\t"
                    +new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
                    +cell.getTimestamp());
        }
    }
    ht.close();
}
複製代碼
hbase(main):021:0> scan 'users',{FILTER=>"FamilyFilter(=,'binaryprefix:add')"}
ROW                                 COLUMN+CELL                                                                                         
 xiaoming                           column=address:city, timestamp=1441997498965, value=hangzhou                                        
 xiaoming                           column=address:contry, timestamp=1441997498911, value=china                                         
 xiaoming                           column=address:province, timestamp=1441997498939, value=zhejiang                                    
 xiaoming01                         column=address:contry, timestamp=1442000277200, value=\xE4\xB8\xAD\xE5\x9B\xBD                      
 xiaoming01                         column=address:country, timestamp=1442000228945, value=\xE4\xB8\xAD\xE5\x9B\xBD                     
 zhangyifei                         column=address:city, timestamp=1441997499108, value=jieyang                                         
 zhangyifei                         column=address:contry, timestamp=1441997499077, value=china                                         
 zhangyifei                         column=address:province, timestamp=1441997499093, value=guangdong                                   
 zhangyifei                         column=address:town, timestamp=1441997500711, value=xianqiao                                        
3 row(s) in 0.0400 seconds
複製代碼
複製代碼

3.2.基於Qualifier列名過濾數據的QualifierFilter
QualifierFilter(CompareFilter.CompareOp op, ByteArrayComparable qualifierComparator)字符串

說明:該過濾器應該比FamilyFilter更經常使用!get

複製代碼
//基於Qualifier(列名)過濾數據的QualifierFilter
private static void scanFilter05() throws IOException,
        UnsupportedEncodingException {
    Configuration conf = HBaseConfiguration.create();
    conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
    conf.set("hbase.zookeeper.quorum", "ncst");
    HTable ht = new HTable(conf, "users");
    
    //過濾列名 = 'age'全部RowKey
    //QualifierFilter qualifierFilter = new QualifierFilter(CompareOp.EQUAL, new BinaryComparator("age".getBytes()));
    
    //過濾列名  以'age'開頭 全部RowKey(包含age)
    //QualifierFilter qualifierFilter = new QualifierFilter(CompareOp.EQUAL, new BinaryPrefixComparator("age".getBytes()));
    
    //過濾列名  包含'age' 全部RowKey(包含age)
    //QualifierFilter qualifierFilter = new QualifierFilter(CompareOp.EQUAL, new SubstringComparator("age"));
    
    //過濾列名  符合'.ge'正則表達式 全部RowKey
    QualifierFilter qualifierFilter = new QualifierFilter(CompareOp.EQUAL, new RegexStringComparator(".ge"));
    
    Scan scan = new Scan();
    scan.setFilter(qualifierFilter);
    ResultScanner rs = ht.getScanner(scan);
    for(Result result : rs){
        for(Cell cell : result.rawCells()){
            System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
                    +new String(CellUtil.cloneFamily(cell))+"\t"
                    +new String(CellUtil.cloneQualifier(cell))+"\t"
                    +new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
                    +cell.getTimestamp());
        }
    }
    ht.close();
}
複製代碼
hbase(main):020:0> scan 'users',{FILTER=>"QualifierFilter(=,'regexstring:.ge')"}
ROW                                 COLUMN+CELL                                                                                         
 xiaoming                           column=info:age, timestamp=1441997971945, value=38                                                  
 xiaoming01                         column=info:age, timestamp=1441998917568, value=24                                                  
 xiaoming02                         column=info:age, timestamp=1441998917594, value=24                                                  
 xiaoming03                         column=info:age, timestamp=1441998919607, value=24                                                  
 zhangyifei                         column=info:age, timestamp=1442247255446, value=18                                                  
5 row(s) in 0.0460 seconds
複製代碼
複製代碼

3.3.基於列名前綴過濾數據的ColumnPrefixFilter(該功能用QualifierFilter也能實現)
ColumnPrefixFilter(byte[] prefix)
注意:一個列名是能夠出如今多個列族中的,該過濾器將返回全部列族中匹配的列。

複製代碼
//ColumnPrefixFilter例子
private static void scanFilter06() throws IOException,
        UnsupportedEncodingException {
    Configuration conf = HBaseConfiguration.create();
    conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
    conf.set("hbase.zookeeper.quorum", "ncst");
    HTable ht = new HTable(conf, "users");
    
    //匹配 以'ag'開頭的全部的列
    ColumnPrefixFilter columnPrefixFilter = new ColumnPrefixFilter("ag".getBytes());
            
    Scan scan = new Scan();
    scan.setFilter(columnPrefixFilter);
    ResultScanner rs = ht.getScanner(scan);
    for(Result result : rs){
        for(Cell cell : result.rawCells()){
            System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
                    +new String(CellUtil.cloneFamily(cell))+"\t"
                    +new String(CellUtil.cloneQualifier(cell))+"\t"
                    +new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
                    +cell.getTimestamp());
        }
    }
    ht.close();
}
複製代碼
hbase(main):018:0> scan 'users',{FILTER=>"ColumnPrefixFilter('ag')"}
ROW                                 COLUMN+CELL                                                                                         
 xiaoming                           column=info:age, timestamp=1441997971945, value=38                                                  
 xiaoming01                         column=info:age, timestamp=1441998917568, value=24                                                  
 xiaoming02                         column=info:age, timestamp=1441998917594, value=24                                                  
 xiaoming03                         column=info:age, timestamp=1441998919607, value=24                                                  
 zhangyifei                         column=info:age, timestamp=1442247255446, value=18                                                  
5 row(s) in 0.0280 seconds
複製代碼
複製代碼

3.4.基於多個列名前綴過濾數據的MultipleColumnPrefixFilter
MultipleColumnPrefixFilter 和 ColumnPrefixFilter 行爲差很少,但能夠指定多個前綴。

複製代碼
//MultipleColumnPrefixFilter例子
private static void scanFilter07() throws IOException,
        UnsupportedEncodingException {
    Configuration conf = HBaseConfiguration.create();
    conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
    conf.set("hbase.zookeeper.quorum", "ncst");
    HTable ht = new HTable(conf, "users");

    //匹配 以'a'或者'c'開頭 全部的列{二維數組}
    byte[][] prefixes =new byte[][]{"a".getBytes(), "c".getBytes()};        
     MultipleColumnPrefixFilter multipleColumnPrefixFilter = new MultipleColumnPrefixFilter(prefixes );

    Scan scan = new Scan();
    scan.setFilter(multipleColumnPrefixFilter);
    ResultScanner rs = ht.getScanner(scan);
    for(Result result : rs){
        for(Cell cell : result.rawCells()){
            System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
                    +new String(CellUtil.cloneFamily(cell))+"\t"
                    +new String(CellUtil.cloneQualifier(cell))+"\t"
                    +new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
                    +cell.getTimestamp());
        }
    }
    ht.close();
}
複製代碼
hbase(main):017:0> scan 'users',{FILTER=>"MultipleColumnPrefixFilter('a','c')"}
ROW                                 COLUMN+CELL                                                                                         
 xiaoming                           column=address:city, timestamp=1441997498965, value=hangzhou                                        
 xiaoming                           column=address:contry, timestamp=1441997498911, value=china                                         
 xiaoming                           column=info:age, timestamp=1441997971945, value=38                                                  
 xiaoming                           column=info:company, timestamp=1441997498889, value=alibaba                                         
 xiaoming01                         column=address:contry, timestamp=1442000277200, value=\xE4\xB8\xAD\xE5\x9B\xBD                      
 xiaoming01                         column=address:country, timestamp=1442000228945, value=\xE4\xB8\xAD\xE5\x9B\xBD                     
 xiaoming01                         column=info:age, timestamp=1441998917568, value=24                                                  
 xiaoming02                         column=info:age, timestamp=1441998917594, value=24                                                  
 xiaoming03                         column=info:age, timestamp=1441998919607, value=24                                                  
 zhangyifei                         column=address:city, timestamp=1441997499108, value=jieyang                                         
 zhangyifei                         column=address:contry, timestamp=1441997499077, value=china                                         
 zhangyifei                         column=info:age, timestamp=1442247255446, value=18                                                  
 zhangyifei                         column=info:company, timestamp=1441997499039, value=alibaba                                         
5 row(s) in 0.0430 seconds
複製代碼
複製代碼

3.5.基於列範圍(不是行範圍)過濾數據ColumnRangeFilter

  1. 可用於得到一個範圍的列,例如,若是你的一行中有百萬個列,可是你只但願查看列名從bbbb到dddd的範圍
  2. 該方法從 HBase 0.92 版本開始引入
  3. 一個列名是能夠出如今多個列族中的,該過濾器將返回全部列族中匹配的列

構造函數:
ColumnRangeFilter(byte[] minColumn, boolean minColumnInclusive, byte[] maxColumn, boolean maxColumnInclusive)
參數解釋:

  • minColumn - 列範圍的最小值,若是爲空,則沒有下限
  • minColumnInclusive - 列範圍是否包含minColumn
  • maxColumn - 列範圍最大值,若是爲空,則沒有上限
  • maxColumnInclusive - 列範圍是否包含maxColumn
複製代碼
//ColumnRangeFilter例子
private static void scanFilter08() throws IOException,
UnsupportedEncodingException {
    Configuration conf = HBaseConfiguration.create();
    conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
    conf.set("hbase.zookeeper.quorum", "ncst");
    HTable ht = new HTable(conf, "users");

    //匹配 以'a'開頭到以'c'開頭(不包含c) 全部的列    
    ColumnRangeFilter columnRangeFilter = new ColumnRangeFilter("a".getBytes(), true, "c".getBytes(), false);

    Scan scan = new Scan();
    scan.setFilter(columnRangeFilter);
    ResultScanner rs = ht.getScanner(scan);
    for(Result result : rs){
        for(Cell cell : result.rawCells()){
            System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
                    +new String(CellUtil.cloneFamily(cell))+"\t"
                    +new String(CellUtil.cloneQualifier(cell))+"\t"
                    +new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
                    +cell.getTimestamp());
        }
    }
    ht.close();
}
複製代碼
hbase(main):016:0> scan 'users',{FILTER=>"ColumnRangeFilter('a',true,'c',false)"}
ROW                                 COLUMN+CELL                                                                                         
 xiaoming                           column=info:age, timestamp=1441997971945, value=38                                                  
 xiaoming                           column=info:birthday, timestamp=1441997498851, value=1987-06-17                                     
 xiaoming01                         column=info:age, timestamp=1441998917568, value=24                                                  
 xiaoming02                         column=info:age, timestamp=1441998917594, value=24                                                  
 xiaoming03                         column=info:age, timestamp=1441998919607, value=24                                                  
 zhangyifei                         column=info:age, timestamp=1442247255446, value=18                                                  
 zhangyifei                         column=info:birthday, timestamp=1441997498990, value=1987-4-17                                      
5 row(s) in 0.0340 seconds
複製代碼
複製代碼

4.RowKey
當須要根據行鍵特徵查找一個範圍的行數據時,使用Scan的startRow和stopRow會更高效,可是,startRow和stopRow只能匹配行鍵的開始字符,而不能匹配中間包含的字符。當須要針對行鍵進行更復雜的過濾時,能夠使用RowFilter。
構造函數:RowFilter(CompareFilter.CompareOp rowCompareOp, ByteArrayComparable rowComparator)

複製代碼
//RowFilter例子
private static void scanFilter09() throws IOException,
        UnsupportedEncodingException {
    Configuration conf = HBaseConfiguration.create();
    conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
    conf.set("hbase.zookeeper.quorum", "ncst");
    HTable ht = new HTable(conf, "users");

    //匹配 行鍵包含'01' 全部的行    
    RowFilter rowFilter = new RowFilter(CompareOp.EQUAL, new SubstringComparator("01"));
    
    Scan scan = new Scan();
    scan.setFilter(rowFilter);
    ResultScanner rs = ht.getScanner(scan);
    for(Result result : rs){
        for(Cell cell : result.rawCells()){
            System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
                    +new String(CellUtil.cloneFamily(cell))+"\t"
                    +new String(CellUtil.cloneQualifier(cell))+"\t"
                    +new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
                    +cell.getTimestamp());
        }
    }
    ht.close();
}
複製代碼
hbase(main):013:0> scan 'users',{FILTER=>"RowFilter(=,'substring:01')"}
ROW                                 COLUMN+CELL                                                                                         
 xiaoming01                         column=address:contry, timestamp=1442000277200, value=\xE4\xB8\xAD\xE5\x9B\xBD                      
 xiaoming01                         column=address:country, timestamp=1442000228945, value=\xE4\xB8\xAD\xE5\x9B\xBD                     
 xiaoming01                         column=info:age, timestamp=1441998917568, value=24                                                  
1 row(s) in 0.0190 seconds
複製代碼
複製代碼

5.PageFilter(Shell不支持?)
指定頁面行數,返回對應行數的結果集。
須要注意的是,該過濾器並不能保證返回的結果行數小於等於指定的頁面行數,由於過濾器是分別做用到各個region server的,它只能保證當前region返回的結果行數不超過指定頁面行數。
構造函數:PageFilter(long pageSize)

複製代碼
//PageFilter例子
private static void scanFilter10() throws IOException,
        UnsupportedEncodingException {
    Configuration conf = HBaseConfiguration.create();
    conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
    conf.set("hbase.zookeeper.quorum", "ncst");
    HTable ht = new HTable(conf, "users");

    //從RowKey爲 "xiaoming" 開始,取3行(包含xiaoming)    
    PageFilter pageFilter = new PageFilter(3L);
    
    Scan scan = new Scan();
    scan.setStartRow("xiaoming".getBytes());
    scan.setFilter(pageFilter);
    ResultScanner rs = ht.getScanner(scan);
    for(Result result : rs){
        for(Cell cell : result.rawCells()){
            System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
                    +new String(CellUtil.cloneFamily(cell))+"\t"
                    +new String(CellUtil.cloneQualifier(cell))+"\t"
                    +new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
                    +cell.getTimestamp());
        }
    }
    ht.close();
}
複製代碼

注意:因爲該過濾器並不能保證返回的結果行數小於等於指定的頁面行數,因此更好的返回指定行數的辦法是ResultScanner.next(int nbRows),即:

複製代碼
//上面Demo的改動版
private static void scanFilter11() throws IOException,
        UnsupportedEncodingException {
    Configuration conf = HBaseConfiguration.create();
    conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
    conf.set("hbase.zookeeper.quorum", "ncst");
    HTable ht = new HTable(conf, "users");
   
    //從RowKey爲 "xiaoming" 開始,取3行(包含xiaoming)    
    //PageFilter pageFilter = new PageFilter(3L);
    
    Scan scan = new Scan();
    scan.setStartRow("xiaoming".getBytes());
    //scan.setFilter(pageFilter);
    ResultScanner rs = ht.getScanner(scan);
    //指定返回3行數據
    for(Result result : rs.next(3)){
        for(Cell cell : result.rawCells()){
            System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
                    +new String(CellUtil.cloneFamily(cell))+"\t"
                    +new String(CellUtil.cloneQualifier(cell))+"\t"
                    +new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
                    +cell.getTimestamp());
        }
    }
    ht.close();
}
複製代碼

6.SkipFilter(Shell不支持)
根據整行中的每一個列來作過濾,只要存在一列不知足條件,整行都被過濾掉。
構造函數:SkipFilter(Filter filter)

例如,若是一行中的全部列表明的是不一樣物品的重量,則真實場景下這些數值都必須大於零,咱們但願將那些包含任意列值爲0的行都過濾掉。在這個狀況下,咱們結合ValueFilter和SkipFilter共同實現該目的:
scan.setFilter(new SkipFilter(new ValueFilter(CompareOp.NOT_EQUAL,new BinaryComparator(Bytes.toBytes(0))));

複製代碼
//SkipFilter例子
private static void scanFilter12() throws IOException,
        UnsupportedEncodingException {
    Configuration conf = HBaseConfiguration.create();
    conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
    conf.set("hbase.zookeeper.quorum", "ncst");
    HTable ht = new HTable(conf, "users");
    
    //跳過列值中包含"24"的全部列
    SkipFilter skipFilter = new SkipFilter(new ValueFilter(CompareOp.NOT_EQUAL, new BinaryComparator("24".getBytes())));
    
    Scan scan = new Scan();
    scan.setFilter(skipFilter);
    ResultScanner rs = ht.getScanner(scan);
    for(Result result : rs){
        for(Cell cell : result.rawCells()){
            System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
                    +new String(CellUtil.cloneFamily(cell))+"\t"
                    +new String(CellUtil.cloneQualifier(cell))+"\t"
                    +new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
                    +cell.getTimestamp());
        }
    }
    ht.close();
}
複製代碼

7.Utility--FirstKeyOnlyFilter
該過濾器僅僅返回每一行中第一個cell的值,能夠用於高效的執行行數統計操做。估計實戰意義不大。
構造函數:public FirstKeyOnlyFilter()

複製代碼
//FirstKeyOnlyFilter例子
private static void scanFilter12() throws IOException,
        UnsupportedEncodingException {
    Configuration conf = HBaseConfiguration.create();
    conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
    conf.set("hbase.zookeeper.quorum", "ncst");
    HTable ht = new HTable(conf, "users");
    
    //返回每一行中的第一個cell的值
    FirstKeyOnlyFilter firstKeyOnlyFilter = new FirstKeyOnlyFilter();

    Scan scan = new Scan();
    scan.setFilter(firstKeyOnlyFilter);
    ResultScanner rs = ht.getScanner(scan);
    int i = 0;
    for(Result result : rs){
        for(Cell cell : result.rawCells()){
            System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
                    +new String(CellUtil.cloneFamily(cell))+"\t"
                    +new String(CellUtil.cloneQualifier(cell))+"\t"
                    +new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
                    +cell.getTimestamp());
            i++;
        }
    }
    //輸出總的行數
    System.out.println(i);
    ht.close();
}
複製代碼
hbase(main):009:0> scan 'users',{FILTER=>'FirstKeyOnlyFilter()'}
ROW                                COLUMN+CELL                                                                                         
 xiaoming                          column=address:city, timestamp=1441997498965, value=hangzhou                                        
 xiaoming01                        column=address:contry, timestamp=1442000277200, value=\xE4\xB8\xAD\xE5\x9B\xBD                      
 xiaoming02                        column=info:age, timestamp=1441998917594, value=24                                                  
 xiaoming03                        column=info:age, timestamp=1441998919607, value=24                                                  
 zhangyifei                        column=address:city, timestamp=1441997499108, value=jieyang                                         
5 row(s) in 0.0240 seconds
複製代碼
複製代碼
相關文章
相關標籤/搜索