hbase 學習(十二)非mapreduce生成Hfile,而後導入hbase當中

最近一個羣友的boss讓研究hbase,讓hbase的入庫速度達到5w+/s,這可愁死了,4臺我的電腦組成的集羣,多線程入庫調了很久,速度也才1w左右,都沒有達到理想的那種速度,而後就想到了這種方式,可是網上可能是用mapreduce來實現入庫,而如今的需求是實時入庫,不生成文件了,因此就只能本身用代碼實現了,可是網上查了不少資料都沒有查到,最後在一個網友的指引下,看了源碼,最後找到了生成Hfile的方式,實現了以後,發現單線程入庫速度才達到1w4左右,和以前的多線程的全速差很少了,百思不得其解之時,調整了一下代碼把列的Byte.toBytes(cols)這個方法調整出來只作一次,速度立馬就到3w了,提高很是明顯,這是個人電腦上的速度,估計在它的集羣上能更快一點吧,下面把代碼和你們分享一下。多線程

複製代碼

        String tableName = "taglog"             [] family = Bytes.toBytes("logs"                          Configuration conf =             conf.set("hbase.master", "192.168.1.133:60000"             conf.set("hbase.zookeeper.quorum", "192.168.1.135"                          conf.set("hbase.metrics.showTableName", "false"                                       String outputdir = "hdfs://hadoop.Master:8020/user/SEA/hfiles/"             Path dir =              Path familydir =              FileSystem fs =             BloomType bloomType =              HFileDataBlockEncoder encoder =              blockSize = 64000             Configuration tempConf =              tempConf.set("hbase.metrics.showTableName", "false"             tempConf.setFloat(HConstants.HFILE_BLOCK_CACHE_SIZE_KEY, 1.0f                          StoreFile.Writer writer =  StoreFile.WriterBuilder(conf,                    start =                          DecimalFormat df =  DecimalFormat("0000000"                                                    KeyValue kv1 =              KeyValue kv2 =              KeyValue kv3 =              KeyValue kv4 =              KeyValue kv5 =              KeyValue kv6 =              KeyValue kv7 =              KeyValue kv8 =                                        [] cn = Bytes.toBytes("cn"             [] dt = Bytes.toBytes("dt"             [] ic = Bytes.toBytes("ic"             [] ifs = Bytes.toBytes("if"             [] ip = Bytes.toBytes("ip"             [] le = Bytes.toBytes("le"             [] mn = Bytes.toBytes("mn"             [] pi = Bytes.toBytes("pi"                           maxLength = 3000000             ( i=0;i<maxLength;i++                 String currentTime = ""+System.currentTimeMillis() +                  current =                                    kv1 =                           family, cn,current,KeyValue.Type.Put,Bytes.toBytes("3"                                   kv2 =                           family, dt,current,KeyValue.Type.Put,Bytes.toBytes("6"                                   kv3 =                           family, ic,current,KeyValue.Type.Put,Bytes.toBytes("8"                                   kv4 =                           family, ifs,current,KeyValue.Type.Put,Bytes.toBytes("7"                                   kv5 =                           family, ip,current,KeyValue.Type.Put,Bytes.toBytes("4"                                   kv6 =                           family, le,current,KeyValue.Type.Put,Bytes.toBytes("2"                                   kv7 =                           family, mn,current,KeyValue.Type.Put,Bytes.toBytes("5"                                   kv8 =                           family,pi,current,KeyValue.Type.Put,Bytes.toBytes("1"                                                                                            HTable table =              LoadIncrementalHFiles loader =              loader.doBulkLoad(dir, table);

  

  最後再附上查看hfile的方式,查詢正確的hfile和本身生成的hfile,方便查找問題。
 
相關文章
相關標籤/搜索