最近遇到一例,HBase 指定大量列集合的場景下,併發拉取數據,應用卡住不響應的情形。記錄一下。
html
退款導出中,爲了獲取商品規格編碼,須要從 HBase 表 T 里拉取對應的數據。 T 對商品數據的存儲採用了 表名:字段名:id 的列存儲方式。因爲這個表很大,且爲詳情公用,所以不方便使用 scanByPrefixFilter 的方式,擔憂引發這個表訪問的不穩定,進而影響詳情和導出的總體穩定性。java
要用 multiGet 的方式來獲取多個訂單的指定列字段的數據,須要動態生成相應的列名集合,而後在 HBase 獲取數據的 API 參數裏指定。好比有訂單 E 含有三個商品 ID, I001, I002, I003, 數據庫裏的表名爲 item , 字段名爲 sku , 就須要動態生成列名集合: item:sku:I001, item:sku:I002, item:sku:I003 。算法
現有記錄集合 List<Record> , 其中 Record 含有 id 字段,每一個 Record 都對應一個訂單。 這樣,能夠從 Record 中把 id 字段的值提取出來,結合列模板 tablename:fieldname:id 來生成所要獲取的 HBase 列名集合。數據庫
然而,當 HBase 指定列名集合比較大的時候, 彷佛是有問題的。堆內存爆了。
編程
CPU 曲線也是隨之陡然飆升。
數組
在預發環境能夠容易地復現。這爲排查解決問題提供了很大的便利。數據結構
排查問題的第一要務是縮小範圍,檢查是什麼變動致使了問題。從錯誤日誌上看,很容易看出是 HBase 獲取數據卡住了。 而這次的變動是增長了一個能夠併發獲取 HBase 指定列集合的數據的插件。這個 HBase 插件是複用了原來的 HAHBaseService 獲取數據的能力,而這個能力線上運行一直穩定良好。不一樣在於,此次會指定大量的列名去查詢。難道 HBase 在指定大量列名集合時拉取數據會有問題? 諮詢數據大佬原大哥,答覆是不會。 那是爲何呢 ? 作個實驗嘗試解決下。
併發
原來的代碼以下:app
private List<Result> fetchDataFromHBase(List<OneRecord> data, List<String> rowKeys, HBaseDataConf hbaseDataConf) { List<Result> hbaseResults = multiTaskExecutor.exec(rowKeys, subRowkeys -> haHbaseService.getRawData(subRowkeys, hbaseDataConf.getTable(), "cf", generateCols(hbaseDataConf.getFetchDataConf(), data), "", true), 200); return hbaseResults; }
這裏使用了一個通用的併發獲取數據的能力 multiTaskExecutor.exec ,只須要指定處理函數便可。詳見: 「精練代碼:一次Java函數式編程的重構之旅」 的「抽離併發處理」部分。less
問題出在 subRowkeys -> haHbaseService.getRawData(subRowkeys, hbaseDataConf.getTable(), "cf", generateCols(hbaseDataConf.getFetchDataConf(), data)
這一行上。 data 是記錄全集,generalCols 會拿到全部訂單的商品 ID 對應的列集合。而 subRowkeys 是按照指定任務數分割後的 HBase Rowkeys 子集合。這意味着每一個子任務都拿到所有的列集合去拉取 HBase 數據。 假如 data 有 8000 條記錄,subRowkeys 有 200 條, 那麼會生成 400 個任務,每一個任務都針對 generateCols(hbaseDataConf.getFetchDataConf(), data) 會生成幾萬條動態列集合。 顯然, generateCols 裏的 data 應該是對應劃分後的 subRowkeys 的那些子記錄集合,而不是所有記錄集合。 也就是說,動態列數量應該是 200 * 指定列字段數量,而不是 8000 * 指定列字段數量。
試着先減小列集合,看看是否能解決問題。
修改後的代碼以下:
private List<Result> fetchDataFromHBase(List<OneRecord> data, HBaseDataConf hbaseDataConf) { List<Result> hbaseResults = multiTaskExecutor.exec(data, partData -> fetchDataFromHBasePartially(partData, hbaseDataConf), 200); return hbaseResults; } private List<Result> fetchDataFromHBasePartially(List<OneRecord> partData, HBaseDataConf hbaseDataConf) { List<String> rowKeys = RowkeyUtil.buildRowKeys(partData, hbaseDataConf.getRowkeyConf()); logger.info("hbase-rowkeys: {}", rowKeys.size()); return haHbaseService.getRawData(rowKeys, hbaseDataConf.getTable(), "cf", generateCols(hbaseDataConf.getFetchDataConf(), partData), "", true); }
這裏,generalCols 用來生成的動態列集合就只對應分割後的記錄集合。修改後,問題就解決了。
爲何指定數萬條列名時 HBase 獲取數據內存爆掉了呢? 是 HBase 不支持拉取大量指定列的數據嗎?
打印調試日誌是排查問題的第一利器。在獲取 HBase 數據的地方打印調試日誌:
String cf = (cfName == null) ? "cf" : cfName; logger.info("columns: {}", columns); List<Get> gets = buildGets(rowKeyList, cf, columns, columnPrefixFilters); logger.info("after buildGet: {}", gets.size()); Result[] results = getFromHbaseFunc.apply(tableName, gets); logger.info("after getHBase: {}", results.length);
發現: columns 日誌打出來了, after buildGet 沒有打出來。程序卡住了。能夠推斷,是 buildGets 這一步卡住了。 與我想象中的不太符合。我覺得是 buildGets 不大可能出問題,而更可能在拉取數據自己上出問題。 不過,如今現實明白滴告訴咱們: buildGets 卡住了。 並且這一步是 CPU 操做,與以前的 CPU 曲線飆升是很吻合的。
寫一個單測,作個小實驗。 先弄個串行的實驗。 1000個訂單, 列數從 2000 增加 24000
@Test def "testMultiGetsSerial"() { expect: def columnSize = 12 def rowkeyNums = 1000 def rowkeys = (1..rowkeyNums).collect { "E001" + it } (1..columnSize).each { colsSize -> def columns = (1..(colsSize*2000)).collect { "item:sku:" + it } def start = System.currentTimeMillis() List<Get> gets = new HAHbaseService().invokeMethod("buildGets", [rowkeys, "cf", columns, null]) gets.size() == rowkeyNums def end = System.currentTimeMillis() def cost = end - start println "num = $rowkeyNums , colsSize = ${columns.size()}, cost (ms) = $cost" } }
耗時以下:
num = 1000 , colsSize = 2000, cost (ms) = 2143 num = 1000 , colsSize = 4000, cost (ms) = 3610 num = 1000 , colsSize = 6000, cost (ms) = 5006 num = 1000 , colsSize = 8000, cost (ms) = 8389 num = 1000 , colsSize = 10000, cost (ms) = 8921 num = 1000 , colsSize = 12000, cost (ms) = 12467 num = 1000 , colsSize = 14000, cost (ms) = 11845 num = 1000 , colsSize = 16000, cost (ms) = 12589 num = 1000 , colsSize = 18000, cost (ms) = 20068 java.lang.OutOfMemoryError: GC overhead limit exceeded
再針對實際運行的併發狀況作個實驗。 從 1000 到 6000 個訂單,列集合數量 從 1000 - 10000。 用併發來構建 gets 。
@Test def "testMultiGetsConcurrent"() { expect: def num = 4 def columnSize = 9 (1..num).each { n -> def rowkeyNums = n*1000 def rowkeys = (1..rowkeyNums).collect { "E001" + it } (1..columnSize).each { colsSize -> def columns = (1..(colsSize*1000)).collect { "tc_order_item:sku_code:" + it } def start = System.currentTimeMillis() List<Get> gets = taskExecutor.exec( rowkeys, { new HAHbaseService().invokeMethod("buildGets", [it, "cf", columns, null]) } as Function, 200) gets.size() == rowkeyNums def end = System.currentTimeMillis() def cost = end - start println "num = $rowkeyNums , colsSize = ${columns.size()}, cost (ms) = $cost" println "analysis:$rowkeyNums,${columns.size()},$cost" } } }
耗時以下:
num = 1000 , colsSize = 1000, cost (ms) = 716 num = 1000 , colsSize = 2000, cost (ms) = 1180 num = 1000 , colsSize = 3000, cost (ms) = 1378 num = 1000 , colsSize = 4000, cost (ms) = 2632 num = 1000 , colsSize = 5000, cost (ms) = 2130 num = 1000 , colsSize = 6000, cost (ms) = 4328 num = 1000 , colsSize = 7000, cost (ms) = 4524 num = 1000 , colsSize = 8000, cost (ms) = 5612 num = 1000 , colsSize = 9000, cost (ms) = 5804 num = 2000 , colsSize = 1000, cost (ms) = 1416 num = 2000 , colsSize = 2000, cost (ms) = 1486 num = 2000 , colsSize = 3000, cost (ms) = 2434 num = 2000 , colsSize = 4000, cost (ms) = 4925 num = 2000 , colsSize = 5000, cost (ms) = 5176 num = 2000 , colsSize = 6000, cost (ms) = 7217 num = 2000 , colsSize = 7000, cost (ms) = 9298 num = 2000 , colsSize = 8000, cost (ms) = 11979 num = 2000 , colsSize = 9000, cost (ms) = 20156 num = 3000 , colsSize = 1000, cost (ms) = 1837 num = 3000 , colsSize = 2000, cost (ms) = 2460 num = 3000 , colsSize = 3000, cost (ms) = 4516 num = 3000 , colsSize = 4000, cost (ms) = 7556 num = 3000 , colsSize = 5000, cost (ms) = 6169 num = 3000 , colsSize = 6000, cost (ms) = 19211 num = 3000 , colsSize = 7000, cost (ms) = 180950 ……
可見,耗時隨着rowkey 數應該是線性增加; 而隨着指定列集合的增大,會有超過線性的增加和波動。超線性增加是算法引發的,波動應該是由線程池執行引發的。
若是有 8800 個訂單,指定 24000 個列, 可想而知,有多慢了。 上帝都在排隊了。
查看 buildGets 代碼,其中嫌疑最大的就是 addColumn 方法。這個方法添加列時,將列加入了 NavigableSet<byte[]> 這個數據結構裏。NavigableSet 是一個排序的集合。HBase 的 NavigableSet 實現類是 TreeSet, 是基於紅黑樹實現的。紅黑樹查詢一個元素的複雜度是在 O(Log2n) 。添加 N 個元素的複雜度在 n*O(Log2n) 。 若是添加大量列,就可能致使CPU計算消耗大,併發的狀況會加重。
那麼, HBase 列數據集的結構爲何要用排序的 Set 而不用普通的 Set 呢?是由於指定列名集合從 HBase 獲取數據時,HBase 會將知足條件的數據拿出來,依次與指定列進行匹配過濾,這時候要應用到查找列功能。當指定列很是大時,TreeSet 的效率比 HashSet 的要大。
回到那個串行的單測實驗 testMultiGetsSerial, 打印下不一樣列數目下生成每個 Get 的列結構中的 familyMap 的大小:
try { ObjectInfo objectInfo = new ClassIntrospector().introspect(gets.get(0).getFamilyMap()); System.out.println("columnSize: " + columns.size() + ", columnMap: " + objectInfo.getDeepSize()); } catch (IllegalAccessException e) { }
運行結果以下:
columnSize: 2000, columnMap: 137112 columnSize: 4000, columnMap: 275112 columnSize: 6000, columnMap: 413112 columnSize: 8000, columnMap: 551112 columnSize: 10000, columnMap: 689112 columnSize: 12000, columnMap: 829112 columnSize: 14000, columnMap: 969112 columnSize: 16000, columnMap: 1109112 columnSize: 18000, columnMap: 1249112 columnSize: 20000, columnMap: 1389112 columnSize: 22000, columnMap: 1529112
也就是說,HBase 指定列名有 22000 個時,每一個 Get 的列對象都會佔用 1.46 MB 的大小,每一個 column 平均佔用 68 - 69 個字節。 1000 個訂單會佔用 1.46 G 的大小。在串行的情形下, 8000 個訂單會佔用 11.664 G 的內存。若是內存不及時釋放,顯然就會堆內存爆掉了。
HBase 列的結構是 Map<byte[], NavigableSet<byte[]>> familyMap
, NavigableSet 是基於 TreeMap 來實現的。所以,添加大量列名時,是一個構建紅黑樹的過程,涉及到大量比較運算(列名前綴仍是相同的,每次都須要重複比較列名前綴),是 CPU 密集型,所以 CPU 曲線會飆升。 從前面的耗時來看,添加一個元素平均約 1ms 左右,這個時間不隨 TreeMap 已有元素數目而變化。 添加 22000 個元素則須要 20s 左右了。
TreeMap 及紅黑樹的實現,將在專門的文章進行討論。
在網上找的程序,查看對象的佔用內存大小。
package zzz.study.util; import java.lang.reflect.Array; import java.lang.reflect.Field; import java.lang.reflect.Modifier; import java.util.ArrayList; import java.util.Arrays; import java.util.Collections; import java.util.HashMap; import java.util.IdentityHashMap; import java.util.List; import java.util.Map; import sun.misc.Unsafe; public class ClassIntrospector { private static final Unsafe unsafe; /** Size of any Object reference */ private static final int objectRefSize; static { try { Field field = Unsafe.class.getDeclaredField("theUnsafe"); field.setAccessible(true); unsafe = (Unsafe) field.get(null); // 能夠經過Object[]數組獲得oop指針到底是壓縮後的4個字節仍是未壓縮的8個字節 objectRefSize = unsafe.arrayIndexScale(Object[].class); } catch (Exception e) { throw new RuntimeException(e); } } /** Sizes of all primitive values */ private static final Map<Class<?>, Integer> primitiveSizes; static { primitiveSizes = new HashMap<Class<?>, Integer>(10); primitiveSizes.put(byte.class, 1); primitiveSizes.put(char.class, 2); primitiveSizes.put(int.class, 4); primitiveSizes.put(long.class, 8); primitiveSizes.put(float.class, 4); primitiveSizes.put(double.class, 8); primitiveSizes.put(boolean.class, 1); } /** * Get object information for any Java object. Do not pass primitives to * this method because they will boxed and the information you will get will * be related to a boxed version of your value. * * @param obj * Object to introspect * @return Object info * @throws IllegalAccessException */ public ObjectInfo introspect(final Object obj) throws IllegalAccessException { try { return introspect(obj, null); } finally { // clean visited cache before returning in order to make // this object reusable m_visited.clear(); } } // we need to keep track of already visited objects in order to support // cycles in the object graphs private IdentityHashMap<Object, Boolean> m_visited = new IdentityHashMap<Object, Boolean>( 100); private ObjectInfo introspect(final Object obj, final Field fld) throws IllegalAccessException { // use Field type only if the field contains null. In this case we will // at least know what's expected to be // stored in this field. Otherwise, if a field has interface type, we // won't see what's really stored in it. // Besides, we should be careful about primitives, because they are // passed as boxed values in this method // (first arg is object) - for them we should still rely on the field // type. boolean isPrimitive = fld != null && fld.getType().isPrimitive(); boolean isRecursive = false; // will be set to true if we have already // seen this object if (!isPrimitive) { if (m_visited.containsKey(obj)) isRecursive = true; m_visited.put(obj, true); } final Class<?> type = (fld == null || (obj != null && !isPrimitive)) ? obj .getClass() : fld.getType(); int arraySize = 0; int baseOffset = 0; int indexScale = 0; if (type.isArray() && obj != null) { baseOffset = unsafe.arrayBaseOffset(type); indexScale = unsafe.arrayIndexScale(type); arraySize = baseOffset + indexScale * Array.getLength(obj); } final ObjectInfo root; if (fld == null) { root = new ObjectInfo("", type.getCanonicalName(), getContents(obj, type), 0, getShallowSize(type), arraySize, baseOffset, indexScale); } else { final int offset = (int) unsafe.objectFieldOffset(fld); root = new ObjectInfo(fld.getName(), type.getCanonicalName(), getContents(obj, type), offset, getShallowSize(type), arraySize, baseOffset, indexScale); } if (!isRecursive && obj != null) { if (isObjectArray(type)) { // introspect object arrays final Object[] ar = (Object[]) obj; for (final Object item : ar) if (item != null) root.addChild(introspect(item, null)); } else { for (final Field field : getAllFields(type)) { if ((field.getModifiers() & Modifier.STATIC) != 0) { continue; } field.setAccessible(true); root.addChild(introspect(field.get(obj), field)); } } } root.sort(); // sort by offset return root; } // get all fields for this class, including all superclasses fields private static List<Field> getAllFields(final Class<?> type) { if (type.isPrimitive()) return Collections.emptyList(); Class<?> cur = type; final List<Field> res = new ArrayList<Field>(10); while (true) { Collections.addAll(res, cur.getDeclaredFields()); if (cur == Object.class) break; cur = cur.getSuperclass(); } return res; } // check if it is an array of objects. I suspect there must be a more // API-friendly way to make this check. private static boolean isObjectArray(final Class<?> type) { if (!type.isArray()) return false; if (type == byte[].class || type == boolean[].class || type == char[].class || type == short[].class || type == int[].class || type == long[].class || type == float[].class || type == double[].class) return false; return true; } // advanced toString logic private static String getContents(final Object val, final Class<?> type) { if (val == null) return "null"; if (type.isArray()) { if (type == byte[].class) return Arrays.toString((byte[]) val); else if (type == boolean[].class) return Arrays.toString((boolean[]) val); else if (type == char[].class) return Arrays.toString((char[]) val); else if (type == short[].class) return Arrays.toString((short[]) val); else if (type == int[].class) return Arrays.toString((int[]) val); else if (type == long[].class) return Arrays.toString((long[]) val); else if (type == float[].class) return Arrays.toString((float[]) val); else if (type == double[].class) return Arrays.toString((double[]) val); else return Arrays.toString((Object[]) val); } return val.toString(); } // obtain a shallow size of a field of given class (primitive or object // reference size) private static int getShallowSize(final Class<?> type) { if (type.isPrimitive()) { final Integer res = primitiveSizes.get(type); return res != null ? res : 0; } else return objectRefSize; } }
package zzz.study.util; import java.util.ArrayList; import java.util.Collections; import java.util.Comparator; import java.util.List; public class ObjectInfo { /** Field name */ public final String name; /** Field type name */ public final String type; /** Field data formatted as string */ public final String contents; /** Field offset from the start of parent object */ public final int offset; /** Memory occupied by this field */ public final int length; /** Offset of the first cell in the array */ public final int arrayBase; /** Size of a cell in the array */ public final int arrayElementSize; /** Memory occupied by underlying array (shallow), if this is array type */ public final int arraySize; /** This object fields */ public final List<ObjectInfo> children; public ObjectInfo(String name, String type, String contents, int offset, int length, int arraySize, int arrayBase, int arrayElementSize) { this.name = name; this.type = type; this.contents = contents; this.offset = offset; this.length = length; this.arraySize = arraySize; this.arrayBase = arrayBase; this.arrayElementSize = arrayElementSize; children = new ArrayList<ObjectInfo>( 1 ); } public void addChild( final ObjectInfo info ) { if ( info != null ) children.add( info ); } /** * Get the full amount of memory occupied by a given object. This value may be slightly less than * an actual value because we don't worry about memory alignment - possible padding after the last object field. * * The result is equal to the last field offset + last field length + all array sizes + all child objects deep sizes * @return Deep object size */ public long getDeepSize() { //return length + arraySize + getUnderlyingSize( arraySize != 0 ); return addPaddingSize(arraySize + getUnderlyingSize( arraySize != 0 )); } long size = 0; private long getUnderlyingSize( final boolean isArray ) { //long size = 0; for ( final ObjectInfo child : children ) size += child.arraySize + child.getUnderlyingSize( child.arraySize != 0 ); if ( !isArray && !children.isEmpty() ){ int tempSize = children.get( children.size() - 1 ).offset + children.get( children.size() - 1 ).length; size += addPaddingSize(tempSize); } return size; } private static final class OffsetComparator implements Comparator<ObjectInfo> { @Override public int compare( final ObjectInfo o1, final ObjectInfo o2 ) { return o1.offset - o2.offset; //safe because offsets are small non-negative numbers } } //sort all children by their offset public void sort() { Collections.sort( children, new OffsetComparator() ); } @Override public String toString() { final StringBuilder sb = new StringBuilder(); toStringHelper( sb, 0 ); return sb.toString(); } private void toStringHelper( final StringBuilder sb, final int depth ) { depth( sb, depth ).append("name=").append( name ).append(", type=").append( type ) .append( ", contents=").append( contents ).append(", offset=").append( offset ) .append(", length=").append( length ); if ( arraySize > 0 ) { sb.append(", arrayBase=").append( arrayBase ); sb.append(", arrayElemSize=").append( arrayElementSize ); sb.append( ", arraySize=").append( arraySize ); } for ( final ObjectInfo child : children ) { sb.append( '\n' ); child.toStringHelper(sb, depth + 1); } } private StringBuilder depth( final StringBuilder sb, final int depth ) { for ( int i = 0; i < depth; ++i ) sb.append( "\t"); return sb; } private long addPaddingSize(long size){ if(size % 8 != 0){ return (size / 8 + 1) * 8; } return size; } }
由於一個比較粗糙的編碼錯誤,堆內存爆了; 又由於這個錯誤,深刻了解了 HBase 指定列名集合時獲取數據的一些內幕。 實際上,這是一個數據結構與算法引起的問題。 可見,數據結構與算法在實際工做中仍是很是重要的。
任務: 學習 TreeMap 及紅黑樹的實現。
【完】