不要問我爲何,由於愛,哈哈哈哈。。。進入正題,最近作項目順帶學習了下hive metastore的源碼,進行下知識總結。java
hive metastore的總體架構如圖:apache
1、組成結構:api
如圖咱們能夠看到,hive metastore的組成結構分爲 客戶端 服務端 ,那麼下來咱們逐一進行分析:架構
一、客戶端app
從代碼的角度來看:尼瑪太多了。。咱們從入口HIVE開始看,能夠找到MetaStoreClient客戶端的建立:ide
1 private IMetaStoreClient createMetaStoreClient() throws MetaException { 2 3 HiveMetaHookLoader hookLoader = new HiveMetaHookLoader() { 4 @Override 5 public HiveMetaHook getHook( 6 org.apache.hadoop.hive.metastore.api.Table tbl) 7 throws MetaException { 8 9 try { 10 if (tbl == null) { 11 return null; 12 } 13 HiveStorageHandler storageHandler = 14 HiveUtils.getStorageHandler(conf, 15 tbl.getParameters().get(META_TABLE_STORAGE)); 16 if (storageHandler == null) { 17 return null; 18 } 19 return storageHandler.getMetaHook(); 20 } catch (HiveException ex) { 21 LOG.error(StringUtils.stringifyException(ex)); 22 throw new MetaException( 23 "Failed to load storage handler: " + ex.getMessage()); 24 } 25 } 26 }; 27 return RetryingMetaStoreClient.getProxy(conf, hookLoader, metaCallTimeMap, 28 SessionHiveMetaStoreClient.class.getName()); 29 }
咱們能夠看到,建立MetaStoreClient中,建立了HiveMetaHook,這個Hook的做用在於,每次對meta進行操做的時候,好比createTable的時候,若是建表的存儲方式不是文件,好比集成hbase,HiveMetaStoreClient會調用hook的接口方法preCreateTable,進行建表前的準備,用來判斷外部表與內部表,若是中途有失敗的話,依舊調用hook中的rollbackCreateTable進行回滾。oop
1 public void createTable(Table tbl, EnvironmentContext envContext) throws AlreadyExistsException, 2 InvalidObjectException, MetaException, NoSuchObjectException, TException { 3 HiveMetaHook hook = getHook(tbl); 4 if (hook != null) { 5 hook.preCreateTable(tbl); 6 } 7 boolean success = false; 8 try { 9 // Subclasses can override this step (for example, for temporary tables) 10 create_table_with_environment_context(tbl, envContext); 11 if (hook != null) { 12 hook.commitCreateTable(tbl); 13 } 14 success = true; 15 } finally { 16 if (!success && (hook != null)) { 17 hook.rollbackCreateTable(tbl); 18 } 19 } 20 }
在hbase表不存在的狀況下,不能create external table ,會報doesn't exist while the table is declared as an external table,那麼需直接建立create table 建立一個指向hbase的hive表。學習
建表語句以下:this
CREATE TABLE hbase_table_1(key int, value string) /
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:val")
TBLPROPERTIES ("hbase.table.name" = "tableName", "hbase.mapred.output.outputtable" = "tableName");
代碼請查看HBaseStorageHandler的preCreateTable方法,這裏就不貼出來啦。spa
隨之迴歸Hive類,Hive類能夠說是整個元數據DDL操做的最頂層抽象。HiveMetaStoreClient實現了IMetaStoreClient的接口,在建立HiveMetaStoreClient時,會建立於server段HiveMetaStore的連接,而且會經過檢查hive.metastore.local是否爲true,來決定是在本地建立服務端,這裏爲在本地:
1 public HiveMetaStoreClient(HiveConf conf, HiveMetaHookLoader hookLoader) 2 throws MetaException { 3 4 this.hookLoader = hookLoader; 5 if (conf == null) { 6 conf = new HiveConf(HiveMetaStoreClient.class); 7 } 8 this.conf = conf; 9 filterHook = loadFilterHooks(); 10 11 String msUri = conf.getVar(HiveConf.ConfVars.METASTOREURIS); 12 localMetaStore = HiveConfUtil.isEmbeddedMetaStore(msUri); 13 if (localMetaStore) { 14 // instantiate the metastore server handler directly instead of connecting 15 // through the network 16 client = HiveMetaStore.newRetryingHMSHandler("hive client", conf, true); 17 isConnected = true; 18 snapshotActiveConf(); 19 return; 20 }
隨後,建立server端的HiveMetaStore.HMSHandler,HMSHandler繼承自IHMSHandler,而IHMSHandler又繼承自ThriftHiveMetastore.Iface,在HMSHandler中實現了全部操做的對外方法:
public class ThriftHiveMetastore { /** * This interface is live. */ public interface Iface extends com.facebook.fb303.FacebookService.Iface { public String getMetaConf(String key) throws MetaException, org.apache.thrift.TException; public void setMetaConf(String key, String value) throws MetaException, org.apache.thrift.TException; public void create_database(Database database) throws AlreadyExistsException, InvalidObjectException, MetaException, org.apache.thrift.TException; public Database get_database(String name) throws NoSuchObjectException, MetaException, org.apache.thrift.TException; public void drop_database(String name, boolean deleteData, boolean cascade) throws NoSuchObjectException, InvalidOperationException, MetaException, org.apache.thrift.TException; public List<String> get_databases(String pattern) throws MetaException, org.apache.thrift.TException; public List<String> get_all_databases() throws MetaException, org.apache.thrift.TException; public void alter_database(String dbname, Database db) throws MetaException, NoSuchObjectException, org.apache.thrift.TException; public Type get_type(String name) throws MetaException, NoSuchObjectException, org.apache.thrift.TException; public boolean create_type(Type type) throws AlreadyExistsException, InvalidObjectException, MetaException, org.apache.thrift.TException; public boolean drop_type(String type) throws MetaException, NoSuchObjectException, org.apache.thrift.TException; ......
在建立HiveMetaStore的init方法中,同時建立了三種Listener---MetaStorePreEventListener,MetaStoreEventListener,MetaStoreEndFunctionListener用於對每一步事件的監聽。
1 initListeners = MetaStoreUtils.getMetaStoreListeners( 2 MetaStoreInitListener.class, hiveConf, 3 hiveConf.getVar(HiveConf.ConfVars.METASTORE_INIT_HOOKS)); 4 for (MetaStoreInitListener singleInitListener: initListeners) { 5 MetaStoreInitContext context = new MetaStoreInitContext(); 6 singleInitListener.onInit(context); 7 } 8 9 String alterHandlerName = hiveConf.get("hive.metastore.alter.impl", 10 HiveAlterHandler.class.getName()); 11 alterHandler = (AlterHandler) ReflectionUtils.newInstance(MetaStoreUtils.getClass( 12 alterHandlerName), hiveConf); 13 wh = new Warehouse(hiveConf); 14 15 synchronized (HMSHandler.class) { 16 if (currentUrl == null || !currentUrl.equals(MetaStoreInit.getConnectionURL(hiveConf))) { 17 createDefaultDB(); 18 createDefaultRoles(); 19 addAdminUsers(); 20 currentUrl = MetaStoreInit.getConnectionURL(hiveConf); 21 } 22 } 23 24 if (hiveConf.getBoolean("hive.metastore.metrics.enabled", false)) { 25 try { 26 Metrics.init(); 27 } catch (Exception e) { 28 // log exception, but ignore inability to start 29 LOG.error("error in Metrics init: " + e.getClass().getName() + " " 30 + e.getMessage(), e); 31 } 32 } 33 34 preListeners = MetaStoreUtils.getMetaStoreListeners(MetaStorePreEventListener.class, 35 hiveConf, 36 hiveConf.getVar(HiveConf.ConfVars.METASTORE_PRE_EVENT_LISTENERS)); 37 listeners = MetaStoreUtils.getMetaStoreListeners(MetaStoreEventListener.class, hiveConf, 38 hiveConf.getVar(HiveConf.ConfVars.METASTORE_EVENT_LISTENERS)); 39 listeners.add(new SessionPropertiesListener(hiveConf)); 40 endFunctionListeners = MetaStoreUtils.getMetaStoreListeners( 41 MetaStoreEndFunctionListener.class, hiveConf, 42 hiveConf.getVar(HiveConf.ConfVars.METASTORE_END_FUNCTION_LISTENERS));
同時建立了AlterHandler,它是HiveAlterHandler的接口,是將修改表和修改partition的操做抽離了出來單獨實現(修改表很複雜的。。)。
1 public interface AlterHandler extends Configurable { 2 3 /** 4 * handles alter table 5 * 6 * @param msdb 7 * object to get metadata 8 * @param wh 9 * TODO 10 * @param dbname 11 * database of the table being altered 12 * @param name 13 * original name of the table being altered. same as 14 * <i>newTable.tableName</i> if alter op is not a rename. 15 * @param newTable 16 * new table object 17 * @throws InvalidOperationException 18 * thrown if the newTable object is invalid 19 * @throws MetaException 20 * thrown if there is any other error 21 */ 22 public abstract void alterTable(RawStore msdb, Warehouse wh, String dbname, 23 String name, Table newTable) throws InvalidOperationException, 24 MetaException; 25 26 /** 27 * handles alter table, the changes could be cascaded to partitions if applicable 28 * 29 * @param msdb 30 * object to get metadata 31 * @param wh 32 * Hive Warehouse where table data is stored 33 * @param dbname 34 * database of the table being altered 35 * @param name 36 * original name of the table being altered. same as 37 * <i>newTable.tableName</i> if alter op is not a rename. 38 * @param newTable 39 * new table object 40 * @param cascade 41 * if the changes will be cascaded to its partitions if applicable 42 * @throws InvalidOperationException 43 * thrown if the newTable object is invalid 44 * @throws MetaException 45 * thrown if there is any other error 46 */ 47 public abstract void alterTable(RawStore msdb, Warehouse wh, String dbname, 48 String name, Table newTable, boolean cascade) throws InvalidOperationException, 49 MetaException;
最重要的是RawStore的建立。RawStore不光是定義了一套最終的物理操做,使用JDO將一個對象看成表進行存儲。ObjectStore中的transaction機制也是經過JDO提供的transaction實現的。當commit失敗時,將rollback全部操做。
1 @Override 2 public void createDatabase(Database db) throws InvalidObjectException, MetaException { 3 boolean commited = false; 4 MDatabase mdb = new MDatabase(); 5 mdb.setName(db.getName().toLowerCase()); 6 mdb.setLocationUri(db.getLocationUri()); 7 mdb.setDescription(db.getDescription()); 8 mdb.setParameters(db.getParameters()); 9 mdb.setOwnerName(db.getOwnerName()); 10 PrincipalType ownerType = db.getOwnerType(); 11 mdb.setOwnerType((null == ownerType ? PrincipalType.USER.name() : ownerType.name())); 12 try { 13 openTransaction(); 14 pm.makePersistent(mdb); 15 commited = commitTransaction(); 16 } finally { 17 if (!commited) { 18 rollbackTransaction(); 19 } 20 } 21 } 22 23 @SuppressWarnings("nls") 24 private MDatabase getMDatabase(String name) throws NoSuchObjectException { 25 MDatabase mdb = null; 26 boolean commited = false; 27 try { 28 openTransaction(); 29 name = HiveStringUtils.normalizeIdentifier(name); 30 Query query = pm.newQuery(MDatabase.class, "name == dbname"); 31 query.declareParameters("java.lang.String dbname"); 32 query.setUnique(true); 33 mdb = (MDatabase) query.execute(name); 34 pm.retrieve(mdb); 35 commited = commitTransaction(); 36 } finally { 37 if (!commited) { 38 rollbackTransaction(); 39 } 40 } 41 if (mdb == null) { 42 throw new NoSuchObjectException("There is no database named " + name); 43 } 44 return mdb; 45 }
今晚就到這裏。。。本身摸索,若有錯誤,還望指出謝謝~