FROM : http://blog.csdn.net/hi_box/article/details/40820341java
首先用最普通的建表語句建一個表:mysql
- hive>create table test(id int,name string)row format delimited fields terminated by ',';
測試insert:git
- insert into table test values (1,'row1'),(2,'row2');
結果報錯:sql
- java.io.FileNotFoundException: File does not exist: hdfs:
- apache-hive-0.14.0-SNAPSHOT-bin/lib/curator-client-2.6.0.jar
- at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
- at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
- at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
- at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
- at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
- at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
- at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
- at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
- at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
- at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
- at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
- at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
- at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
- at java.security.AccessController.doPrivileged(Native Method)
- ......
貌似往hdfs上找jar包了,小問題,直接把lib下的jar包上傳到hdfs數據庫
- hadoop fs -mkdir -p /home/hadoop/git/hive/packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/apache-hive-0.14.0-SNAPSHOT-bin/lib/
- hadoop fs -put $HIVE_HOME/lib/* /home/hadoop/git/hive/packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/apache-hive-0.14.0-SNAPSHOT-bin/lib/
接着運行insert,沒有問題,接下來測試deleteapache
- hive>delete from test where id = 1;
報錯!:socket
FAILED: SemanticException [Error 10294]: Attempt to do update or delete using transaction manager that does not support these operations.
說是在使用的轉換管理器不支持update跟delete操做。oop
原來要支持update操做跟delete操做,必須額外再配置一些東西,見:測試
https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-NewConfigurationParametersforTransactions優化
根據提示配置hive-site.xml:
- hive.support.concurrency – true
- hive.enforce.bucketing – true
- hive.exec.dynamic.partition.mode – nonstrict
- hive.txn.manager – org.apache.hadoop.hive.ql.lockmgr.DbTxnManager
- hive.compactor.initiator.on – true
- hive.compactor.worker.threads – 1
配置完覺得可以順利運行了,誰知開始報下面這個錯誤:
- FAILED: LockException [Error 10280]: Error communicating with the metastore
與元數據庫出現了問題,修改log爲DEBUG查看具體錯誤:
- 2014-11-04 14:20:14,367 DEBUG [Thread-8]: txn.CompactionTxnHandler (CompactionTxnHandler.java:findReadyToClean(265)) - Going to execute query <select cq_id,
- cq_database, cq_table, cq_partition, cq_type, cq_run_as from COMPACTION_QUEUE where cq_state = 'r'>
- 2014-11-04 14:20:14,367 ERROR [Thread-8]: txn.CompactionTxnHandler (CompactionTxnHandler.java:findReadyToClean(285)) - Unable to select next element for cleaning,
- Table 'hive.COMPACTION_QUEUE' doesn't exist
- 2014-11-04 14:20:14,367 DEBUG [Thread-8]: txn.CompactionTxnHandler (CompactionTxnHandler.java:findReadyToClean(287)) - Going to rollback
- 2014-11-04 14:20:14,368 ERROR [Thread-8]: compactor.Cleaner (Cleaner.java:run(143)) - Caught an exception in the main loop of compactor cleaner, MetaException(message
- :Unable to connect to transaction database com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Table 'hive.COMPACTION_QUEUE' doesn't exist
- at sun.reflect.GeneratedConstructorAccessor19.newInstance(Unknown Source)
- at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
- at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
- at com.mysql.jdbc.Util.handleNewInstance(Util.java:409)
在元數據庫中找不到COMPACTION_QUEUE這個表,趕忙去mysql中查看,確實沒有這個表。怎麼會沒有這個表呢?找了好久都沒找到什麼緣由,查源碼吧。
在org.apache.hadoop.hive.metastore.txn下的TxnDbUtil類中找到了建表語句,順藤摸瓜,找到了下面這個方法會調用建表語句:
- private void checkQFileTestHack() {
- boolean hackOn = HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_IN_TEST) ||
- HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_IN_TEZ_TEST);
- if (hackOn) {
- LOG.info("Hacking in canned values for transaction manager");
-
- TxnDbUtil.setConfValues(conf);
- try {
- TxnDbUtil.prepDb();
- } catch (Exception e) {
-
- if (!e.getMessage().contains("already exists")) {
- throw new RuntimeException("Unable to set up transaction database for" +
- " testing: " + e.getMessage());
- }
- }
- }
- }
什麼意思呢,就是說要運行建表語句還有一個條件:HIVE_IN_TEST或者HIVE_IN_TEZ_TEST.只有在測試環境中才能用delete,update操做,也能夠理解,畢竟尚未開發徹底。
終於找到緣由,解決方法也很簡單:在hive-site.xml中添加下面的配置:
- <property>
- <name>hive.in.test</name>
- <value>true</value>
- </property>
OK,再從新啓動服務,再運行delete:
- hive>delete from test where id = 1;
又報錯:
- FAILED: SemanticException [Error 10297]: Attempt to do update or delete on table default.test that does not use an AcidOutputFormat or is not bucketed
說是要進行delete操做的表test不是AcidOutputFormat或沒有分桶。估計是要求輸出是AcidOutputFormat而後必須分桶
網上查到確實如此,並且目前只有ORCFileformat支持AcidOutputFormat,不只如此建表時必須指定參數('transactional' = true)。感受太麻煩了。。。。
因而按照網上示例建表:
- hive>create table test(id int ,name string )clustered by (id) into 2 buckets stored as orc TBLPROPERTIES('transactional'='true');
insert
- hive>insert into table test values (1,'row1'),(2,'row2'),(3,'row3');
delete
- hive>delete from test where id = 1;
update
- hive>update test set name = 'Raj' where id = 2;
OK!所有順利運行,不過貌似效率過低了,基本都要30s左右,估計應該能夠優化,再研究研究
最後還有個問題:show tables時報錯:
- hive> show tables;
- OK
- tab_name
- Failed with exception java.io.IOException:java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: fcitx-socket-:0
- Time taken: 0.064 seconds
好像跟/tmp/下fcitx-socket-:0文件名有關,待解決。。。