Hive擴展功能(九)--Hive的行級更新操做(Update)

軟件環境:

linux系統: CentOS6.7
Hadoop版本: 2.6.5
zookeeper版本: 3.4.8


主機配置:

一共m1, m2, m3這三部機, 每部主機的用戶名都爲centos

192.168.179.201: m1 
192.168.179.202: m2 
192.168.179.203: m3 

m1: Zookeeper, Namenode, DataNode, ResourceManager, NodeManager, Master, Worker
m2: Zookeeper, Namenode, DataNode, ResourceManager, NodeManager, Worker
m3: Zookeeper, DataNode, NodeManager, Worker

資料:

官方資料:
Update資料  <=>      https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
Join資料    <=>      https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins
    
網上參考資料:
Update資料  <=>      http://www.aboutyun.com/thread-12155-1-1.html


一.爲Hive配置Update功能

1.編輯hive-site.xml文件:

<property>
    <name>hive.optimize.sort.dynamic.partition</name>
    <value>false</value>
</property>
<property>
    <name>hive.support.concurrency</name>
    <value>true</value>
</property>
<property>
    <name>hive.enforce.bucketing</name>
    <value>true</value>
</property>
<property>
    <name>hive.exec.dynamic.partition.mode</name>
    <value>nonstrict</value>
</property>
<property>
    <name>hive.txn.manager</name>
    <value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager</value>
</property>
<property>
    <name>hive.compactor.initiator.on</name>
    <value>true</value>
</property>
<property>
    <name>hive.compactor.worker.threads</name>
    <value>1</value>
</property>
<property>
    <name>hive.in.test</name>
    <value>true</value>
</property>


二.Update語法

1.創表語句

Hive對使用Update功能的表有特定的語法要求, 語法要求以下:
(1)要執行Update的表中, 建表時必須帶有buckets(分桶)屬性
(2)要執行Update的表中, 須要指定格式,其他格式目前贊不支持, 如:parquet格式, 目前只支持ORCFileformat和AcidOutputFormat
(3)要執行Update的表中, 建表時必須指定參數('transactional' = true);
舉例:html

create table student (id bigint,name string) clustered by (name) into 2 buckets stored as orc TBLPROPERTIES('transactional'='true');

2.更新語句:

update student set id='444' where name='tom';
相關文章
相關標籤/搜索