Hive如何根據表中某個字段動態分區

時間 2019-11-13

標籤 hive 如何根據表中某個字段動態分區欄目 Hadoop 简体版

原文原文鏈接

使用hive儲存數據時，須要對作分區，若是從kafka接收數據，將天天的數據保存一個分區（按天分區），保存分區時須要根據某個字段作動態分區，而不是傻傻的將數據寫到某一個臨時目錄最後倒入到某一個分區，這是靜態分區。測試

Hive動態分區步驟以下：spa

一、創建某一個源表模擬數據源並插入一些數據code

create table t_test_p_source (
    id string,
    name string,
    birthday string
) 
row format delimited fields terminated by '\t'
stored as textfile;

insert into t_test_p_source values ('a1', 'zhangsan', '2018-01-01');
insert into t_test_p_source values ('a2', 'lisi', '2018-01-02');
insert into t_test_p_source values ('a3', 'zhangsan', '2018-01-03');
insert into t_test_p_source values ('a4', 'wangwu', '2018-01-04');
insert into t_test_p_source values ('a5', 'sanzang', '2018-01-05');
insert into t_test_p_source values ('a6', 'zhangsan2', '2018-01-01');

二、創建一張分區表（按ds字段分區）orm

create table t_test_p_target (
    id string,
    name string
)
partitioned by (ds string)
row format delimited fields terminated by '\t'
stored as textfile;

三、向分區表中插入數據blog

SET hive.exec.dynamic.partition=true;   #是否開啓動態分區，默認是false，因此必需要設置成true
SET hive.exec.dynamic.partition.mode=nonstrict;    # 動態分區模式，默認爲strict, 表示表中必須一個分區爲靜態分區，nostrict表示容許全部字段均可以做爲動態分區

insert into table t_test_p_target partition (ds) select id, name, birthday as ds from t_test_p_source;