Hive-1.2.1_03_DDL操做

 

Hive官方文檔:Home-UserDocumentation算法

Hive DDL官方文檔:LanguageManual DDL數據庫

參考文章:Hive 用戶指南apache

 

  注意:各個語句的版本時間,有的是在 hive-1.2.1 以後纔有的,這些語句咱們在hive-1.2.1中是不能使用的。瀏覽器

  注意:本文寫的都是經常使用的知識點,更多細節請參考官網。bash

 

經常使用命令app

1 select current_database();        # 當前使用哪一個庫
2 show databases;        # 顯示全部庫名
3 show tables;        # 顯示當前庫有哪些表
4 describe extended talbe_name;   # 表的擴展信息  和 describe formatted talbe_name; 相似
5 describe formatted talbe_name;   # 推薦使用★★★  如:describe formatted t_sz01; # 描述表信息,也可知道是 MANAGED_TABLE 仍是 EXTERNAL_TABLE
6 describe database test_db;  # 查看數據庫信息
7 show partitions t_sz03_part;  # 查看分區表的分區信息

 

說明ide

1 本文章使用Hive操做時,多是 HIVE的命令行,也多是beeline
2 因此看到2種風格的命令行,請不要驚訝。

 

 

1. DDL- Database操做

1.1. Create Database

1 CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name
2   [COMMENT database_comment]
3   [LOCATION hdfs_path]
4   [WITH DBPROPERTIES (property_name=property_value, ...)];

 

建立庫工具

 

 1 # 建立庫
 2 create database if not exists test_db
 3 comment 'my frist db'; 
 4 
 5 0: jdbc:hive2://mini01:10000> describe database test_db;  # 建庫信息 
 6 +----------+--------------+----------------------------------------------------+-------------+-------------+-------------+--+
 7 | db_name  |   comment    |                      location                      | owner_name  | owner_type  | parameters  |
 8 +----------+--------------+----------------------------------------------------+-------------+-------------+-------------+--+
 9 | test_db  | my frist db  | hdfs://mini01:9000/user/hive/warehouse/test_db.db  | yun         | USER        |             |
10 +----------+--------------+----------------------------------------------------+-------------+-------------+-------------+--+

 

 

1.2. Drop Database

1 DROP (DATABASE|SCHEMA) [IF EXISTS] database_name [RESTRICT|CASCADE];
2 # 默認爲 RESTRICT,若是庫下有表就不能刪除庫;若是使用 CASCADE 那麼即便庫下有表,也會刪除庫下的表和庫。

 

1.3. Use Database

1 USE database_name;        # 使用哪一個庫
2 USE DEFAULT;            # 使用默認庫

 

例如:oop

 

 1 hive (default)> show databases;        # 查詢有哪些庫
 2 OK
 3 default
 4 test001
 5 test_db
 6 zhang
 7 Time taken: 0.016 seconds, Fetched: 4 row(s)
 8 hive (default)> use test_db;    # 使用哪一個庫
 9 OK
10 Time taken: 0.027 seconds
11 hive (test_db)> select current_database();    # 當前使用的是哪一個庫
12 OK
13 test_db
14 Time taken: 1.232 seconds, Fetched: 1 row(s)

 

 

2. DDL-Table

2.1. Create Table

CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name 
  [(col_name data_type [COMMENT col_comment], ... [constraint_specification])]
  [COMMENT table_comment]
  [PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]	# PARTITIONED 分割
  [CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS]  # CLUSTERED 分羣 BUCKETS 桶
  [SKEWED BY (col_name, col_name, ...) 
     ON ((col_value, col_value, ...), (col_value, col_value, ...), ...)
     [STORED AS DIRECTORIES]
  [
   [ROW FORMAT row_format] 
   [STORED AS file_format]
     | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]  
  ]
  [LOCATION hdfs_path]
  [TBLPROPERTIES (property_name=property_value, ...)]   # TBLPROPERTIES 表屬性
  [AS select_statement];   -- (Note: Available in Hive 0.5.0 and later; not supported for external tables)
 
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name
  LIKE existing_table_or_view_name
  [LOCATION hdfs_path];
 
data_type
  : primitive_type
  | array_type
  | map_type
  | struct_type
  | union_type  -- (Note: Available in Hive 0.7.0 and later)
 
primitive_type
  : TINYINT
  | SMALLINT
  | INT
  | BIGINT
  | BOOLEAN
  | FLOAT
  | DOUBLE
  | DOUBLE PRECISION -- (Note: Available in Hive 2.2.0 and later)
  | STRING
  | BINARY      -- (Note: Available in Hive 0.8.0 and later)
  | TIMESTAMP   -- (Note: Available in Hive 0.8.0 and later)
  | DECIMAL     -- (Note: Available in Hive 0.11.0 and later)
  | DECIMAL(precision, scale)  -- (Note: Available in Hive 0.13.0 and later)
  | DATE        -- (Note: Available in Hive 0.12.0 and later)
  | VARCHAR     -- (Note: Available in Hive 0.12.0 and later)
  | CHAR        -- (Note: Available in Hive 0.13.0 and later)
 
array_type
  : ARRAY < data_type >
 
map_type
  : MAP < primitive_type, data_type >
 
struct_type
  : STRUCT < col_name : data_type [COMMENT col_comment], ...>
 
union_type
   : UNIONTYPE < data_type, data_type, ... >  -- (Note: Available in Hive 0.7.0 and later)
 
row_format
  : DELIMITED [FIELDS TERMINATED BY char [ESCAPED BY char]] [COLLECTION ITEMS TERMINATED BY char]
        [MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char]
        [NULL DEFINED AS char]   -- (Note: Available in Hive 0.13 and later)
  | SERDE serde_name [WITH SERDEPROPERTIES (property_name=property_value, property_name=property_value, ...)]
 
file_format:
  : SEQUENCEFILE
  | TEXTFILE    -- (Default, depending on hive.default.fileformat configuration)
  | RCFILE      -- (Note: Available in Hive 0.6.0 and later)
  | ORC         -- (Note: Available in Hive 0.11.0 and later)
  | PARQUET     -- (Note: Available in Hive 0.13.0 and later)
  | AVRO        -- (Note: Available in Hive 0.14.0 and later)
  | JSONFILE    -- (Note: Available in Hive 4.0.0 and later)
  | INPUTFORMAT input_format_classname OUTPUTFORMAT output_format_classname
 
constraint_specification:
  : [, PRIMARY KEY (col_name, ...) DISABLE NOVALIDATE ]
    [, CONSTRAINT constraint_name FOREIGN KEY (col_name, ...) REFERENCES table_name(col_name, ...) DISABLE NOVALIDATE

  

       注意:使用CREATE TABLE建表時,若是該表或視圖已存在,那麼會報錯。可使用IF NOT EXISTS 跳過該錯誤。url

 

  •  表名和列名大小寫不明感,可是SerDe(Serializer/Deserializer的簡寫。hive使用Serde進行行對象的序列與反序列化)和property (屬性)名大小寫敏感。
  •  表和列註釋是字符串文本(須要單引號)。
  •  一個表的建立沒有使用EXTERNAL子句,那麼被稱爲managed table(託管表);由於Hive管理它的數據。查看一個表是託管的仍是外部的,能夠經過 DESCRIBE EXTENDED 或者 table_name 查看tableType獲知。【或者經過describe formatted table_name; 查看】  【參見示例1】
  • TBLPROPERTIES 子句容許你使用本身的元數據key/value【鍵/值對】去標記表定義。一些預約義的表屬性也存在,好比last_modified_user和last_modified_time,它們是由Hive自動添加和管理的。【參見示例1】

 

 1 # 示例1
 2 0: jdbc:hive2://mini01:10000> create table t_sz05 (id int, name string) tblproperties ('key001'='value1', 'key200'='value200');  # 建立表
 3 No rows affected (0.224 seconds)
 4 0: jdbc:hive2://mini01:10000> describe formatted t_sz05;  # 查看錶信息
 5 +-------------------------------+-------------------------------------------------------------+-----------------------+--+
 6 |           col_name            |                          data_type                          |        comment        |
 7 +-------------------------------+-------------------------------------------------------------+-----------------------+--+
 8 | # col_name                    | data_type                                                   | comment               |
 9 |                               | NULL                                                        | NULL                  |
10 | id                            | int                                                         |                       |
11 | name                          | string                                                      |                       |
12 |                               | NULL                                                        | NULL                  |
13 | # Detailed Table Information  | NULL                                                        | NULL                  |
14 | Database:                     | zhang                                                       | NULL                  |
15 | Owner:                        | yun                                                         | NULL                  |
16 | CreateTime:                   | Sat Jul 07 20:13:53 CST 2018                                | NULL                  |
17 | LastAccessTime:               | UNKNOWN                                                     | NULL                  |
18 | Protect Mode:                 | None                                                        | NULL                  |
19 | Retention:                    | 0                                                           | NULL                  |
20 | Location:                     | hdfs://mini01:9000/user/hive/warehouse/zhang.db/t_sz05      | NULL                  |
21 | Table Type:                   | MANAGED_TABLE 【# 若是是外部表,則爲EXTERNAL_TABLE】        | NULL                  |
22 | Table Parameters:             | NULL                                                        | NULL                  |
23 |                               | key001 【# 自定義】                                         | value1                |
24 |                               | key200 【# 自定義】                                         | value200              |
25 |                               | transient_lastDdlTime                                       | 1530965633            |
26 |                               | NULL                                                        | NULL                  |
27 | # Storage Information         | NULL                                                        | NULL                  |
28 | SerDe Library:                | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe          | NULL                  |
29 | InputFormat:                  | org.apache.hadoop.mapred.TextInputFormat                    | NULL                  |
30 | OutputFormat:                 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat  | NULL                  |
31 | Compressed:                   | No                                                          | NULL                  |
32 | Num Buckets:                  | -1                                                          | NULL                  |
33 | Bucket Columns:               | []                                                          | NULL                  |
34 | Sort Columns:                 | []                                                          | NULL                  |
35 | Storage Desc Params:          | NULL                                                        | NULL                  |
36 |                               | serialization.format                                        | 1                     |
37 +-------------------------------+-------------------------------------------------------------+-----------------------+--+
38 29 rows selected (0.153 seconds)

 

2.1.1. Managed Table

建表

 

 1 # 能夠經過 describe formatted table_name;  # 查看錶信息
 2 hive (test_db)> create table t_sz01 (id int, name string comment 'person name')
 3                 comment 'a table of name'
 4                 row format delimited fields terminated by ',';
 5 OK
 6 Time taken: 0.311 seconds
 7 hive (test_db)> show tables;
 8 OK
 9 t_sz01
10 Time taken: 0.031 seconds, Fetched: 1 row(s)

 

 

       經過 desc formatted t_sz01; 中的信息可知: Location: 爲 hdfs://mini01:9000/user/hive/warehouse/test_db.db/t_sz01

 

添加表數據

 

 1 [yun@mini01 hive]$ pwd
 2 /app/software/hive
 3 [yun@mini01 hive]$ cat t_sz01.dat 
 4 1,zhnagsan
 5 3,李四
 6 5,wangwu
 7 7,趙六
 8 2,sunqi
 9 4,周八
10 6,kkkkk
11 8,zzzzz
12 [yun@mini01 hive]$ cp -a t_sz01.dat t_sz01.dat2  # 複製一份
13 [yun@mini01 hive]$ hadoop fs -put t_sz01.dat /user/hive/warehouse/test_db.db/t_sz01  # 數據上傳
14 [yun@mini01 hive]$ hadoop fs -put t_sz01.dat2 /user/hive/warehouse/test_db.db/t_sz01 # 數據上傳
15 [yun@mini01 hive]$ hadoop fs -ls /user/hive/warehouse/test_db.db/t_sz01  # 表下能夠有多個數據文件
16 Found 2 items
17 -rw-r--r--   2 yun supergroup         71 2018-07-12 21:58 /user/hive/warehouse/test_db.db/t_sz01/t_sz01.dat
18 -rw-r--r--   2 yun supergroup         71 2018-07-12 22:30 /user/hive/warehouse/test_db.db/t_sz01/t_sz01.dat2

 

經過hive查看數據

 1 0: jdbc:hive2://mini01:10000> select * from t_sz01;
 2 +------------+--------------+--+
 3 | t_sz01.id  | t_sz01.name  |
 4 +------------+--------------+--+
 5 | 1          | zhnagsan     |
 6 | 3          | 李四         |
 7 | 5          | wangwu       |
 8 | 7          | 趙六         |
 9 | 2          | sunqi        |
10 | 4          | 周八         |
11 | 6          | kkkkk        |
12 | 8          | zzzzz        |
13 | 1          | zhnagsan     |
14 | 3          | 李四         |
15 | 5          | wangwu       |
16 | 7          | 趙六         |
17 | 2          | sunqi        |
18 | 4          | 周八         |
19 | 6          | kkkkk        |
20 | 8          | zzzzz        |
21 +------------+--------------+--+
22 16 rows selected (0.159 seconds)

 

2.1.2. External Tables

       外表就是本身提供一個LOCATION,而不使用默認的表位置。而且刪除該表時,表中的數據是不會刪除的。

 

建表

 

 1 # 其中location也可省略 hdfs://mini01:9000  改成 /user02/hive/database/ext_table   
 2 # 其中若是location目錄不存在,那麼hive會建立
 3 hive (test_db)> create external table t_sz02_ext (id int, name string)
 4                 comment 'a ext table'
 5                 row format delimited fields terminated by '\t'
 6                 location 'hdfs://mini01:9000/user02/hive/database/ext_table';
 7 OK
 8 Time taken: 0.065 seconds
 9 hive (test_db)> show tables;
10 OK
11 t_sz01
12 t_sz02_ext
13 Time taken: 0.03 seconds, Fetched: 2 row(s)
14 0: jdbc:hive2://mini01:10000> select * from t_sz02_ext;   # 無數據
15 +----------------+------------------+--+
16 | t_sz02_ext.id  | t_sz02_ext.name  |
17 +----------------+------------------+--+
18 +----------------+------------------+--+
19 No rows selected (0.094 seconds)
20 
21 # 經過desc formatted t_sz02_ext; 可能查詢表的Location,獲得下面的信息
22 # hdfs://mini01:9000/user02/hive/database/ext_table

 

 

添加表數據

 

 1 [yun@mini01 hive]$ pwd
 2 /app/software/hive
 3 [yun@mini01 hive]$ ll
 4 total 12
 5 -rw-rw-r-- 1 yun yun 56 Jul  3 21:26 sz.dat
 6 -rw-rw-r-- 1 yun yun 71 Jul 12 21:53 t_sz01.dat
 7 -rw-rw-r-- 1 yun yun 79 Jul 12 22:15 t_sz02_ext.dat
 8 [yun@mini01 hive]$ cat t_sz02_ext.dat # 最後有一行空行 
 9 1    劉晨
10 2    王敏
11 3    張立
12 4    劉剛
13 5    孫慶
14 6    易思玲
15 7    李娜
16 8    夢圓圓
17 
18 [yun@mini01 hive]$ hadoop fs -put t_sz02_ext.dat /user02/hive/database/ext_table  # 上傳數據
19 [yun@mini01 hive]$ hadoop fs -ls /user02/hive/database/ext_table
20 Found 1 items
21 -rw-r--r--   2 yun supergroup         79 2018-07-12 22:16 /user02/hive/database/ext_table/t_sz02_ext.dat

 

經過hive查看數據

 

 1 0: jdbc:hive2://mini01:10000> select * from t_sz02_ext;
 2 +----------------+------------------+--+
 3 | t_sz02_ext.id  | t_sz02_ext.name  |
 4 +----------------+------------------+--+
 5 | 1              | 劉晨               |
 6 | 2              | 王敏               |
 7 | 3              | 張立               |
 8 | 4              | 劉剛               |
 9 | 5              | 孫慶               |
10 | 6              | 易思玲             |
11 | 7              | 李娜               |
12 | 8              | 夢圓圓             |
13 | NULL           | NULL             | # 緣由是數據中,最後一行爲空行
14 +----------------+------------------+--+
15 9 rows selected (0.14 seconds)

 

 

2.1.3. Partitioned Tables

       分區表(Partitioned tables)可使用 PARTITIONED BY 子句。一個表能夠有一個或多個分區列,而且爲每一個分區列中的不一樣值組合建立一個單獨的數據目錄。

       當建立一個分區表,你獲得這樣的錯誤:「FAILED: Error in semantic analysis: Column repeated in partitioning columns,【失敗:語義分析錯誤:分區列重複列】」。意味着你試圖在表的自己數據中包含分區列。你可能確實定義了列。可是,您建立的分區能夠生成一個能夠查詢的僞列,所以必須將表的列重命名爲其餘(那樣用戶不會查詢)。

 

       例如,假設原始未分區表有三列:id、date和name。

1 id     int,
2 date   date,
3 name   varchar

 

       如今要按日期分區。你的Hive定義可使用 "dtDontQuery"做爲列名,以便 "date" 能夠被用做分區(和查詢)。

1 create table table_name (
2   id                int,
3   dtDontQuery       string,
4   name              string
5 )
6 partitioned by (date string)

 

       如今你的用戶仍然查詢"where date = '...'",可是第二列dtDontQuery將保存原始值。

 

建表

1 # 不能使用date做爲表的字段【列】,由於date是關鍵字
2 hive (test_db)> create table t_sz03_part (id int, name string)
3                 comment 'This is a partitioned table'
4                 partitioned by (dt string, country string)
5                 row format delimited fields terminated by ',';

 

添加數據

 

 1 [yun@mini01 hive]$ pwd
 2 /app/software/hive
 3 [yun@mini01 hive]$ cat t_sz03_20180711.dat1
 4 1,張三_20180711
 5 2,lisi_20180711
 6 3,Wangwu_20180711
 7 [yun@mini01 hive]$ cat t_sz03_20180711.dat2
 8 11,Tom_20180711
 9 12,Dvid_20180711
10 13,cherry_20180711
11 [yun@mini01 hive]$ cat t_sz03_20180712.dat1
12 1,張三_20180712
13 2,lisi_20180712
14 3,Wangwu_20180712
15 [yun@mini01 hive]$ cat t_sz03_20180712.dat2
16 11,Tom_20180712
17 12,Dvid_20180712
18 13,cherry_20180712
19 #### 在Hive中導入數據
20 hive (test_db)> load data local inpath '/app/software/hive/t_sz03_20180711.dat1' into table t_sz03_part partition (dt='20180711', country='CN');
21 Loading data to table test_db.t_sz03_part partition (dt=20180711, country=CN)
22 Partition test_db.t_sz03_part{dt=20180711, country=CN} stats: [numFiles=1, numRows=0, totalSize=52, rawDataSize=0]
23 OK
24 Time taken: 0.406 seconds
25 hive (test_db)> load data local inpath '/app/software/hive/t_sz03_20180711.dat2' into table t_sz03_part partition (dt='20180711', country='US');
26 Loading data to table test_db.t_sz03_part partition (dt=20180711, country=US)
27 Partition test_db.t_sz03_part{dt=20180711, country=US} stats: [numFiles=1, numRows=0, totalSize=52, rawDataSize=0]
28 OK
29 Time taken: 0.453 seconds
30 hive (test_db)> load data local inpath '/app/software/hive/t_sz03_20180712.dat1' into table t_sz03_part partition (dt='20180712', country='CN');
31 Loading data to table test_db.t_sz03_part partition (dt=20180712, country=CN)
32 Partition test_db.t_sz03_part{dt=20180712, country=CN} stats: [numFiles=1, numRows=0, totalSize=52, rawDataSize=0]
33 OK
34 Time taken: 0.381 seconds
35 hive (test_db)> load data local inpath '/app/software/hive/t_sz03_20180712.dat2' into table t_sz03_part partition (dt='20180712', country='US');
36 Loading data to table test_db.t_sz03_part partition (dt=20180712, country=US)
37 Partition test_db.t_sz03_part{dt=20180712, country=US} stats: [numFiles=1, numRows=0, totalSize=52, rawDataSize=0]
38 OK
39 Time taken: 0.506 seconds

 

 

瀏覽器訪問

 

 

 

經過hive查看數據

 

 1 0: jdbc:hive2://mini01:10000> select * from t_sz03_part;
 2 +-----------------+-------------------+-----------------+----------------------+--+
 3 | t_sz03_part.id  | t_sz03_part.name  | t_sz03_part.dt  | t_sz03_part.country  |
 4 +-----------------+-------------------+-----------------+----------------------+--+
 5 | 1               | 張三_20180711     | 20180711        | CN                   |
 6 | 2               | lisi_20180711     | 20180711        | CN                   |
 7 | 3               | Wangwu_20180711   | 20180711        | CN                   |
 8 | 11              | Tom_20180711      | 20180711        | US                   |
 9 | 12              | Dvid_20180711     | 20180711        | US                   |
10 | 13              | cherry_20180711   | 20180711        | US                   |
11 | 1               | 張三_20180712     | 20180712        | CN                   |
12 | 2               | lisi_20180712     | 20180712        | CN                   |
13 | 3               | Wangwu_20180712   | 20180712        | CN                   |
14 | 11              | Tom_20180712      | 20180712        | US                   |
15 | 12              | Dvid_20180712     | 20180712        | US                   |
16 | 13              | cherry_20180712   | 20180712        | US                   |
17 +-----------------+-------------------+-----------------+----------------------+--+
18 12 rows selected (0.191 seconds)
19 0: jdbc:hive2://mini01:10000> show partitions t_sz03_part;  # 查看分區表的分區信息
20 +-------------------------+--+
21 |        partition        |
22 +-------------------------+--+
23 | dt=20180711/country=CN  |
24 | dt=20180711/country=US  |
25 | dt=20180712/country=CN  |
26 | dt=20180712/country=US  |
27 +-------------------------+--+
28 4 rows selected (0.164 seconds)

 

 

2.1.4. Create Table As Select (CTAS)

       表還能夠由一個create-table-as-select (CTAS)語句中的查詢結果建立和填充。CTAS建立的表是原子的,這意味着在填充全部查詢結果以前,其餘用戶看不到該表。所以,其餘用戶要麼看到表的完整結果,要麼根本看不到表。

 

CTAS限制:

       目標表不能是分區表

       目標表不能是外部表

       目標表不能是分桶表

這裏的目標表指的是要建立的表。

1 # 示例:
2 CREATE TABLE new_key_value_store
3    ROW FORMAT SERDE "org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe"
4    STORED AS RCFile
5    AS
6 SELECT (key % 1024) new_key, concat(key, value) key_value_pair
7 FROM key_value_store
8 SORT BY new_key, key_value_pair;

 

實例

 

 1 hive (test_db)> create table t_sz02_ext_new
 2                    row format delimited fields terminated by '#'
 3                    AS
 4                 SELECT id , concat(name, '_', id) name2 
 5                 FROM t_sz02_ext
 6                 SORT BY name2;
 7 
 8 0: jdbc:hive2://mini01:10000> show tables;
 9 +-----------------+--+
10 |    tab_name     |
11 +-----------------+--+
12 | t_sz01          |
13 | t_sz02_ext      |
14 | t_sz02_ext_new  |
15 | t_sz03_part     |
16 | t_sz100_ext     |
17 | t_sz101_ext     |
18 +-----------------+--+
19 6 rows selected (0.069 seconds)
20 0: jdbc:hive2://mini01:10000> select * from t_sz02_ext_new;
21 +--------------------+-----------------------+--+
22 | t_sz02_ext_new.id  | t_sz02_ext_new.name2  |
23 +--------------------+-----------------------+--+
24 | NULL               | NULL                  |
25 | 4                  | 劉剛_4                |
26 | 1                  | 劉晨_1                |
27 | 5                  | 孫慶_5                |
28 | 3                  | 張立_3                |
29 | 6                  | 易思玲_6              |
30 | 7                  | 李娜_7                |
31 | 8                  | 夢圓圓_8              |
32 | 2                  | 王敏_2                |
33 +--------------------+-----------------------+--+
34 9 rows selected (0.094 seconds)
35 
36 # 其中路徑location能夠經過desc formatted t_sz02_ext_new; 獲取
37 hive (test_db)> dfs -ls /user/hive/warehouse/test_db.db/t_sz02_ext_new/;
38 Found 1 items
39 -rwxr-xr-x   2 yun supergroup        100 2018-07-12 23:50 /user/hive/warehouse/test_db.db/t_sz02_ext_new/000000_0
40 hive (test_db)> dfs -cat /user/hive/warehouse/test_db.db/t_sz02_ext_new/000000_0;
41 \N#\N
42 4#劉剛_4
43 1#劉晨_1
44 5#孫慶_5
45 3#張立_3
46 6#易思玲_6
47 7#李娜_7
48 8#夢圓圓_8
49 2#王敏_2

 

 

hdfs查看

 

 1 # 其中路徑location能夠經過desc formatted t_sz02_ext_new; 獲取
 2 [yun@mini01 hive]$ hadoop fs -ls /user/hive/warehouse/test_db.db/t_sz02_ext_new;
 3 Found 1 items
 4 -rwxr-xr-x   2 yun supergroup        100 2018-07-12 23:44 /user/hive/warehouse/test_db.db/t_sz02_ext_new/000000_0
 5 [yun@mini01 hive]$ hadoop fs -cat /user/hive/warehouse/test_db.db/t_sz02_ext_new/000000_0
 6 \N#\N
 7 4#劉剛_4
 8 1#劉晨_1
 9 5#孫慶_5
10 3#張立_3
11 6#易思玲_6
12 7#李娜_7
13 8#夢圓圓_8
14 2#王敏_2

 

 

2.1.5. Create Table Like

       拷貝一個已存在的表結構做爲一個新表

1 CREATE TABLE empty_key_value_store
2 LIKE key_value_store [TBLPROPERTIES (property_name=property_value, ...)];

 

實例

 

 1 hive (test_db)> create table t_sz03_part_new
 2                 like t_sz03_part tblproperties ('proper1'='value1', 'proper2'='value2');
 3 OK
 4 Time taken: 0.083 seconds
 5 0: jdbc:hive2://mini01:10000> select * from t_sz03_part_new;  # 只複製表結構,不復制內容 
 6 +---------------------+-----------------------+---------------------+--------------------------+--+
 7 | t_sz03_part_new.id  | t_sz03_part_new.name  | t_sz03_part_new.dt  | t_sz03_part_new.country  |
 8 +---------------------+-----------------------+---------------------+--------------------------+--+
 9 +---------------------+-----------------------+---------------------+--------------------------+--+
10 No rows selected (0.087 seconds)
11 hive (test_db)> create table t_sz04_like
12                 like t_sz02_ext  tblproperties ('proper1'='value1', 'proper2'='value2');
13 No rows affected (0.153 seconds)
14 
15 0: jdbc:hive2://mini01:10000> select * from t_sz04_like;
16 +-----------------+-------------------+--+
17 | t_sz04_like.id  | t_sz04_like.name  |
18 +-----------------+-------------------+--+
19 +-----------------+-------------------+--+

 

 

       經過desc formatted tab_name; 能夠看到,t_sz03_part 和 t_sz03_part_new 的表結構是同樣的。  只是tblproperties 不同而已。

       若是like對象即便是一個外部表,那麼生成的表也是MANAGED_TABLE,。

 

2.1.6. Bucketed Sorted Tables

 

 

 1 CREATE TABLE page_view(viewTime INT, userid BIGINT,
 2      page_url STRING, referrer_url STRING,
 3      ip STRING COMMENT 'IP Address of the User')
 4  COMMENT 'This is the page view table'
 5  PARTITIONED BY(dt STRING, country STRING)
 6  CLUSTERED BY(userid) SORTED BY(viewTime) INTO 32 BUCKETS
 7  ROW FORMAT DELIMITED
 8    FIELDS TERMINATED BY '\001'
 9    COLLECTION ITEMS TERMINATED BY '\002'
10    MAP KEYS TERMINATED BY '\003'
11  STORED AS SEQUENCEFILE;

 

  CLUSTERED BY

  對於每個表(table)或者分區, Hive能夠進一步組織成桶,也就是說桶是更爲細粒度的數據範圍劃分。Hive也是 針對某一列進行桶的組織。Hive採用對列值哈希,而後除以桶的個數求餘的方式決定該條記錄存放在哪一個桶當中。

  把表(或者分區)組織成桶(Bucket)有兩個理由:

    (1)得到更高的查詢處理效率。桶爲表加上了額外的結構,Hive 在處理有些查詢時能利用這個結構。具體而言,鏈接兩個在(包含鏈接列的)相同列上劃分了桶的表,可使用 Map 端鏈接 (Map-side join)高效的實現。好比JOIN操做。對於JOIN操做兩個表有一個相同的列,若是對這兩個表都進行了桶操做。那麼將保存相同列值的桶進行JOIN操做就能夠,能夠大大較少JOIN的數據量。

    (2)使取樣(sampling)更高效。在處理大規模數據集時,在開發和修改查詢的階段,若是能在數據集的一小部分數據上試運行查詢,會帶來不少方便。

 

注意點

  1、order by 會對輸入作全局排序,所以只有一個reducer,會致使當輸入規模較大時,須要較長的計算時間。

  2、sort by不是全局排序,其在數據進入reducer前完成排序。所以,若是用sort by進行排序,而且設置mapred.reduce.tasks>1,則sort by只保證每一個reducer的輸出有序,不保證全局有序。

  3、distribute by(字段)根據指定的字段將數據分到不一樣的reducer,且分發算法是hash散列。

  4、Cluster by(字段)除了具備Distribute by的功能外,還會對該字段進行排序。

         所以,若是分桶和sort字段是同一個時,此時,cluster by = distribute by + sort by

 

      說明:是根據表查詢數據結果向分桶表中插入數據。

 

分桶表的做用:最大的做用是用來提升join操做的效率;

(思考這個問題:

select a.id,a.name,b.addr from a join b on a.id = b.id;

若是a表和b表已是分桶表,並且分桶的字段是id字段

作這個join操做時,還須要全表作笛卡爾積嗎?)

 

實例

 1 # 建立分桶表
 2 hive (test_db)> create table t_sz05_buck (id int, name string)
 3                 clustered by(id) sorted by(id) into 4 buckets
 4                 row format delimited fields terminated by ','; 
 5 OK
 6 Time taken: 0.267 seconds
 7 0: jdbc:hive2://mini01:10000> select * from t_sz05_buck;
 8 +-----------------+-------------------+--+
 9 | t_sz05_buck.id  | t_sz05_buck.name  |
10 +-----------------+-------------------+--+
11 +-----------------+-------------------+--+
12 No rows selected (0.095 seconds)

 

 

準備數據

 

 1 [yun@mini01 hive]$ pwd
 2 /app/software/hive
 3 [yun@mini01 hive]$ cat t_sz05_buck.dat
 4 1,mechanics
 5 2,mechanical engineering
 6 3,Instrument Science and technology
 7 4,Material science and Engineering
 8 5,Dynamic engineering and Engineering Thermophysics
 9 6,Electrical engineering
10 7,Control science and Engineering
11 8,Computer science and technology
12 9,Architecture
13 10,irrigation works
14 11,Nuclear science and technology
15 12,Environmental Science and Engineering
16 13,Urban and rural planning
17 14,Landscape architecture
18 15,Electronic Science and technology
19 16,Information and Communication Engineering
20 17,Civil Engineering
21 18,Chemical engineering and technology
22 19,software engineering
23 20,biomedical engineering
24 21,Safety science and Engineering
25 22,Aeronautical and Astronautical Science and Technology
26 23,Transportation Engineering
27 24,Optical engineering
28 ##############  將數據導入hive中的一個普通表
29 hive (test_db)> create table t_sz05 (id int, name string)
30                 row format delimited fields terminated by ',';
31 OK
32 Time taken: 0.091 seconds
33 hive (test_db)> load data local inpath '/app/software/hive/t_sz05_buck.dat' into table t_sz05; # 導入數據
34 Loading data to table test_db.t_sz05
35 Table test_db.t_sz05 stats: [numFiles=1, totalSize=753]
36 OK
37 Time taken: 0.276 seconds
38 hive (test_db)> select * from t_sz05;  # 查詢結果數據正常
39 ……………………

 

 

指定分桶信息

 

 1 # 原參數  在Hive命令行或者 0: jdbc:hive2 命令行
 2 hive (test_db)> set hive.enforce.bucketing;
 3 hive.enforce.bucketing=false
 4 hive (test_db)> set mapreduce.job.reduces;
 5 mapreduce.job.reduces=-1
 6 # 指定參數
 7 hive (test_db)> set hive.enforce.bucketing = true;
 8 hive (test_db)> set mapreduce.job.reduces = 4;
 9 hive (test_db)> set hive.enforce.bucketing;
10 hive.enforce.bucketing=true
11 hive (test_db)> set mapreduce.job.reduces;
12 mapreduce.job.reduces=4
13 ######### 在查詢 t_sz05 ###############################
14 0: jdbc:hive2://mini01:10000> select id, name from t_sz05 cluster by (id);  
15 INFO  : Number of reduce tasks not specified. Defaulting to jobconf value of: 4
16 INFO  : In order to change the average load for a reducer (in bytes):
17 INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
18 INFO  : In order to limit the maximum number of reducers:
19 INFO  :   set hive.exec.reducers.max=<number>
20 INFO  : In order to set a constant number of reducers:
21 INFO  :   set mapreduce.job.reduces=<number>
22 INFO  : number of splits:1
23 INFO  : Submitting tokens for job: job_1531531561184_0009
24 INFO  : The url to track the job: http://mini02:8088/proxy/application_1531531561184_0009/
25 INFO  : Starting Job = job_1531531561184_0009, Tracking URL = http://mini02:8088/proxy/application_1531531561184_0009/
26 INFO  : Kill Command = /app/hadoop/bin/hadoop job  -kill job_1531531561184_0009
27 INFO  : Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 4
28 INFO  : 2018-07-14 10:56:10,711 Stage-1 map = 0%,  reduce = 0%
29 INFO  : 2018-07-14 10:56:15,963 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.42 sec
30 INFO  : 2018-07-14 10:56:30,998 Stage-1 map = 100%,  reduce = 25%, Cumulative CPU 4.94 sec
31 INFO  : 2018-07-14 10:56:32,196 Stage-1 map = 100%,  reduce = 50%, Cumulative CPU 11.03 sec
32 INFO  : 2018-07-14 10:56:35,630 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 22.46 sec
33 INFO  : MapReduce Total cumulative CPU time: 22 seconds 460 msec
34 INFO  : Ended Job = job_1531531561184_0009
35 +-----+--------------------------------------------------------+--+
36 | id  |                          name                          |
37 +-----+--------------------------------------------------------+--+
38 | 4   | Material science and Engineering                       |
39 | 8   | Computer science and technology                        |
40 | 12  | Environmental Science and Engineering                  |
41 | 16  | Information and Communication Engineering              |
42 | 20  | biomedical engineering                                 |
43 | 24  | Optical engineering                                    |
44 | 1   | mechanics                                              |
45 | 5   | Dynamic engineering and Engineering Thermophysics      |
46 | 9   | Architecture                                           |
47 | 13  | Urban and rural planning                               |
48 | 17  | Civil Engineering                                      |
49 | 21  | Safety science and Engineering                         |
50 | 2   | mechanical engineering                                 |
51 | 6   | Electrical engineering                                 |
52 | 10  | irrigation works                                       |
53 | 14  | Landscape architecture                                 |
54 | 18  | Chemical engineering and technology                    |
55 | 22  | Aeronautical and Astronautical Science and Technology  |
56 | 3   | Instrument Science and technology                      |
57 | 7   | Control science and Engineering                        |
58 | 11  | Nuclear science and technology                         |
59 | 15  | Electronic Science and technology                      |
60 | 19  | software engineering                                   |
61 | 23  | Transportation Engineering                             |
62 +-----+--------------------------------------------------------+--+
63 24 rows selected (33.214 seconds)

 

 

向分桶表導入數據★★★★★

 

 1 # 其中   select id, name from t_sz05 cluster by (id);
 2 # 等價於 select id, name from t_sz05 distribute by (id) sort by (id);
 3 # 所以,若是分桶和sort字段是同一個時,此時,cluster by = distribute by + sort by
 4 # 當根據一個字段分桶,另外一個字段排序時就可使用後面的語句
 5 0: jdbc:hive2://mini01:10000> insert into table t_sz05_buck
 6                               select id, name from t_sz05 cluster by (id);
 7 INFO  : Number of reduce tasks determined at compile time: 4
 8 INFO  : In order to change the average load for a reducer (in bytes):
 9 INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
10 INFO  : In order to limit the maximum number of reducers:
11 INFO  :   set hive.exec.reducers.max=<number>
12 INFO  : In order to set a constant number of reducers:
13 INFO  :   set mapreduce.job.reduces=<number>
14 INFO  : number of splits:1
15 INFO  : Submitting tokens for job: job_1531531561184_0010
16 INFO  : The url to track the job: http://mini02:8088/proxy/application_1531531561184_0010/
17 INFO  : Starting Job = job_1531531561184_0010, Tracking URL = http://mini02:8088/proxy/application_1531531561184_0010/
18 INFO  : Kill Command = /app/hadoop/bin/hadoop job  -kill job_1531531561184_0010
19 INFO  : Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 4
20 INFO  : 2018-07-14 11:01:40,785 Stage-1 map = 0%,  reduce = 0%
21 INFO  : 2018-07-14 11:01:46,089 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.49 sec
22 INFO  : 2018-07-14 11:02:03,691 Stage-1 map = 100%,  reduce = 25%, Cumulative CPU 10.35 sec
23 INFO  : 2018-07-14 11:02:04,962 Stage-1 map = 100%,  reduce = 75%, Cumulative CPU 25.76 sec
24 INFO  : 2018-07-14 11:02:05,987 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 32.36 sec
25 INFO  : MapReduce Total cumulative CPU time: 32 seconds 360 msec
26 INFO  : Ended Job = job_1531531561184_0010
27 INFO  : Loading data to table test_db.t_sz05_buck from hdfs://mini01:9000/user/hive/warehouse/test_db.db/t_sz05_buck/.hive-staging_hive_2018-07-14_11-01-34_616_4744571023046037741-1/-ext-10000
28 INFO  : Table test_db.t_sz05_buck stats: [numFiles=4, numRows=24, totalSize=753, rawDataSize=729]
29 No rows affected (33.762 seconds)
30 0: jdbc:hive2://mini01:10000> select * from t_sz05_buck;  
31 +-----------------+--------------------------------------------------------+--+
32 | t_sz05_buck.id  |                    t_sz05_buck.name                    |
33 +-----------------+--------------------------------------------------------+--+
34 | 4               | Material science and Engineering                       |
35 | 8               | Computer science and technology                        |
36 | 12              | Environmental Science and Engineering                  |
37 | 16              | Information and Communication Engineering              |
38 | 20              | biomedical engineering                                 |
39 | 24              | Optical engineering                                    |
40 | 1               | mechanics                                              |
41 | 5               | Dynamic engineering and Engineering Thermophysics      |
42 | 9               | Architecture                                           |
43 | 13              | Urban and rural planning                               |
44 | 17              | Civil Engineering                                      |
45 | 21              | Safety science and Engineering                         |
46 | 2               | mechanical engineering                                 |
47 | 6               | Electrical engineering                                 |
48 | 10              | irrigation works                                       |
49 | 14              | Landscape architecture                                 |
50 | 18              | Chemical engineering and technology                    |
51 | 22              | Aeronautical and Astronautical Science and Technology  |
52 | 3               | Instrument Science and technology                      |
53 | 7               | Control science and Engineering                        |
54 | 11              | Nuclear science and technology                         |
55 | 15              | Electronic Science and technology                      |
56 | 19              | software engineering                                   |
57 | 23              | Transportation Engineering                             |
58 +-----------------+--------------------------------------------------------+--+
59 24 rows selected (0.097 seconds)

 

 

瀏覽器訪問

 

 

 1 0: jdbc:hive2://mini01:10000> dfs -cat /user/hive/warehouse/test_db.db/t_sz05_buck/000000_0;
 2 +-----------------------------------------------+--+
 3 |                  DFS Output                   |
 4 +-----------------------------------------------+--+
 5 | 4,Material science and Engineering            |
 6 | 8,Computer science and technology             |
 7 | 12,Environmental Science and Engineering      |
 8 | 16,Information and Communication Engineering  |
 9 | 20,biomedical engineering                     |
10 | 24,Optical engineering                        |
11 +-----------------------------------------------+--+
12 6 rows selected (0.016 seconds)
13 0: jdbc:hive2://mini01:10000> dfs -cat /user/hive/warehouse/test_db.db/t_sz05_buck/000001_0;
14 +------------------------------------------------------+--+
15 |                      DFS Output                      |
16 +------------------------------------------------------+--+
17 | 1,mechanics                                          |
18 | 5,Dynamic engineering and Engineering Thermophysics  |
19 | 9,Architecture                                       |
20 | 13,Urban and rural planning                          |
21 | 17,Civil Engineering                                 |
22 | 21,Safety science and Engineering                    |
23 +------------------------------------------------------+--+
24 6 rows selected (0.015 seconds)
25 0: jdbc:hive2://mini01:10000> dfs -cat /user/hive/warehouse/test_db.db/t_sz05_buck/000002_0;
26 +-----------------------------------------------------------+--+
27 |                        DFS Output                         |
28 +-----------------------------------------------------------+--+
29 | 2,mechanical engineering                                  |
30 | 6,Electrical engineering                                  |
31 | 10,irrigation works                                       |
32 | 14,Landscape architecture                                 |
33 | 18,Chemical engineering and technology                    |
34 | 22,Aeronautical and Astronautical Science and Technology  |
35 +-----------------------------------------------------------+--+
36 6 rows selected (0.014 seconds)
37 0: jdbc:hive2://mini01:10000> dfs -cat /user/hive/warehouse/test_db.db/t_sz05_buck/000003_0;
38 +---------------------------------------+--+
39 |              DFS Output               |
40 +---------------------------------------+--+
41 | 3,Instrument Science and technology   |
42 | 7,Control science and Engineering     |
43 | 11,Nuclear science and technology     |
44 | 15,Electronic Science and technology  |
45 | 19,software engineering               |
46 | 23,Transportation Engineering         |
47 +---------------------------------------+--+
48 6 rows selected (0.016 seconds)

 

2.1.7. Temporary Tables

       一個表建立爲一個臨時表那麼只對當前會話看見。數據將存儲在用戶的scratch目錄中,並在會話結束時刪除。

       若是使用數據庫中已經存在的永久表的數據庫/表名建立臨時表,那麼在該會話中對該表的任何引用都將解析爲臨時表,而不是永久表。若是不刪除臨時表或將其重命名爲不衝突的名稱,用戶將沒法訪問會話中的原始表。

 

臨時表有如下限制:

       分區列不支持

       不支持建立索引

 

實例

 

 1 0: jdbc:hive2://mini01:10000> create temporary table t_sz06_temp (id int, name string)
 2 0: jdbc:hive2://mini01:10000> row format delimited fields terminated by ',';
 3 No rows affected (0.062 seconds)
 4 0: jdbc:hive2://mini01:10000> select * from t_sz06_temp ;
 5 +-----------------+-------------------+--+
 6 | t_sz06_temp.id  | t_sz06_temp.name  |
 7 +-----------------+-------------------+--+
 8 +-----------------+-------------------+--+
 9 No rows selected (0.042 seconds)
10 0: jdbc:hive2://mini01:10000> load data local inpath '/app/software/hive/t_sz01.dat' into table t_sz06_temp ; # 導入數據
11 INFO  : Loading data to table test_db.t_sz06_temp from file:/app/software/hive/t_sz01.dat
12 INFO  : Table test_db.t_sz06_temp stats: [numFiles=1, totalSize=71]
13 No rows affected (0.09 seconds)
14 0: jdbc:hive2://mini01:10000> select * from t_sz06_temp ;
15 +-----------------+-------------------+--+
16 | t_sz06_temp.id  | t_sz06_temp.name  |
17 +-----------------+-------------------+--+
18 | 1               | zhnagsan          |
19 | 3               | 李四              |
20 | 5               | wangwu            |
21 | 7               | 趙六              |
22 | 2               | sunqi             |
23 | 4               | 周八              |
24 | 6               | kkkkk             |
25 | 8               | zzzzz             |
26 +-----------------+-------------------+--+
27 8 rows selected (0.062 seconds)
28 # 其中 Location爲 hdfs://mini01:9000/tmp/hive/yun/4bf7c650-56ed-4f93-8a73-23420c7292a1/_tmp_space.db/158ad837-ee43-4870-9e34-6ad5cd4b292a
29 0: jdbc:hive2://mini01:10000> desc formatted t_sz06_temp; 
30 +-------------------------------+------------------------------------------------------------+-----------------------+--+
31 |           col_name            |                       data_type                            |        comment        |
32 +-------------------------------+------------------------------------------------------------+-----------------------+--+
33 | # col_name                    | data_type                                                  | comment               |
34 |                               | NULL                                                       | NULL                  |
35 | id                            | int                                                        |                       |
36 | name                          | string                                                     |                       |
37 |                               | NULL                                                       | NULL                  |
38 | # Detailed Table Information  | NULL                                                       | NULL                  |
39 | Database:                     | test_db                                                    | NULL                  |
40 | Owner:                        | yun                                                        | NULL                  |
41 | CreateTime:                   | Sat Jul 14 11:47:14 CST 2018                               | NULL                  |
42 | LastAccessTime:               | UNKNOWN                                                    | NULL                  |
43 | Protect Mode:                 | None                                                       | NULL                  |
44 | Retention:                    | 0                                                          | NULL                  |
45 | Location:                     | hdfs://mini01:9000/tmp/hive/yun/UUID/_tmp_space.db/UUID    | NULL                  |
46 | Table Type:                   | MANAGED_TABLE                                              | NULL                  |
47 | Table Parameters:             | NULL                                                       | NULL                  |
48 |                               | COLUMN_STATS_ACCURATE                                      | true                  |
49 |                               | numFiles                                                   | 1                     |
50 |                               | totalSize                                                  | 71                    |
51 |                               | NULL                                                       | NULL                  |
52 | # Storage Information         | NULL                                                       | NULL                  |
53 | SerDe Library:                | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe         | NULL                  |
54 | InputFormat:                  | org.apache.hadoop.mapred.TextInputFormat                   | NULL                  |
55 | OutputFormat:                 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL                  |
56 | Compressed:                   | No                                                         | NULL                  |
57 | Num Buckets:                  | -1                                                         | NULL                  |
58 | Bucket Columns:               | []                                                         | NULL                  |
59 | Sort Columns:                 | []                                                         | NULL                  |
60 | Storage Desc Params:          | NULL                                                       | NULL                  |
61 |                               | field.delim                                                | ,                     |
62 |                               | serialization.format                                       | ,                     |
63 +-------------------------------+------------------------------------------------------------+-----------------------+--+
64 30 rows selected (0.024 seconds)

 

2.1.8. Constraints【約束條件】

       Hive包含對未驗證的 主鍵和外鍵約束的支持。一些SQL工具在出現約束時生成更有效的查詢。因爲這些約束沒有通過驗證,因此上游系統須要在將數據加載到Hive以前確保數據的真實性。

 

1 # Example:
2 create table pk(id1 integer, id2 integer,
3   primary key(id1, id2) disable novalidate);
4  
5 create table fk(id1 integer, id2 integer,
6   constraint c1 foreign key(id1, id2) references pk(id2, id1) disable novalidate);

 

 

2.2. Drop Table

1 DROP TABLE [IF EXISTS] table_name [PURGE];  

 

       DROP TABLE 刪除該表的元數據和數據。若是配置了 Trash那麼數據實際移到了 .Trash/Current 目錄(而且PURGE沒有指定)。元數據徹底丟失。

       當刪除一個 EXTERNAL表時,表的數據將不會從文件系統刪除。

       當刪除視圖引用的表時,不會給出任何警告。

       表信息從元存儲中被刪除而且行數據也會被刪除就像經過「hadoop dfs -rm」。多數狀況下,這會致使表數據被移動到用戶家目錄中的.Trash文件夾中; 所以,錯誤地刪除表的用戶能夠經過使用相同的模式從新建立表,從新建立任何須要的分區,而後使用Hadoop手動將數據移回原位,從而恢復丟失的數據。因爲該解決方案依賴於底層實現,所以可能會隨時間或跨安裝設備進行更改;強烈建議用戶不要隨意刪除表。

 

       若是PURGE被指定,那麼表數據不會移到.Trash/Current目錄,所以若是是錯誤的DROP那麼不能被恢復。purge 選項也可使用表屬性auto. cleanup指定

 

2.3. Truncate Table

1 TRUNCATE TABLE table_name [PARTITION partition_spec];
2  
3 partition_spec:
4   : (partition_column = partition_col_value, partition_column = partition_col_value, ...)

 

       從表或分區表刪除全部的行。若是文件系統垃圾(Trash)可用那麼行數據將進入垃圾站,不然將被刪除。當前,目標表應該是本機/託管表,不然將拋出異常。用戶能夠指定部分partition_spec來一次截斷多個分區,而省略partition_spec將截斷表中的全部分區。

 

 

3. Alter Table/Partition/Column

 1 Alter Table
 2     Rename Table
 3     Alter Table Properties
 4         Alter Table Comment
 5     Add SerDe Properties
 6     Alter Table Storage Properties
 7     Alter Table Skewed or Stored as Directories
 8         Alter Table Skewed
 9         Alter Table Not Skewed
10         Alter Table Not Stored as Directories
11         Alter Table Set Skewed Location
12     Alter Table Constraints
13     Additional Alter Table Statements
14 Alter Partition
15     Add Partitions
16         Dynamic Partitions
17     Rename Partition
18     Exchange Partition
19     Recover Partitions (MSCK REPAIR TABLE)
20     Drop Partitions
21     (Un)Archive Partition
22 Alter Either Table or Partition
23     Alter Table/Partition File Format
24     Alter Table/Partition Location
25     Alter Table/Partition Touch
26     Alter Table/Partition Protections
27     Alter Table/Partition Compact
28     Alter Table/Partition Concatenate
29     Alter Table/Partition Update columns
30 Alter Column
31     Rules for Column Names
32     Change Column Name/Type/Position/Comment
33     Add/Replace Columns
34     Partial Partition Specification

 

參見:Alter Table/Partition/Column

 

 

4. Show

 1 Show Databases
 2 Show Tables/Views/Partitions/Indexes
 3     Show Tables
 4     Show Views
 5     Show Partitions
 6     Show Table/Partition Extended
 7     Show Table Properties
 8     Show Create Table
 9     Show Indexes
10 Show Columns
11 Show Functions
12 Show Granted Roles and Privileges
13 Show Locks
14 Show Conf
15 Show Transactions
16 Show Compactions

 

參見:Show

 

 

5. Describe

1 Describe Database
2 Describe Table/View/Column
3     Display Column Statistics
4 Describe Partition

 

參見:Describe

相關文章
相關標籤/搜索