hive入門

時間 2019-11-19

標籤 hive 入門欄目 Hadoop 简体版

原文原文鏈接

鏈接登陸 html

!connect jdbc:hive2://localhost:10000
Connecting to jdbc:hive2://localhost:10000
Enter username for jdbc:hive2://localhost:10000: hadoop
Enter password for jdbc:hive2://localhost:10000: mysql

建立表sql

hive與mysql的之一是在建表時要制定數據格式函數

create table t_sz01(id int, name string) row format delimited fields terminated by ',';oop

導入數據性能

[hadoop@mini2 study]$ hadoop fs -put sz4.dat /user/hive/warehouse/myhive.db/t_sz02spa

1,oo
2,pp
3,ll
4,i9i
5,kkj
6,ujn
7,aa
8,zx
9,sdfa
10,4sad
11,3d3
12,sadf
13,gdh
14,asdf4
15,asdfsadf
16,asdf
17,asddd

而後執行查詢 select * from t_sz02;.net

建立表的語句：
Create [EXTERNAL] TABLE [IF NOT EXISTS] table_name
[(col_name data_type [COMMENT col_comment], ...)]
[COMMENT table_comment]
[PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)]
[CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC],...)]INTO num_buckets BUCKETS]
[ROW FORMAT row_format]
[STORED AS file_format]
[LOCATION hdfs_path]

CREATE TABLE 建立一個指定名字的表。若是相同名字的表已經存在，則拋出異常；用戶能夠用 IF NOT EXIST 選項來忽略這個異常。
EXTERNAL 關鍵字能夠讓用戶建立一個外部表，在建表的同時指定一個指向實際數據的路徑（LOCATION），Hive建立內部表時，會將數據移動到數據倉庫指向的路徑；若建立外部表，僅記錄數據所在的路徑，不對數據的位置作任何改變。在刪除表的時候，內部表的元數據和數據會被一塊兒刪除，而外部表只刪除元數據，不刪除數據。
若是文件數據是純文本，能夠使用 STORED AS TEXTFILE。若是數據須要壓縮，使用 STORED AS SEQUENCE 。
有分區的表能夠在建立的時候使用 PARTITIONED BY 語句。一個表能夠擁有一個或者多個分區，每個分區單獨存在一個目錄下。並且，表和分區均可以對某個列進行 CLUSTERED BY 操做，將若干個列放入一個桶（bucket）中。也能夠利用SORT BY 對數據進行排序。這樣能夠爲特定應用提升性能3d

建立一個普通表code

create table if not exists mytable(id int,name string) row format delimited fields terminated by '\0005' stored as textfile;

外部表（導入數據的方法相同）

create external table if not exists myexternaltable(id int,name string) row format delimited fields terminated by ',' location 'hdfs://mini2:9000/user/myhive/warehouse/myexternaltable';

desc extended myexternaltable; 查看更詳細的表信息

desc formatted myexternaltable; 格式化的詳細信息

裝載數據的方法

0: jdbc:hive2://localhost:10000> load data local inpath '/home/hadoop/study/sz4.dat' overwrite into table myexternaltable;（overwrite是覆蓋數據，若是不是覆蓋就不要）

在hive中查看hdfs

0: jdbc:hive2://localhost:10000> dfs -ls /user/hive/warehouse/myhive.db/;

分區表

0: jdbc:hive2://localhost:10000> create table parttable(id int ,name string) partitioned by (country string)
0: jdbc:hive2://localhost:10000> row format delimited fields terminated by ',';

加載數據時要指定向那個分區中加載數據

load data local inpath '/home/hadoop/study/sz4.dat' into table parttable partition(country='US');

查詢 select * From parttable where country='US';

查詢出來的country是僞列

沒有插入數據時能夠修改表添加分區

alter table t_name add [if not exists] partition_spec [location 'localtion1']

partion_spc [location 'location2'] ..

partition_spc: (partition_col = partttion_val,partition_col = partttion_val,)

ALTER TABLE tname drop partition_spc,partition_spc,..

具體實例

alter table t1 add partition(part='a') partition(part='b');

分區與分桶的區別

http://www.cnblogs.com/xiohao/p/6429305.html

描述表

http://blog.csdn.net/lskyne/article/details/38427895

查看錶的分區，能夠在頁面去看

show partiotns parttable;

date_sub函數,腳本中只要日期參數格式正確,就能夠解析

hive> select date_sub('2017-07-01',11) from dual;
OK
2017-06-20

hive中查看hdfs上文件超快

hive> dfs -ls /test;
Found 4 items
drwxr-xr-x   - root supergroup          0 2017-09-01 12:22 /test/outpt
drwxr-xr-x   - root supergroup          0 2017-09-01 13:22 /test/outpt1
drwxr-xr-x   - root supergroup          0 2017-09-01 13:33 /test/outpt2
drwxr-xr-x   - root supergroup          0 2017-09-03 09:09 /test/outptx

1. Hive入門
2. hive入門
3. [Hive01]Hive入門
4. Hive 入門
5. HIve入門
6. Hive入門篇
7. hive入門(轉)
8. 【Hive】Hive入門解析（五）
9. 【Hive】Hive入門解析（一）
10. 【Hive】Hive入門解析（三）
更多相關文章...
• Memcached入門教程 - NoSQL教程
• Neo4j數據庫入門教程 - NoSQL教程
• YAML 入門教程
• Java Agent入門實戰（一）-Instrumentation介紹與使用

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。