hive簡單hql整理

時間 2019-11-24

標籤 hive 簡單 hql 整理欄目 Hadoop 简体版

原文原文鏈接

DDL操做：增刪改數據庫表和數據庫(hive中ddl操做是能夠操做數據庫的)node

DML操做：增刪改數據數據庫

HIVE中特別的字段集合類型：數組

Strutc(first String,last String): 由first 和last 組成一個字段session

Map(key,value,key,value...)：由key value 組成字段，須要指定哪一個是key 哪一個是valueapp

Array(value String,value String,value String)：由平級的字段來組成 spa

安裝好hive後 (安裝hive 過程不詳述)輸入 hive 進入hive：rest

進入hive後默認進入default數據庫orm

use mydb;--使用本身建立的數據庫ip

set hive.cli.print.currend.db=true;設置hive命令後顯示數據庫名ci

HIVE數據庫相關操做

建立數據庫

create database if not exists mydb

location '/my/databases'--指定數據文件存放位置。一般該位置是由hive.metastore.warehouse.dir 指定。也能夠臨時指定

comment 'my databases'--描述個人數據庫

with dbproperties('creator' = 'pang','date' = '2015-4-15');--建立信息

查看數據庫列表

show databases in mydb; --也能夠show databases like 'my.*'展出以my開頭的 .* 是匹配項。in mydb 是隻列出mybd下的表

查看數據庫信息

describe database extended mydb; --若不加extended 則只能看的到comment

刪除數據庫

drop database if exists financials cascade;刪除數據庫 cascade級聯刪除，若是數據庫裏面有表有數據須要經過級聯刪除才能刪掉。

修改數據庫

alter database mydb set dbproperties('edited-by','pang');--修改數據庫信息

HIVE表相關操做

建立外部表

creat external table mytable(name String,age int)--建立外部表，用處是除了hive外的程序部件也可使用外部表

建立表：

create table mytable(name String,age int,friends Array(String),deduction Map(String,float),address Strutc(street:String ,city:String,state:String,zip:int))

row format delimited--每一行之間用分隔符來替代

fields terminated by '\001'--域之間用 \001 來分割即 ^A

collection items terminated by '\002'--數組之間元素用 \002 來分割即 ^B

map keys terminated by '\003'--key 和 value之間用 \003 來分割即 ^C

lines terminated by '\n'--行結束用回車

stored as textfile;--以textfile 格式來存儲這個數據文件.

這些特定字段分割都是缺省分隔。若是不指定建立語句下方的分隔規則，則使用該分隔規則。

複製建立新表

cteate table if not exists mydb.mytable1 like mytable;--使用mydb. 能夠指定數據庫

查看錶描述

describe extended mydb.mytable; --mydb.mytable.name這樣既可查看某個字段的描述

更改表名

alter table mytables rename to tables;

更改列名列類型

alter table mytable change columns hms hours_minutes_seconds int after severity;

添加列

alter table mytable add columns(

app_name String comment 'application name',

session_id long

)

hive不支持行級別的insert、delete、update。將數據放入表中的惟一辦法是批量載入。

加載本地文件到表中對應的分區下

load data local inpath '${env:home}/california-employees' overwrite into table mytable partition(country='us')

從另一張表查詢出數據插入到表中

insert overwrite TABLE table partition (country='us') select * from mytables

建立表的同時插入數據

creat table mytable as select name,age from students where age='18'

修改某列信息

alter table mytable replace columns(

message String comment 'the rest of the message'

);

HIVE表分區

建立分區表

create table mytable(name String,age int,friends Array(String),deduction Map(String,float),address Strutc(street:String ,city:String,state:String,zip:int))

partitioned by(country String);--以國家爲分區。分區關鍵字不必定在表字段上體現。分區表的存儲會變成一個子目錄裏面的一系列文件

分區表查詢存在一個strict模式(嚴格模式)概念。

set hive.mapred.mode=strict;

select e.name,e.salary from mytable e limit 100; 會報錯

FAILED:ERRORin semantic analysis:No partition predicate found for alias "e" table "mytable"

set hive.mapred.mode=nonstrict; 則能夠查詢

可是：既然存在分區，則應該儘可能使用分區來查詢，不用條件亂查效率低下。

列出分區

show partitions mytable partition (country='us');列出美國分區下的表

增長分區

alter table mutables add if not exists

partition(year = 2015 ,mouth =4 ,day =1) location '2015/4/1'