不須要進入到impala-shell交互命令行當中便可執行的命令參數
node
impala-shell後面執行的時候能夠帶不少參數:mysql
-h 查看幫助文檔web
impala-shell -h [root@node03 hive-1.1.0-cdh5.14.0]# impala-shell -h Usage: impala_shell.py [options] Options: -h, --help show this help message and exit -i IMPALAD, --impalad=IMPALAD <host:port> of impalad to connect to [default: node03.hadoop.com:21000] -q QUERY, --query=QUERY Execute a query without the shell [default: none] -f QUERY_FILE, --query_file=QUERY_FILE Execute the queries in the query file, delimited by ;. If the argument to -f is "-", then queries are read from stdin and terminated with ctrl-d. [default: none] -k, --kerberos Connect to a kerberized impalad [default: False] -o OUTPUT_FILE, --output_file=OUTPUT_FILE If set, query results are written to the g
-r 刷新整個元數據,數據量大的時候,比較消耗服務器性能sql
impala-shell -r #結果 [root@node03 hive-1.1.0-cdh5.14.0]# impala-shell -r Starting Impala Shell without Kerberos authentication Connected to node03.hadoop.com:21000 Server version: impalad version 2.11.0-cdh5.14.0 RELEASE (build d68206561bce6b26762d62c01a78e6cd27aa7690) Invalidating Metadata *********************************************************************************** Welcome to the Impala shell. (Impala Shell v2.11.0-cdh5.14.0 (d682065) built on Sat Jan 6 13:27:16 PST 2018) The HISTORY command lists all shell commands in chronological order. *********************************************************************************** +==========================================================================+ | DEPRECATION WARNING: | | -r/--refresh_after_connect is deprecated and will be removed in a future | | version of Impala shell. | +==========================================================================+ Query: invalidate metadata Query submitted at: 2019-08-22 14:45:28 (Coordinator: http://node03.hadoop.com:25000) Query progress can be monitored at: http://node03.hadoop.com:25000/query_plan?query_id=ce4db858e1dfd774:814fabac00000000 Fetched 0 row(s) in 5.04s
-B 去格式化,查詢大量數據能夠提升性能
--print_header 去格式化顯示列名
--output_delimiter 指定分隔符
-v 查看對應版本shell
impala-shell -v -V #結果 [root@node03 hive-1.1.0-cdh5.14.0]# impala-shell -v -V Impala Shell v2.11.0-cdh5.14.0 (d682065) built on Sat Jan 6 13:27:16 PST 2018
-f 執行查詢文件
--query_file 指定查詢文件數據庫
cd /export/servers vim impala-shell.sql #寫入下面兩段話 use weblog; select * from ods_click_pageviews limit 10; #賦予可執行權限 chmod 755 imapala-shell.sql #經過-f 參數來執行執行的查詢文件 impala-shell -f impala-shell.sql #結果 [root@node03 hivedatas]# impala-shell -f imapala-shell.sql Starting Impala Shell without Kerberos authentication Connected to node03.hadoop.com:21000 Server version: impalad version 2.11.0-cdh5.14.0 RELEASE (build d68206561bce6b26762d62c01a78e6cd27aa7690) Query: use hivesql Query: select * from ods_click_pageviews limit 10 Query submitted at: 2019-08-22 15:29:54 (Coordinator: http://node03.hadoop.com:25000) Query progress can be monitored at: http://node03.hadoop.com:25000/query_plan?query_id=6a4d51930cf99b9d:21f02c4e00000000 +--------------------------------------+-----------------+-------------+---------------------+----------------------------+------------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+--------+----------+ | session | remote_addr | remote_user | time_local | request | visit_step | page_staylong | http_referer | http_user_agent | body_bytes_sent | status | datestr | +--------------------------------------+-----------------+-------------+---------------------+----------------------------+------------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+--------+----------+ | d1328698-d475-4973-86ee-15ad9da8c860 | 1.80.249.223 | - | 2013-09-18 07:57:33 | /hadoop-hive-intro/ | 1 | 60 | "http://www.google.com.hk/url?sa=t&rct=j&q=hive%E7%9A%84%E5%AE%89%E8%A3%85&source=web&cd=2&ved=0CC4QFjAB&url=%68%74%74%70%3a%2f%2f%62%6c%6f%67%2e%66%65%6e%73%2e%6d%65%2f%68%61%64%6f%6f%70%2d%68%69%76%65%2d%69%6e%74%72%6f%2f&ei=5lw5Uo-2NpGZiQfCwoG4BA&usg=AFQjCNF8EFxPuCMrm7CvqVgzcBUzrJZStQ&bvm=bv.52164340,d.aGc&cad=rjt" | "Mozilla/5.0(WindowsNT5.2;rv:23.0)Gecko/20100101Firefox/23.0" | 14764 | 200 | 20130918 | | 0370aa09-ebd6-4d31-b6a5-469050a7fe61 | 101.226.167.201 | - | 2013-09-18 09:30:36 | /hadoop-mahout-roadmap/ | 1 | 60 | "http://blog.fens.me/hadoop-mahout-roadmap/"
-i 鏈接到impaladvim
--impalad 指定impalad去執行任務bash
-o 保存執行結果到文件當中去服務器
--output_file 指定輸出文件名session
impala-shell -f impala-shell.sql -o fizz.txt #結果 [root@node03 hivedatas]# impala-shell -f imapala-shell.sql -o fizz.txt Starting Impala Shell without Kerberos authentication Connected to node03.hadoop.com:21000 Server version: impalad version 2.11.0-cdh5.14.0 RELEASE (build d68206561bce6b26762d62c01a78e6cd27aa7690) Query: use hivesql Query: select * from ods_click_pageviews limit 10 Query submitted at: 2019-08-22 15:31:45 (Coordinator: http://node03.hadoop.com:25000) Query progress can be monitored at: http://node03.hadoop.com:25000/query_plan?query_id=7c421ab5d208f3b1:dec5a09300000000 Fetched 10 row(s) in 0.13s #當前文件夾多了一個 fizz.txt 文件 [root@node03 hivedatas]# ll total 2592 -rw-r--r-- 1 root root 511 Aug 21 2017 dim_time_dat.txt -rw-r--r-- 1 root root 9926 Aug 22 15:31 fizz.txt -rwxr-xr-x 1 root root 57 Aug 22 15:29 imapala-shell.sql -rwxrwxrwx 1 root root 133 Aug 20 00:36 movie.txt -rw-r--r-- 1 root root 18372 Jun 17 18:33 pageview2 -rwxr-xr-x 1 root root 154 Aug 20 00:32 test.txt -rw-r--r-- 1 root root 327 Aug 20 02:37 user_table -rw-r--r-- 1 root root 10361 Jun 18 09:00 visit2 -rw-r--r-- 1 root root 2587511 Jun 17 18:05 weblog2
-p 顯示查詢計劃
impala-shell -f impala-shell.sql -p
-q 執行片斷sql語句
impala-shell -q "use hivesql;select * from ods_click_pageviews limit 10;" [root@node03 hivedatas]# impala-shell -q "use hivesql;select * from ods_click_pageviews limit 10;" Starting Impala Shell without Kerberos authentication Connected to node03.hadoop.com:21000 Server version: impalad version 2.11.0-cdh5.14.0 RELEASE (build d68206561bce6b26762d62c01a78e6cd27aa7690) Query: use hivesql Query: select * from ods_click_pageviews limit 10 Query submitted at: 2019-08-22 15:36:58 (Coordinator: http://node03.hadoop.com:25000) Query progress can be monitored at: http://node03.hadoop.com:25000/query_plan?query_id=b443d56565419f60:a149235700000000 +--------------------------------------+-----------------+-------------+---------------------+----------------------------+------------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+--------+----------+ | session | remote_addr | remote_user | time_local | request | visit_step | page_staylong | http_referer | http_user_agent | body_bytes_sent | status | datestr |
進入impala-shell命令行以後能夠執行的語法
進入impala-shell:
impala-shell #任意目錄 #結果 [root@node03 hivedatas]# impala-shell Starting Impala Shell without Kerberos authentication Connected to node03.hadoop.com:21000 Server version: impalad version 2.11.0-cdh5.14.0 RELEASE (build d68206561bce6b26762d62c01a78e6cd27aa7690) *********************************************************************************** Welcome to the Impala shell. (Impala Shell v2.11.0-cdh5.14.0 (d682065) built on Sat Jan 6 13:27:16 PST 2018) To see more tips, run the TIP command. *********************************************************************************** [node03.hadoop.com:21000] >
幫助文檔
[node03.hadoop.com:21000] > help; Documented commands (type help <topic>): ======================================== compute describe explain profile rerun set show unset values with connect exit history quit select shell tip use version Undocumented commands: ====================== alter delete drop insert source summary upsert create desc help load src update
connect hostname 鏈接到某一臺機器上面去執行
connect node02; #結果 [node03.hadoop.com:21000] > connect node02; Connected to node02:21000 Server version: impalad version 2.11.0-cdh5.14.0 RELEASE (build d68206561bce6b26762d62c01a78e6cd27aa7690) [node02:21000] >
refresh dbname.tablename 增量刷新
,刷新某一張表的元數據,主要用於刷新hive當中數據表裏面的數據改變的狀況
用於刷新hive當中數據表裏面的數據改變的狀況
refresh movie_info; #結果 [node03:21000] > refresh movie_info; Query: refresh movie_info Query submitted at: 2019-08-22 15:49:24 (Coordinator: http://node03.hadoop.com:25000) Query progress can be monitored at: http://node03.hadoop.com:25000/query_plan?query_id=f74330d533ff2402:27364f7600000000 Fetched 0 row(s) in 0.27s
invalidate metadata全量刷新
,性能消耗較大,主要用於hive當中新建數據庫或者數據庫表的時候來進行刷新
invalidate metadata; #結果 [node03:21000] > invalidate metadata; Query: invalidate metadata Query submitted at: 2019-08-22 15:48:04 (Coordinator: http://node03.hadoop.com:25000) Query progress can be monitored at: http://node03.hadoop.com:25000/query_plan?query_id=6a431748d41bc369:7eeb053400000000 Fetched 0 row(s) in 2.87s
用於查看sql語句的執行計劃
explain select * from stu; #結果 [node03:21000] > explain select * from user_table; Query: explain select * from user_table +------------------------------------------------------------------------------------+ | Explain String | +------------------------------------------------------------------------------------+ | Max Per-Host Resource Reservation: Memory=0B | | Per-Host Resource Estimates: Memory=32.00MB | | WARNING: The following tables are missing relevant table and/or column statistics. | | hivesql.user_table | | | | PLAN-ROOT SINK | | | | | 01:EXCHANGE [UNPARTITIONED] | | | | | 00:SCAN HDFS [hivesql.user_table] | | partitions=1/1 files=1 size=327B | +------------------------------------------------------------------------------------+ Fetched 11 row(s) in 3.99s
explain的值能夠設置成0,1,2,3等幾個值,其中3級別是最高的,能夠打印出最全的信息
set explain_level=3; #結果 [node03:21000] > set explain_level=3; EXPLAIN_LEVEL set to 3 [node03:21000] >
執行sql語句以後執行,能夠打印出更加詳細的執行步驟,
主要用於查詢結果的查看,集羣的調優等
select * from user_table; profile; #部分結果截取 [node03:21000] > profile; Query Runtime Profile: Query (id=ff4799938b710fbb:7997836800000000): Summary: Session ID: a14d3b3894050309:7f300ddf8dcd8584 Session Type: BEESWAX Start Time: 2019-08-22 15:58:22.786612000 End Time: 2019-08-22 15:58:24.558806000 Query Type: QUERY Query State: FINISHED Query Status: OK Impala Version: impalad version 2.11.0-cdh5.14.0 RELEASE (build d68206561bce6b26762d62c01a78e6cd27aa7690) User: root Connected User: root Delegated User: Network Address: ::ffff:192.168.52.120:48318 Default Db: hivesql Sql Statement: select * from user_table Coordinator: node03.hadoop.com:22000 Query Options (set by configuration): EXPLAIN_LEVEL=3 Query Options (set by configuration and planner): EXPLAIN_LEVEL=3,MT_DOP=0 Plan:
注意:在hive窗口當中插入的數據或者新建的數據庫或者數據庫表,在impala當中是不可直接查詢到的,須要刷新數據庫,在impala-shell當中插入的數據,在impala當中是能夠直接查詢到的,不須要刷新數據庫,其中使用的就是catalog這個服務的功能實現的,catalog是impala1.2版本以後增長的模塊功能,主要做用就是同步impala之間的元數據
impala-shell #進入到impala的交互窗口
show databases;
建立數據庫
CREATE DATABASE IF NOT EXISTS mydb1; drop database if exists mydb;
建立student表
CREATE TABLE IF NOT EXISTS mydb1.student (name STRING, age INT, contact INT );
建立employ表
create table employee (Id INT, name STRING, age INT,address STRING, salary BIGINT);
insert into employee (ID,NAME,AGE,ADDRESS,SALARY)VALUES (1, 'Ramesh', 32, 'Ahmedabad', 20000 ); insert into employee values (2, 'Khilan', 25, 'Delhi', 15000 ); Insert into employee values (3, 'kaushik', 23, 'Kota', 30000 ); Insert into employee values (4, 'Chaitali', 25, 'Mumbai', 35000 ); Insert into employee values (5, 'Hardik', 27, 'Bhopal', 40000 ); Insert into employee values (6, 'Komal', 22, 'MP', 32000 );
數據的覆蓋
Insert overwrite employee values (1, 'Ram', 26, 'Vishakhapatnam', 37000 );
執行覆蓋以後,表中只剩下了這一條數據了
另一種建表語句
create table customer as select * from employee;
select * from employee; select name,age from employee;
DROP table mydb1.employee;
truncate employee;
CREATE VIEW IF NOT EXISTS employee_view AS select name, age from employee;
select * from employee_view;
基礎語法
select * from table_name ORDER BY col_name [ASC|DESC] [NULLS FIRST|NULLS LAST] Select * from employee ORDER BY id asc;
Select name, sum(salary) from employee Group BY name;
基礎語法
select * from table_name ORDER BY col_name [ASC|DESC] [NULLS FIRST|NULLS LAST]
按年齡對錶進行分組,並選擇每一個組的最大工資,並顯示大於20000的工資
select max(salary) from employee group by age having max(salary) > 20000
select * from employee order by id limit 4;
第一種方式,經過load hdfs的數據到impala當中去
create table user(id int ,name string,age int ) row format delimited fields terminated by "\t";
準備數據user.txt並上傳到hdfs的 /user/impala路徑下去
上傳user.txt到hadoop上去:
hdfs dfs -put user.txt /user/impala/
查看是否上傳成功:
hdfs dfs -ls /user/impala
1 kasha 15 2 fizz 20 3 pheonux 30 4 manzi 50
加載數據
load data inpath '/user/impala/' into table user;
查詢加載的數據
select * from user;
若是查詢不不到數據,那麼須要刷新一遍數據表
refresh user;
第二種方式:
create table user2 as select * from user;
第三種方式:
insert into #不推薦使用 由於會產生大量的小文件
千萬不要把impala當作一個數據庫來使用
第四種方式:
insert into select #用的比較多