1.文檔編寫目的html
本文檔主要講述如何使用Sentry對Hive外部表權限管理,並基於如下假設:node
1.操做系統版本:RedHat6.5shell
2.CM版本:CM 5.11.1oop
3.集羣已啓用Kerberos和Sentry測試
4.採用具備sudo權限的ec2-user用戶進行操做ui
2.前置準備操作系統
2.1建立外部表數據父目錄命令行
1.使用hive用戶登陸Kerberos3d
[root@ip-172-31-8-141 1874-hive-HIVESERVER2]# kinit -kt hive.keytab hive/ip-172-31-8-141.ap-southeast-1.compute.internal@CLOUDERA.COM [root@ip-172-31-8-141 1874-hive-HIVESERVER2]# klist Ticket cache: FILE:/tmp/krb5cc_0 Default principal: hive/ip-172-31-8-141.ap-southeast-1.compute.internal@CLOUDERA.COM Valid starting Expires Service principal 09/01/17 11:10:54 09/02/17 11:10:54 krbtgt/CLOUDERA.COM@CLOUDERA.COM renew until 09/06/17 11:10:54 [root@ip-172-31-8-141 1874-hive-HIVESERVER2]#
2.建立HDFS目錄code
使用以下命令在HDFS的根目錄下建立Hive外部表的數據目錄/extwarehouse
[root@ip-172-31-8-141 ec2-user]# hadoop fs -mkdir /extwarehouse [root@ip-172-31-8-141 ec2-user]# hadoop fs -ls / drwxr-xr-x - hive supergroup 0 2017-09-01 11:27 /extwarehouse drwxrwxrwx - user_r supergroup 0 2017-08-23 03:23 /fayson drwx------ - hbase hbase 0 2017-09-01 02:59 /hbase drwxrwxrwt - hdfs supergroup 0 2017-08-31 06:18 /tmp drwxrwxrwx - hdfs supergroup 0 2017-08-30 03:48 /user [root@ip-172-31-8-141 ec2-user]# hadoop fs -chown hive:hive /extwarehouse [root@ip-172-31-8-141 ec2-user]# hadoop fs -chmod 771 /extwarehouse [root@ip-172-31-8-141 ec2-user]# hadoop fs -ls / drwxrwx--x - hive hive 0 2017-09-01 11:27 /extwarehouse drwxrwxrwx - user_r supergroup 0 2017-08-23 03:23 /fayson drwx------ - hbase hbase 0 2017-09-01 02:59 /hbase drwxrwxrwt - hdfs supergroup 0 2017-08-31 06:18 /tmp drwxrwxrwx - hdfs supergroup 0 2017-08-30 03:48 /user [root@ip-172-31-8-141 ec2-user]#
2.2配置外部表數據父目錄的ACL同步
1.確保HDFS已開啓sentry並啓用ACL同步
2.配置sentry同步路徑(2.1建立的Hive外部表數據目錄)
3.配置完成,重啓服務。
3.建立Hive外部表
1.使用beeline命令行鏈接hive,建立Hive外部表
建表語句:
create external table if not exists student( name string, age int, addr string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LOCATION '/extwarehouse/student';
終端操做:
[root@ip-172-31-8-141 1874-hive-HIVESERVER2]# beeline Beeline version 1.1.0-cdh5.11.1 by Apache Hive beeline> !connect jdbc:hive2://localhost:10000/;principal=hive/ip-172-31-8-141.ap-southeast-1.compute.internal@CLOUDERA.COM ... 0: jdbc:hive2://localhost:10000/> create external table if not exists student( . . . . . . . . . . . . . . . . > name string, . . . . . . . . . . . . . . . . > age int, . . . . . . . . . . . . . . . . > addr string . . . . . . . . . . . . . . . . > ) . . . . . . . . . . . . . . . . > ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' . . . . . . . . . . . . . . . . > LOCATION '/extwarehouse/student'; ... INFO : OK No rows affected (0.236 seconds) 0: jdbc:hive2://localhost:10000/>
2.向student表中load數據
準備測試數據
[root@ip-172-31-8-141 student]# pwd /home/ec2-user/student [root@ip-172-31-8-141 student]# ll total 4 -rw-r--r-- 1 root root 39 Sep 1 11:37 student.txt [root@ip-172-31-8-141 student]# cat student.txt zhangsan,18,guangzhou lisi,20,shenzhen [root@ip-172-31-8-141 student]#
將student.txt文件put到hdfs的/tmp/student目錄
[root@ip-172-31-8-141 student]# hadoop fs -mkdir /tmp/student [root@ip-172-31-8-141 student]# ll total 4 -rw-r--r-- 1 hive hive 39 Sep 1 11:37 student.txt [root@ip-172-31-8-141 student]# hadoop fs -put student.txt /tmp/student [root@ip-172-31-8-141 student]# hadoop fs -ls /tmp/student Found 1 items -rw-r--r-- 3 hive supergroup 39 2017-09-01 11:57 /tmp/stu dent/student.txt [root@ip-172-31-8-141 student]#
在beeline命令行下,將數據load到student表
0: jdbc:hive2://localhost:10000/> load data inpath '/tmp/student' into table student; ... INFO : Table default.student stats: [numFiles=1, totalSize=39] INFO : Completed executing command(queryId=hive_20170901115858_5a76aa76-1b24-40ce-8254-42991856c05b); Time taken: 0.263 seconds INFO : OK No rows affected (0.41 seconds) 0: jdbc:hive2://localhost:10000/>
執行完load命令後,查看錶數據
0: jdbc:hive2://localhost:10000/> select * from student; ... INFO : OK +---------------+--------------+---------------+--+ | student.name | student.age | student.addr | +---------------+--------------+---------------+--+ | zhangsan | 18 | guangzhou | | lisi | 20 | shenzhen | +---------------+--------------+---------------+--+ 2 rows selected (0.288 seconds) 0: jdbc:hive2://localhost:10000/>
4.使用fayson用戶在beeline和impala-shell查看
使用fayson用戶的principal初始化Kerberors的票據
[ec2-user@ip-172-31-8-141 cdh-shell-master]$ kinit fayson Password for fayson@CLOUDERA.COM: [ec2-user@ip-172-31-8-141 cdh-shell-master]$ klist Ticket cache: FILE:/tmp/krb5cc_500 Default principal: fayson@CLOUDERA.COM Valid starting Expires Service principal 09/01/17 12:27:39 09/02/17 12:27:39 krbtgt/CLOUDERA.COM@CLOUDERA.COM renew until 09/08/17 12:27:39 [ec2-user@ip-172-31-8-141 cdh-shell-master]$
4.1訪問hdfs目錄
[ec2-user@ip-172-31-8-141 ~]$ hadoop fs -ls /extwarehouse/student ls: Permission denied: user=fayson, access=READ_EXECUTE, inode="/extwarehouse/student":hive:hive:drwxrwx--x [ec2-user@ip-172-31-8-141 ~]$
4.2beeline命令行查看
[ec2-user@ip-172-31-8-141 ~]$ beeline Beeline version 1.1.0-cdh5.11.1 by Apache Hive beeline> !connect jdbc:hive2://localhost:10000/;principal=hive/ip-172-31-8-141.ap-southeast-1.compute.internal@CLOUDERA.COM ... INFO : OK +-----------+--+ | tab_name | +-----------+--+ +-----------+--+ No rows selected (0.295 seconds) 0: jdbc:hive2://localhost:10000/> select * from student; Error: Error while compiling statement: FAILED: SemanticException No valid privileges User fayson does not have privileges for QUERY The required privileges: Server=server1->Db=default->Table=student->Column=addr->action=select; (state=42000,code=40000) 0: jdbc:hive2://localhost:10000/>
4.3impala-shell命令行查看
[ec2-user@ip-172-31-8-141 cdh-shell-master]$ impala-shell ... [Not connected] > connect ip-172-31-10-156.ap-southeast-1.compute.internal:21000; Connected to ip-172-31-10-156.ap-southeast-1.compute.internal:21000 Server version: impalad version 2.8.0-cdh5.11.1 RELEASE (build 3382c1c488dff12d5ca8d049d2b59babee605b4e) [ip-172-31-10-156.ap-southeast-1.compute.internal:21000] > show tables; Query: show tables ERROR: AuthorizationException: User 'fayson@CLOUDERA.COM' does not have privileges to access: default.* [ip-172-31-10-156.ap-southeast-1.compute.internal:21000] > select * from student; Query: select * from student Query submitted at: 2017-09-01 12:33:06 (Coordinator: http://ip-172-31-10-156.ap-southeast-1.compute.internal:25000) ERROR: AuthorizationException: User 'fayson@CLOUDERA.COM' does not have privileges to execute 'SELECT' on: default.student [ip-172-31-10-156.ap-southeast-1.compute.internal:21000] >
4.4測試總結
經過hive用戶建立的外部表,未給fayson用戶賦予student表讀權限狀況下,無權限訪問hdfs的(/extwarehouse/student)數據目錄,在beeline和impala-shell命令行下,fayson用戶均無權限查詢student表數據。
5.爲fayson用戶賦予student表讀權限
注:如下操做均在hive管理員用戶下操做
1.建立student_read角色
0: jdbc:hive2://localhost:10000/> create role student_read; ... INFO : Executing command(queryId=hive_20170901124848_927878ba-0217-4a32-a508-bf29fed67be8): create role student_read ... INFO : OK No rows affected (0.104 seconds) 0: jdbc:hive2://localhost:10000/>
2.將student表的查詢權限受權給student_read角色
0: jdbc:hive2://localhost:10000/> grant select on table student to role student_read; ... INFO : Executing command(queryId=hive_20170901125252_8702d99d-d8eb-424e-929d-5df352828e2c): grant select on table student to role student_read ... INFO : OK No rows affected (0.111 seconds) 0: jdbc:hive2://localhost:10000/>
3.將student_read角色受權給fayson用戶組
0: jdbc:hive2://localhost:10000/> grant role student_read to group fayson; ... INFO : Executing command(queryId=hive_20170901125454_5f27a87e-2f63-46d9-9cce-6f346a0c415c): grant role student_read to group fayson ... INFO : OK No rows affected (0.122 seconds) 0: jdbc:hive2://localhost:10000/>
6.再次測試
使用fayson用戶登陸Kerberos
6.1訪問HDFS目錄
訪問student數據所在hdfs目錄/extwarehouse/student
[ec2-user@ip-172-31-8-141 ~]$ hadoop fs -ls /extwarehouse/student Found 1 items -rwxrwx--x+ 3 hive hive 39 2017-09-01 14:42 /extwarehouse/student/student.txt [ec2-user@ip-172-31-8-141 ~]$
6.2beeline查詢student表
[ec2-user@ip-172-31-8-141 ~]$ klist Ticket cache: FILE:/tmp/krb5cc_500 Default principal: fayson@CLOUDERA.COM Valid starting Expires Service principal 09/01/17 12:58:59 09/02/17 12:58:59 krbtgt/CLOUDERA.COM@CLOUDERA.COM renew until 09/08/17 12:58:59 [ec2-user@ip-172-31-8-141 ~]$ [ec2-user@ip-172-31-8-141 ~]$ beeline Beeline version 1.1.0-cdh5.11.1 by Apache Hive beeline> !connect jdbc:hive2://localhost:10000/;principal=hive/ip-172-31-8-141.ap-southeast-1.compute.internal@CLOUDERA.COM ... INFO : OK +-----------+--+ | tab_name | +-----------+--+ | student | +-----------+--+ 1 row selected (0.294 seconds) 0: jdbc:hive2://localhost:10000/> select * from student; ... INFO : OK +---------------+--------------+---------------+--+ | student.name | student.age | student.addr | +---------------+--------------+---------------+--+ | zhangsan | 18 | guangzhou | | lisi | 20 | shenzhen | +---------------+--------------+---------------+--+ 2 rows selected (0.241 seconds) 0: jdbc:hive2://localhost:10000/>
6.3impala-shell查詢student表
[ec2-user@ip-172-31-8-141 cdh-shell-master]$ klist Ticket cache: FILE:/tmp/krb5cc_500 Default principal: fayson@CLOUDERA.COM Valid starting Expires Service principal 09/01/17 12:58:59 09/02/17 12:58:59 krbtgt/CLOUDERA.COM@CLOUDERA.COM renew until 09/08/17 12:58:59 [ec2-user@ip-172-31-8-141 cdh-shell-master]$ impala-shell ... [Not connected] > connect ip-172-31-10-156.ap-southeast-1.compute.internal:21000; Connected to ip-172-31-10-156.ap-southeast-1.compute.internal:21000 Server version: impalad version 2.8.0-cdh5.11.1 RELEASE (build 3382c1c488dff12d5ca8d049d2b59babee605b4e) [ip-172-31-10-156.ap-southeast-1.compute.internal:21000] > show tables; Query: show tables +---------+ | name | +---------+ | student | +---------+ Fetched 1 row(s) in 0.02s [ip-172-31-10-156.ap-southeast-1.compute.internal:21000] > select * from student; ... +----------+-----+-----------+ | name | age | addr | +----------+-----+-----------+ | zhangsan | 18 | guangzhou | | lisi | 20 | shenzhen | +----------+-----+-----------+ Fetched 2 row(s) in 0.13s [ip-172-31-10-156.ap-southeast-1.compute.internal:21000] >
6.4測試總結
經過hive用戶建立的外部表,給fayson用戶賦予student表讀權限後,可正常訪問hdfs的(/extwarehouse/student)數據目錄,在beeline和impala-shell命令行下,fayson用戶都可查詢student表數據。
7.Sentry管理Hive外部表權限總結
開啓外部表的數據父目錄ACL同步後,不須要單獨的維護外部表數據目錄權限。
參考文檔:
https://www.cloudera.com/documentation/enterprise/latest/topics/sg_hdfs_sentry_sync.html
醉酒鞭名馬,少年多浮誇! 嶺南浣溪沙,嘔吐酒肆下!摯友不願放,數據玩的花!