關於 Kerberos 的安裝和 HDFS 配置 kerberos 認證,請參考 HDFS配置kerberos認證。html
關於 Kerberos 的安裝和 YARN 配置 kerberos 認證,請參考 YARN配置kerberos認證。java
關於 Kerberos 的安裝和 Hive 配置 kerberos 認證,請參考 Hive配置kerberos認證。node
請先完成 HDFS 、YARN、Hive 配置 Kerberos 認證,再來配置 Impala 集成 Kerberos 認證 !python
參考 使用yum安裝CDH Hadoop集羣 安裝 hadoop 集羣,集羣包括三個節點,每一個節點的ip、主機名和部署的組件分配以下:android
192.168.56.121 cdh1 NameNode、Hive、ResourceManager、HBase、impala-state-store、impala-catalog、Kerberos Server 192.168.56.122 cdh2 DataNode、SSNameNode、NodeManager、HBase、impala-server 192.168.56.123 cdh3 DataNode、HBase、NodeManager、impala-server
注意:hostname 請使用小寫,要否則在集成 kerberos 時會出現一些錯誤。ios
在每一個節點上運行下面的命令:git
$ yum install python-devel openssl-devel python-pip cyrus-sasl cyrus-sasl-gssapi cyrus-sasl-devel -y
github
$ pip-python install ssl
sql
在 cdh1 節點,即 KDC server 節點上執行下面命令:shell
$ cd /var/kerberos/krb5kdc/
kadmin.local -q "addprinc -randkey impala/cdh1@JAVACHEN.COM "
kadmin.local -q "addprinc -randkey impala/cdh2@JAVACHEN.COM "
kadmin.local -q "addprinc -randkey impala/cdh3@JAVACHEN.COM "
kadmin.local -q "xst -k impala-unmerge.keytab impala/cdh1@JAVACHEN.COM "
kadmin.local -q "xst -k impala-unmerge.keytab impala/cdh2@JAVACHEN.COM "
kadmin.local -q "xst -k impala-unmerge.keytab impala/cdh3@JAVACHEN.COM "
另外,若是你使用了haproxy來作負載均衡,參考官方文檔Using Impala through a Proxy for High Availability,還需生成 proxy.keytab:
$ cd /var/kerberos/krb5kdc/
# proxy 爲安裝了 haproxy 的機器
kadmin.local -q "addprinc -randkey impala/proxy@JAVACHEN.COM "
kadmin.local -q "xst -k proxy.keytab impala/proxy@JAVACHEN.COM "
合併 proxy.keytab 和 impala-unmerge.keytab 生成 impala.keytab:
$ ktutil
ktutil: rkt proxy.keytab
ktutil: rkt impala-unmerge.keytab
ktutil: wkt impala.keytab
ktutil: quit
拷貝 impala.keytab 和 proxy_impala.keytab 文件到其餘節點的 /etc/impala/conf 目錄
$ scp impala.keytab cdh1:/etc/impala/conf
$ scp impala.keytab cdh2:/etc/impala/conf
$ scp impala.keytab cdh3:/etc/impala/conf
並設置權限,分別在 cdh一、cdh二、cdh3 上執行:
$ ssh cdh1 "cd /etc/impala/conf/;chown impala:hadoop *.keytab ;chmod 400 *.keytab"
$ ssh cdh2 "cd /etc/impala/conf/;chown impala:hadoop *.keytab ;chmod 400 *.keytab"
$ ssh cdh3 "cd /etc/impala/conf/;chown impala:hadoop *.keytab ;chmod 400 *.keytab"
因爲 keytab 至關於有了永久憑證,不須要提供密碼(若是修改 kdc 中的 principal 的密碼,則該 keytab 就會失效),因此其餘用戶若是對該文件有讀權限,就能夠冒充 keytab 中指定的用戶身份訪問 hadoop,因此 keytab 文件須要確保只對 owner 有讀權限(0400)
修改 cdh1 節點上的 /etc/default/impala,在 IMPALA_CATALOG_ARGS
、IMPALA_SERVER_ARGS
和 IMPALA_STATE_STORE_ARGS
中添加下面參數:
-kerberos_reinit_interval=60
-principal=impala/_HOST@JAVACHEN.COM
-keytab_file=/etc/impala/conf/impala.keytab
若是使用了 HAProxy(關於 HAProxy 的配置請參考 Hive使用HAProxy配置HA),則 IMPALA_SERVER_ARGS
參數須要修改成(proxy爲 HAProxy 機器的名稱,這裏我是將 HAProxy 安裝在 cdh1 節點上):
-kerberos_reinit_interval=60
-be_principal=impala/_HOST@JAVACHEN.COM
-principal=impala/proxy@JAVACHEN.COM
-keytab_file=/etc/impala/conf/impala.keytab
在 IMPALA_CATALOG_ARGS
中添加:
-state_store_host=${IMPALA_STATE_STORE_HOST} \
將修改的上面文件同步到其餘節點。最後,/etc/default/impala 文件以下,這裏,爲了不 hostname 存在大寫的狀況,使用 hostname
變量替換 _HOST
:
IMPALA_CATALOG_SERVICE_HOST=cdh1
IMPALA_STATE_STORE_HOST=cdh1
IMPALA_STATE_STORE_PORT=24000
IMPALA_BACKEND_PORT=22000
IMPALA_LOG_DIR=/var/log/impala
IMPALA_MEM_DEF=$(free -m |awk 'NR==2{print $2-5120}')
hostname=`hostname -f |tr "[:upper:]" "[:lower:]"`
IMPALA_CATALOG_ARGS=" -log_dir=${IMPALA_LOG_DIR} -state_store_host=${IMPALA_STATE_STORE_HOST} \
-kerberos_reinit_interval=60\
-principal=impala/${hostname}@JAVACHEN.COM \
-keytab_file=/etc/impala/conf/impala.keytab
"
IMPALA_STATE_STORE_ARGS=" -log_dir=${IMPALA_LOG_DIR} -state_store_port=${IMPALA_STATE_STORE_PORT}\
-statestore_subscriber_timeout_seconds=15 \
-kerberos_reinit_interval=60 \
-principal=impala/${hostname}@JAVACHEN.COM \
-keytab_file=/etc/impala/conf/impala.keytab
"
IMPALA_SERVER_ARGS=" \
-log_dir=${IMPALA_LOG_DIR} \
-catalog_service_host=${IMPALA_CATALOG_SERVICE_HOST} \
-state_store_port=${IMPALA_STATE_STORE_PORT} \
-use_statestore \
-state_store_host=${IMPALA_STATE_STORE_HOST} \
-be_port=${IMPALA_BACKEND_PORT} \
-kerberos_reinit_interval=60 \
-be_principal=impala/${hostname}@JAVACHEN.COM \
-principal=impala/cdh1@JAVACHEN.COM \
-keytab_file=/etc/impala/conf/impala.keytab \
-mem_limit=${IMPALA_MEM_DEF}m
"
ENABLE_CORE_DUMPS=false
將修改的上面文件同步到其餘節點:cdh二、cdh3:
$ scp /etc/default/impala cdh2:/etc/default/impala
$ scp /etc/default/impala cdh3:/etc/default/impala
更新 impala 配置文件下的文件並同步到其餘節點:
cp /etc/hadoop/conf/core-site.xml /etc/impala/conf/
cp /etc/hadoop/conf/hdfs-site.xml /etc/impala/conf/
cp /etc/hive/conf/hive-site.xml /etc/impala/conf/
scp -r /etc/impala/conf cdh2:/etc/impala
scp -r /etc/impala/conf cdh3:/etc/impala
impala-state-store 是經過 impala 用戶啓動的,故在 cdh1 上先獲取 impala 用戶的 ticket 再啓動服務:
$ kinit -k -t /etc/impala/conf/impala.keytab impala/cdh1@JAVACHEN.COM
$ service impala-state-store start
而後查看日誌,確認是否啓動成功。
$ tailf /var/log/impala/statestored.INFO
impala-catalog 是經過 impala 用戶啓動的,故在 cdh1 上先獲取 impala 用戶的 ticket 再啓動服務:
$ kinit -k -t /etc/impala/conf/impala.keytab impala/cdh1@JAVACHEN.COM
$ service impala-catalog start
而後查看日誌,確認是否啓動成功。
$ tailf /var/log/impala/catalogd.INFO
impala-server 是經過 impala 用戶啓動的,故在 cdh1 上先獲取 impala 用戶的 ticket 再啓動服務:
$ kinit -k -t /etc/impala/conf/impala.keytab impala/cdh1@JAVACHEN.COM
$ service impala-server start
而後查看日誌,確認是否啓動成功。
$ tailf /var/log/impala/impalad.INFO
在啓用了 kerberos 以後,運行 impala-shell 時,須要添加 -k
參數:
$ impala-shell -k
Starting Impala Shell using Kerberos authentication
Using service name 'impala'
Connected to cdh1:21000
Server version: impalad version 1.3.1-cdh4 RELEASE (build 907481bf45b248a7bb3bb077d54831a71f484e5f)
Welcome to the Impala shell. Press TAB twice to see a list of available commands.
Copyright (c) 2012 Cloudera, Inc. All rights reserved.
(Shell build version: Impala Shell v1.3.1-cdh4 (907481b) built on Wed Apr 30 14:23:48 PDT 2014)
[cdh1:21000] >
[cdh1:21000] > show tables;
Query: show tables
+------+
| name |
+------+
| a |
| b |
| c |
| d |
+------+
Returned 4 row(s) in 0.08s
若是出現下面異常:
[cdh1:21000] > select * from test limit 10; Query: select * from test limit 10 ERROR: AnalysisException: Failed to load metadata for table: default.test CAUSED BY: TableLoadingException: Failed to load metadata for table: test CAUSED BY: TTransportException: java.net.SocketTimeoutException: Read timed out CAUSED BY: SocketTimeoutException: Read timed out
則須要在 hive-site.xml 中添加下面參數:
<property>
<name>hive.metastore.client.socket.timeout</name>
<value>3600</value>
</property>
本文主要記錄 CDH 5.2 Hadoop 集羣中配置 Impala 和 Hive 集成 Sentry 的過程,包括 Sentry 的安裝、配置以及和 Impala、Hive 集成後的測試。
使用 Sentry 來管理集羣的權限,須要先在集羣上配置好 Kerberos。
關於 Hadoop 集羣上配置 kerberos 以及 ldap 的過程請參考本博客如下文章:
Sentry 會安裝在三個節點的 hadoop 集羣上,每一個節點的ip、主機名和部署的組件分配以下:
192.168.56.121 cdh1 NameNode、Hive、ResourceManager、HBase、impala-state-store、impala-catalog、Kerberos Server、sentry-store 192.168.56.122 cdh2 DataNode、SSNameNode、NodeManager、HBase、impala-server 192.168.56.123 cdh3 DataNode、HBase、NodeManager、impala-server
Sentry 的使用有兩種方式,一是基於文件的存儲方式(SimpleFileProviderBackend),一是基於數據庫的存儲方式(SimpleDbProviderBackend),若是使用基於文件的存儲則只須要安裝sentry
,不然還須要安裝 sentry-store
。
在 cdh1 節點上安裝 sentry-store 服務:
yum install sentry sentry-store -y
修改 Sentry 的配置文件 /etc/sentry/conf/sentry-store-site.xml
,下面的配置參考了 Sentry源碼中的配置例子:
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<property>
<name>sentry.service.admin.group</name>
<value>impala,hive,hue</value>
</property>
<property>
<name>sentry.service.allow.connect</name>
<value>impala,hive,hue</value>
</property>
<property>
<name>sentry.verify.schema.version</name>
<value>true</value>
</property>
<property>
<name>sentry.service.server.rpc-address</name>
<value>cdh1</value>
</property>
<property>
<name>sentry.service.server.rpc-port</name>
<value>8038</value>
</property>
<property>
<name>sentry.store.jdbc.url</name>
<value>jdbc:postgresql://cdh1/sentry</value>
</property>
<property>
<name>sentry.store.jdbc.driver</name>
<value>org.postgresql.Driver</value>
</property>
<property>
<name>sentry.store.jdbc.user</name>
<value>sentry</value>
</property>
<property>
<name>sentry.store.jdbc.password</name>
<value>redhat</value>
</property>
<property>
<name>sentry.hive.server</name>
<value>server1</value>
</property>
<property>
<name>sentry.store.group.mapping</name>
<value>org.apache.sentry.provider.common.HadoopGroupMappingService</value>
</property>
</configuration>
建立數據庫,請參考 Hadoop自動化安裝shell腳本:
yum install postgresql-server postgresql-jdbc -y
ln -s /usr/share/java/postgresql-jdbc.jar /usr/lib/hive/lib/postgresql-jdbc.jar
ln -s /usr/share/java/postgresql-jdbc.jar /usr/lib/sentry/lib/postgresql-jdbc.jar
su -c "cd ; /usr/bin/pg_ctl start -w -m fast -D /var/lib/pgsql/data" postgres
su -c "cd ; /usr/bin/psql --command \"create user sentry with password 'redhat'; \" " postgres
su -c "cd ; /usr/bin/psql --command \"CREATE DATABASE sentry owner=sentry;\" " postgres
su -c "cd ; /usr/bin/psql --command \"GRANT ALL privileges ON DATABASE sentry TO sentry;\" " postgres
su -c "cd ; /usr/bin/psql -U sentry -d sentry -f /usr/lib/sentry/scripts/sentrystore/upgrade/sentry-postgres-1.4.0-cdh5.sql" postgres
su -c "cd ; /usr/bin/pg_ctl restart -w -m fast -D /var/lib/pgsql/data" postgres
/var/lib/pgsql/data/pg_hba.conf 內容以下:
# TYPE DATABASE USER CIDR-ADDRESS METHOD # "local" is for Unix domain socket connections only local all all md5 # IPv4 local connections: #host all all 0.0.0.0/0 trust host all all 127.0.0.1/32 md5 # IPv6 local connections: #host all all ::1/128 nd5
若是集羣開啓了 Kerberos 驗證,則須要在該節點上生成 Sentry 服務的 principal 並導出爲 ticket:
$ cd /etc/sentry/conf
kadmin.local -q "addprinc -randkey sentry/cdh1@JAVACHEN.COM "
kadmin.local -q "xst -k sentry.keytab sentry/cdh1@JAVACHEN.COM "
chown sentry:hadoop sentry.keytab ; chmod 400 *.keytab
而後,在/etc/sentry/conf/sentry-store-site.xml 中添加以下內容:
<property>
<name>sentry.service.security.mode</name>
<value>kerberos</value>
</property>
<property>
<name>sentry.service.server.principal</name>
<value>sentry/cdh1@JAVACHEN.COM</value>
</property>
<property>
<name>sentry.service.server.keytab</name>
<value>/etc/sentry/conf/sentry.keytab</value>
</property>
參考 Securing Impala for analysts,準備測試數據:
$ cat /tmp/events.csv
10.1.2.3,US,android,createNote
10.200.88.99,FR,windows,updateNote
10.1.2.3,US,android,updateNote
10.200.88.77,FR,ios,createNote
10.1.4.5,US,windows,updateTag
$ hive -S
hive> create database sensitive;
hive> create table sensitive.events (
ip STRING, country STRING, client STRING, action STRING
) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
hive> load data local inpath '/tmp/events.csv' overwrite into table sensitive.events;
hive> create database filtered;
hive> create view filtered.events as select country, client, action from sensitive.events;
hive> create view filtered.events_usonly as
select * from filtered.events where country = 'US';
在使用 Sentry 時,有以下要求:
一、須要修改 /user/hive/warehouse
權限:
hdfs dfs -chmod -R 770 /user/hive/warehouse
hdfs dfs -chown -R hive:hive /user/hive/warehouse
二、修改 hive-site.xml 文件,關掉 HiveServer2 impersonation
三、taskcontroller.cfg 文件中確保 min.user.id=0
。
修改 hive-site.xml,添加以下:
<property>
<name>hive.security.authorization.task.factory</name>
<value>org.apache.sentry.binding.hive.SentryHiveAuthorizationTaskFactoryImpl</value>
</property>
<property>
<name>hive.server2.session.hook</name>
<value>org.apache.sentry.binding.hive.HiveAuthzBindingSessionHook</value>
</property>
<property>
<name>hive.sentry.conf.url</name>
<value>file:///etc/hive/conf/sentry-site.xml</value>
</property>
在 /etc/hive/conf/ 目錄建立 sentry-site.xml:
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<property>
<name>sentry.service.client.server.rpc-port</name>
<value>8038</value>
</property>
<property>
<name>sentry.service.client.server.rpc-address</name>
<value>cdh1</value>
</property>
<property>
<name>sentry.service.client.server.rpc-connection-timeout</name>
<value>200000</value>
</property>
<property>
<name>sentry.service.security.mode</name>
<value>kerberos</value>
</property>
<property>
<name>sentry.service.server.principal</name>
<value>sentry/_HOST@JAVACHEN.COM</value>
</property>
<property>
<name>sentry.service.server.keytab</name>
<value>/etc/sentry/conf/sentry.keytab</value>
</property>
<property>
<name>sentry.hive.provider</name>
<value>org.apache.sentry.provider.file.HadoopGroupResourceAuthorizationProvider</value>
</property>
<property>
<name>sentry.hive.provider.backend</name>
<value>org.apache.sentry.provider.db.SimpleDBProviderBackend</value>
</property>
<property>
<name>sentry.hive.server</name>
<value>server1</value>
</property>
<property>
<name>sentry.metastore.service.users</name>
<value>hive</value>
</property>
<property>
<name>sentry.hive.testing.mode</name>
<value>false</value>
</property>
</configuration>
在 beeline 中經過 hive(注意,在 sentry 中 hive 爲管理員用戶)的 ticket 鏈接 hive-server2,建立 role、group 等等,執行下面語句:
create role admin_role;
GRANT ALL ON SERVER server1 TO ROLE admin_role;
GRANT ROLE admin_role TO GROUP admin;
GRANT ROLE admin_role TO GROUP hive;
create role test_role;
GRANT ALL ON DATABASE filtered TO ROLE test_role;
GRANT ALL ON DATABASE sensitive TO ROLE test_role;
GRANT ROLE test_role TO GROUP test;
上面建立了兩個角色,一個是 admin_role,具備管理員權限,能夠讀寫全部數據庫,並受權給 admin 和 hive 組(對應操做系統上的組);一個是 test_role,只能讀寫 filtered 和 sensitive 數據庫,並受權給 test 組
在 ldap 服務器上建立系統用戶 yy_test,並使用 migrationtools 工具將該用戶導入 ldap,最後設置 ldap 中該用戶密碼。
# 建立 yy_test用戶
useradd yy_test
grep -E "yy_test" /etc/passwd >/opt/passwd.txt
/usr/share/migrationtools/migrate_passwd.pl /opt/passwd.txt /opt/passwd.ldif
ldapadd -x -D "uid=ldapadmin,ou=people,dc=lashou,dc=com" -w secret -f /opt/passwd.ldif
#使用下面語句修改密碼,填入上面生成的密碼,輸入兩次:
ldappasswd -x -D 'uid=ldapadmin,ou=people,dc=lashou,dc=com' -w secret "uid=yy_test,ou=people,dc=lashou,dc=com" -S
在每臺 datanode 機器上建立 test 分組,並將 yy_test 用戶加入到 test 分組:
groupadd test ; useradd yy_test; usermod -G test,yy_test yy_test
經過 beeline 鏈接 hive-server2,進行測試:
# 切換到 test 用戶進行測試
$ su test
$ kinit -k -t test.keytab test/cdh1@JAVACHEN.COM
$ beeline -u "jdbc:hive2://cdh1:10000/default;principal=test/cdh1@JAVACHEN.COM"
修改 /etc/default/impala 文件中的 IMPALA_SERVER_ARGS
參數,添加:
-server_name=server1
-sentry_config=/etc/impala/conf/sentry-site.xml
在 IMPALA_CATALOG_ARGS
中添加:
-sentry_config=/etc/impala/conf/sentry-site.xml
注意:server1 必須和 sentry-provider.ini 文件中的保持一致。
IMPALA_SERVER_ARGS
參數最後以下:
hostname=`hostname -f |tr "[:upper:]" "[:lower:]"`
IMPALA_SERVER_ARGS=" \
-log_dir=${IMPALA_LOG_DIR} \
-catalog_service_host=${IMPALA_CATALOG_SERVICE_HOST} \
-state_store_port=${IMPALA_STATE_STORE_PORT} \
-use_statestore \
-state_store_host=${IMPALA_STATE_STORE_HOST} \
-kerberos_reinit_interval=60 \
-principal=impala/${hostname}@JAVACHEN.COM \
-keytab_file=/etc/impala/conf/impala.keytab \
-enable_ldap_auth=true -ldap_uri=ldaps://cdh1 -ldap_baseDN=ou=people,dc=javachen,dc=com \
-server_name=server1 \
-sentry_config=/etc/impala/conf/sentry-site.xml \
-be_port=${IMPALA_BACKEND_PORT} -default_pool_max_requests=-1 -mem_limit=60%"
建立 /etc/impala/conf/sentry-site.xml 內容以下:
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<property>
<name>sentry.service.client.server.rpc-port</name>
<value>8038</value>
</property>
<property>
<name>sentry.service.client.server.rpc-address</name>
<value>cdh1</value>
</property>
<property>
<name>sentry.service.client.server.rpc-connection-timeout</name>
<value>200000</value>
</property>
<property>
<name>sentry.service.security.mode</name>
<value>kerberos</value>
</property>
<property>
<name>sentry.service.server.principal</name>
<value>sentry/_HOST@JAVACHEN.COM</value>
</property>
<property>
<name>sentry.service.server.keytab</name>
<value>/etc/sentry/conf/sentry.keytab</value>
</property>
</configuration>
請參考下午基於文件存儲方式中 impala 的測試。
在 hive 的 /etc/hive/conf 目錄下建立 sentry-site.xml 文件,內容以下:
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<property>
<name>hive.sentry.server</name>
<value>server1</value>
</property>
<property>
<name>sentry.hive.provider.backend</name>
<value>org.apache.sentry.provider.file.SimpleFileProviderBackend</value>
</property>
<property>
<name>hive.sentry.provider</name>
<value>org.apache.sentry.provider.file.HadoopGroupResourceAuthorizationProvider</value>
</property>
<property>
<name>hive.sentry.provider.resource</name>
<value>/user/hive/sentry/sentry-provider.ini</value>
</property>
</configuration>
建立 sentry-provider.ini 文件並將其上傳到 hdfs 的 /user/hive/sentry/
目錄:
$ cat /tmp/sentry-provider.ini
[databases]
# Defines the location of the per DB policy file for the customers DB/schema
#db1 = hdfs://cdh1:8020/user/hive/sentry/db1.ini
[groups]
admin = any_operation
hive = any_operation
test = select_filtered
[roles]
any_operation = server=server1->db=*->table=*->action=*
select_filtered = server=server1->db=filtered->table=*->action=SELECT
select_us = server=server1->db=filtered->table=events_usonly->action=SELECT
[users]
test = test
hive= hive
$ hdfs dfs -rm -r /user/hive/sentry/sentry-provider.ini
$ hdfs dfs -put /tmp/sentry-provider.ini /user/hive/sentry/
$ hdfs dfs -chown hive:hive /user/hive/sentry/sentry-provider.ini
$ hdfs dfs -chmod 640 /user/hive/sentry/sentry-provider.ini
關於 sentry-provider.ini 文件的語法說明,請參考官方文檔。這裏我指定了 Hive 組有所有權限,並指定 Hive 用戶屬於 Hive 分組,而其餘兩個分組只有部分權限。
而後在 hive-site.xml 中添加以下配置:
<property>
<name>hive.security.authorization.task.factory</name>
<value>org.apache.sentry.binding.hive.SentryHiveAuthorizationTaskFactoryImpl</value>
</property>
<property>
<name>hive.server2.session.hook</name>
<value>org.apache.sentry.binding.hive.HiveAuthzBindingSessionHook</value>
</property>
<property>
<name>hive.sentry.conf.url</name>
<value>file:///etc/hive/conf/sentry-site.xml</value>
</property>
將配置文件同步到其餘節點,並重啓 hive-server2 服務。
這裏,我集羣中 hive-server2 開啓了 kerberos 認證,故經過 hive 用戶來鏈接 hive-server2。
$ kinit -k -t /etc/hive/conf/hive.keytab hive/cdh1@JAVACHEN.COM
$ beeline -u "jdbc:hive2://cdh1:10000/default;principal=hive/cdh1@JAVACHEN.COM"
scan complete in 10ms
Connecting to jdbc:hive2://cdh1:10000/default;principal=hive/cdh1@JAVACHEN.COM
Connected to: Apache Hive (version 0.13.1-cdh5.2.0)
Driver: Hive JDBC (version 0.13.1-cdh5.2.0)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 0.13.1-cdh5.2.0 by Apache Hive
5 rows selected (0.339 seconds)
0: jdbc:hive2://cdh1:10000/default> show databases;
+----------------+--+
| database_name |
+----------------+--+
| default |
| filtered |
| sensitive |
+----------------+--+
10 rows selected (0.145 seconds)
0: jdbc:hive2://cdh1:10000/default> use filtered
No rows affected (0.132 seconds)
0: jdbc:hive2://cdh1:10000/default> show tables;
+----------------+--+
| tab_name |
+----------------+--+
| events |
| events_usonly |
+----------------+--+
2 rows selected (0.158 seconds)
0: jdbc:hive2://cdh1:10000/default> use sensitive;
No rows affected (0.115 seconds)
0: jdbc:hive2://cdh1:10000/default> show tables;
+-----------+--+
| tab_name |
+-----------+--+
| events |
+-----------+--+
1 row selected (0.148 seconds)
修改 /etc/default/impala 文件中的 IMPALA_SERVER_ARGS
參數,添加:
-server_name=server1
-authorization_policy_file=/user/hive/sentry/sentry-provider.ini
-authorization_policy_provider_class=org.apache.sentry.provider.file.LocalGroupResourceAuthorizationProvider
注意:server1 必須和 sentry-provider.ini 文件中的保持一致。
IMPALA_SERVER_ARGS
參數最後以下:
hostname=`hostname -f |tr "[:upper:]" "[:lower:]"`
IMPALA_SERVER_ARGS=" \
-log_dir=${IMPALA_LOG_DIR} \
-catalog_service_host=${IMPALA_CATALOG_SERVICE_HOST} \
-state_store_port=${IMPALA_STATE_STORE_PORT} \
-use_statestore \
-state_store_host=${IMPALA_STATE_STORE_HOST} \
-be_port=${IMPALA_BACKEND_PORT} \
-server_name=server1 \
-authorization_policy_file=/user/hive/sentry/sentry-provider.ini \
-authorization_policy_provider_class=org.apache.sentry.provider.file.LocalGroupResourceAuthorizationProvider \
-enable_ldap_auth=true -ldap_uri=ldaps://cdh1 -ldap_baseDN=ou=people,dc=javachen,dc=com \
-kerberos_reinit_interval=60 \
-principal=impala/${hostname}@JAVACHEN.COM \
-keytab_file=/etc/impala/conf/impala.keytab \
"
重啓 impala-server 服務,而後進行測試。由於我這裏 impala-server 集成了 kerberos 和 ldap,如今經過 ldap 來進行測試。
先經過 ldap 的 test 用戶來測試:
impala-shell -l -u test
Starting Impala Shell using LDAP-based authentication
LDAP password for test:
Connected to cdh1:21000
Server version: impalad version 2.0.0-cdh5 RELEASE (build ecf30af0b4d6e56ea80297df2189367ada6b7da7)
Welcome to the Impala shell. Press TAB twice to see a list of available commands.
Copyright (c) 2012 Cloudera, Inc. All rights reserved.
(Shell build version: Impala Shell v2.0.0-cdh5 (ecf30af) built on Sat Oct 11 13:56:06 PDT 2014)
[cdh1:21000] > show databases;
Query: show databases
+---------+
| name |
+---------+
| default |
+---------+
Fetched 1 row(s) in 0.11s
[cdh1:21000] > show tables;
Query: show tables
ERROR: AuthorizationException: User 'test' does not have privileges to access: default.*
[cdh1:21000] >
能夠看到 test 用戶沒有權限查看和數據庫,這是由於 sentry-provider.ini 文件中並無給 test 用戶分配任何權限。
下面使用 hive 用戶來測試。使用下面命令在 ldap 中建立 hive 用戶和組並給 hive 用戶設置密碼。
$ grep hive /etc/passwd >/opt/passwd.txt
$ /usr/share/migrationtools/migrate_passwd.pl /opt/passwd.txt /opt/passwd.ldif
$ ldapadd -x -D "uid=ldapadmin,ou=people,dc=javachen,dc=com" -w secret -f /opt/passwd.ldif
$ grep hive /etc/group >/opt/group.txt
$ /usr/share/migrationtools/migrate_group.pl /opt/group.txt /opt/group.ldif
$ ldapadd -x -D "uid=ldapadmin,ou=people,dc=javachen,dc=com" -w secret -f /opt/group.ldif
# 修改 ldap 中 hive 用戶密碼
$ ldappasswd -x -D 'uid=ldapadmin,ou=people,dc=javachen,dc=com' -w secret "uid=hive,ou=people,dc=javachen,dc=com" -S
而後,使用 hive 用戶測試:
$ impala-shell -l -u hive Starting Impala Shell using LDAP-based authentication LDAP password for hive: Connected to cdh1:21000 Server version: impalad version 2.0.0-cdh5 RELEASE (build ecf30af0b4d6e56ea80297df2189367ada6b7da7) Welcome to the Impala shell. Press TAB twice to see a list of available commands. Copyright (c) 2012 Cloudera, Inc. All rights reserved. (Shell build version: Impala Shell v2.0.0-cdh5 (ecf30af) built on Sat Oct 11 13:56:06 PDT 2014) [cdh1:21000] > show databases; Query: show databases +------------------+ | name | +------------------+ | _impala_builtins | | default | | filtered | | sensitive | +------------------+ Fetched 11 row(s) in 0.11s [cdh1:21000] > use sensitive; Query: use sensitive [cdh1:21000] > show tables; Query: show tables +--------+ | name | +--------+ | events | +--------+ Fetched 1 row(s) in 0.11s [cdh1:21000] > select * from events; Query: select * from events +--------------+---------+---------+------------+ | ip | country | client | action | +--------------+---------+---------+------------+ | 10.1.2.3 | US | android | createNote | | 10.200.88.99 | FR | windows | updateNote | | 10.1.2.3 | US | android | updateNote | | 10.200.88.77 | FR | ios | createNote | | 10.1.4.5 | US | windows | updateTag | +--------------+---------+---------+------------+ Fetched 5 row(s) in 0.76s
一樣,你還可使用其餘用戶來測試。
也可使用 beeline 來鏈接 impala-server 來進行測試:
$ beeline -u "jdbc:hive2://cdh1:21050/default;" -n test -p test
scan complete in 2ms
Connecting to jdbc:hive2://cdh1:21050/default;
Connected to: Impala (version 2.0.0-cdh5)
Driver: Hive JDBC (version 0.13.1-cdh5.2.0)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 0.13.1-cdh5.2.0 by Apache Hive
0: jdbc:hive2://cdh1:21050/default>
SHOW CREATE table hue.auth_permission;
ALTER TABLE hue.auth_permission DROP FOREIGN KEY content_type_id_refs_id_id value;
DELETE FROM hue.django_content_type;
ALTER TABLE hue.auth_permission ADD FOREIGN KEY (content_type_id) REFERENCES django_content_type (id);
爲HDFS定義URI時,還必須指定NameNode。例如:
GRANT ALL ON URI文件:/// path / to / dir TO <role>
GRANT ALL ON URI hdfs:// namenode:port / path / to / dir TO <role>
GRANT ALL ON URI hdfs:// ha -nn-uri / path / to / dir TO <role>
管理用戶的權限示例
在此示例中,SQL語句授予 entire_server 角色服務器中的數據庫和URI的全部特權。
CREATE ROLE whole_server;
GRANT ROLE whole_server TO GROUP admin_group;
GRANT ALL ON SERVER server1 TO ROLE whole_server;
具備特定數據庫和表的權限的用戶
若是用戶具備特定數據庫中特定表的權限,則用戶能夠訪問這些內容,但不能訪問其餘內容。他們能夠在輸出中看到表及其父數據庫 顯示錶格 和 顯示數據庫, 使用 適當的數據庫,並執行相關的行動(選擇 和/或 插)基於表權限。要實際建立表須要全部 數據庫級別的權限,所以您能夠爲用戶設置單獨的角色,以設置架構以及對錶執行平常操做的其餘用戶或應用程序。
CREATE ROLE one_database;
GRANT ROLE one_database TO GROUP admin_group;
GRANT ALL ON DATABASE db1 TO ROLE one_database;
CREATE ROLE instructor;
GRANT ROLE instructor TO GROUP trainers;
GRANT ALL ON TABLE db1.lesson TO ROLE instructor;
# This particular course is all about queries, so the students can SELECT but not INSERT or CREATE/DROP.
CREATE ROLE student;
GRANT ROLE student TO GROUP visitors;
GRANT SELECT ON TABLE db1.training TO ROLE student;
使用外部數據文件的權限
經過數據插入數據時 負載數據 語句,或從普通Impala數據庫目錄以外的HDFS位置引用,用戶還須要對與這些HDFS位置對應的URI的適當權限。
在這個例子中:
該 external_table 角色能夠插入並查詢Impala表, external_table.sample。
該 STAGING_DIR角色能夠指定HDFS路徑/用戶/ Cloudera的/ external_data與負載數據聲明。當Impala查詢或加載數據文件時,它會對該目錄中的全部文件進行操做,而不只僅是單個文件,所以任何Impala均可以位置 參數指的是目錄而不是單個文件。
CREATE ROLE external_table;
GRANT ROLE external_table TO GROUP cloudera;
GRANT ALL ON TABLE external_table.sample TO ROLE external_table;
CREATE ROLE staging_dir;
GRANT ROLE staging TO GROUP cloudera;
GRANT ALL ON URI'hdfs://127.0.0.1:8020 / user / cloudera / external_data'TO ROLE staging_dir;
將管理員職責與讀寫權限分開
要建立數據庫,您須要該數據庫的徹底權限,而對該數據庫中的表的平常操做能夠在特定表上使用較低級別的權限執行。所以,您能夠爲每一個數據庫或應用程序設置單獨的角色:能夠建立或刪除數據庫的管理角色,以及只能訪問相關表的用戶級角色。
在此示例中,職責分爲3個不一樣組中的用戶:
CREATE ROLE training_sysadmin;
GRANT ROLE training_sysadmin TO GROUP supergroup;
GRANT ALL ON DATABASE training1 TO ROLE training_sysadmin;
CREATE ROLE instructor;
GRANT ROLE instructor TO GROUP cloudera;
GRANT ALL ON TABLE training1.course1 TO ROLE instructor;
CREATE ROLE visitor;
GRANT ROLE student TO GROUP visitor;
GRANT SELECT ON TABLE training1.course1 TO ROLE student;
server=server_name->db=database_name->table=table_name->action=SELECT
server=server_name->db=database_name->table=table_name->action=ALL
server=impala-host.example.com->db=default->table=t1->action=SELECT server=impala-host.example.com->db=*->table=audit_log->action=SELECT server=impala-host.example.com->db=default->table=t1->action=*