背景描述:html
最近在進行安全掃描的時候,說hadoop存在漏洞,Hadoop 未受權訪問【原理掃描】,而後就參考官方文檔及一些資料,在測試環境中進行了開啓,中間就遇到了不少的坑,或者說本身沒有想明白的問題,在此記錄下吧,這個問題搞了2天。node
環境描述:shell
hadoop版本:2.6.2apache
操做步驟:安全
1.想要開啓服務級認證,須要在core-site.xml文件中開啓參數hadoop.security.authorization,將其設置爲truebash
<property> <name>hadoop.security.authorization</name> <value>true</value> <description>Is service-level authorization enabled?</description> </property>
備註:根據官方文檔的解釋,設置爲true就是simple類型的認證,基於OS用戶的認證.如今服務級的認證已經開啓了。app
增長此參數以後,須要重啓namenode:ide
sbin/hadoop-daemon.sh stop namenode
sbin/hadoop-daemon.sh start namenode
如何知道是否真正的開啓了該配置,查看hadoop安全日誌SecurityAuth-aiprd.audit,若是有新日誌增長,裏面帶有認證信息,說明開啓成功。oop
2.針對具體的各個服務的認證,在配置文件hadoop-policy.xml中測試
<configuration> <property> <name>security.client.protocol.acl</name> <value>*</value> <description>ACL for ClientProtocol, which is used by user code via the DistributedFileSystem. The ACL is a comma-separated list of user and group names. The user and group list is separated by a blank. For e.g. "alice,bob users,wheel". A special value of "*" means all users are allowed.</description> </property> <property> <name>security.client.datanode.protocol.acl</name> <value>*</value> <description>ACL for ClientDatanodeProtocol, the client-to-datanode protocol for block recovery. The ACL is a comma-separated list of user and group names. The user and group list is separated by a blank. For e.g. "alice,bob users,wheel". A special value of "*" means all users are allowed.</description> </property> <property> <name>security.datanode.protocol.acl</name> <value>*</value> <description>ACL for DatanodeProtocol, which is used by datanodes to communicate with the namenode. The ACL is a comma-separated list of user and group names. The user and group list is separated by a blank. For e.g. "alice,bob users,wheel". A special value of "*" means all users are allowed.</description> </property> <property> <name>security.inter.datanode.protocol.acl</name> <value>*</value> <description>ACL for InterDatanodeProtocol, the inter-datanode protocol for updating generation timestamp. The ACL is a comma-separated list of user and group names. The user and group list is separated by a blank. For e.g. "alice,bob users,wheel". A special value of "*" means all users are allowed.</description> </property> <property> <name>security.namenode.protocol.acl</name> <value>*</value> <description>ACL for NamenodeProtocol, the protocol used by the secondary namenode to communicate with the namenode. The ACL is a comma-separated list of user and group names. The user and group list is separated by a blank. For e.g. "alice,bob users,wheel". A special value of "*" means all users are allowed.</description> </property> <property> <name>security.admin.operations.protocol.acl</name> <value>*</value> <description>ACL for AdminOperationsProtocol. Used for admin commands. The ACL is a comma-separated list of user and group names. The user and group list is separated by a blank. For e.g. "alice,bob users,wheel". A special value of "*" means all users are allowed.</description> </property> <property> <name>security.refresh.user.mappings.protocol.acl</name> <value>*</value> <description>ACL for RefreshUserMappingsProtocol. Used to refresh users mappings. The ACL is a comma-separated list of user and group names. The user and group list is separated by a blank. For e.g. "alice,bob users,wheel". A special value of "*" means all users are allowed.</description> </property> <property> <name>security.refresh.policy.protocol.acl</name> <value>*</value> <description>ACL for RefreshAuthorizationPolicyProtocol, used by the dfsadmin and mradmin commands to refresh the security policy in-effect. The ACL is a comma-separated list of user and group names. The user and group list is separated by a blank. For e.g. "alice,bob users,wheel". A special value of "*" means all users are allowed.</description> </property> <property> <name>security.ha.service.protocol.acl</name> <value>*</value> <description>ACL for HAService protocol used by HAAdmin to manage the active and stand-by states of namenode.</description> </property> <property> <name>security.zkfc.protocol.acl</name> <value>*</value> <description>ACL for access to the ZK Failover Controller </description> </property> <property> <name>security.qjournal.service.protocol.acl</name> <value>*</value> <description>ACL for QJournalProtocol, used by the NN to communicate with JNs when using the QuorumJournalManager for edit logs.</description> </property> <property> <name>security.mrhs.client.protocol.acl</name> <value>*</value> <description>ACL for HSClientProtocol, used by job clients to communciate with the MR History Server job status etc. The ACL is a comma-separated list of user and group names. The user and group list is separated by a blank. For e.g. "alice,bob users,wheel". A special value of "*" means all users are allowed.</description> </property> <!-- YARN Protocols --> <property> <name>security.resourcetracker.protocol.acl</name> <value>*</value> <description>ACL for ResourceTrackerProtocol, used by the ResourceManager and NodeManager to communicate with each other. The ACL is a comma-separated list of user and group names. The user and group list is separated by a blank. For e.g. "alice,bob users,wheel". A special value of "*" means all users are allowed.</description> </property> <property> <name>security.resourcemanager-administration.protocol.acl</name> <value>*</value> <description>ACL for ResourceManagerAdministrationProtocol, for admin commands. The ACL is a comma-separated list of user and group names. The user and group list is separated by a blank. For e.g. "alice,bob users,wheel". A special value of "*" means all users are allowed.</description> </property> <property> <name>security.applicationclient.protocol.acl</name> <value>*</value> <description>ACL for ApplicationClientProtocol, used by the ResourceManager and applications submission clients to communicate with each other. The ACL is a comma-separated list of user and group names. The user and group list is separated by a blank. For e.g. "alice,bob users,wheel". A special value of "*" means all users are allowed.</description> </property> <property> <name>security.applicationmaster.protocol.acl</name> <value>*</value> <description>ACL for ApplicationMasterProtocol, used by the ResourceManager and ApplicationMasters to communicate with each other. The ACL is a comma-separated list of user and group names. The user and group list is separated by a blank. For e.g. "alice,bob users,wheel". A special value of "*" means all users are allowed.</description> </property> <property> <name>security.containermanagement.protocol.acl</name> <value>*</value> <description>ACL for ContainerManagementProtocol protocol, used by the NodeManager and ApplicationMasters to communicate with each other. The ACL is a comma-separated list of user and group names. The user and group list is separated by a blank. For e.g. "alice,bob users,wheel". A special value of "*" means all users are allowed.</description> </property> <property> <name>security.resourcelocalizer.protocol.acl</name> <value>*</value> <description>ACL for ResourceLocalizer protocol, used by the NodeManager and ResourceLocalizer to communicate with each other. The ACL is a comma-separated list of user and group names. The user and group list is separated by a blank. For e.g. "alice,bob users,wheel". A special value of "*" means all users are allowed.</description> </property> <property> <name>security.job.task.protocol.acl</name> <value>*</value> <description>ACL for TaskUmbilicalProtocol, used by the map and reduce tasks to communicate with the parent tasktracker. The ACL is a comma-separated list of user and group names. The user and group list is separated by a blank. For e.g. "alice,bob users,wheel". A special value of "*" means all users are allowed.</description> </property> <property> <name>security.job.client.protocol.acl</name> <value>*</value> <description>ACL for MRClientProtocol, used by job clients to communciate with the MR ApplicationMaster to query job status etc. The ACL is a comma-separated list of user and group names. The user and group list is separated by a blank. For e.g. "alice,bob users,wheel". A special value of "*" means all users are allowed.</description> </property> <property> <name>security.applicationhistory.protocol.acl</name> <value>*</value> <description>ACL for ApplicationHistoryProtocol, used by the timeline server and the generic history service client to communicate with each other. The ACL is a comma-separated list of user and group names. The user and group list is separated by a blank. For e.g. "alice,bob users,wheel". A special value of "*" means all users are allowed.</description> </property> </configuration>
備註:默認有10個服務,每一個服務的默認值都是*,表示的就是任何的用戶均可以對其進行訪問。
3.目前只須要針對客戶端哪些用戶可以訪問namenode便可,即修改參數security.client.protocol.acl的值
<property> <name>security.zkfc.protocol.acl</name> <value>aiprd</value> <description>ACL for access to the ZK Failover Controller </description> </property>
備註:表示客戶端進行對應的用戶是aiprd的就能夠訪問namenode。
刷新ACL配置:
bin/hdfs dfsadmin -refreshServiceAcl
修改格式以下:
<property> <name>security.job.submission.protocol.acl</name> <value>user1,user2 group1,group2</value> </property>
備註:該值是,用戶之間逗號隔開,用戶組之間用逗號隔開,用戶和用戶組之間用空格分開,若是沒有用戶,要以空格開頭後面接用戶組。
4.遠程客戶端訪問hdfs中文件進行驗證
[aiprd@localhost ~]$ hdfs dfs -ls hdfs://hadoop1:9000/ Found 10 items drwxr-xr-x - aiprd supergroup 0 2019-08-14 04:31 hdfs://hadoop1:9000/hbase drwxr-xr-x - aiprd hadoop 0 2019-08-14 06:40 hdfs://hadoop1:9000/test01 drwxr-xr-x - aiprd supergroup 0 2019-08-14 06:22 hdfs://hadoop1:9000/test02 drwxr-xr-x - aiprd supergroup 0 2019-08-14 23:39 hdfs://hadoop1:9000/test03 drwxr-xr-x - aiprd supergroup 0 2019-08-14 06:30 hdfs://hadoop1:9000/test07 drwxr-xr-x - aiprd supergroup 0 2019-08-14 06:31 hdfs://hadoop1:9000/test08 drwxr-xr-x - aiprd supergroup 0 2019-08-14 06:32 hdfs://hadoop1:9000/test09 drwxr-xr-x - aiprd supergroup 0 2019-08-14 06:41 hdfs://hadoop1:9000/test10 drwxrwx--- - aiprd supergroup 0 2019-08-14 07:06 hdfs://hadoop1:9000/test11 drwxr-xr-x - aiprd1 supergroup 0 2019-08-15 00:10 hdfs://hadoop1:9000/test12
備註:在客戶端上,將hadoop的程序部署在aiprd用戶下,執行命令可以查看其中的文件、文件夾信息。同時,aiprd用戶也是啓動namenode的用戶即hadoop中的超級用戶,因此,查看到的文件的用戶組都是aiprd.
5.測試,若是增長或者使用其餘的用戶是否能夠
<property> <name>security.zkfc.protocol.acl</name> <value>aiprd1</value> <description>ACL for access to the ZK Failover Controller </description> </property>
刷新ACL配置。
bin/hdfs dfsadmin -refreshServiceAcl
將用戶修改aiprd1。即只有客戶端的程序用戶是aiprd1才能訪問。
6.在客戶端中,繼續使用以前部署在aiprd用戶下的hadoop客戶端進行訪問
[aiprd@localhost ~]$ hdfs dfs -ls hdfs://hadoop1:9000/ ls: User aiprd (auth:SIMPLE) is not authorized for protocol interface org.apache.hadoop.hdfs.protocol.ClientProtocol, expected client Kerberos principal is null
備註:發現aiprd用戶是不能訪問的了
7.客戶端中,在aiprd1用戶下,在部署hadoop客戶端,而後進行訪問
[aiprd1@localhost ~]$ hdfs dfs -ls hdfs://hadoop1:9000/test12 Found 6 items drwxr-xr-x - aiprd supergroup 0 2019-08-14 23:43 hdfs://hadoop1:9000/test12/01 drwxr-xr-x - aiprd supergroup 0 2019-08-14 23:43 hdfs://hadoop1:9000/test12/02 drwxr-xr-x - aiprd supergroup 0 2019-08-14 23:43 hdfs://hadoop1:9000/test12/03 drwxr-xr-x - aiprd supergroup 0 2019-08-14 23:44 hdfs://hadoop1:9000/test12/04 drwxr-xr-x - aiprd supergroup 0 2019-08-14 23:49 hdfs://hadoop1:9000/test12/05 drwxr-xr-x - aiprd1 supergroup 0 2019-08-15 00:10 hdfs://hadoop1:9000/test12/10
備註:是可以訪問的,因此,若是要使用用戶來進行認證,那麼客戶端程序對應的OS用戶,必需要和hadoop-policy.xml中配置的用戶一致不然不能訪問。
既然,服務級參數的值,能夠是用戶,也能夠是用戶組,用戶驗證完了,那麼來驗證用戶組吧,此時,就遇到了不少的坑。
1.仍是以前的參數security.zkfc.protocol.acl,此次使用,用戶組
<property> <name>security.zkfc.protocol.acl</name> <value>aiprd hadoop</value> <description>ACL for access to the ZK Failover Controller </description> </property>
刷新ACL配置:
/bin/hdfs dfsadmin -refreshServiceAcl
那麼問題來了,以前的用戶是基於OS級別的判斷,這個應該也是,也就是判斷我這個用戶究竟是不是這個用戶組裏面的。
2.在客戶端上aiprd用戶下的程序是能夠訪問的,通過以前的驗證沒有問題
3.在客戶端上,在aiprd1下部署hadoop客戶端程序,正常是訪問不了hdfs的,那麼將aiprd1加入到這個hadoop組下,理論上是能夠訪問的
[aiprd1@localhost ~]$ id aiprd1 uid=1001(aiprd1) gid=1001(aiprd1) groups=1001(aiprd1),1002(hadoop) [aiprd1@localhost ~]$ hdfs dfs -ls hdfs://hadoop1:9000/test12 ls: User aiprd1 (auth:SIMPLE) is not authorized for protocol interface org.apache.hadoop.hdfs.protocol.ClientProtocol, expected client Kerberos principal is null
通過驗證,是不能夠的,說明這個hadoop分組並無起做用。
試了以下的辦法:
實在沒有辦法,開啓DEBUG吧,開啓以後,得到信息以下:
2019-08-15 15:12:27,188 WARN org.apache.hadoop.security.ShellBasedUnixGroupsMapping: got exception trying to get groups for user aiprd1: id: aiprd1: No such user 2019-08-15 15:12:27,188 WARN org.apache.hadoop.security.UserGroupInformation: No groups available for user aiprd1 adoop.hdfs.protocol.ClientProtocol, expected client Kerberos principal is null :SIMPLE) 2019-08-15 15:12:27,188 DEBUG org.apache.hadoop.ipc.Server: Socket Reader #1 for port 9000: responding to null from 192.168.30.1:61985 Call#-3 Retry#-1 2019-08-15 15:12:27,188 DEBUG org.apache.hadoop.ipc.Server: Socket Reader #1 for port 9000: responding to null from 192.168.30.1:61985 Call#-3 Retry#-1 Wrote 243 bytes. izationException: User aiprd1 (auth:SIMPLE) is not authorized for protocol interface
意思是說,當試着爲這個用戶查找用戶組的時候,沒有這個用戶,就很奇怪,明明是有用戶的啊。而後就基於這個報錯各類查找,而後在下面的文章中得到了點啓示:
https://www.e-learn.cn/content/wangluowenzhang/1136832
To accomplish your goal you'd need to add your user account (clott) on the NameNode machine and add it to hadoop group there. If you are going to run MapReduce with your user, you'd need your user account to be configured on NodeManager hosts as well.
4.按照這個意思,在Namenode節點上,建立aiprd1用戶,並加入到hadoop用戶組裏面。
[root@hadoop1 ~]# useradd -G hadoop aiprd1 [root@hadoop1 ~]# id aiprd1 uid=503(aiprd1) gid=503(aiprd1) groups=503(aiprd1),502(hadoop) [root@hadoop1 ~]# su - aiprd [aiprd@hadoop1 ~]$ jps 15289 NameNode 15644 Jps
備註:此節點運行了NameNode.
5.再次在hadoop客戶端上,aiprd1用戶下執行查詢操做
[aiprd1@localhost ~]$ hdfs dfs -ls hdfs://hadoop1:9000/test12 Found 6 items drwxr-xr-x - aiprd supergroup 0 2019-08-14 23:43 hdfs://hadoop1:9000/test12/01 drwxr-xr-x - aiprd supergroup 0 2019-08-14 23:43 hdfs://hadoop1:9000/test12/02 drwxr-xr-x - aiprd supergroup 0 2019-08-14 23:43 hdfs://hadoop1:9000/test12/03 drwxr-xr-x - aiprd supergroup 0 2019-08-14 23:44 hdfs://hadoop1:9000/test12/04 drwxr-xr-x - aiprd supergroup 0 2019-08-14 23:49 hdfs://hadoop1:9000/test12/05 drwxr-xr-x - aiprd1 supergroup 0 2019-08-15 00:10 hdfs://hadoop1:9000/test12/10
能夠進行查詢了。
在客戶端上,將aiprd1對應的用戶組hadoop去掉。
[aiprd1@localhost ~]$ id uid=1001(aiprd1) gid=1001(aiprd1) groups=1001(aiprd1)
再次執行查詢:
[aiprd1@localhost ~]$ hdfs dfs -ls hdfs://hadoop1:9000/test12 Found 6 items drwxr-xr-x - aiprd supergroup 0 2019-08-14 23:43 hdfs://hadoop1:9000/test12/01 drwxr-xr-x - aiprd supergroup 0 2019-08-14 23:43 hdfs://hadoop1:9000/test12/02 drwxr-xr-x - aiprd supergroup 0 2019-08-14 23:43 hdfs://hadoop1:9000/test12/03 drwxr-xr-x - aiprd supergroup 0 2019-08-14 23:44 hdfs://hadoop1:9000/test12/04 drwxr-xr-x - aiprd supergroup 0 2019-08-14 23:49 hdfs://hadoop1:9000/test12/05 drwxr-xr-x - aiprd1 supergroup 0 2019-08-15 00:10 hdfs://hadoop1:9000/test12/10
仍是能夠查詢的,能夠看出來,用戶組和客戶端上用戶所在的組沒有關係,須要在Namenode節點設置。
查看官方,有以下解釋:
Once a username has been determined as described above, the list of groups is determined by a group mapping service, configured by the hadoop.security.group.mapping property. The default implementation, org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback, will determine if the Java Native Interface (JNI) is available. If JNI is available, the implementation will use the API within hadoop to resolve a list of groups for a user. If JNI is not available then the shell implementation, org.apache.hadoop.security.ShellBasedUnixGroupsMapping, is used. This implementation shells out with the bash -c groups command (for a Linux/Unix environment) or the net group command (for a Windows environment) to resolve a list of groups for a user.
An alternate implementation, which connects directly to an LDAP server to resolve the list of groups, is available via org.apache.hadoop.security.LdapGroupsMapping. However, this provider should only be used if the required groups reside exclusively in LDAP, and are not materialized on the Unix servers. More information on configuring the group mapping service is available in the Javadocs.
For HDFS, the mapping of users to groups is performed on the NameNode. Thus, the host system configuration of the NameNode determines the group mappings for the users.
Note that HDFS stores the user and group of a file or directory as strings; there is no conversion from user and group identity numbers as is conventional in Unix.
對於HDFS來講,用戶到組的映射關係是在NameNode上執行的,所以,NameNode的主機系統配置決定了用戶組的映射。
實驗以後纔看明白,以前根本沒有理解,覺得是從客戶端拿到用戶對應的用戶組信息,而後到NameNode來進行判斷呢。
因此,到這裏,基於服務級的ACL,用戶、用戶組的都已經能夠配置了,對於其餘的服務,能夠根據實際狀況進行配置。這裏面只要求哪些用戶、用戶組能夠鏈接上來就行了。
小結:
1.hadoop.security.authorization設置爲true,開啓simple認證,即基於os用戶的認證,配置以後,重啓namenode
2.acl爲用戶認證的,保證服務acl中配置的值與客戶端進程對應的用戶一致便可訪問。
3.acl爲用戶組的,客戶端若是使用A訪問,那麼要在NameNode上建立用戶A,將A加入到acl用戶組,驗證過程:獲取客戶端的用戶,好比爲A,NameNode節點上,經過用戶A,到NameNode的主機上來查找用戶A對應的用戶組信息,若是NameNode上沒有用戶A,認證失敗,若是有用戶A,沒有在acl用戶組上,認證失敗,有用戶A,用戶A在acl配置的組裏面,認證成功。
4.acl配置的用戶組與客戶端程序用戶,所在的用戶組沒有關係。
5.每次修改hadoop-policy.xml中的值,記得要執行刷新操做。
另外:要注意,不一樣版本的參數,配置可能不一樣,要看和本身hadoop版本一致的文檔。
https://hadoop.apache.org/docs/r2.6.2/hadoop-project-dist/hadoop-common/ServiceLevelAuth.html
文檔建立時間:2019年8月15日17:30:24