用了一週多的時間終於把CDH版Hadoop部署在了測試環境(部分組件未安裝成功),本文將就這個部署過程作個總結。html
1、Hadoop版本選擇。java
Hadoop大體可分爲Apache Hadoop和第三方發行第三方發行版Hadoop,考慮到Hadoop集羣部署的高效,集羣的穩定性,以及後期集中的配置管理,業界多使用Cloudera公司的發行版,簡稱爲CDH。node
下面是轉載的Hadoop社區版本與第三方發行版本的比較:mysql
Apache社區版本linux
優勢:web
缺點:spring
第三方發行版本(如CDH,HDP,MapR等)sql
優勢:shell
缺點:數據庫
轉自:http://itindex.net/detail/51484-%E8%87%AA%E5%AD%A6-%E5%A4%A7%E6%95%B0%E6%8D%AE-%E7%94%9F%E4%BA%A7
更多內容請看原做者博客。
2、安裝介質準備
安裝介質準備和安裝部分主要參考:http://blog.csdn.net/shawnhu007/article/details/52579204,對其內容進行少量補充以作到能傻瓜安裝的目的。
咱們採用離線安裝的方式,須要下載CDH離線安裝包和相關組件:
介質下載和安裝部分主要參考:http://blog.csdn.net/shawnhu007/article/details/52579204
在線安裝請參考文章(對網速有較高要求):http://www.cnblogs.com/ee900222/p/hadoop_3.html
3、操做系統準備
準備好三臺環境同樣的centos7在本地虛擬機VMWare上,Cloudera發行版比起Apache社區版本安裝對硬件的要求更高,內存至少10G,否則後面你會遇到各類問題,或許都找不到答案。
本人前2次安裝失敗就是由於節點分配內存太少,建議對於cloudera-scm-server就須要至少4G的內存,cloudera-scm-agent的內存至少也須要1.5G以上。
3臺虛擬機環境以下:
IP地址 | 主機名 | 說明 |
192.168.42.128 | CDH1 | 主節點master,datanode |
192.168.42.129 | CDH2 | datanode |
192.168.42.30 | CDH3 | datanode |
4、開始安裝前配置和預裝軟件
能夠在VM中先安裝1臺機器,作完相關配置後再克隆出另外2臺機器,以免在3臺機器上的重複配置
由於Centos7的最小安裝版,因此首先解決首次開機聯網問題
[root@cdh1~]$ vi /etc/sysconfig/network-scripts/ifcfg-enp0s3 將 ONBOOT=no 改成 ONBOOT=yes [root@cdh1~]$ systemct1 restart network [root@cdh1~]$ yum install net-tools //爲了使用ifconfig查看網絡
[root@cdh1~]$ java -version [root@cdh1~]$ rpm -qa | grep jdk java-1.7.0-openjdk-1.7.0.75-2.5.4.2.el7_0.x86_64 java-1.7.0-openjdk-headless-1.7.0.75-2.5.4.2.el7_0.x86_64 [root@cdh1~]# yum -y remove java-1.7.0-openjdk-1.7.0.75-2.5.4.2.el7_0.x86_64 [root@cdh1~]# yum -y remove java-1.7.0-openjdk-headless-1.7.0.75-2.5.4.2.el7_0.x86_64 [root@cdh1~]# java -version bash: /usr/bin/java: No such file or directory [root@cdh1~]# rpm -ivh jdk-8u101-linux-x64.rpm [root@cdh1~]# java -version java version "1.8.0_101" Java(TM) SE Runtime Environment (build 1.8.0_101-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode)
[root@cdh1~]# vi /etc/sysconfig/network NETWORKING=yes HOSTNAME=cdh1 [root@cdh1~]# vi /etc/hosts 127.0.0.1 localhost.cdh1 192.168.42.128 cdh1 192.168.42.129 cdh2 192.168.42.130 cdh3
[root@cdh1~]# vi /etc/sysconfig/selinux SELINUX=disabled [root@cdh1~]#sestatus -v SELinux status: disabled 表示已經關閉了
[root@cdh1~]# systemctl stop firewalld [root@cdh1~]# systemctl disable firewalld rm '/etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service' rm '/etc/systemd/system/basic.target.wants/firewalld.service' [root@cdh1~]# systemctl status firewalld firewalld.service - firewalld - dynamic firewall daemon Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled) Active: inactive (dead)
[root@cdh1~]#yum -y install ntp 更改master的節點 [root@cdh1~]## vi /etc/ntp.conf 註釋掉全部server *.*.*的指向,新添加一條可鏈接的ntp服務器(我選的本公司的ntp測試服務器) server 172.30.0.19 iburst 在其餘節點上把ntp指向master服務器地址便可(/etc/ntp.conf下) server 192.168.42.128 iburst [root@cdh1~]## systemctl start ntpd //啓動ntp服務 [root@cdh1~]## systemctl status ntpd //查看ntp服務狀態
下面以192.168.42.128到192.168.42.129的免密登陸設置舉例
[root@cdh1 /]# ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): /root/.ssh/id_rsa already exists. Overwrite (y/n)? y Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: 1d:e9:b4:ed:1d:e5:c6:a7:f3:23:ac:02:2b:8c:fc:ca root@cdh1 The key's randomart image is: +--[ RSA 2048]----+ | | | . | | + .| | + + + | | S + . . =| | . . . +.| | . o o o + | | .o o . . o + | | Eo.. ... . o| +-----------------+ [root@cdh1 /]# ssh-copy-id 192.168.42.129 /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys root@192.168.42.129's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh '192.168.42.129'" and check to make sure that only the key(s) you wanted were added.
安裝mysql
centos7自帶的是mariadb,須要先卸載掉
[root@cdh1 /]# rpm -qa | grep mariadb mariadb-libs-5.5.41-2.el7_0.x86_64 [root@cdh1 /]# rpm -e --nodeps mariadb-libs-5.5.41-2.el7_0.x86_64 [root@cdh1 /]# tar -xvf MySQL-5.6.24-1.linux_glibc2.5.x86_64.rpm-bundle.tar //mysql rpm包拷貝到服務器上而後解壓 [root@cdh1 /]# rpm -ivh MySQL-*.rpm //安裝釋出的所有rpm [root@cdh1 /]# cp /usr/share/mysql/my-default.cnf /etc/my.cnf [root@cdh1 /]# vi /etc/my.cnf //在配置文件中增長如下配置並保存 [mysqld] default-storage-engine = innodb innodb_file_per_table collation-server = utf8_general_ci init-connect = 'SET NAMES utf8' character-set-server = utf8 [root@cdh1 /]# yum install -y perl-Module-Install.noarch [root@cdh1 /]# /usr/bin/mysql_install_db //初始化mysql [root@cdh1 /]# service mysql restart //啓動mysql ERROR! MySQL server PID file could not be found! Starting MySQL... SUCCESS! [root@cdh1 /]# cat /root/.mysql_secret //查看mysql root初始化密碼 # The random password set for the root user at Fri Sep 22 11:13:25 2017 (local time): 9mp7uYFmgt6drdq3 [root@cdh1 /]# mysql -u root -p //登陸進行去更改密碼 mysql> SET PASSWORD=PASSWORD('123456'); mysql> update user set host='%' where user='root' and host='localhost'; //容許mysql遠程訪問 Query OK, 1 row affected (0.05 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> flush privileges; Query OK, 0 rows affected (0.00 sec) [root@cdh1 /]# chkconfig mysql on //配置開機啓動
[root@cdh1 /]# tar -zcvf mysql-connector-java-5.1.44.tar.gz // 解壓mysql-connector-java-5.1.44.tar.gz獲得mysql-connector-java-5.1.44-bin.jar
[root@cdh1 /]# mkdir /usr/share/java // 在各節點建立java文件夾
[root@cdh1 /]# cp mysql-connector-java-5.1.44-bin.jar /usr/share/java/mysql-connector-java.jar //將mysql-connector-java-5.1.44-bin.jar拷貝到/usr/share/java路徑下並重命名爲mysql-connector-java.jar
create database hive DEFAULT CHARSET utf8 COLLATE utf8_general_ci; Query OK, 1 row affected (0.00 sec) create database amon DEFAULT CHARSET utf8 COLLATE utf8_general_ci; Query OK, 1 row affected (0.00 sec) create database hue DEFAULT CHARSET utf8 COLLATE utf8_general_ci; Query OK, 1 row affected (0.00 sec) create database monitor DEFAULT CHARSET utf8 COLLATE utf8_general_ci; Query OK, 1 row affected (0.00 sec) create database oozie DEFAULT CHARSET utf8 COLLATE utf8_general_ci; Query OK, 1 row affected (0.00 sec) grant all on *.* to root@"%" Identified by "123456";
5、安裝Cloudera-Manager
//解壓cm tar包到指定目錄全部服務器都要(或者在主節點解壓好,而後經過scp到各個節點同一目錄下) [root@cdh1 ~]#mkdir /opt/cloudera-manager [root@cdh1 ~]# tar -axvf cloudera-manager-centos7-cm5.7.2_x86_64.tar.gz -C /opt/cloudera-manager //建立cloudera-scm用戶(全部節點) [root@cdh1 ~]# useradd --system --home=/opt/cloudera-manager/cm-5.7.2/run/cloudera-scm-server --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm //在主節點建立cloudera-manager-server的本地元數據保存目錄 [root@cdh1 ~]# mkdir /var/cloudera-scm-server [root@cdh1 ~]# chown cloudera-scm:cloudera-scm /var/cloudera-scm-server [root@cdh1 ~]# chown cloudera-scm:cloudera-scm /opt/cloudera-manager //配置從節點cloudera-manger-agent指向主節點服務器 [root@cdh1 ~]# vi /opt/cloudera-manager/cm-5.7.2/etc/cloudera-scm-agent/config.ini 將server_host改成CMS所在的主機名即cdh1 //主節點中建立parcel-repo倉庫目錄 [root@cdh1 ~]# mkdir -p /opt/cloudera/parcel-repo [root@cdh1 ~]# chown cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo [root@cdh1 ~]# cp CDH-5.7.2-1.cdh5.7.2.p0.18-el7.parcel CDH-5.7.2-1.cdh5.7.2.p0.18-el7.parcel.sha manifest.json /opt/cloudera/parcel-repo 注意:其中CDH-5.7.2-1.cdh5.7.2.p0.18-el5.parcel.sha1 後綴要把1去掉 //全部節點建立parcels目錄 [root@cdh1 ~]# mkdir -p /opt/cloudera/parcels [root@cdh1 ~]# chown cloudera-scm:cloudera-scm /opt/cloudera/parcels 解釋:Clouder-Manager將CDHs從主節點的/opt/cloudera/parcel-repo目錄中抽取出來,分發解壓激活到各個節點的/opt/cloudera/parcels目錄中 //初始腳本配置數據庫scm_prepare_database.sh(在主節點上) [root@cdh1 ~]# /opt/cloudera-manager/cm-5.7.2/share/cmf/schema/scm_prepare_database.sh mysql -hcdh1 -uroot -p123456 --scm-host cdh1 scmdbn scmdbu scmdbp 說明:這個腳本就是用來建立和配置CMS須要的數據庫的腳本。各參數是指: mysql:數據庫用的是mysql,若是安裝過程當中用的oracle,那麼該參數就應該改成oracle。 -cdh1:數據庫創建在cdh1主機上面,也就是主節點上面。 -uroot:root身份運行mysql。-123456:mysql的root密碼是***。 --scm-host cdh1:CMS的主機,通常是和mysql安裝的主機是在同一個主機上,最後三個參數是:數據庫名,數據庫用戶名,數據庫密碼。 若是報錯: ERROR com.cloudera.enterprise.dbutil.DbProvisioner - Exception when creating/dropping database with user 'root' and jdbc url 'jdbc:mysql://localhost/?useUnicode=true&characterEncoding=UTF-8' java.sql.SQLException: Access denied for user 'root'@'cdh1' (using password: YES) 則參考 http://forum.spring.io/forum/spring-projects/web/57254-java-sql-sqlexception-access-denied-for-user-root-localhost-using-password-yes 運行以下命令: update user set PASSWORD=PASSWORD('123456') where user='root'; GRANT ALL PRIVILEGES ON *.* TO 'root'@'cdh1' IDENTIFIED BY '123456' WITH GRANT OPTION; FLUSH PRIVILEGES; //啓動主節點 [root@cdh1 ~]# cp /opt/cloudera-manager/cm-5.7.2/etc/init.d/cloudera-scm-server /etc/init.d/cloudera-scm-server [root@cdh1 ~]# chkconfig cloudera-scm-server on [root@cdh1 ~]# vi /etc/init.d/cloudera-scm-server CMF_DEFAULTS=${CMF_DEFAULTS:-/etc/default}改成=/opt/cloudera-manager/cm-5.7.2/etc/default [root@cdh1 ~]# service cloudera-scm-server start //同時爲了保證在每次服務器重啓的時候都能啓動cloudera-scm-server,應該在開機啓動腳本/etc/rc.local中加入命令:service cloudera-scm-server restart //啓動cloudera-scm-agent全部節點 [root@cdhX ~]# mkdir /opt/cloudera-manager/cm-5.7.2/run/cloudera-scm-agent [root@cdhX ~]# cp /opt/cloudera-manager/cm-5.7.2/etc/init.d/cloudera-scm-agent /etc/init.d/cloudera-scm-agent [root@cdhX ~]# chkconfig cloudera-scm-agent on [root@cdhX ~]# vi /etc/init.d/cloudera-scm-agent CMF_DEFAULTS=${CMF_DEFAULTS:-/etc/default}改成=/opt/cloudera-manager/cm-5.7.2/etc/default [root@cdhX ~]# service cloudera-scm-agent start //同時爲了保證在每次服務器重啓的時候都能啓動cloudera-scm-agent,應該在開機啓動腳本/etc/rc.local中加入命令:service cloudera-scm-agent restart
6、在瀏覽器安裝CDH
等待主節點完成啓動就在瀏覽器中進行操做了
進入192.168.42.128:7180 默認使用admin admin登陸
如下在瀏覽器中使用操做安裝
配置主機:因爲咱們在各個節點都安裝啓動了agent,而且在中各個節點都在配置文件中指向cdh1是server節點,因此這裏咱們能夠在「當前管理的主機」中看到三個主機,所有勾選並繼續.
注意:若是cloudera-scm-agent沒有設爲開機啓動,若是以上有重啓這裏可能會檢測不到其餘服務器。
而後選擇選擇cdh
這個地方要注意這個地方有兩項沒有檢查經過,
根據帖子 http://www.cnblogs.com/itboys/p/5955545.html 能夠在集羣中使用如下命令,而後再點擊上面的從新運行會發現此次所有檢查經過了,
可是我沒有成功,還請高手告訴我緣由。
echo 0 > /proc/sys/vm/swappiness echo never > /sys/kernel/mm/transparent_hugepage/defrag
根據須要選擇要安裝的服務,若是選擇全部服務則對系統配置要求較高
數據庫設置選擇
數據庫設置 | 數據庫類型 | 數據庫名稱 | 用戶名 | 密碼 |
Hive | mysql | hive | root | 123456 |
Oozie Server | mysql | oozie | root | 123456 |
而後直接下一步下一步開始安裝
安裝完成後可在瀏覽器中進入192.168.42.128:7180地址,查看集羣狀況:
我這裏有較多報警,大概是安裝過程當中部分組件存在錯誤所致,如今尚未能力排除這些錯誤,先看基本功能。
7、測試
在集羣的一臺機器上執行如下模擬Pi的示例程序:
sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi 10 100
經過YARN的Web管理界面也能夠看到MapReduce的執行狀態:
MapReduce執行過程當中終端的輸出以下:
Number of Maps = 10 Samples per Map = 100 Wrote input for Map #0 Wrote input for Map #1 Wrote input for Map #2 Wrote input for Map #3 Wrote input for Map #4 Wrote input for Map #5 Wrote input for Map #6 Wrote input for Map #7 Wrote input for Map #8 Wrote input for Map #9 Starting Job 17/09/22 17:17:50 INFO client.RMProxy: Connecting to ResourceManager at cdh1/192.168.42.128:8032 17/09/22 17:17:52 INFO input.FileInputFormat: Total input paths to process : 10 17/09/22 17:17:52 INFO mapreduce.JobSubmitter: number of splits:10 17/09/22 17:17:53 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1505892176617_0002 17/09/22 17:17:53 INFO impl.YarnClientImpl: Submitted application application_1505892176617_0002 17/09/22 17:17:54 INFO mapreduce.Job: The url to track the job: http://cdh1:8088/proxy/application_1505892176617_0002/ 17/09/22 17:17:54 INFO mapreduce.Job: Running job: job_1505892176617_0002 17/09/22 17:18:07 INFO mapreduce.Job: Job job_1505892176617_0002 running in uber mode : false 17/09/22 17:18:07 INFO mapreduce.Job: map 0% reduce 0% 17/09/22 17:18:22 INFO mapreduce.Job: map 10% reduce 0% 17/09/22 17:18:29 INFO mapreduce.Job: map 20% reduce 0% 17/09/22 17:18:37 INFO mapreduce.Job: map 30% reduce 0% 17/09/22 17:18:43 INFO mapreduce.Job: map 40% reduce 0% 17/09/22 17:18:49 INFO mapreduce.Job: map 50% reduce 0% 17/09/22 17:18:56 INFO mapreduce.Job: map 60% reduce 0% 17/09/22 17:19:02 INFO mapreduce.Job: map 70% reduce 0% 17/09/22 17:19:10 INFO mapreduce.Job: map 80% reduce 0% 17/09/22 17:19:16 INFO mapreduce.Job: map 90% reduce 0% 17/09/22 17:19:24 INFO mapreduce.Job: map 100% reduce 0% 17/09/22 17:19:30 INFO mapreduce.Job: map 100% reduce 100% 17/09/22 17:19:32 INFO mapreduce.Job: Job job_1505892176617_0002 completed successfully 17/09/22 17:19:32 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=91 FILE: Number of bytes written=1308980 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=2590 HDFS: Number of bytes written=215 HDFS: Number of read operations=43 HDFS: Number of large read operations=0 HDFS: Number of write operations=3 Job Counters Launched map tasks=10 Launched reduce tasks=1 Data-local map tasks=10 Total time spent by all maps in occupied slots (ms)=58972 Total time spent by all reduces in occupied slots (ms)=5766 Total time spent by all map tasks (ms)=58972 Total time spent by all reduce tasks (ms)=5766 Total vcore-seconds taken by all map tasks=58972 Total vcore-seconds taken by all reduce tasks=5766 Total megabyte-seconds taken by all map tasks=60387328 Total megabyte-seconds taken by all reduce tasks=5904384 Map-Reduce Framework Map input records=10 Map output records=20 Map output bytes=180 Map output materialized bytes=340 Input split bytes=1410 Combine input records=0 Combine output records=0 Reduce input groups=2 Reduce shuffle bytes=340 Reduce input records=20 Reduce output records=0 Spilled Records=40 Shuffled Maps =10 Failed Shuffles=0 Merged Map outputs=10 GC time elapsed (ms)=1509 CPU time spent (ms)=10760 Physical memory (bytes) snapshot=4541886464 Virtual memory (bytes) snapshot=30556168192 Total committed heap usage (bytes)=3937402880 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=1180 File Output Format Counters Bytes Written=97 Job Finished in 102.286 seconds Estimated value of Pi is 3.14800000000000000000
遇到的問題:
一、在Windows Server2008 r2服務器使用VM安裝Centos7時,報錯:
此主機不支持64位客戶機操做系統,此係統沒法運行
這個須要分別在VM的虛擬機編輯中添加VT-X虛擬化功能,而且在Windows Server服務器的虛擬機服務器管理Web界面同步設置。
二、在集羣設置時,好幾個組件安裝失敗。
首次,
重試後
如上問題至今未解決,歡迎高手指教。
鑄劍團隊簽名:
【總監】十二春秋之,3483099@qq.com;
【Master】戈稻不蒼,han169@126.com;
【Java開發】雨鷥,343691194@qq.com;思齊駿惠,qiangzhang1227@163.com;小王子,545106057@qq.com;巡山小鑽風,840260821@qq.com;
【VS開發】豆點,2268800211@qq.com;
【系統測試】土鏡問道,847071279@qq.com;塵子與自由,695187655@qq.com;
【大數據】沙漠綠洲,caozhipan@126.com;張三省,570417591@qq.com;
【網絡】夜孤星,11297761@qq.com;
【系統運營】三石頭,261453882@qq.com;平凡怪咖,591169003@qq.com;
【容災備份】秋天的雨,18568921@qq.com;
【安全】保密,你懂的。
原創做者:張三省
著做權歸做者全部。商業轉載請聯繫做者得到受權,非商業轉載請註明出處。