一、實驗環境java
1. IP和主機名和域名,全部主機均可以鏈接互聯網
10.0.70.242 hadoop1 hadoop1.com
10.0.70.243 hadoop2 hadoop2.com
10.0.70.230 hadoop3 hadoop3.com
10.0.70.231 hadoop4 hadoop4.com
python
2. 操做系統
CentOS release 6.5 (Final) 64位
mysql
2、配置步驟linux
1. 安裝前準備(都是使用root用戶在集羣中的全部主機配置)
(1)從如下地址下載所須要的安裝文件
http://archive.cloudera.com/cm5/cm/5/cloudera-manager-el6-cm5.7.0_x86_64.tar.gz
http://archive.cloudera.com/cdh5/parcels/5.7/CDH-5.7.0-1.cdh5.7.0.p0.45-el6.parcel
http://archive.cloudera.com/cdh5/parcels/5.7/CDH-5.7.0-1.cdh5.7.0.p0.45-el6.parcel.sha1
http://archive.cloudera.com/cdh5/parcels/5.7/manifest.json
(2)使用下面的命令檢查OS依賴包,xxxx換成包名
# rpm -qa | grep xxxx
如下這些包必須安裝:
chkconfig
python (2.6 required for CDH 5)
bind-utils
psmisc
libxslt
zlib
sqlite
cyrus-sasl-plain
cyrus-sasl-gssapi
fuse
portmap (rpcbind)
fuse-libs
redhat-lsb
(3)配置域名解析
# vi /etc/hosts
# 添加以下內容
sql
10.0.70.242hadoop1shell
10.0.70.243hadoop2數據庫
10.0.70.230hadoop3json
10.0.70.231hadoop4vim
或者作好域名解析api
(4)安裝JDK
CDH5推薦的JDK版本是1.7.0_6七、1.7.0_7五、1.7.0_80,這裏安裝jdk1.8.0_51
注意:
. 全部主機要安裝相同版本的JDK
. 安裝目錄爲/app/zpy/jdk1.8.0_51/
# mkdir -p /app/zpy
# cd /app/zpy/3rd
# tar zxvf jdk-8u51-linux-x64.tar.gz -C /app/zpy
# chown -R root.root jdk1.8.0_51/
# cat /etc/profile
JAVA_HOME=/app/zpy/jdk1.8.0_51
JAVA_BIN=/app/zpy/jdk1.8.0_51/bin
PATH=$PATH:$JAVA_BIN
CLASSPATH=$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export AVA_HOME JAVA_BIN PATH CLASSPATH
# . /etc/profile
(5)NTP時間同步
# echo "0 * * * * root ntpdate 10.0.70.2" >> /etc/crontab
# /etc/init.d/crond restart
(6)創建CM用戶
# useradd --system --home=/app/zpy/cm-5.7.0/run/cloudera-scm-server --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm
# sed -i "s/Defaults requiretty/#Defaults requiretty/g" /etc/sudoers
(7)安裝配置MySQL數據庫
# yum install -y mysql
# 修改root密碼
mysqladmin -u root password
# 編輯配置文件
vi /etc/my.cnf
# 內容以下
[mysqld]
transaction-isolation = READ-COMMITTED
# Disabling symbolic-links is recommended to prevent assorted security risks;
# # to do so, uncomment this line:
# # symbolic-links = 0
#
key_buffer = 16M
key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1
#
max_connections = 550
#expire_logs_days = 10
# #max_binlog_size = 100M
#
# #log_bin should be on a disk with enough free space. Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your system
# #and chown the specified folder to the mysql user.
log_bin=/var/lib/mysql/mysql_binary_log
#
# # For MySQL version 5.1.8 or later. Comment out binlog_format for older versions.
binlog_format = mixed
#
read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
# join_buffer_size = 8M
#
# # InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
#
sql_mode=STRICT_ALL_TABLES
# 添加開機啓動
chkconfig mysql on
# 啓動MySQL
service mysql restart
對於沒有innodb的狀況
>show databases;查看
刪除/var/lib/mysql/下ib*,重啓服務便可
# 根據須要創建元數據庫
>create database hive;
>grant all on hive.* to 'hive'@'%' identified by '1qaz@WSX?';
>create database man;
>grant all on man.* to 'man'@'%' identified by '1qaz@WSX?';
>create database oozie;
>grant all on oozie.* to 'oozie'@'%' identified by '1qaz@WSX?';
(8)安裝MySQL JDBC驅動
# cd /app/zpy/3rd
# cp mysql-connector-java-5.1.38-bin.jar /app/zpy/cm-5.7.0/share/cmf/lib/
(9)配置免密碼ssh(這裏配置了任意兩臺機器都免密碼)
# # #分別在四臺機器上生成密鑰對:
# cd ~
# ssh-keygen -t rsa
# # # 而後一路回車
# # # 在hadoop1上執行:
# cd ~/.ssh/
# ssh-copy-id hadoop1
# scp /root/.ssh/authorized_keys hadoop2:/root/.ssh/
# # # 在hadoop2上執行:
# cd ~/.ssh/
# ssh-copy-id hadoop2
# scp /root/.ssh/authorized_keys hadoop3:/root/.ssh/
# # # 在hadoop3上執行:
# cd ~/.ssh/
# ssh-copy-id hadoop3
# scp /root/.ssh/authorized_keys hadoop4:/home/grid/.ssh/
# # # 在hadoop4上執行:
# cd ~/.ssh/
# ssh-copy-id hadoop4
# scp /root/.ssh/authorized_keys hadoop1:/root/.ssh/
# scp /root/.ssh/authorized_keys hadoop2:/root/.ssh/
# scp /root/.ssh/authorized_keys hadoop3:/root/.ssh/
2. 在hadoop1上安裝Cloudera Manager
# tar zxvf cloudera-manager-el6-cm5.7.0_x86_64.tar.gz -C /app/zpy/
# # # 創建cm數據庫
# /app/zpy/cm-5.7.0/share/cmf/schema/scm_prepare_database.sh mysql cm -hlocalhost -uroot -p1qaz@WSX? --scm-host localhost scm scm scm
# # # 配置cm代理
# vim /app/zpy/cm-5.7.0/etc/cloudera-scm-agent/config.ini
# # # 將cm主機名改成hadoop1或者改成域名hadoop1.com
server_host=hadoop1
# # # 將Parcel相關的三個文件拷貝到/opt/cloudera/parcel-repo 做爲本地源!
# cp CDH-5.7.0-1.cdh5.7.0.p0.45-el6.parcel /opt/cloudera/parcel-repo/
# cp CDH-5.7.0-1.cdh5.7.0.p0.45-el6.parcel.sha1 /opt/cloudera/parcel-repo/
# cp manifest.json /opt/cloudera/parcel-repo/
# # # 更名
# mv /opt/cloudera/parcel-repo/CDH-5.7.0-1.cdh5.7.0.p0.45-el6.parcel.sha1 /opt/cloudera/parcel-repo/CDH-5.7.0-1.cdh5.7.0.p0.45-el6.parcel.sha
# # # 修改屬主
# chown -R cloudera-scm:cloudera-scm /opt/cloudera/
# # # 將/app/zpy/cm-5.7.0目錄拷貝到其它三個主機
# scp -r -p /app/zpy/cm-5.7.0 hadoop2:/app/zpy/
# scp -r -p /app/zpy/cm-5.7.0 hadoop3:/app/zpy/
# scp -r -p /app/zpy/cm-5.7.0 hadoop4:/app/zpy/
3. 在每一個主機上創建/opt/cloudera/parcels目錄,並修改屬主
# mkdir -p /opt/cloudera/parcels
# chown cloudera-scm:cloudera-scm /opt/cloudera/parcels
4. 在hadoop1上啓動cm server
# /app/zpy/cm-5.7.0/etc/init.d/cloudera-scm-server start
# # # 此步驟須要運行一些時間,用下面的命令查看啓動狀況
# tail -f /app/zpy/cm-5.7.0/log/cloudera-scm-server/cloudera-scm-server.log
5. 在全部主機上啓動cm agent
# mkdir /app/zpy/cm-5.7.0/run/cloudera-scm-agent
# chown cloudera-scm:cloudera-scm /app/zpy/cm-5.7.0/run/cloudera-scm-agent
# /app/zpy/cm-5.7.0/etc/init.d/cloudera-scm-agent
6. 登陸cm控制檯,安裝CDH5
打開控制檯
http://10.0.70.242:7180/
頁面如圖1所示。
圖1
缺省的用戶名和密碼都是admin,登陸後進入歡迎頁面。勾選許可協議,如圖2所示,點繼續。
圖2
進入版本說明頁面,如圖3所示,點繼續。
圖3
進入服務說明頁面,如圖4所示,點繼續。
圖4
進入選擇主機頁面,當前管理的主機。如圖5所示,全選四個主機,點繼續。
圖5
進入選擇存儲庫頁面,如圖6所示,點繼續。
圖6
進入集羣安裝頁面,如圖7所示,點繼續。
圖7
進入驗證頁面,如圖8所示,點完成。
圖8
進入集羣設置頁面,如圖9所示,根據須要選擇服務,這裏咱們選擇自定義,選擇須要的服務。後期也能夠添加服務,點繼續。
圖9
進入自定義角色分配頁面,如圖10所示,保持不變,點繼續。
圖10
進入數據庫設置頁面,填寫相關信息,點測試鏈接,如圖11所示,點繼續。
圖11
進入審覈更改頁面,保持不變,點繼續。
進入首次運行頁面,等待運行完,如圖12所示,點繼續。
圖11
進入安裝成功頁面,如圖13所示,點完成。
圖13
進入安裝成功頁面,如圖14所示。
注意:
1)
Error found before invoking supervisord: dictionary update sequence element #78 has length1; 2 is required
這個錯誤是CM的一個bug,解決方法爲修改 /app/zpy/cm-5.7.0/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.7.0-py2.6.egg/cmf/util.py文件。將其中的代碼:
pipe = subprocess.Popen(['/bin/bash', '-c', ". %s; %s; env" % (path, command)],
stdout=subprocess.PIPE, env=caller_env)
修改成:
pipe = subprocess.Popen(['/bin/bash', '-c', ". %s; %s; env | grep -v { | grep -v }" % (path, command)],
stdout=subprocess.PIPE, env=caller_env)
而後重啓全部Agent便可。
2)
安裝hive報錯數據庫建立失敗時
# cp /app/zpy/3rd/mysql-connector-java-5.1.38-bin.jar /opt/cloudera/parcels/CDH-5.7.0-1.cdh5.7.0.p0.45/lib/hive/lib/
3)
手動添加應用
4)對於spark找不到java_home的報錯解決方法以下:
echo "export JAVA_HOME=/opt/javak1.8.0_51" >> /opt/cloudera/parcels/CDH-5.7.0-1.cdh5.7.0.p0.45/metah_env.sh
如圖: