8. 使用當前root用戶建立文件夾,並給/opt/下的全部文件夾及文件賦予775權限,修改用戶組爲當前用戶html
mkdir -p /opt/modules mkdir -p /opt/software mkdir -p /opt/datas mkdir -p /opt/tools chmod 775 /opt/* chown beifeng:beifeng /opt/*
最終效果以下:java
[beifeng@beifeng-hadoop-02 opt]$ pwd /opt [beifeng@beifeng-hadoop-02 opt]$ ll total 20 drwxrwxr-x. 5 beifeng beifeng 4096 Jul 30 00:13 clusterapps drwxr-xr-x. 11 beifeng beifeng 4096 Jul 21 23:30 datas drwxr-xr-x. 6 beifeng beifeng 4096 Jul 31 22:03 modules drwxr-xr-x. 2 beifeng beifeng 4096 Jul 30 18:17 software drwxr-xr-x. 2 beifeng beifeng 4096 Jul 10 20:26 tools
jdk-7u67-linux-x64.tar.gznode
tar -zxvf jdk-7u67-linux-x64.tar.gz -C /opt/modules
1)使用sudo配置/etc/profile,在文件尾加上如下配置linux
#JAVA_HOME export JAVA_HOME=/opt/modules/jdk1.7.0_67 export PATH=$PATH:$JAVA_HOME/bin
2)配置完成後,使用su - root 切換到root用戶,使用source命令生效配置。web
source /etc/profile
3)驗證jdk是否安裝成功shell
[root@beifeng-hadoop-02 ~]# java -version java version "1.7.0_67" Java(TM) SE Runtime Environment (build 1.7.0_67-b01) Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode) [root@beifeng-hadoop-02 ~]# javac -version javac 1.7.0_67
下載地址:http://archive.cloudera.com/cdh5/cdh/5/apache
下載: hadoop-2.5.0-cdh5.3.6.tar.gzubuntu
tar -zxvf hadoop-2.5.0-cdh5.3.6.tar.gz -C /opt/modules/cdh/
參考文檔: http://hadoop.apache.org/docs/r2.5.2/hadoop-project-dist/hadoop-common/ClusterSetup.htmlwindows
cd /opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/etc/hadoop
修改/etc/profile,在文件尾增長如下配置:服務器
#HADOOP_HOME export HADOOP_HOME=/opt/modules/cdh/hadoop-2.5.0-cdh5.3.6 export PATH=$PATH:$HADOOP_HOME/bin export PATH=$PATH:$HADOOP_HOME/sbin export HADOOP_MAPRED_HOME=$HADOOP_HOME export HADOOP_COMMON_HOME=$HADOOP_HOME export HADOOP_HDFS_HOME=$HADOOP_HOME export YARN_HOME=$HADOOP_HOME export HADOOP_COMMON_LIB_NATIE_DIR=$HADOOP_HOME/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
建議使用遠程sftp編輯工具,windows上能夠使用notepad++,mac上推薦使用skEdit。
1)修改hadoop-evn.sh
export JAVA_HOME=/opt/modules/jdk1.7.0_67
2)修改yarn-env.sh
export JAVA_HOME=/opt/modules/jdk1.7.0_67
3)修改mapred-env.sh
export JAVA_HOME=/opt/modules/jdk1.7.0_67
4)修改core-site.xml
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://beifeng-hadoop-02:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/data/tmp</value> </property> <property> <name>hadoop.http.staticuser.user</name> <value>beifeng</value> </property> </configuration>
5)修改hdfs-site.xml
<configuration> <!-- 數據副本數,副本數等於全部datanode的總和 --> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>beifeng-hadoop-02:50090</value> </property> <property> <name>dfs.permissions.enabled</name> <value>false</value> </property> </configuration>
6)修改slaves
beifeng-hadoop-02
7)修改yarn-site.xml
<configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.hostname</name> <value>beifeng-hadoop-02</value> </property> <!-- 是否啓用日誌彙集功能 --> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <!-- 日誌保留時間(單位爲秒) --> <property> <name>yarn.log-aggregation.retain-seconds</name> <value>106800</value> </property> </configuration>
8) 修改mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
9)啓動服務
(1)格式化hdfs
bin/hdfs namenode -format
(2)啓動namenode和datanode
sbin/hadoop-daemon.sh start namenode sbin/hadoop-daemon.sh start datanode
使用jps命令,或者web UI界面查看namenode是否已啓動成功。
[beifeng@beifeng-hadoop-02 hadoop-2.5.0-cdh5.3.6]$ jps 82334 DataNode 82383 Jps 82248 NameNode
hdfs可視化界面: http://beifeng-hadoop-02:50070/dfshealth.html#tab-overview
(2)啓動resourcemanager和nodemanager
sbin/yarn-daemon.sh start resourcemanager sbin/yarn-daemon.sh start nodemanager
使用jps命令,或者web UI界面查看resourcemanager和nodemanager是否已成功啓動
[beifeng@beifeng-hadoop-02 hadoop-2.5.0-cdh5.3.6]$ jps 82334 DataNode 82757 NodeManager 82874 Jps 82248 NameNode 82507 ResourceManager
yarn可視化界面: http://beifeng-hadoop-02:8088/cluster
(3)啓動job歷史服務器
sbin/mr-jobhistory-daemon.sh start historyserver
查看是否已成功啓動:
歷史服務器可視化界面:http://beifeng-hadoop-02:19888/
(4)啓動secondarynamenode
sbin/hadoop-daemon.sh start secondarynamenode
查看是否已成功啓動:
secondarynamenode可視化界面 http://beifeng-hadoop-02:50090/status.html
(5)全部相關服務中止命令
sbin/hadoop-daemon.sh stop namenode sbin/hadoop-daemon.sh stop datanode sbin/yarn-daemon.sh stop resourcemanager sbin/yarn-daemon.sh stop nodemanager sbin/mr-jobhistory-daemon.sh stop historyserver sbin/hadoop-daemon.sh stop secondarynamenode
10)跑一個wordcount 驗證環境搭建結果
文件系統shell:http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.5.0-cdh5.3.6/hadoop-project-dist/hadoop-common/FileSystemShell.html
hdfs dfs -mkdir -p /user/beifeng/input hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0-cdh5.3.6.jar wordcount /user/beifeng/input /user/beifeng/output hdfs dfs -cat /user/beifeng/output/part-r-00000
1)修改core-site.xml
<!-- SNAPPY compress --> <property> <name>io.compression.codecs</name> <value>org.apache.hadoop.io.compress.GzipCodec, org.apache.hadoop.io.compress.DefaultCodec, org.apache.hadoop.io.compress.BZip2Codec, org.apache.hadoop.io.compress.SnappyCodec </value> <description>A comma-separated list of the compression codec classes that can be used for compression/decompression. In addition to any classes specified with this property (which take precedence), codec classes on the classpath are discovered using a Java ServiceLoader. </description> </property>
2)修改mapred-site.xml
<!-- 開啓 MapReduce map 輸出結果壓縮功能 --> <property> <name>mapreduce.map.output.compress</name> <value>true</value> </property> <property> <name>mapreduce.map.output.compress.codec</name> <value>org.apache.hadoop.io.compress.SnappyCodec</value> </property>
1)解壓
tar -zxvf snappy-1.1.2.tar.gz -C /opt/modules/cdh/ cd /opt/modules/cdh/snappy-1.1.2
2)預編譯
./configure
3)編譯安裝
sudo make && sudo make install
4)編譯成功後,查看安裝目錄
cd /usr/local/lib && ls
1)解壓
tar -zxvf hadoop-snappy.tar.gz -C /opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/
2)打包編譯
cd /opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/hadoop-snappy mvn package -Dsnappy.prefix=/usr/local
sudo ln -s /opt/modules/jdk1.7.0_67/jre/lib/amd64/server/libjvm.so /usr/local/lib
3)copy 編譯好的jar包到hadoop lib下
cp /opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/hadoop-snappy/target/hadoop-snappy-0.0.1-SNAPSHOT.jar /opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/lib
4)修改hadoop-env.sh
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/native/Linux-amd64-64/
5)編譯生成後的動態庫 copy 到 $HADOOP_HOME/lib/native/ 目錄下
cd /opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/hadoop-snappy/target/hadoop-snappy-0.0.1-SNAPSHOT-tar/hadoop-snappy-0.0.1-SNAPSHOT/lib cp -r native/Linux-amd64-64 /opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/lib/native/
6)copy Linux-amd64-64 目錄下的文件,到/opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/lib/native/
cd Linux-amd64-64/ cp -r ./* ../
注意.m2/settings.xml文件,使用maven原生的配置,不然沒法加載pom
mvn package -Pdist,native -DskipTests -Dtar -Drequire.snappy
執行了一半,磁盤空間不夠
http://os.51cto.com/art/201012/240726_all.htm
http://www.cnblogs.com/chenmh/p/5096592.html
http://www.linuxfly.org/post/243/
1)替換 hadoop 安裝目錄下的 lib/native 目錄下的本地庫文件
/opt/modules/hadoop-2.5.0-src/hadoop-dist/target/hadoop-2.5.0/lib/native
cp ./* /opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/lib/native/
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0-cdh5.3.6.jar pi 2 100 hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0-cdh5.3.6.jar wordcount /user/beifeng/input /user/beifeng/output03 hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0-cdh5.3.6.jar wordcount -Dmapreduce.map.output.compress=true -Dmapreduce.map.output.codec=org.apache.hadoop.io.compress.SnappyCodec /user/beifeng/input /user/beifeng/output02