切記按順序,若是沒有配置好就執行了start-all.sh腳本,可能須要刪除tmp文件,並從新format一下namenode和datanodejava
0.配置hostname和hostsnode
vi /etc/hostname hadoop-01 vi /etc/hosts 192.168.240.138 hadoop-01
1.安裝java,並配置環境變量;bash
2.安裝sshapp
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys ssh localhost ## 登陸成功 ## Last login: Tue Mar 7 21:29:45 2017 from 192.168.240.1
3.解壓hadoop,並配置環境變量ssh
## 解壓移動hadoop tar -zxvf hadoop-2.7.3.tar mv hadoop-2.7.3 /opt/hadoop ## 配置環境變量 vi /etc/profile ## 在文件後面加入 hadoop env export HADOOP_HOME=/opt/hadoop export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin ## 生效 source /etc/profile
4.配置hadoop的配置文件oop
## 進入配置文件目錄 cd /opt/hadoop/etc/hadoop/ ## 配置hadoop-env.sh vi hadoop-env.sh ## 修改java_home,而且加入hadoop_home # The java implementation to use. export JAVA_HOME=/usr/java export HADOOP_HOME=/opt/hadoop ## vi core-site.xml,配置以下 <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://hadoop-01</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/opt/hadoop/tmp</value> </property> </configuration> ## vi hdfs-site.xml,配置以下 <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/opt/hadoop/tmp/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/opt/hadoop/tmp/dfs/data</value> </property> </configuration> ## 修改mapred-site.xml cp mapred-site.xml.template mapred-site.xml vi mapred-site.xml <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration> ## vi yarn-site.xml,配置以下 <configuration> <property> <name>yarn.resourcemanager.hostname</name> <value>hadoop-01</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration>
5.格式化namenode和datanode測試
hadoop namenode -format hadoop datanode -format
6.啓動hadoopurl
start-all.sh
7.簡單測試日誌
[root@hadoop-01 ~]# jps 4401 Jps 4196 NodeManager 3943 SecondaryNameNode 3784 DataNode 3661 NameNode 4094 ResourceManager [root@hadoop-01 ~]# hadoop fs -ls ls: `.': No such file or directory
jps後,出現後五個說明啓動成功了,在用hadoop腳本查看hdfs目錄。code
8.執行worldcount
## 建立input目錄 [root@hadoop-01 ~]# hadoop fs -mkdir -p /data/input ## 建立文件 echo "hello world, I am jungle. bye world" > file.txt ## 把文件放到input文件夾中 hadoop fs -put file.txt /data/input ## 執行wordcount,結果保存到output中 hadoop jar /opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /data/input /data/output ## 日誌以下 17/03/07 23:01:10 INFO client.RMProxy: Connecting to ResourceManager at hadoop-01/192.168.240.138:8032 17/03/07 23:01:13 INFO input.FileInputFormat: Total input paths to process : 1 17/03/07 23:01:13 INFO mapreduce.JobSubmitter: number of splits:1 17/03/07 23:01:14 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1488898172879_0001 17/03/07 23:01:15 INFO impl.YarnClientImpl: Submitted application application_1488898172879_0001 17/03/07 23:01:15 INFO mapreduce.Job: The url to track the job: http://hadoop-01:8088/proxy/application_1488898172879_0001/ 17/03/07 23:01:15 INFO mapreduce.Job: Running job: job_1488898172879_0001 17/03/07 23:01:42 INFO mapreduce.Job: Job job_1488898172879_0001 running in uber mode : false 17/03/07 23:01:42 INFO mapreduce.Job: map 0% reduce 0% 17/03/07 23:02:01 INFO mapreduce.Job: map 100% reduce 0% 17/03/07 23:02:13 INFO mapreduce.Job: map 100% reduce 100% 17/03/07 23:02:15 INFO mapreduce.Job: Job job_1488898172879_0001 completed successfully 17/03/07 23:02:16 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=84 FILE: Number of bytes written=237309 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=137 HDFS: Number of bytes written=50 HDFS: Number of read operations=6 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Launched reduce tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=15087 Total time spent by all reduces in occupied slots (ms)=9565 Total time spent by all map tasks (ms)=15087 Total time spent by all reduce tasks (ms)=9565 Total vcore-milliseconds taken by all map tasks=15087 Total vcore-milliseconds taken by all reduce tasks=9565 Total megabyte-milliseconds taken by all map tasks=15449088 Total megabyte-milliseconds taken by all reduce tasks=9794560 Map-Reduce Framework Map input records=1 Map output records=7 Map output bytes=64 Map output materialized bytes=84 Input split bytes=101 Combine input records=7 Combine output records=7 Reduce input groups=7 Reduce shuffle bytes=84 Reduce input records=7 Reduce output records=7 Spilled Records=14 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=290 CPU time spent (ms)=2880 Physical memory (bytes) snapshot=283889664 Virtual memory (bytes) snapshot=4159328256 Total committed heap usage (bytes)=138174464 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=36 File Output Format Counters Bytes Written=50
## 查詢輸出文件 hadoop fs -ls /data/output ## 查看輸出文件 hadoop fs -cat /data/output/part-r-00000 I 1 am 1 bye 1 hello 1 jungle. 1 world 1 world, 1