規劃 三臺機器,html
vi /etc/hosts #127.0.0.1 localhost.localdomain localhost 爲了以防萬一,我將127.0.0.1也註釋掉,正常應該只註釋ipv6 ::1便可 #::1 localhost6.localdomain6 localhost6 192.168.79.135 master 192.168.79.131 slave1 192.168.79.132 slave2
能夠每一個機器都單獨配置,也可使用scp命令進行服務器間的拷貝,可是此時沒有進行免密碼登錄(後邊將有ssh免密碼登陸的說明),拷貝時須要輸入密碼
#vi /etc/sysconfig/network NETWORKING=yes NETWORKING_IPV6=no HOSTNAME =master NTPSERVERARGS=iburst
將HOSTNAME修改成預先規劃的名稱,三個機器都要修改,固然、不要重複java
每一個機器執行 ssh-keygen 在主節點master執行 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys ssh slave1 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys ssh slave2 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
執行完之後,將authorized_keys文件傳輸到每一個機器的/etc/目錄下node
先查看 rpm -qa | grep java 顯示以下信息: java-1.4.2-gcj-compat-1.4.2.0-40jpp.115 java-1.6.0-openjdk-1.6.0.0-1.7.b09.el5 卸載: rpm -e --nodeps java-1.4.2-gcj-compat-1.4.2.0-40jpp.115 rpm -e --nodeps java-1.6.0-openjdk-1.6.0.0-1.7.b09.el5
下載jdk 對應版本tar包,解壓到指定目錄,本人(/usr/java/)解壓後在/etc/profile中配置jdk環境變量export JAVA_HOME=/usr/java/jdk1.7.0_67 export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar export JRE_HOME=/usr/java/jdk1.7.0_67/jre export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin修改完成後,執行source /etc/profile,是環境變量生效web
接下來驗證jdk是否安裝成功:[root@master ~]# java -version java version "1.7.0_67" Java(TM) SE Runtime Environment (build 1.7.0_67-b15) Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode) [root@master ~]# javac -version javac 1.7.0_67 [root@master ~]# $JAVA_HOME -bash: /usr/java/jdk1.7.0_67: is a directory如上顯示正常版本及jdk路徑,則安裝成功,能夠把java安裝文件及/etc/profile文件拷貝到其餘節點機器,執行 source /etc/profile,便可。apache
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://master:9000</value> <description>The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem.</description> </property> <property> <name>hadoop.tmp.dir</name> <value>/data/hadoop/tmp/hadoop-${user.name}</value> <description>A base for other temporary directories.</description> </property> <!-- i/o properties --> <property> <name>io.file.buffer.size</name> <value>131072</value> <description>The size of buffer for use in sequence files. The size of this buffer should probably be a multiple of hardware page size (4096 on Intel x86), and it determines how much data is buffered during read and write operations.</description> </property> </configuration>
(3).hdfs-site.xmlbash
<configuration> <property> <name>dfs.namenode.name.dir</name> <value>/data/hadoop/hadoop-2.5.2/hdfs/name</value> <description>Determines where on the local filesystem the DFS name node should store the name table(fsimage). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy. </description> </property> <property> <name>dfs.datanode.data.dir</name> <value>/data/hadoop/hadoop-2.5.2/hdfs/data</value> <description>Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored. </description> </property> <property> <name>dfs.replication</name> <value>2</value> <description>Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time. </description> </property> <property> <name>dfs.blocksize</name> <value>134217728</value> <description> The default block size for new files, in bytes. You can use the following suffix (case insensitive): k(kilo), m(mega), g(giga), t(tera), p(peta), e(exa) to specify the size (such as 128k, 512m, 1g, etc.), Or provide complete size in bytes (such as 134217728 for 128 MB). </description> </property> <property> <name>dfs.namenode.handler.count</name> <value>10</value> <description>The number of server threads for the namenode.</description> </property> </configuration>
(4)、mapred-site.xml服務器
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> <description>The runtime framework for executing MapReduce jobs. Can be one of local, classic or yarn. </description> </property> <!-- jobhistory properties --> <property> <name>mapreduce.jobhistory.address</name> <value>master:10020</value> <description>MapReduce JobHistory Server IPC host:port</description> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>master:19888</value> <description>MapReduce JobHistory Server Web UI host:port</description> </property> </configuration>
(5)、yarn-site.xml網絡
<configuration> <!-- Site specific YARN configuration properties --> <property> <description>The hostname of the RM.</description> <name>yarn.resourcemanager.hostname</name> <value>master</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>${yarn.resourcemanager.hostname}:8032</value> <description>The address of the applications manager interface in the RM.</description> </property> <property> <description>The address of the scheduler interface.</description> <name>yarn.resourcemanager.scheduler.address</name> <value>${yarn.resourcemanager.hostname}:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>${yarn.resourcemanager.hostname}:8031</value> </property> <property> <description>The address of the RM admin interface.</description> <name>yarn.resourcemanager.admin.address</name> <value>${yarn.resourcemanager.hostname}:8033</value> </property> <property> <description>The http address of the RM web application.</description> <name>yarn.resourcemanager.webapp.address</name> <value>${yarn.resourcemanager.hostname}:8088</value> </property> <property> <description>The minimum allocation for every container request at the RM, in MBs. Memory requests lower than this won't take effect, and the specified value will get allocated at minimum. default is 1024 </description> <name>yarn.scheduler.minimum-allocation-mb</name> <value>512</value> </property> <property> <description>The maximum allocation for every container request at the RM, in MBs. Memory requests higher than this won't take effect, and will get capped to this value. default value is 8192</description> <name>yarn.scheduler.maximum-allocation-mb</name> <value>2048</value> </property> <property> <description>Amount of physical memory, in MB, that can be allocated for containers.default value is 8192</description> <name>yarn.nodemanager.resource.memory-mb</name> <value>2048</value> </property> <property> <description>Whether to enable log aggregation. Log aggregation collects each container's logs and moves these logs onto a file-system, for e.g. HDFS, after the application completes. Users can configure the "yarn.nodemanager.remote-app-log-dir" and "yarn.nodemanager.remote-app-log-dir-suffix" properties to determine where these logs are moved to. Users can access the logs via the Application Timeline Server. </description> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> </configuration>
slave1
slave2
在以上配置文件中,有好多屬性爲hadoop默認屬性值,拿來只是爲了標註清楚,在配置時,若發現與默認文檔相同的值,能夠省略app
至此,hadoop配置文件就配置完了,接着須要配置hadoop的環境變量,以前java環境變量也是包含其中,以下:dom
/etc/profile
#set java_env export JAVA_HOME=/usr/java/jdk1.7.0_67 export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar export JRE_HOME=/usr/java/jdk1.7.0_67/jre export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin ###set hadoop_env export HADOOP_HOME=/data/hadoop/hadoop-2.5.2 export HADOOP_COMMON_HOME=$HADOOP_HOME export HADOOP_HDFS_HOME=$HADOOP_HOME export HADOOP_MAPRED_HOME=$HADOOP_HOME export HADOOP_YARN_HOME=$HADOOP_HOME export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HADOOP_HOME/lib export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
一樣執行source /etc/profile
<property> <name>dfs.namenode.rpc-bind-host</name> <value></value> <description> The actual address the RPC server will bind to. If this optional address is set, it overrides only the hostname portion of dfs.namenode.rpc-address. It can also be specified per name node or name service for HA/Federation. This is useful for making the name node listen on all interfaces by setting it to 0.0.0.0. </description> </property>
[root@master hadoop]# jps 2630 Jps 1955 SecondaryNameNode 1785 NameNode [root@slave1 ~]# jps 1942 Jps 1596 DataNode
若在master使用jps發現上邊兩個進程,在slave發現DataNode,則dfs啓動成功(固然你須要在slave節點查看日誌,如有錯仍還須要排查)
[root@master hadoop]# jps 2630 Jps 1955 SecondaryNameNode 1785 NameNode 2316 ResourceManager [root@slave1 ~]# jps 1942 Jps 1596 DataNode 1774 NodeManager
啓動成功後,master增長一個ResourceManager,slave增長一個NodeManager,啓動成功
如下命令 用於啓動mapreduce-jobhistoryserver (若不須要查看,則不須要啓動),jps能夠查看到多了一個JobHistoryServer./mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR
Daemon | Web Interface | Notes |
---|---|---|
NameNode | http://nn_host:port/ | Default HTTP port is 50070. |
ResourceManager | http://rm_host:port/ | Default HTTP port is 8088. |
MapReduce JobHistory Server | http://jhs_host:port/ | Default HTTP port is 19888. |
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException: Call From slave1/192.168.79.131 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: 拒絕鏈接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused