目錄 java
Hadoop能在單臺機器上以僞分佈式模式運行,即每一個Hadoop模塊運行在單獨的java進程裏 node
centos:5.8 apache
hadoop:2.2.0 vim
不是必須的,但從安全和運維的角度,建議隔離在一個專門的用戶下 centos
- sudo groupadd hadoop
- sudo useradd -g hadoop hadoop
- sudo passwd hadoop
切換到hadoop用戶: 安全
su hadoop bash
- ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
- cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
測試: 運維
ssh localhost ssh
若是不用輸入密碼,則設置成功 maven
若是仍然須要輸入密碼,能夠參考本站另外一篇博文《ssh無密碼登錄》,修改相應文件夾的權限
官網下載hadoop-2.2.0.tar.gz
tar -xvzf hadoop-2.2.0.tar.gz -C /var/
cd /var/hadoop-2.2.0/
$ vim ~/.bashrc
添加:
export HADOOP_PREFIX=/var/hadoop-2.2.0
export HADOOP_COMMON_HOME=$HADOOP_PREFIX
export HADOOP_HDFS_HOME=$HADOOP_PREFIX
export HADOOP_MAPRED_HOME=$HADOOP_PREFIX
export HADOOP_YARN_HOME=$HADOOP_PREFIX
export HADOOP_CONF_DIR=$HADOOP_PREFIX/etc/hadoop
export PATH=$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin
hadoop配置文件默認在安裝目錄的etc/hadoop文件夾下面
vim hadoop-env.sh
主要是配置JAVA_HOME,設置正確的JDK位置
vim core-site.xml
- <configuration>
- <property>
- <name>fs.defaultFS</name>
- <value>hdfs://localhost</value>
- </property>
- </configuration>
vim hdfs-site.xml
- <configuration>
- <property>
- <name>dfs.datanode.data.dir</name>
- <value>file:///home/hadoop/hdfs/datanode</value>
- </property>
- <property>
- <name>dfs.namenode.name.dir</name>
- <value>file:///home/hadoop/hdfs/namenode</value>
- </property>
- <property>
- <name>dfs.namenode.checkpoint.dir</name>
- <value>file:///home/hadoop/hdfs/namesecondary</value>
- </property>
- <property>
- <name>dfs.replication</name>
- <value>1</value>
- </property>
- </configuration>
hadoop會自動建立相應的目錄
yarn-site.xml
添加:
- <property>
- <name>yarn.nodemanager.aux-services</name>
- <value>mapreduce_shuffle</value>
- </property>
mv mapred-site.xml.template mapred-site.xml
vim mapred-site.xml
添加:
- <property>
- <name>mapreduce.framework.name</name>
- <value>yarn</value>
- </property>
hdfs namenode -format
start-dfs.sh
start-yarn.sh
若是使用maven引用hadoop的jar包方式,必定注意hadoop集羣的版本,是1.0仍是2.0
不然會出現相似「Server IPC version 7 cannot communicate with client version 4」的錯誤
若是是hadoop1版本,在pom.xml下添加相似下面依賴:
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>1.2.0</version>
</dependency>
注意的是這個依賴並不全,若是寫mr任務或者寫hdfs,還須要引入其餘依賴
若是是hadoop2,添加相似下面依賴:
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.4.0</version>
</dependency>
這個依賴基本會把因此相關jar包都包含了