最近大數據比較火,因此也想學習一下,因此在虛擬機安裝Ubuntu Server,而後安裝Hadoop。php
如下是安裝步驟:java
若是是新機器,默認沒有安裝java,運行java –version命名,看是否能夠查看Java版本,若是未安裝Java,這運行如下命名:node
# Update the source list
$ sudo apt-get updateapache
# The OpenJDK project is the default version of Java
# that is provided from a supported Ubuntu repository.
$ sudo apt-get install default-jdkubuntu
$ java -version bash
$sudo addgroup hadoopapp
$ sudo adduser --ingroup hadoop hduserssh
$ sudo apt-get install sshjvm
$ su hduseride
$ ssh-keygen -t rsa -P ""
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
接下來運行ssh命令,測試一下是否成功.
$ ssh localhost
首先須要下載並解壓Hadoop文件,運行命令:
$wget http://apache.spinellicreations.com/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz
這裏的URL是最新的Hadoop2.6.0版,安裝的時候能夠先到官方網站看看須要下載哪一個版本,而後更換這個Url.
下載完畢後,就是解壓縮:
$ tar xvzf hadoop-2.6.0.tar.gz
而後將Hadoop文件夾搬到新文件夾,而且給hduser這個用戶權限:
$ sudo mv hadoop-2.6.0 /usr/local/hadoop
$ cd /usr/local
$ sudo chown -R hduser:hadoop hadoop
接下來咱們可使用putty經過ssh鏈接到Ubuntu了,將當前用戶切換到hduser作以下的操做:
首先運行命令查看Java的路徑:
$ update-alternatives --config java
There is only one alternative in link group java (providing /usr/bin/java): /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java
Nothing to configure.
這裏咱們須要的JavaHome就是:/usr/lib/jvm/java-7-openjdk-amd64,【注意,這裏沒有後面的/jre/bin/java部分】 ,而後使用vi編輯~/.bashrc
#HADOOP VARIABLES START
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib/native"
#HADOOP VARIABLES END
文件的路徑爲:/usr/local/hadoop/etc/hadoop/hadoop-env.sh,找到對應的行,將內容改成:
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
在修改這個文件以前,咱們須要使用超級用戶建立一個目錄,並給予hduser該目錄的權限:
$ sudo mkdir -p /app/hadoop/tmp
$ sudo chown hduser:hadoop /app/hadoop/tmp
接下來切換回hduser用戶,修改配置文件,文件路徑:/usr/local/hadoop/etc/hadoop/core-site.xml,使用VI,將配置改成:
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
</configuration>
默認狀況下,咱們只有/usr/local/hadoop/etc/hadoop/mapred-site.xml.template,咱們須要先基於這個文件,copy一個新的文件出來,而後再進行修改。
$ cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/etc/hadoop/mapred-site.xml
使用VI打開,修改配置以下:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
</configuration>
在修改以前,也是須要切換回超級管理員帳戶,建立須要用到的目錄:
$ sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode
$ sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode
$ sudo chown -R hduser:hadoop /usr/local/hadoop_store
而後切換回來hduser用戶,修改配置文件:/usr/local/hadoop/etc/hadoop/hdfs-site.xml,改成:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
</configuration>
$ hadoop namenode –format
若是發現hadoop這個命令不認識,那是由於環境變量沒有載入,最簡單辦法就是登出,而後再登入hduser就能夠了。這個命令會刪掉全部已經存在的數據,因此若是已經有數據的狀況下,慎用這個命令。
首先啓用無密碼的ssh。否則接下來啓動的時候會不斷的提示輸入密碼,很煩人。
ssh-keygen -t rsa
默認位置,無密碼生成密鑰。
chmod 755 ~/.ssh
cd ~/.ssh
cat id_rsa.pub >> authorized_keys
接下來咱們試一下使用ssh鏈接本地,看能鏈接成功不:
ssh localhost
接下來是啓動Hadoop的服務。
使用$ start-all.sh就能夠啓動Hadoop了,判斷是否啓動成功,咱們能夠運行jps命令,咱們能夠看到以下的結果,說明已經啓動成功了:
$ jps
2149 SecondaryNameNode
1805 NameNode
2283 ResourceManager
1930 DataNode
2410 NodeManager
2707 Jps
另外,咱們能夠訪問Hadoop的Web,地址是:
運行命令:
$ stop-all.sh
好了,終於在虛擬機中將Hadoop搭建成功。整個操做過程參考了另外一篇博客:
http://www.bogotobogo.com/Hadoop/BigData_hadoop_Install_on_ubuntu_single_node_cluster.php
我只是把其中須要注意的幾個地方從新說明了一下,借花獻佛。