Hadoop 2.6 + Hive 1.2.1 + spark-1.4.1(1)

備註: 

hadoop-2.6.0-src.tar.gz是源碼壓縮文件。
能夠用eclipse導入研究源碼,或者Maven構建編譯打包。 hadoop-2.6.0.tar.gz是已經官方發佈版壓縮包,能夠直接使用。
不過官網下載的hadoop發佈版本只適合x86環境,若要x64的則須要Maven從新構建。
 *.mds 是描述文件,記錄壓縮包的MD5,SHA1等信息。


1) 配置服務器的 hostname/ hosts的映射關係

[jiangzl@master hadoop]$ cat /etc/hostname html

masterjava

[jiangzl@master hadoop]$ cat /etc/hostsnode

192.168.1.114 masterpython

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4mysql

::1         localhost localhost.localdomain localhost6 localhost6.localdomain6linux

[jiangzl@master hadoop]$ sql

 

2) 遠程本機的 hosts

     (mac)apache

bogon:~ jiangzl$ vim

bogon:~ jiangzl$ cat /etc/hosts安全

##

# Host Database

#

# localhost is used to configure the loopback interface

# when the system is booting.  Do not change this entry.

##

127.0.0.1   localhost   bogon

255.255.255.255   broadcasthost

::1             localhost

192.168.1.114 master

    (window 7)

在window中配置:主機名對應的ip

C:\Windows\System32\drivers\etc\hosts

192.168.1.114 master

3)  CentOS關閉防火牆:

CentOS 7.0默認使用的是firewall做爲防火牆,這裏改成iptables防火牆。

firewall:

systemctl status firewalld.service  #查看firewall狀態

 

[jiangzl@localhost ~]$ systemctl status firewalld.service

firewalld.service - firewalld - dynamic firewall daemon

   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled)

   Active: active (running) since 四 2015-09-24 13:30:25 CST; 2 weeks 6 days ago

 Main PID: 879 (firewalld)

   CGroup: /system.slice/firewalld.service

           └─879 /usr/bin/python -Es /usr/sbin/firewalld --nofork --nopid

 

systemctl start firewalld.service  #啓動firewall

 

systemctl stop firewalld.service   #中止firewall

[jiangzl@localhost ~]$ systemctl status firewalld.service

firewalld.service - firewalld - dynamic firewall daemon

   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled)

   Active: inactive (dead) since 三 2015-10-14 22:05:20 CST; 4min 12s ago

  Process: 879 ExecStart=/usr/sbin/firewalld --nofork --nopid $FIREWALLD_ARGS (code=exited, status=0/SUCCESS)

 Main PID: 879 (code=exited, status=0/SUCCESS)

 

systemctl disable firewalld.service  #禁止firewall開機啓動

[jiangzl@localhost ~]$ systemctl status firewalld.service     

firewalld.service - firewalld - dynamic firewall daemon

   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled)

   Active: inactive (dead)

4)  SSH免密碼登陸

ssh免密碼登錄

(1)執行命令ssh-keygen -t rsa (而後一路Enter)  產生祕鑰位於 ~/.ssh/

(2)執行命令cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys  產生受權文件

(3)驗證:ssh master  (ssh 主機名)

5) 配置hadoop/java 環境變量

[jiangzl@localhost ~]$ vim .bash_profile

 

#hadoop

export HADOOP_PREFIX=/home/jiangzl/work/hadoop

export HADOOP_HOME=/home/jiangzl/work/hadoop

export PATH=$PATH:$HADOOP_PREFIX/bin

 

# others

export JAVA_HOME=/home/jiangzl/work/jdk

 

export PATH=$PATH:$JAVA_HOME/bin

export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

 

[jiangzl@localhost ~]$ source .bash_profile

 

[jiangzl@master work]$ java -version

java version "1.7.0_79"

Java(TM) SE Runtime Environment (build 1.7.0_79-b15)

Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)

[jiangzl@master work]$

 

linux安裝jdk後發現系統帶有openjdk的處理 (直接卸載)

操做步驟:

http://jingyan.baidu.com/article/73c3ce28f0f68fe50343d9e1.html

6) 配置hadoop 的配置文件

官方文檔:

http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/SingleCluster.html

案例測試:

mkdir input

bin/hdfs dfs -put LICENSE.txt input

bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount input output

hadoop fs -cat output/p*

6) JAVA_HOME is not set

[jiangzl@master hadoop]$ sbin/start-dfs.sh

Starting namenodes on [master]

master: Error: JAVA_HOME is not set and could not be found.

localhost: Error: JAVA_HOME is not set and could not be found.

Starting secondary namenodes [0.0.0.0]

0.0.0.0: Error: JAVA_HOME is not set and could not be found.

[jiangzl@master hadoop]$

 

Hadoop安裝完後,啓動時報Error: JAVA_HOME is not set and could not be found.

解決辦法:

    修改 etc/hadoop/hadoop-env.sh中設JAVA_HOME。

    應當使用絕對路徑。

    export JAVA_HOME=$JAVA_HOME     //錯誤,不能這麼改

export JAVA_HOME=/usr/java/jdk        //正確,應該這麼改

7) 解決NameNode沒法打開

Hadoop 僞分佈模式下關機後,fs端口鏈接不上問題解決方案

老是報出異常以下所示:

13/07/24 09:14:24 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)

13/07/24 09:14:25 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)

解決方案:

1.首先刪除 /tmp/hadoop-username/dfs/目錄下的東西,並格式化namenode

2.屬性hadoop.tmp.dir是hadoop文件系統依賴的基礎配置,不少路徑都依賴它。它默認的位置是在/tmp/{$user}下面,在local和hdfs都會建有相同的目錄,可是在/tmp路徑下的存儲是不安全的,由於linux一次重啓,文件就可能被刪除。致使namenode啓動不起來。所以須要在core-site.xml下 添加以下字段

<property>

  <name>hadoop.tmp.dir</name>

  <value>/home/jiangzl/work/hadoop/tmp</value>

</property>

3,注意關機前須要stop-all.sh


公益hive視頻資料分享:

連接: http://pan.baidu.com/s/1jGKZKSe 密碼: 1zty 

hive課程中- hive視頻-HIVE(1)安裝mysql部分.avi  (會講hadoop的安裝,經過官網)

相關文章
相關標籤/搜索