ubuntu16.04 安裝單機Hadoop&HIVE&HUE

硬件準備

  1. Win10上安裝virtualbox,採用橋接模式,網卡爲wirelessphp

    虛擬機地址爲,192.168.1.188html

    cat /etc/issue
           Ubuntu 16.04.2 LTS \n \l
  2. 其餘已安裝 (除去jdk外非必需)java

    安裝了redis/usr/bin/redis-server,redis-cli
          安裝了java user/bin/java
          安裝了sonar/usr/local/sonar/sonarqube-5.6.6/bin/linux-x86-    
                 64/sonar.shstart 默認監聽9000端口
          安裝了mysql server && client 用戶名密碼root
          安裝了php 7.0
          安裝了apache2  apachectl -v 2.4.10
          安裝了szrz小工具
          安裝了jenkins service jenkins start,默認監聽8080端口 用戶名密碼
                            tongbo

軟件準備

jdknode

root@ubuntu:/usr/bin#  /usr/local/java/jdk1.8.0_121/bin/java -version
    java version "1.8.0_121"
    Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
    Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
    root@ubuntu:/usr/bin#
    root@ubuntu:/home/tb# /usr/local/java/jdk1.8.0_121/bin/jps
    2050 Jps
    1533 jenkins.war

hadoop官網下載地址python

能夠利用szrz小工具,我是解壓後hadoop安裝目錄爲:mysql

/home/tb/tbdown/hadoop-2.8.2

tar zxf  hadoop-2.8.2.tar.gz

root@ubuntu:/home/tb/tbdown# ls
dump.rdb      hadoop-2.8.2-src         hadoop-2.8.2.tar.gz  nginx-1.8.1.tar.gz
hadoop-2.8.2  hadoop-2.8.2-src.tar.gz  nginx-1.8.1          spider111

驗證hadoop是否安裝成功

root@ubuntu:/home/tb/tbdown/hadoop-2.8.2# ./bin/hadoop version
Hadoop 2.8.2
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 66c47f2a01ad9637879e95f80c41f798373828fb
Compiled by jdu on 2017-10-19T20:39Z
Compiled with protoc 2.5.0
From source with checksum dce55e5afe30c210816b39b631a53b1d
This command was run using /home/tb/tbdown/hadoop-2.8.2/share/hadoop/common/hadoop-common-2.8.2.jar

修改配置

注意全部操做都在hadoop安裝目錄(/home/tb/tbdown/hadoop-2.8.2/)進行linux

若有須要改動ip配置,重啓網卡 /etc/init.d/networking restartnginx

vim /etc/hosts
添加一行
127.0.0.1 tb001

而後須要修改配置文件,在hadoop安裝目錄的./etc/hadoop下git

root@ubuntu:/home/tb/tbdown/hadoop-2.8.2/etc/hadoop# ls
capacity-scheduler.xml      hadoop-policy.xml        kms-log4j.properties        ssl-client.xml.example
configuration.xsl           hdfs-site.xml            kms-site.xml                ssl-server.xml.example
container-executor.cfg      httpfs-env.sh            log4j.properties            yarn-env.cmd
core-site.xml               httpfs-log4j.properties  mapred-env.cmd              yarn-env.sh
hadoop-env.cmd              httpfs-signature.secret  mapred-env.sh               yarn-site.xml
hadoop-env.sh               httpfs-site.xml          mapred-queues.xml.template
hadoop-metrics2.properties  kms-acls.xml             mapred-site.xml.template
hadoop-metrics.properties   kms-env.sh               slaves

修改配置文件前,最好先備份一下原始的配置文件github

vim hadoop-env.sh

須要更改第一行的 export java home ,若是已經配置了全局java,則無需更改,不然使JAVA_HOME =/your java.jdk...*/

將simple配置文件複製一份本身定義的,yarn是用來hadoop的資源管理系統,

root@ubuntu:/home/tb/tbdown/hadoop-2.8.2/etc/hadoop# cp mapred-site.xml.template mapred-site.xml



vim mapred-site.xml

<configuration>
 <property>
  <name>mapreduce.framework.name</name>
  <value>yarn</value>
 </property>
</configuration>


vim core-site.xml

若是沒有設置hosts裏面的對應,如下的tb001能夠設置爲localhost,fs.defaultFS是用來設置hadoop的文件系統,默認就是hdfs了。這樣客戶端能夠經過8020端口來鏈接namenode服務,hdfs的守護進程也會經過該屬性肯定主機和端口

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs:://tb001:8020</value>
    </property>
</configuration>

vim hdfs-site.xml

第一個參數配置爲副本數,因爲是單機,先爲1,默認應該是3,可是咱們是僞分佈式,設置爲3會在block作副本的時候報錯,就是說沒法將某塊複製到3個datanode上。
另外關於副本數量,能夠經過hodoop fs -ls命令看到副本的數量
第二三個參數配置爲 兩個目錄配置,配置好後啓動hadoop會自動建立,默認爲/tmp/,若是是虛擬機,請必定設置非/tmp路徑

<property>
  <name>dfs.replication</name>
  <value>1</value>
 </property>
 <property>
  <name>dfs.namenode.name.dir</name>
  <value>/home/tb/hadoop/dfs/name</value>
 </property>
 <property>
  <name>dfs.datanode.data.dir</name>
  <value>/home/tb/hadoop/dfs/data</value>


vim yarn-site.xml
  <property>
  <name>yarn.nodemanager.aux-services</name>
  <value>mapreduce_value</value>
 </property>



vim slaves

若是是單機,loalhost或者修改對應hosts的值均可以
默認爲localhost

啓動服務

啓動namenode ,啓動以前先進行格式化,第一次進行時運行,若是後期再運行,將格式化全部數據

root@ubuntu:/home/tb/tbdown/hadoop-2.8.2# ./bin/hadoop namenode -format

格式化後對應目錄將會有如下變化(根據你設置的hdfs-site.xml第一個設置dfs路徑的):多了dfs目錄及如下
(啓動datanode會建立第二個配置文件中的目錄,見下文)

root@ubuntu:/home/tb/hadoop# pwd
/home/tb/hadoop
root@ubuntu:/home/tb/hadoop# tree ./
./
└── dfs
    └── name
        └── current
            ├── fsimage_0000000000000000000
            ├── fsimage_0000000000000000000.md5
            ├── seen_txid
            └── VERSION

3 directories, 4 files

啓動namenode

root@ubuntu:/home/tb/tbdown/hadoop-2.8.2# ./sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /home/tb/tbdown/hadoop-2.8.2/logs/hadoop-root-namenode-ubuntu.out
root@ubuntu:/home/tb/tbdown/hadoop-2.8.2#

如何知道namenode有沒有成功呢?看下面沒有對應進程,說明沒有成功,下面咱們調試錯誤

/usr/local/java/jdk1.8.0_121/bin/jps
1533 jenkins.war
3215 Jps

調試錯誤

沒有成功,能夠查看log


root@ubuntu:/home/tb/tbdown/hadoop-2.8.2# cd logs/
root@ubuntu:/home/tb/tbdown/hadoop-2.8.2/logs# ls
hadoop-root-namenode-ubuntu.log  hadoop-root-namenode-ubuntu.out  SecurityAuth-root.audit
root@ubuntu:/home/tb/tbdown/hadoop-2.8.2/logs# tail -f hadoop-root-namenode-ubuntu.log
  at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:682)
  at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:905)
  at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:884)
  at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1610)
  at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1678)
2017-11-04 16:42:22,937 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2017-11-04 16:42:22,939 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ubuntu/127.0.1.1
************************************************************/

找到問題所在:

java.lang.IllegalArgumentException: Invalid URI for NameNode address (check fs.defaultFS): hdfs:://tb001:8020 has no authority.

看問題是沒有受權,解決辦法以下[能夠參考文末官方文檔],ssh主要是方便主節點直接登陸操做其餘子節點,無需單獨登陸到子節點逐一管理

apt-get install ssh
sudo apt-get install pdsh
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys

再次執行啓動namenode就能夠了,

root@ubuntu:/home/tb/tbdown/hadoop-2.8.2# ./sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /home/tb/tbdown/hadoop-2.8.2/logs/hadoop-root-namenode-ubuntu.out
root@ubuntu:/home/tb/tbdown/hadoop-2.8.2# /usr/local/java/jdk1.8.0_121/bin/jps
6658 NameNode
6731 Jps
1550 jenkins.war

namenode已經啓動成功,繼續啓動其餘吧

固然也能夠把dfs的namenode和datanode 一塊兒執行,執行 sbin/start-dfs.sh便可

root@ubuntu:/home/tb/tbdown/hadoop-2.8.2# sbin/start-dfs.sh
Starting namenodes on [tb001]
tb001: namenode running as process 6658. Stop it first.
localhost: starting datanode, logging to /home/tb/tbdown/hadoop-2.8.2/logs/hadoop-root-datanode-ubuntu.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/tb/tbdown/hadoop-2.8.2/logs/hadoop-root-secondarynamenode-ubuntu.out
root@ubuntu:/home/tb/tbdown/hadoop-2.8.2# /usr/local/java/jdk1.8.0_121/bin/jps
6658 NameNode
7235 Jps
7124 SecondaryNameNode
6942 DataNode
1550 jenkins.war
root@ubuntu:/home/tb/tbdown/hadoop-2.8.2#

中止


root@ubuntu:/home/tb/tbdown/hadoop-2.8.2# sbin/stop-dfs.sh
Stopping namenodes on [tb001]
tb001: stopping namenode
localhost: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
root@ubuntu:/home/tb/tbdown/hadoop-2.8.2#

啓動hive

root@ubuntu:/usr/local/apache-hive-2.2.0-bin/bin# ./hive
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/apache-hive-2.2.0-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/tb/tbdown/hadoop-2.8.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in file:/usr/local/apache-hive-2.2.0-bin/conf/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive>

啓動hive server2供hue

root@ubuntu:/home/tb/tbdown/hue# hive --service hiveserver2
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/apache-hive-2.2.0-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/tb/tbdown/hadoop-2.8.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

檢查一下
oot@ubuntu:/home/tb/tbdown/hadoop-2.8.2/etc/hadoop# netstat -anp | grep 10000
tcp 0 0 0.0.0.0:10000 0.0.0.0:* LISTEN 5030/java
root@ubuntu:/home/tb/tbdown/hadoop-2.8.2/etc/hadoop#

啓動yarn

root@ubuntu:/home/tb/tbdown/hadoop-2.8.2# sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /home/tb/tbdown/hadoop-2.8.2/logs/yarn-root-resourcemanager-ubuntu.out
localhost: starting nodemanager, logging to /home/tb/tbdown/hadoop-2.8.2/logs/yarn-root-nodemanager-ubuntu.out
root@ubuntu:/home/tb/tbdown/hadoop-2.8.2# /usr/local/java/jdk1.8.0_121/bin/jps
9252 SecondaryNameNode
8918 NameNode
10039 Jps
9066 DataNode
1550 jenkins.war
9631 ResourceManager
root@ubuntu:/home/tb/tbdown/hadoop-2.8.2#

驗證成功

yarn :http://192.168.1.188:8088/cluster

圖片描述
hdfs :http://192.168.1.188:50070/dfshealth.html#tab-overview

圖片描述

安裝HUE

HUE GITHUB
HUE python寫的,須要安裝dev python包

apt-get update
apt-get install python-dev
git clone https://github.com/cloudera/hue.git
cd hue

執行 make apps

cd /home/tb/tbdown/hue/maven && mvn install
/bin/bash: mvn: command not found
Makefile:122: recipe for target 'parent-pom' failed
make: *** [parent-pom] Error 127
root@ubuntu:/home/tb/tbdown/hue#

看來是沒有安裝mvn

mvn -v
The program 'mvn' is currently not installed. You can install it by typing:
apt install maven
root@ubuntu:/home/tb/tbdown/hue# apt install maven

安裝一下吧

apt install maven

再次 mvn -v

Apache Maven 3.3.9
Maven home: /usr/share/maven
Java version: 1.8.0_121, vendor: Oracle Corporation
Java home: /usr/local/java/jdk1.8.0_121/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "4.4.0-62-generic", arch: "amd64", family: "unix"

再次 make apps

fatal error: libxml/xmlversion.h: No such file or directory

解決辦法:依次執行apt-get install libxml2-dev libxslt1-dev python-dev

而後再再次 make apps

馬蛋又錯了:錯誤部分以下:...7/src/kerberos.o sh: 1: krb5-conf..

怎麼就這麼難,而後又去擼了擼文檔,發現人家說的清清楚楚,安裝hue以前須要安裝哪些東西。

You'll need these library development packages and tools installed on your system:

很差好看文檔!

https://github.com/cloudera/h...

而後再再再次 make apps

10分鐘後成功了

...
426 static files copied to '/home/tb/tbdown/hue/build/static', 1426 post-processed.
make[1]: Leaving directory '/home/tb/tbdown/hue/apps'
...

而後執行
build/env/bin/hue runserver
默認只能本機訪問,如需外網
build/env/bin/hue runserver 0.0.0.0:8000

啓動成功

[19/Jan/2018 00:27:26 +0000] __init__     INFO     Couldn't import snappy. Support for snappy compression disabled.
0 errors found
January 19, 2018 - 00:27:26
Django version 1.6.10, using settings 'desktop.settings'
Starting development server at http://0.0.0.0:8000/
Quit the server with CONTROL-C.

HUE上圖

圖片描述

更多內容及參考

官方安裝說明

自動安裝部署

ambari
minos(小米開源hadoop部署)
choudear manager(收費)

一鍵打包兼容

hdp
cdh4 OR cdh5

相關文章
相關標籤/搜索