經常使用的Hadoop發行版:java
/ | 優勢 | 缺點 |
---|---|---|
Apache | 純開源 | 不一樣版本/不一樣框架之間整合 jar衝突 |
CDH | 有比較完善的客戶端cm、能夠一鍵式安裝升級 | cm不開源、與社區版本有些許出入 |
Hortonworks | 原裝Hadoop、純開源、支持tez | 企業級安全不開源 |
其中CDH佔市場使用率的60%-70%,因此本次學習打算使用CDH的版本
Hadoop-2.6.0-cdh5.11.1下載地址
CDH官方文檔
CentOS7下載地址
JDK8下載地址(百度網盤提取碼dg3v)node
安裝CentOS7的時候設置hostname爲hadoop000 而且建立hadoop用戶linux
目錄介紹centos
[hadoop@hadoop000 ~]$ pwd /home/hadoop [hadoop@hadoop000 ~]$ ll 總用量 0 drwxrwxr-x. 5 hadoop hadoop 67 2月 10 04:07 app //java、hadoop等軟件的安裝目錄 drwxrwxr-x. 2 hadoop hadoop 77 2月 10 00:27 software //安裝包目錄
解壓jdk並配置環境變量
scp複製本地文件到Linux安全
scp jdk-8u241-linux-x64.tar.gz hadoop@192.168.7.83:~/software/
解壓jdkbash
tar -zvxf jdk-8u241-linux-x64.tar.gz -C ~/app/
配置環境變量app
vi ~/.bash_profile export JAVA_HOME=/home/hadoop/app/jdk1.8.0_241 export PATH=$JAVA_HOME/bin:$PATH source ~/.bash_profile
驗證框架
[hadoop@hadoop000 ~]$ java -version java version "1.8.0_241" Java(TM) SE Runtime Environment (build 1.8.0_241-b07) Java HotSpot(TM) 64-Bit Server VM (build 25.241-b07, mixed mode)
ssh-keygen -t rsa //一路回車
[hadoop@hadoop000 ~]$ cd ~/.ssh [hadoop@hadoop000 .ssh]$ ll 總用量 16 -rw-------. 1 hadoop hadoop 1675 2月 10 03:52 id_rsa //私鑰 -rw-r--r--. 1 hadoop hadoop 398 2月 10 03:52 id_rsa.pub //公鑰 -rw-r--r--. 1 hadoop hadoop 376 2月 18 01:28 known_hosts
cat id_rsa.pub >> authorized_keys chmod 600 authorized_keys
解壓Hadoopssh
tar -zxvf ~/software/hadoop-2.6.0-cdh5.11.1.tar.gz -C ~/app/
配置Hadoop環境變量(加在JDK的下面便可)oop
vi ~/.bash_profile export HADOOP_HOME=/home/hadoop/app/hadoop-2.6.0-cdh5.11.1 export PATH=$HADOOP_HOME/bin:$PATH source ~/.bash_profile
Hadoop目錄介紹
[hadoop@hadoop000 software]$ cd ~/app/hadoop-2.6.0-cdh5.11.1/ [hadoop@hadoop000 hadoop-2.6.0-cdh5.11.1]$ ll 總用量 116 drwxr-xr-x. 2 hadoop hadoop 137 6月 1 2017 bin //Hadoop客戶端操做命令 drwxr-xr-x. 2 hadoop hadoop 166 6月 1 2017 bin-mapreduce1 drwxr-xr-x. 3 hadoop hadoop 4096 6月 1 2017 cloudera drwxr-xr-x. 6 hadoop hadoop 109 6月 1 2017 etc //Hadoop配置文件 drwxr-xr-x. 5 hadoop hadoop 43 6月 1 2017 examples drwxr-xr-x. 3 hadoop hadoop 28 6月 1 2017 examples-mapreduce1 drwxr-xr-x. 2 hadoop hadoop 106 6月 1 2017 include drwxr-xr-x. 3 hadoop hadoop 20 6月 1 2017 lib drwxr-xr-x. 3 hadoop hadoop 261 6月 1 2017 libexec -rw-r--r--. 1 hadoop hadoop 85063 6月 1 2017 LICENSE.txt -rw-r--r--. 1 hadoop hadoop 14978 6月 1 2017 NOTICE.txt -rw-r--r--. 1 hadoop hadoop 1366 6月 1 2017 README.txt drwxr-xr-x. 3 hadoop hadoop 4096 6月 1 2017 sbin //Hadoop啓動命令腳本 drwxr-xr-x. 4 hadoop hadoop 31 6月 1 2017 share //例子 drwxr-xr-x. 18 hadoop hadoop 4096 6月 1 2017 src
etc/hadoop/hadoop-env.sh(若是已經配置JAVA_HOME則能夠省略)
export JAVA_HOME=/home/hadoop/app/jdk1.8.0_241
etc/hadoop/core-site.xml
<property> <name>fs.defaultFS</name> <value>hdfs://hadoop000:8020</value> </property>
建立HDFS存儲目錄
mkdir /home/hadoop/app/tmp
etc/hadoop/hdfs-site.xml
Hadoop單機版hdfs的副本配置(dfs.replication)爲1便可
<property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/hadoop/app/tmp</value> </property>
etc/hadoop/slaves
hadoop000
啓動HDFS
第一次執行的時候必定要格式化文件系統,不要重複執行
hdfs namenode -format
啓動與中止hdfs集羣
$HADOOP_HOME/sbin/start-dfs.sh $HADOOP_HOME/sbin/stop-dfs.sh
驗證:
[hadoop@hadoop000 bin]$ jps 3345 DataNode 3494 SecondaryNameNode 3597 Jps 3230 NameNode
上傳文件到hdfs
[hadoop@hadoop000 software]$ hadoop fs -ls / 20/02/18 02:46:52 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable [hadoop@hadoop000 software]$ hadoop fs -put jdk-8u241-linux-x64.tar.gz / 20/02/18 02:47:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable [hadoop@hadoop000 software]$ [hadoop@hadoop000 software]$ [hadoop@hadoop000 software]$ hadoop fs -ls / 20/02/18 02:47:16 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 1 items -rw-r--r-- 1 hadoop supergroup 194545143 2020-02-18 02:47 /jdk-8u241-linux-x64.tar.gz
etc/hadoop/mapred-site.xml
<property> <name>mapreduce.framework.name</name> <value>yarn</value> </property>
etc/hadoop/yarn-site.xml
<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property>
啓動與中止yarn集羣
$HADOOP_HOME/sbin/start-yarn.sh $HADOOP_HOME/sbin/stop-yarn.sh
驗證
[hadoop@hadoop000 hadoop-2.6.0-cdh5.11.1]$ jps 21042 ResourceManager 21493 Jps 4070 NameNode 4342 SecondaryNameNode 4190 DataNode 21198 NodeManager
完整的~/.bash_profile
# .bash_profile # Get the aliases and functions if [ -f ~/.bashrc ]; then . ~/.bashrc fi # User specific environment and startup programs PATH=$PATH:$HOME/.local/bin:$HOME/bin export JAVA_HOME=/home/hadoop/app/jdk1.8.0_241 export PATH=$JAVA_HOME/bin:$PATH export HADOOP_HOME=/home/hadoop/app/hadoop-2.6.0-cdh5.11.1 export PATH=$HADOOP_HOME/bin:$PATH export PATH
訪問該機的50070端口(hdfs)
若是沒法訪問請關閉防火牆
sudo firewall-cmd --state //查看防火牆狀態 sudo systemctl stop firewalld.service //關閉防火牆 sudo systemctl disable firewalld.service //禁止開機啓動
訪問8088端口(yarn)
hadoop fs -ls / hadoop fs -put hadoop fs -copyFromLocal hadoop fs -moveFromLocal hadoop fs -cat hadoop fs -text hadoop fs -get hadoop fs -mkdir hadoop fs -mv //移動/更名 hadoop fs -getmerge hadoop fs -rm hadoop fs -rmdir hadoop fs -rm -r