hadoop學習筆記1

 Hadoop:
廣義: 以hadoop軟件爲主的生態圈
狹義: hadoop軟件java

hadoop.apache.org
hive.apache.org
spark.apache.org
flink.apache.orgnode

hadoop軟件:
1.x
2.x 生產 2.6 
3.x linux

hadoop2.x組件:
hdfs: 存儲 分佈式文件系統  底層   生產
     hive/hbase
mapreduce: 分佈式計算 --》開發難度高、計算慢(shuffle 磁盤)
     hive sql/spark
yarn: 資源(內存+core)+做業(job)調度管理系統  生產sql


但:
apache hadoop 不選擇部署
企業通常選擇CDH、Ambari、hdp部署
CDH: 
cloudera公司 將Apache hadoop-2.6.0源代碼,
修復bug,新功能,編譯爲本身的版本cdh5.7.0express

Apache hadoop-2.6.0 --》hadoop-2.6.0-cdh5.7.0apache

部署:bash

1.添加sudo權限的無密碼訪問的hadoop用戶
[root@hadoop002 ~]# useradd hadoop
[root@hadoop002 ~]# cat /etc/sudoers |grep hadoop
hadoop  ALL=(ALL)       NOPASSWD: ALL
[root@hadoop002 ~]# 
[root@hadoop002 ~]# su - hadoop
[hadoop@hadoop002 ~]$ app

2.下載
[hadoop@hadoop002 ~]$ mkdir app 
[hadoop@hadoop002 ~]$ cd app
[hadoop@hadoop002 app]$ wget http://archive-primary.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.7.0.tar.gzless

[hadoop@hadoop002 app]$ tar -xzvf hadoop-2.6.0-cdh5.7.0.tar.gz
[hadoop@hadoop002 app]$ cd hadoop-2.6.0-cdh5.7.0
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ dom


Required software for Linux include:
Java™ must be installed. Recommended Java versions are described at HadoopJavaVersions.
ssh must be installed and sshd must be running to use the Hadoop scripts that manage remote Hadoop daemons.

3.JAVA1.7部署 
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ ll /usr/java/
total 319160
drwxr-xr-x 8 root root      4096 Apr 11  2015 jdk1.7.0_80
drwxr-xr-x 8 root root      4096 Apr 11  2015 jdk1.8.0_45
-rw-r--r-- 1 root root 153530841 Jul  8  2015 jdk-7u80-linux-x64.tar.gz
-rw-r--r-- 1 root root 173271626 Sep 19 11:49 jdk-8u45-linux-x64.gz
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ echo $JAVA_HOME
/usr/java/jdk1.7.0_80
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ 


[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ which java
/usr/java/jdk1.7.0_80/bin/java
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ java -version
java version "1.7.0_80"
Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ 


4.準備
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ cd etc/hadoop
[hadoop@hadoop002 hadoop]$ vi hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_80
export HADOOP_PREFIX=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ bin/hadoop
Usage: hadoop [--config confdir] COMMAND
       where COMMAND is one of:

啓動三種模式
Local (Standalone) Mode: 單機 沒有進程  不用
Pseudo-Distributed Mode: 僞分佈式 1臺機器 進程  學習
Fully-Distributed Mode: 分佈式 進程  生產


5.配置文件
[hadoop@hadoop002 hadoop]$ vi core-site.xml 
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://hadoop002:9000</value>
    </property>
</configuration>
"core-site.xml" 24L, 884C written                                  
[hadoop@hadoop002 hadoop]$ vi hdfs-site.xml 
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>
"hdfs-site.xml" 23L, 866C written                                  
[hadoop@hadoop002 hadoop]$ cd

6.無密碼ssh
[hadoop@hadoop002 hadoop]$ cd
[hadoop@hadoop002 ~]$ 
[hadoop@hadoop002 ~]$ 
[hadoop@hadoop002 ~]$ rm -rf .ssh
[hadoop@hadoop002 ~]$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
Generating public/private dsa key pair.
Created directory '/home/hadoop/.ssh'.
Your identification has been saved in /home/hadoop/.ssh/id_dsa.
Your public key has been saved in /home/hadoop/.ssh/id_dsa.pub.
The key fingerprint is:
a3:c7:ba:e9:2e:77:ff:6f:50:bd:bc:f7:1b:1d:a6:e1 hadoop@hadoop002
The key's randomart image is:
+--[ DSA 1024]----+
|                 |
|                 |
|               . |
|              . .|
|        S    o.o.|
|       o .  o +oo|
|      . o    E .o|
|    . .+.     ..o|
|     =*o ....o..=|
+-----------------+
[hadoop@hadoop002 ~]$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
[hadoop@hadoop002 ~]$ cd .ssh
[hadoop@hadoop002 .ssh]$ ll
total 12
-rw-rw-r-- 1 hadoop hadoop 606 Sep 19 23:16 authorized_keys
-rw------- 1 hadoop hadoop 668 Sep 19 23:16 id_dsa
-rw-r--r-- 1 hadoop hadoop 606 Sep 19 23:16 id_dsa.pub

[hadoop@hadoop002 .ssh]$ chmod 600 authorized_keys
[hadoop@hadoop002 .ssh]$ 

[hadoop@hadoop002 .ssh]$ ssh hadoop002
The authenticity of host 'hadoop002 (172.31.236.240)' can't be established.
RSA key fingerprint is b1:94:33:ec:95:89:bf:06:3b:ef:30:2f:d7:8e:d2:4c.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'hadoop002,172.31.236.240' (RSA) to the list of known hosts.
Last login: Wed Sep 19 18:21:09 2018 from 172.31.236.240

Welcome to Alibaba Cloud Elastic Compute Service !

[hadoop@hadoop002 ~]$ 


7.環境變量
[hadoop@hadoop002 ~]$ vi .bash_profile 
export MVN_HOME=/home/hadoop/app/apache-maven-3.3.9
export PROTOC_HOME=/home/hadoop/app/protobuf
export FINDBUGS_HOME=/home/hadoop/app/findbugs-1.3.9
# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
        . ~/.bashrc
fi

# User specific environment and startup programs
export JAVA_HOME=/usr/java/jdk1.7.0_80
export HADOOP_PREFIX=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0

export PATH=$HADOOP_PREFIX/bin:$JAVA_HOME/bin:$PATH
~
~
".bash_profile" 12L, 293C written                                  
[hadoop@hadoop002 ~]$ 
[hadoop@hadoop002 ~]$ 
[hadoop@hadoop002 ~]$ 
[hadoop@hadoop002 ~]$ ssh hadoop002
Last login: Wed Sep 19 23:18:35 2018 from 172.31.236.240

Welcome to Alibaba Cloud Elastic Compute Service !

[hadoop@hadoop002 ~]$ which hdfs
~/app/hadoop-2.6.0-cdh5.7.0/bin/hdfs
[hadoop@hadoop002 ~]$ 
[hadoop@hadoop002 ~]$ cd ~/app/hadoop-2.6.0-cdh5.7.0
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ 

[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ bin/hdfs namenode -format

[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ sbin/start-dfs.sh
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ jps
27707 SecondaryNameNode
27820 Jps
27432 NameNode


發現DN進程有問題,從新部署
[root@hadoop002 tmp]# rm -rf /tmp/hadoop-hadoop
[root@hadoop002 tmp]# 
[hadoop@hadoop002 hadoop]$ vi slaves 
hadoop002


[hadoop@hadoop002 hadoop]$ cd ../../
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ 
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ bin/hdfs namenode -format

[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ sbin/start-dfs.sh
18/09/19 23:29:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [hadoop002]
hadoop002: starting namenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-hadoop002.out
hadoop002: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-hadoop002.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-secondarynamenode-hadoop002.out
18/09/19 23:29:36 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ 
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ 
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ jps
28288 NameNode
28686 Jps
28410 DataNode
28575 SecondaryNameNode
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ 

雲主機,開啓防火牆
http://47.75.249.8:50070

中秋節做業: 1.join語法練習 2.hdfs部署 3.原創博客 更新到hdfs部署

相關文章
相關標籤/搜索