hadoop

1、day HDFSjava

Hadoop生態系統node

Hadoop:HDFS  |  MapReduce      起源 google  GFS    --nutch(ndfs)--  HDFSlinux

        HDFS:存儲問題 分佈式文件存儲   提高系統的訪問速率shell

        磁盤大小  磁盤訪問速度          讀完耗時數據庫

        1GB        4.4M/S               4 分鐘windows

        1TB        100M/S               2.9小時安全

      

        分佈式解決方案bash

        1TB   --  100 臺機器   --   10.24GB架構

       

        10.24GB   100M/S          2分鐘dom

        MapReduce:計算       起源 google MapReduce

        計算任務  拆分  若干個小任務 分配給  存儲節點   結果彙總  數億行*百萬列 數據隨機訪問

Hbase:基於HDFS上的一款數據庫   起源 google bigtable

Hive:HQL -- 翻譯成  MapReduce程序 

Zookeeper:分佈式協調服務

 

 

HDFS架構圖

1.hadoop環境搭建  jiangzz_wy

系統:CentOS6.5 32 位  安裝JDK1.7+ (而且已經配置過環境變量JAVA_HOME)  

1.安裝JDK

Jdk配置步驟:

①先把jdk-7u71-linux-i586.rpm 用 winsp 拉進/usr/local

②rpm –ivh jdk-7u71-linux-i586.rpm 安裝jdk

③ls  –a   會看到.bashrc

         Vi .bashrc

配置環境:

CLASSPATH=.

JAVA_HOME=/usr/java/latest

PATH=$PATH:$JAVA_HOME/bin

export CLASSPATH

export JAVA_HOME

export PATH

④sec  克隆一下就可生效 jps  或 java –version

 

2配置主機名:

 [root@CentOS ~]# cat /etc/sysconfig/network

NETWORKING=yes

HOSTNAME=CentOS

2.配置主機名和IP的映射關係

  注:先去C:\Windows\System32\drivers\etc\hosts

 添加  192.168.0.8 CentOS

[root@CentOS ~]# cat /etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4

::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.0.8 CentOS

4.配置 機器爲SSH免密碼登錄

 

5.取消防火牆

[root@CentOS ~]# service iptables stop

iptables: Setting chains to policy ACCEPT: filter          [  OK  ]

iptables: Flushing firewall rules:                         [  OK  ]

iptables: Unloading modules:                               [  OK  ]

[root@CentOS ~]# chkconfig --del iptables   -- 關閉防火牆的開啓自啓動

6.配置SSH免密碼登錄

(1)生成公私鑰對

[root@CentOS ~]# ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

Generating public/private dsa key pair.

Your identification has been saved in /root/.ssh/id_dsa.

Your public key has been saved in /root/.ssh/id_dsa.pub.

The key fingerprint is:

06:76:81:51:1f:94:7c:02:6b:49:c5:e8:cc:80:df:8b root@CentOS

The key's randomart image is:

+--[ DSA 1024]----+

|     ..+=B+.     |

|    . o..==..    |

|     .o*= .o     |

|     ..+=        |

|       .S.       |

|      E..        |

|                 |

|                 |

|                 |

+-----------------+

(2)上傳給須要登錄的目標機器

 略 winsp

(3)目標機器將上傳的公鑰添加到本身的信任列表

 [root@CentOS ~]#  cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

圖解SSH免密碼登錄

7.上傳Hadoop-2.6.0.tar.gz文件並解壓到/usr目錄

[root@CentOS ~]# tar -zxf hadoop-2.6.0.tar.gz -C /usr/

8.配置Hadoop的相關配置

(1)etc/hadoop/core-site.xml

(2)exc/hadoop/hdfs-site.xml

(3)etc/hadoop/slaves

9.格式化namenode (建立fsimage文件)

[root@CentOS hadoop-2.6.0]# ./bin/hdfs namenode -format

16/07/27 23:25:12 INFO namenode.NameNode: STARTUP_MSG:

/************************************************************

.....

16/07/27 23:25:13 INFO namenode.NNConf: XAttrs enabled? true

16/07/27 23:25:13 INFO namenode.NNConf: Maximum size of an xattr: 16384

16/07/27 23:25:13 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1909604994-192.168.0.8-1469633113883

16/07/27 23:25:13 INFO common.Storage: Storage directory /tmp/hadoop-root/dfs/name has been successfully formatted.

16/07/27 23:25:14 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0

16/07/27 23:25:14 INFO util.ExitUtil: Exiting with status 0

16/07/27 23:25:14 INFO namenode.NameNode: SHUTDOWN_MSG:

/************************************************************

SHUTDOWN_MSG: Shutting down NameNode at CentOS/192.168.0.8

************************************************************/

10.啓動hadoop

[root@CentOS hadoop-2.6.0]# ./sbin/start-dfs.sh

Starting namenodes on [CentOS]

CentOS: starting namenode, logging to /usr/hadoop-2.6.0/logs/hadoop-root-namenode-CentOS.out

CentOS: starting datanode, logging to /usr/hadoop-2.6.0/logs/hadoop-root-datanode-CentOS.out

Starting secondary namenodes [0.0.0.0]

0.0.0.0: starting secondarynamenode, logging to /usr/hadoop-2.6.0/logs/hadoop-root-secondarynamenode-CentOS.out

11.檢驗是否啓動成功

[root@CentOS hadoop-2.6.0]# jps

3459 NameNode

3713 SecondaryNameNode

3571 DataNode

12.關閉hadoop

[root@CentOS hadoop-2.6.0]# ./sbin/stop-dfs.sh

Stopping namenodes on [CentOS]

CentOS: stopping namenode

CentOS: stopping datanode

Stopping secondary namenodes [0.0.0.0]

0.0.0.0: stopping secondarynamenode

 

任務2、本身學習使用HDFS shell指令

[root@CentOS hadoop-2.6.0]# ./bin/hdfs dfs -help

任務3、JavaAPI操做HDFS

1.配置Windows的開發環境

 (1) 解壓hadoop-2.6.0.tar.gz 到C:/

 (2)配置HADOOP_HOME環境變量

hadoop-2.6.0 路徑 不要有中文

(3)添加hadoop windows開發支持

a)         Hadoop.dll 和 winutils.exe 拷貝到hadoop-2.6.0的bin目錄下

2.將core-site.xml和hdfs-site.xml拷貝到項目的src的目錄下

3.修改Java的啓動參數 -DHADOOP_USER_NAME=root

第二種解決權限方案

 

 

補充:

1.配置文件的基礎路徑

a)         hadoop.tmp.dir

hadoop.tmp.dir 是hadoop文件系統依賴的基礎配置,不少路徑都依賴它。若是hdfs-site.xml中不配 置namenode和datanode的存放位置,默認就放在這個路徑中

 

 

在core-site.xml 配置

 

<property>

          <name>hadoop.tmp.dir</name>

          <value>/usr/local/hadoop-2.6.0/hadoop-${user.name}</value>

        </property>

 

 

 

2.配置回收站

a)         Fs.trash.interval

在core-site.xml 配置

 

        <property>

                                <name>fs.trash.interval</name>

                                <value>2</value>

                </property>

回收站路徑:

hdfs://Centos:9000/user/root/.Trash/Current

查看回收站:

./bin/hdfs dfs -ls /user/root/.Trash/Current

 

恢復:

./bin/hdfs dfs -mv /user/root/.Trash/Current/hadoop-2.6.0.tar.gz /

 

3.自行了解 

a)         ./bin/hdfs -help

b)        ./bin/hdfs dfsadmin -help --自行學習

查看機器內存

./bin/hdfs dfsadmin –report

進入安全模式

./bin/hdfs dfsadmin –safemode enter/get(查看當前狀態)

查看機架

./bin/hdfs dfsadmin -printTopology

相關文章
相關標籤/搜索