1、day HDFSjava
Hadoop生態系統node
Hadoop:HDFS | MapReduce 起源 google GFS --nutch(ndfs)-- HDFSlinux
HDFS:存儲問題 分佈式文件存儲 提高系統的訪問速率shell
磁盤大小 磁盤訪問速度 讀完耗時數據庫
1GB 4.4M/S 4 分鐘windows
1TB 100M/S 2.9小時安全
分佈式解決方案bash
1TB -- 100 臺機器 -- 10.24GB架構
10.24GB 100M/S 2分鐘dom
MapReduce:計算 起源 google MapReduce
計算任務 拆分 若干個小任務 分配給 存儲節點 結果彙總 數億行*百萬列 數據隨機訪問
Hbase:基於HDFS上的一款數據庫 起源 google bigtable
Hive:HQL -- 翻譯成 MapReduce程序
Zookeeper:分佈式協調服務
HDFS架構圖
1.hadoop環境搭建 jiangzz_wy
系統:CentOS6.5 32 位 安裝JDK1.7+ (而且已經配置過環境變量JAVA_HOME)
1.安裝JDK
Jdk配置步驟:
①先把jdk-7u71-linux-i586.rpm 用 winsp 拉進/usr/local
②rpm –ivh jdk-7u71-linux-i586.rpm 安裝jdk
③ls –a 會看到.bashrc
Vi .bashrc
配置環境:
CLASSPATH=.
JAVA_HOME=/usr/java/latest
PATH=$PATH:$JAVA_HOME/bin
export CLASSPATH
export JAVA_HOME
export PATH
④sec 克隆一下就可生效 jps 或 java –version
2配置主機名:
[root@CentOS ~]# cat /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=CentOS
2.配置主機名和IP的映射關係
注:先去C:\Windows\System32\drivers\etc\hosts
添加 192.168.0.8 CentOS
[root@CentOS ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.0.8 CentOS
4.配置 機器爲SSH免密碼登錄
5.取消防火牆
[root@CentOS ~]# service iptables stop
iptables: Setting chains to policy ACCEPT: filter [ OK ]
iptables: Flushing firewall rules: [ OK ]
iptables: Unloading modules: [ OK ]
[root@CentOS ~]# chkconfig --del iptables -- 關閉防火牆的開啓自啓動
6.配置SSH免密碼登錄
(1)生成公私鑰對
[root@CentOS ~]# ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
Generating public/private dsa key pair.
Your identification has been saved in /root/.ssh/id_dsa.
Your public key has been saved in /root/.ssh/id_dsa.pub.
The key fingerprint is:
06:76:81:51:1f:94:7c:02:6b:49:c5:e8:cc:80:df:8b root@CentOS
The key's randomart image is:
+--[ DSA 1024]----+
| ..+=B+. |
| . o..==.. |
| .o*= .o |
| ..+= |
| .S. |
| E.. |
| |
| |
| |
+-----------------+
(2)上傳給須要登錄的目標機器
略 winsp
(3)目標機器將上傳的公鑰添加到本身的信任列表
[root@CentOS ~]# cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
圖解SSH免密碼登錄
7.上傳Hadoop-2.6.0.tar.gz文件並解壓到/usr目錄
[root@CentOS ~]# tar -zxf hadoop-2.6.0.tar.gz -C /usr/
8.配置Hadoop的相關配置
(1)etc/hadoop/core-site.xml
(2)exc/hadoop/hdfs-site.xml
(3)etc/hadoop/slaves
9.格式化namenode (建立fsimage文件)
[root@CentOS hadoop-2.6.0]# ./bin/hdfs namenode -format
16/07/27 23:25:12 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
.....
16/07/27 23:25:13 INFO namenode.NNConf: XAttrs enabled? true
16/07/27 23:25:13 INFO namenode.NNConf: Maximum size of an xattr: 16384
16/07/27 23:25:13 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1909604994-192.168.0.8-1469633113883
16/07/27 23:25:13 INFO common.Storage: Storage directory /tmp/hadoop-root/dfs/name has been successfully formatted.
16/07/27 23:25:14 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
16/07/27 23:25:14 INFO util.ExitUtil: Exiting with status 0
16/07/27 23:25:14 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at CentOS/192.168.0.8
************************************************************/
10.啓動hadoop
[root@CentOS hadoop-2.6.0]# ./sbin/start-dfs.sh
Starting namenodes on [CentOS]
CentOS: starting namenode, logging to /usr/hadoop-2.6.0/logs/hadoop-root-namenode-CentOS.out
CentOS: starting datanode, logging to /usr/hadoop-2.6.0/logs/hadoop-root-datanode-CentOS.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/hadoop-2.6.0/logs/hadoop-root-secondarynamenode-CentOS.out
11.檢驗是否啓動成功
[root@CentOS hadoop-2.6.0]# jps
3459 NameNode
3713 SecondaryNameNode
3571 DataNode
12.關閉hadoop
[root@CentOS hadoop-2.6.0]# ./sbin/stop-dfs.sh
Stopping namenodes on [CentOS]
CentOS: stopping namenode
CentOS: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
任務2、本身學習使用HDFS shell指令
[root@CentOS hadoop-2.6.0]# ./bin/hdfs dfs -help
任務3、JavaAPI操做HDFS
1.配置Windows的開發環境
(1) 解壓hadoop-2.6.0.tar.gz 到C:/
(2)配置HADOOP_HOME環境變量
hadoop-2.6.0 路徑 不要有中文
(3)添加hadoop windows開發支持
a) Hadoop.dll 和 winutils.exe 拷貝到hadoop-2.6.0的bin目錄下
2.將core-site.xml和hdfs-site.xml拷貝到項目的src的目錄下
3.修改Java的啓動參數 -DHADOOP_USER_NAME=root
第二種解決權限方案
1.配置文件的基礎路徑
a) hadoop.tmp.dir
hadoop.tmp.dir 是hadoop文件系統依賴的基礎配置,不少路徑都依賴它。若是hdfs-site.xml中不配 置namenode和datanode的存放位置,默認就放在這個路徑中
在core-site.xml 配置
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop-2.6.0/hadoop-${user.name}</value>
</property>
2.配置回收站
a) Fs.trash.interval
在core-site.xml 配置
<property>
<name>fs.trash.interval</name>
<value>2</value>
</property>
回收站路徑:
hdfs://Centos:9000/user/root/.Trash/Current
查看回收站:
./bin/hdfs dfs -ls /user/root/.Trash/Current
恢復:
./bin/hdfs dfs -mv /user/root/.Trash/Current/hadoop-2.6.0.tar.gz /
3.自行了解
a) ./bin/hdfs -help
b) ./bin/hdfs dfsadmin -help --自行學習
查看機器內存
./bin/hdfs dfsadmin –report
進入安全模式
./bin/hdfs dfsadmin –safemode enter/get(查看當前狀態)
查看機架
./bin/hdfs dfsadmin -printTopology