今天,因爲雲計算實驗須要,同時對雲計算也有很大興趣,就在本身的Mac上安裝了Hadoop。html
===java
我來簡短介紹一下Hadoop:node
Hadoop是一個由Apache基金會所開發的分佈式系統基礎架構。apache
用戶能夠在不瞭解分佈式底層細節的狀況下,開發分佈式程序。充分利用集羣的威力進行高速運算和存儲。vim
Hadoop實現了一個分佈式文件系統(Hadoop Distributed File System),簡稱HDFS。HDFS有高容錯性的特色,而且設計用來部署在低廉的(low-cost)硬件上;並且它提供高吞吐量(high throughput)來訪問應用程序的數據,適合那些有着超大數據集(large data set)的應用程序。HDFS放寬了(relax)POSIX的要求,能夠以流的形式訪問(streaming access)文件系統中的數據。瀏覽器
Hadoop的框架最核心的設計就是:HDFS和MapReduce。HDFS爲海量的數據提供了存儲,則MapReduce爲海量的數據提供了計算。安全
這是我Mac的系統版本:bash
macOS Sierra 版本10.12.3
以後會運行jar包,因此確定須要java環境
打開terminal:
敲入命令架構
java -version
個人終端提示以下app
java version "1.8.0_101" Java(TM) SE Runtime Environment (build 1.8.0_101-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode)
說明個人電腦中已經裝好了jdk,若是提示沒裝,請自行谷歌下載安裝,此文再也不一一敘述。
Mac下自帶ssh,因此不須要安裝ssh了。能夠經過以下命令驗證
➜ hadoop-1.2.1 which ssh /usr/bin/ssh ➜ hadoop-1.2.1 which sshd /usr/sbin/sshd ➜ hadoop-1.2.1 which ssh-keygen /usr/bin/ssh-keygen ➜ hadoop-1.2.1
輸入命令ssh localhost
,可能遇到以下問題
ssh: connect to host localhost port 22: Connection refused
緣由是沒打開遠程登陸,進入系統設置->共享->遠程登陸打開就好,這時你再ssh localhost
一下
➜ hadoop-1.2.1 ssh localhost Password: Last login: Tue Apr 18 09:45:33 2017 from ::1
期間你要輸入你電腦的密碼。
這裏有個ssh免登陸方法,具體我也沒研究,不過依葫蘆畫瓢應該可行,有興趣的能夠試一下。
選擇以下版本
hadoop-1.2.1.tar.gz 06-Nov-2014 21:22 61M
下載完以後,我把它解壓到了個人Documents即文稿目錄下。
終端輸入vim ~/.bash_profile
這裏會問你是否編輯,有個安全提示,按E便可編輯。
在這裏添加環境變量以下:
21 # Hadoop 22 export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home 23 export JRE_HOME=$JAVA_HOME/jre 24 export HADOOP_HOME=/Users/Apple/Documents/hadoop-1.2.1 25 export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH 26 export HADOOP_HOME_WARN_SUPPRESS=1 27 export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin:$PATH
其中:
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home export JRE_HOME=$JAVA_HOME/jre
是java的系統環境變量。
export HADOOP_HOME=/Users/Apple/Documents/hadoop-1.2.1
是配置Hadoop的系統環境變量
export HADOOP_HOME_WARN_SUPPRESS=1
是防止出現:Warning: $HADOOP_HOME is deprecated
的警告錯誤。
上述環境變量增長完成後,退回到終端,輸入:
source ~/.bash_profile
使得環境變量設置生效!
進入剛解壓的Documents
目錄下的Hadoop1.2.1
,
而後進入conf
文件夾,執行vim hadoop-env.sh
,對其進行以下配置
8 # The java implementation to use. Required. 9 export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home 10 #export JAVA_HOME=/usr/lib/j2sdk1.5-sun 11 12 # Extra Java CLASSPATH elements. Optional. 13 # export HADOOP_CLASSPATH= 14 15 # The maximum amount of heap to use, in MB. Default is 1000. 16 export HADOOP_HEAPSIZE=2000 17 18 # Extra Java runtime options. Empty by default. 19 export HADOOP_OPTS="-Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk" 20 # export HADOOP_OPTS=-server
core-site.xml
指定了NameNode
的主機名與端口
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>hadoop.tmp.dir</name> <value>hdfs://localhost:9000</value> <description>A base for other temporary directories.</description> </property> <property> <name>fs.default.name</name> <value>hdfs://localhost:8020</value> </property> </configuration>
hdfs-site.xml
指定了HDFS的默認參數副本數,由於僅運行在一個節點上,因此這裏的副本數爲1
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>
mapred-site.xml
指定了JobTracker的主機名與端口
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapred.job.tracker</name> <value>hdfs://localhost:9001</value> </property> <property> <name>mapred.tasktracker.map.tasks.maximum</name> <value>2</value> </property> <property> <name>mapred.tasktracker.reduce.tasks.maximum</name> <value>2</value> </property> </configuration>
至此,在終端輸入hadoop,就會出現以下
➜ ~ hadoop Usage: hadoop [--config confdir] COMMAND where COMMAND is one of: namenode -format format the DFS filesystem secondarynamenode run the DFS secondary namenode namenode run the DFS namenode datanode run a DFS datanode dfsadmin run a DFS admin client mradmin run a Map-Reduce admin client fsck run a DFS filesystem checking utility fs run a generic filesystem user client balancer run a cluster balancing utility oiv apply the offline fsimage viewer to an fsimage fetchdt fetch a delegation token from the NameNode jobtracker run the MapReduce job Tracker node pipes run a Pipes job tasktracker run a MapReduce task Tracker node historyserver run job history servers as a standalone daemon job manipulate MapReduce jobs queue get information regarding JobQueues version print the version jar <jar> run a jar file distcp <srcurl> <desturl> copy file or directories recursively distcp2 <srcurl> <desturl> DistCp version 2 archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive classpath prints the class path needed to get the Hadoop jar and the required libraries daemonlog get/set the log level for each daemon or CLASSNAME run the class named CLASSNAME Most commands print help when invoked w/o parameters. ➜ ~
表示已經能夠找到Hadoop的執行程序。
在程序執行前,對Namenode執行格式化操做hadoop namenode -format
,出現以下圖結果:
➜ ~ hadoop namenode -format 17/04/18 11:16:33 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = AppledeMacBook-Air-2.local/172.19.167.21 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 1.2.1 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013 STARTUP_MSG: java = 1.8.0_101 ************************************************************/ 17/04/18 11:16:33 INFO util.GSet: Computing capacity for map BlocksMap 17/04/18 11:16:33 INFO util.GSet: VM type = 64-bit 17/04/18 11:16:33 INFO util.GSet: 2.0% max memory = 1864368128 17/04/18 11:16:33 INFO util.GSet: capacity = 2^22 = 4194304 entries 17/04/18 11:16:33 INFO util.GSet: recommended=4194304, actual=4194304 17/04/18 11:16:33 INFO namenode.FSNamesystem: fsOwner=Apple 17/04/18 11:16:33 INFO namenode.FSNamesystem: supergroup=supergroup 17/04/18 11:16:33 INFO namenode.FSNamesystem: isPermissionEnabled=true 17/04/18 11:16:33 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100 17/04/18 11:16:33 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s) 17/04/18 11:16:33 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0 17/04/18 11:16:33 INFO namenode.NameNode: Caching file names occuring more than 10 times 17/04/18 11:16:34 INFO common.Storage: Image file hdfs:/localhost:9000/dfs/name/current/fsimage of size 111 bytes saved in 0 seconds. 17/04/18 11:16:34 INFO namenode.FSEditLog: closing edit log: position=4, editlog=hdfs:/localhost:9000/dfs/name/current/edits 17/04/18 11:16:34 INFO namenode.FSEditLog: close success: truncate to 4, editlog=hdfs:/localhost:9000/dfs/name/current/edits 17/04/18 11:16:34 INFO common.Storage: Storage directory hdfs:/localhost:9000/dfs/name has been successfully formatted. 17/04/18 11:16:34 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at AppledeMacBook-Air-2.local/172.19.167.21 ************************************************************/
表示HDFS已經安裝成功。
執行start-all.sh啓動,期間輸入了三次密碼= =
➜ ~ start-all.sh namenode running as process 61005. Stop it first. Password: localhost: starting datanode, logging to /Users/Apple/Documents/hadoop-1.2.1/libexec/../logs/hadoop-Apple-datanode-AppledeMacBook-Air-2.local.out Password: localhost: secondarynamenode running as process 61265. Stop it first. starting jobtracker, logging to /Users/Apple/Documents/hadoop-1.2.1/libexec/../logs/hadoop-Apple-jobtracker-AppledeMacBook-Air-2.local.out Password: localhost: starting tasktracker, logging to /Users/Apple/Documents/hadoop-1.2.1/libexec/../logs/hadoop-Apple-tasktracker-AppledeMacBook-Air-2.local.out
這樣說明啓動成功
➜ ~ jps 61265 SecondaryNameNode 94723 Jps 61005 NameNode ➜ ~
瀏覽器輸入網址http://localhost:50070
就能看到Hadoop的界面了:
NameNode 'localhost:8020' Started: Tue Apr 18 10:24:19 CST 2017 Version: 1.2.1, r1503152 Compiled: Mon Jul 22 15:23:09 PDT 2013 by mattf Upgrades: There are no upgrades in progress. Browse the filesystem Namenode Logs Cluster Summary 1 files and directories, 0 blocks = 1 total. Heap Size is 77.5 MB / 1.74 GB (4%) Configured Capacity : 0 KB DFS Used : 0 KB Non DFS Used : 0 KB DFS Remaining : 0 KB DFS Used% : 100 % DFS Remaining% : 0 % Live Nodes : 0 Dead Nodes : 0 Decommissioning Nodes : 0 Number of Under-Replicated Blocks : 0 There are no datanodes in the cluster NameNode Storage: Storage Directory Type State hdfs:/localhost:9000/dfs/name IMAGE_AND_EDITS Active This is Apache Hadoop release 1.2.1
至此,Hadoop的安裝配置就所有完成了,若有任何疑問或者文章有任何錯誤,歡迎交流、批評與指正,本文章純屬原創,如需轉載,請註明出處,謝謝!
聯繫方式:370555337@qq.com
MY BLOG