首先到官方下載官網的hadoop2.7.7,連接以下
https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/
找網盤的hadooponwindows-master.zip
連接以下
https://pan.baidu.com/s/1VdG6PBnYKM91ia0hlhIeHg
把hadoop-2.7.7.tar.gz解壓後
使用hadooponwindows-master的bin和etc替換hadoop2.7.7的bin和etc
java
注意:安裝Hadoop2.7.7
官網下載Hadoop2.7.7,安裝時注意,最好不要安裝到帶有空格的路徑名下,例如:Programe Files,不然在配置Hadoop的配置文件時會找不到JDK(按相關說法,配置文件中的路徑加引號便可解決,但我沒測試成功)。
配置HADOOP_HOME node
path添加%HADOOP_HOME%\bin(win10不用分號或者以下編輯界面不用分號,其他加上 ;)mysql
-----------------------------------------------------------配置文件----------------------------web
使用編輯器打開E:\Hadoop2.7.7\hadoop-2.7.7\etc\hadoop\hadoop-env.cmd
修改JAVA_HOME的路徑
把set JAVA_HOME改成jdk的位置
注意其中PROGRA~1表明Program Files
set JAVA_HOME=E:\PROGRA~1\Java\jdk1.8.0_171 sql
打開 hadoop-2.7.7/etc/hadoop/hdfs-site.xml
修改路徑爲hadoop下的namenode和datanode
dfs.replication
1 數據庫
dfs.namenode.name.dir
/E:/Hadoop2.7.7/hadoop-2.7.7/data/namenode apache
dfs.datanode.data.dir
/E:/Hadoop2.7.7/hadoop-2.7.7/data/datanode
windows
在E:\Hadoop-2.7.7目錄下 添加tmp文件夾
在E:/Hadoop2.7.7/hadoop-2.7.7/添加data和namenode,datanode子文件夾 maven
還須要把hadoop.dll(從)拷貝到 C:\Windows\System32 編輯器
否則在window平臺使用MapReduce測試時報錯
以管理員身份打開命令提示符
輸入hdfs namenode -format,看到seccessfully就說明format成功。
轉到Hadoop-2.7.3\sbin文件下 輸入start-all,啓動hadoop集羣 ,關閉是 stop-all
輸入jps - 能夠查看運行的全部節點
訪問http://localhost:50070,訪問hadoop的web界面
---------------------------------------------------------------------
hadoop啓動後,建立以下的HDFS文件:
D:\Code\hadoop-2.7.7\hadoop-2.7.7\sbin>hdfs dfs -mkdir /user
D:\Code\hadoop-2.7.7\hadoop-2.7.7\sbin>hdfs dfs -mkdir /user/hive
D:\Code\hadoop-2.7.7\hadoop-2.7.7\sbin>hdfs dfs -mkdir /user/hive/warehouse
D:\Code\hadoop-2.7.7\hadoop-2.7.7\sbin>hdfs dfs -mkdir /tmp
D:\Code\hadoop-2.7.7\hadoop-2.7.7\sbin>hdfs dfs -mkdir /tmp/hive
D:\Code\hadoop-2.7.7\hadoop-2.7.7\sbin>hadoop fs -chmod -R 777 /tmp
1.安裝hadoop
2.從maven中下載mysql-connector-java-5.1.26-bin.jar(或其餘jar版本)放在hive目錄下的lib文件夾
3.配置hive環境變量,HIVE_HOME=F:\hadoop\apache-hive-2.1.1-bin
4.hive配置
hive的配置文件放在$HIVE_HOME/conf下,裏面有4個默認的配置文件模板
hive-default.xml.template 默認模板
hive-env.sh.template hive-env.sh默認配置
hive-exec-log4j.properties.template exec默認配置
hive-log4j.properties.template log默認配置
可不作任何修改hive也能運行,默認的配置元數據是存放在Derby數據庫裏面的,大多數人都不怎麼熟悉,咱們得改用mysql來存儲咱們的元數據,以及修改數據存放位置和日誌存放位置等使得咱們必須配置本身的環境,下面介紹如何配置。
(1)建立配置文件
$HIVE_HOME/conf/hive-default.xml.template -> $HIVE_HOME/conf/hive-site.xml
$HIVE_HOME/conf/hive-env.sh.template -> $HIVE_HOME/conf/hive-env.sh
$HIVE_HOME/conf/hive-exec-log4j.properties.template -> $HIVE_HOME/conf/hive-exec-log4j.properties
$HIVE_HOME/conf/hive-log4j.properties.template -> $HIVE_HOME/conf/hive-log4j.properties
(2)修改 hive-env.sh
export HADOOP_HOME=F:\hadoop\hadoop-2.7.2
export HIVE_CONF_DIR=F:\hadoop\apache-hive-2.1.1-bin\conf
export HIVE_AUX_JARS_PATH=F:\hadoop\apache-hive-2.1.1-bin\lib
(3)修改 hive-site.xml
1 <!--修改的配置--> 2 3 <property> 4 5 <name>hive.metastore.warehouse.dir</name> 6 7 <!--hive的數據存儲目錄,指定的位置在hdfs上的目錄--> 8 9 <value>/user/hive/warehouse</value> 10 11 <description>location of default database for the warehouse</description> 12 13 </property> 14 15 <property> 16 17 <name>hive.exec.scratchdir</name> 18 19 <!--hive的臨時數據目錄,指定的位置在hdfs上的目錄--> 20 21 <value>/tmp/hive</value> 22 23 <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.</description> 24 25 </property> 26 27 <property> 28 29 <name>hive.exec.local.scratchdir</name> 30 31 <!--本地目錄--> 32 33 <value>F:/hadoop/apache-hive-2.1.1-bin/hive/iotmp</value> 34 35 <description>Local scratch space for Hive jobs</description> 36 37 </property> 38 39 <property> 40 41 <name>hive.downloaded.resources.dir</name> 42 43 <!--本地目錄--> 44 45 <value>F:/hadoop/apache-hive-2.1.1-bin/hive/iotmp</value> 46 47 <description>Temporary local directory for added resources in the remote file system.</description> 48 49 </property> 50 51 <property> 52 53 <name>hive.querylog.location</name> 54 55 <!--本地目錄--> 56 57 <value>F:/hadoop/apache-hive-2.1.1-bin/hive/iotmp</value> 58 59 <description>Location of Hive run time structured log file</description> 60 61 </property> 62 63 <property> 64 65 <name>hive.server2.logging.operation.log.location</name> 66 67 <value>F:/hadoop/apache-hive-2.1.1-bin/hive/iotmp/operation_logs</value> 68 69 <description>Top level directory where operation logs are stored if logging functionality is enabled</description> 70 71 </property> 72 73 <!--新增的配置--> 74 75 <property> 76 77 <name>javax.jdo.option.ConnectionURL</name> 78 79 <value>jdbc:mysql://localhost:3306/hive?characterEncoding=UTF-8</value> 80 81 </property> 82 83 <property> 84 85 <name>javax.jdo.option.ConnectionDriverName</name> 86 87 <value>com.mysql.jdbc.Driver</value> 88 89 </property> 90 91 <property> 92 93 <name>javax.jdo.option.ConnectionUserName</name> 94 95 <value>root</value> 96 97 </property> 98 99 <property> 100 101 <name>javax.jdo.option.ConnectionPassword</name> 102 103 <value>root</value> 104 105 </property> 106 107 <!-- 解決 Required table missing : "`VERSION`" in Catalog "" Schema "". DataNucleus requires this table to perform its persistence operations. Either your MetaData is incorrect, or you need to enable "datanucleus.autoCreateTables" --> 108 109 <property> 110 111 <name>datanucleus.autoCreateSchema</name> 112 113 <value>true</value> 114 115 </property> 116 117 <property> 118 119 <name>datanucleus.autoCreateTables</name> 120 121 <value>true</value> 122 123 </property> 124 125 <property> 126 127 <name>datanucleus.autoCreateColumns</name> 128 129 <value>true</value> 130 131 </property> 132 133 <!-- 解決 Caused by: MetaException(message:Version information not found in metastore. ) --> 134 135 <property> 136 137 <name>hive.metastore.schema.verification</name> 138 139 <value>false</value> 140 141 <description> 142 143 Enforce metastore schema version consistency. 144 145 True: Verify that version information stored in metastore matches with one from Hive jars. Also disable automatic 146 147 schema migration attempt. Users are required to manully migrate schema after Hive upgrade which ensures 148 149 proper metastore schema migration. (Default) 150 151 False: Warn if the version information stored in metastore doesn't match with one from in Hive jars. 152 153 </description> 154 155 </property>
注:須要事先在hadoop上建立hdfs目錄
啓動metastore服務:hive --service metastore
在數據庫中生成對應的 hive 數據庫
啓動Hive:hive
-------------------------------------------------------------- 建立表 以及 查詢案例
hive上建立表:
CREATE TABLE testB (
id INT,
name string,
area string
) PARTITIONED BY (create_time string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
將本地文件上傳到 HDFS:
HDFS下執行: D:\Code\hadoop-2.7.7\hadoop-2.7.7\sbin>hdfs dfs -put D:\Code\hadoop-2.7.7\gxy\bbb.txt /user/hive/warehouse
hive導入HDFS中的數據:
LOAD DATA INPATH '/user/hive/warehouse/bbb.txt' INTO TABLE testb PARTITION(create_time='2015-07-08');
執行選擇命令:
select * from testb;