Hive安裝與配置詳解

pandas和SQL數據分析實戰

https://study.163.com/course/courseMain.htm?courseId=1006383008&share=2&shareId=400000000398149html

http://www.javashuo.com/article/p-eksivokj-bd.html(轉載)java

既然是詳解,那麼咱們就不能只知道怎麼安裝hive了,下面從hive的基本提及,若是你瞭解了,那麼請直接移步安裝與配置python

hive是什麼mysql

hive安裝和配置web

hive的測試sql


hive

  這裏簡單說明一下,好對你們配置hive有點幫助。hive是創建在hadoop上的,固然,你若是隻搭建hive也沒用什麼錯。說簡單一點,hadoop中的mapreduce調用若是面向DBA的時候,那麼問題也就顯現了,由於不是每一個DBA都能明白mapreduce的工做原理,若是爲了管理數據而須要學習一門新的技術,從現實生活中來講,公司又須要花錢請更有技術的人來了。數據庫

  開個玩笑,hadoop是爲了存儲數據和計算而推廣的技術,而和數據掛鉤的也就屬於數據庫的領域了,因此hadoop和DBA掛鉤也就是情理之中的事情,在這個基礎之上,咱們就須要爲了DBA創做適合的技術。express

  hive正是實現了這個,hive是要類SQL語句(HiveQL)來實現對hadoop下的數據管理。hive屬於數據倉庫的範疇,那麼,數據庫和數據倉庫到底有什麼區別了,這裏簡單說明一下:數據庫側重於OLTP(在線事務處理),數據倉庫側重OLAP(在線分析處理);也就是說,例如mysql類的數據庫更側重於短期內的數據處理,反之。apache

無hive:使用者.....->mapreduce...->hadoop數據(可能須要會mapreduce)vim

有hive:使用者...->HQL(SQL)->hive...->mapreduce...->hadoop數據(只須要會SQL語句)


 hive安裝和配置

安裝

一:下載hive——地址:http://mirror.bit.edu.cn/apache/hive/

 這裏以hive-2.1.1爲例子,如圖:

將hive解壓到/usr/local下:

[root@s100 local]# tar -zxvf apache-hive-2.1.1-bin.tar.gz -C /usr/local/

將文件重命名爲hive文件:

[root@s100 local]# mv apache-hive-2.1.1-bin hive

 

修改環境變量/etc/profile:

[root@s100 local]# vim /etc/profile

 

1 #hive
2 export HIVE_HOME=/usr/local/hive
3 export PATH=$PATH:$HIVE_HOME/bin

執行source /etc.profile:

執行hive --version

[root@s100 local]# hive --version

 

 有hive的版本顯現,安裝成功!

配置

[root@s100 conf]# cd /usr/local/hive/conf/

修改hive-site.xml:

這裏沒有,咱們就以模板複製一個:

[root@s100 conf]# cp hive-default.xml.template hive-site.xml
[root@s100 conf]# vim hive-site.xml 

 

1.配置hive-site.xml(第5點的後面有一個單獨的hive-site.xml配置文件,這個若是有疑問能夠用後面的配置文件,更容易明白)

主要是mysql的鏈接信息(在文本的最開始位置)

複製代碼
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--
   Licensed to the Apache Software Foundation (ASF) under one or more
   contributor license agreements.  See the NOTICE file distributed with
   this work for additional information regarding copyright ownership.
   The ASF licenses this file to You under the Apache License, Version 2.0
   (the "License"); you may not use this file except in compliance with
   the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
--><configuration>
  <!-- WARNING!!! This file is auto generated for documentation purposes ONLY! -->
  <!-- WARNING!!! Any changes you make to this file will be ignored by Hive.   -->
  <!-- WARNING!!! You must make your changes in hive-site.xml instead.         -->
  <!-- Hive Execution Parameters -->

        <!-- 插入一下代碼 -->
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>用戶名(這4是新添加的,記住刪除配置文件原有的哦!)
        <value>root</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>密碼
        <value>123456</value>
    </property>
   <property>
        <name>javax.jdo.option.ConnectionURL</name>mysql
        <value>jdbc:mysql://192.168.1.68:3306/hive</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>mysql驅動程序
        <value>com.mysql.jdbc.Driver</value>
    </property>
        <!-- 到此結束代碼 -->


  <property>
    <name>hive.exec.script.wrapper</name>
    <value/>
    <description/>
  </property>
複製代碼

2.複製mysql的驅動程序到hive/lib下面(這裏已經拷貝好了)

[root@s100 lib]# ll mysql-connector-java-5.1.18-bin.jar 
-rw-r--r-- 1 root root 789885 1月   4 01:43 mysql-connector-java-5.1.18-bin.jar

 

3.在mysql中hive的schema(在此以前須要建立mysql下的hive數據庫)

1 [root@s100 bin]# pwd
2 /usr/local/hive/bin
3 [root@s100 bin]# schematool -dbType mysql -initSchema

4.執行hive命令

[root@localhost hive]# hive

成功進入hive界面,hive配置完成

5.查詢mysql(hive這個庫是在 schematool -dbType mysql -initSchema 以前建立的!)

複製代碼
 1 [root@localhost ~]# mysql -uroot -p123456
 2 Welcome to the MySQL monitor.  Commands end with ; or \g.
 3 Your MySQL connection id is 10
 4 Server version: 5.1.73 Source distribution
 5 
 6 Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
 7 
 8 Oracle is a registered trademark of Oracle Corporation and/or its
 9 affiliates. Other names may be trademarks of their respective
10 owners.
11 
12 Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
13 
14 mysql> use hive
15 Reading table information for completion of table and column names
16 You can turn off this feature to get a quicker startup with -A
17 
18 Database changed
19 mysql> show tables;
20 +---------------------------+
21 | Tables_in_hive            |
22 +---------------------------+
23 | AUX_TABLE                 |
24 | BUCKETING_COLS            |
25 | CDS                       |
26 | COLUMNS_V2                |
27 | COMPACTION_QUEUE          |
28 | COMPLETED_COMPACTIONS     |
複製代碼

備註 (這裏不計入正文不要重複配置hive-site.xml)

配置文件hive-site.xml

這裏不得不說一下,若是你的 schematool -dbType mysql -initSchema 並無執行成功怎麼辦,小博主昨天在這卡了一天,最後根據偉大的百度和hive官方文檔,直接寫了一個hive-site.xml配置文本:

複製代碼
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
        <property>
                <name>javax.jdo.option.ConnectionURL</name>
                <value>jdbc:mysql://localhost:3306/hahive</value>(mysql地址localhost)
        </property>

        <property>
                <name>javax.jdo.option.ConnectionDriverName</name>(mysql的驅動)
                <value>com.mysql.jdbc.Driver</value>
        </property>

        <property>
                <name>javax.jdo.option.ConnectionUserName</name>(用戶名)
                <value>root</value>
        </property>

        <property>
                <name>javax.jdo.option.ConnectionPassword</name>(密碼)
                <value>123456</value>
        </property>

        <property>
                <name>hive.metastore.schema.verification</name>
                <value>false</value>
        </property>
</configuration>
複製代碼

 

 


 

那咱們作這些事幹什麼的呢,下面小段測試你們感覺一下

hive測試:

備註:這裏是第二個配置文件的演示:因此數據庫名稱是hahive數據庫!

1.須要知道如今的hadoop中的HDFS存了什麼

[root@localhost conf]# hadoop fs -lsr /

2.進入hive並建立一個測試庫和測試表

[root@localhost conf]# hive

 建立庫:

1 hive> create database hive_1;
2 OK
3 Time taken: 1.432 seconds

 顯示庫:

1 hive> show databases;
2 OK
3 default
4 hive_1
5 Time taken: 1.25 seconds, Fetched: 2 row(s)

 建立庫成功!

3.查詢一下HDFS有什麼變化

多了一個庫hive_1

娜莫喔們的mysql下的hahive庫有什麼變化

1
mysql> use hahive;
1
2
3
4
5
6
7
8
mysql>  select  from  DBS;
+-------+-----------------------+------------------------------------------------+---------+------------+------------+
| DB_ID | DESC                  | DB_LOCATION_URI                                | NAME    | OWNER_NAME | OWNER_TYPE |
+-------+-----------------------+------------------------------------------------+---------+------------+------------+
|     1 | Default Hive database | hdfs: //localhost/user/hive/warehouse           | default | public     | ROLE       |
|     6 | NULL                  | hdfs: //localhost/user/hive/warehouse/hive_1.db | hive_1  | root       | USER       |
+-------+-----------------------+------------------------------------------------+---------+------------+------------+
2 rows  in  set  (0.00 sec)

4.在hive_1下建立一個表hive_01

1
2
3
4
5
6
7
8
9
10
11
hive> use hive_1;
OK
Time taken: 0.754 seconds
hive> create table hive_01 (id  int ,name  string );
OK
Time taken: 2.447 seconds
hive> show tables;
OK
hive_01 (表建立成功)
Time taken: 0.31 seconds, Fetched: 2 row(s)
hive>

HDFS下的狀況:

mysql下:

1
2
3
4
5
6
7
mysql>  select  from  TBLS;
+--------+-------------+-------+------------------+-------+-----------+-------+----------+---------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME | TBL_TYPE      | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT |
+--------+-------------+-------+------------------+-------+-----------+-------+----------+---------------+--------------------+--------------------+
|      6 |  1514286051 |     6 |                0 | root  |         0 |     6 | hive_01  | MANAGED_TABLE | NULL               | NULL               |
+--------+-------------+-------+------------------+-------+-----------+-------+----------+---------------+--------------------+--------------------+
2 rows  in  set  (0.00 sec)

娜莫在web端是什麼樣子的呢!

 

總的來講,hive其實就和mysql差很少呢!那麼後面就不說了

 python機器學習-sklearn挖掘乳腺癌細胞( 博主親自錄製)

網易雲觀看地址

https://study.163.com/course/introduction.htm?courseId=1005269003&utm_campaign=commission&utm_source=cp-400000000398149&utm_medium=share

掃二維碼,關注博主主頁,學習更多Python知識

相關文章
相關標籤/搜索