大數據教程(11.6)hadoop2.9.1平臺上倉庫工具hive2.3.4搭建

     上一篇文章介紹了hive的原理以及實現機。本篇博客開始,博主將分享數據倉庫hive工具搭建全過程。html

1、安裝Hive
      (1)、下載Hive和環境準備:java

      Hive官網地址:http://hive.apache.org/index.htmlmysql

      Hive下載地址:http://www.apache.org/dyn/closer.cgi/hive/linux

      注意: 在安裝Hive以前,須要保證你的Hadoop集羣已經正常啓動,Hive只需在Hadoop集羣的NameNode節點上安裝便可,無需在DataNode節點上安裝。web

      本文安裝的是 apache-hive-2.3.4-bin.tar.gz 其下載地址爲:http://mirrors.shu.edu.cn/apache/hive/hive-2.3.4/sql

      (2)、執行安裝數據庫

#上傳
Alt+p;
cd ~
put apache-hive-2.3.4-bin.tar.gz
# 將下載好的hive壓縮包解壓到用戶根目錄
tar zxvf apache-hive-2.3.4-bin.tar.gz

      (3)、配置hiveapache

#a.配置環境變量,編輯/etc/profile
   #set hive env
   export HIVE_HOME=/home/hadoop/apps/apache-hive-2.3.4-bin
   export PATH=${HIVE_HOME}/bin:$PATH
   #讓環境變量生效
   source /etc/profile

#建立hive-site.xml配置文件
   # 在開始配置Hive以前,先執行以下命令,切換到Hive的操做帳戶,個人是 hadoop 
   su - hadoop
   cd /home/hadoop/apps/apache-hive-2.3.4-bin/conf
   #以hive-default.xml.template爲模板,建立 hive-site.xml
   cp hive-default.xml.template  hive-site.xml

      (4)、在HDFS中建立Hive所需目錄vim

             由於在hive-site.xml中有如下配置:centos

<property>
  <name>hive.metastore.warehouse.dir</name>
  <value>/user/hive/warehouse</value>
  <description>location of default database for the warehouse</description>
</property>
<property>
  <name>hive.exec.scratchdir</name>
  <value>/tmp/hive</value>
  <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/&lt;username&gt; is created, with ${hive.scratch.dir.permission}.</description>
</property>

             因此須要在HDFS中建立好相應的目錄,操做命令以下:

[hadoop@centos-aaron-h1 ~]$ hdfs dfs -mkdir -p /user/hive/warehouse
[hadoop@centos-aaron-h1 ~]$ hdfs dfs -chmod -R 777 /user/hive/warehouse
[hadoop@centos-aaron-h1 ~]$ hdfs dfs -mkdir -p /tmp/hive
[hadoop@centos-aaron-h1 ~]$ hdfs dfs -chmod -R 777 /tmp/hive
[hadoop@centos-aaron-h1 ~]$ hdfs dfs -ls /
Found 16 items
drwxr-xr-x   - hadoop supergroup          0 2018-12-30 07:45 /combinefile
drwxr-xr-x   - hadoop supergroup          0 2018-12-24 00:51 /crw
drwxr-xr-x   - hadoop supergroup          0 2018-12-24 00:51 /en
drwxr-xr-x   - hadoop supergroup          0 2018-12-19 07:11 /index
drwxr-xr-x   - hadoop supergroup          0 2018-12-09 06:57 /localwccombineroutput
drwxr-xr-x   - hadoop supergroup          0 2018-12-24 00:51 /loge
drwxr-xr-x   - hadoop supergroup          0 2018-12-23 08:12 /ordergp
drwxr-xr-x   - hadoop supergroup          0 2018-12-19 05:48 /rjoin
drwxr-xr-x   - hadoop supergroup          0 2018-12-23 05:12 /shared
drwx------   - hadoop supergroup          0 2019-01-20 23:34 /tmp
drwxr-xr-x   - hadoop supergroup          0 2019-01-20 23:33 /user
drwxr-xr-x   - hadoop supergroup          0 2018-12-05 08:32 /wccombineroutput
drwxr-xr-x   - hadoop supergroup          0 2018-12-05 08:39 /wccombineroutputs
drwxr-xr-x   - hadoop supergroup          0 2019-01-19 22:38 /webloginput
drwxr-xr-x   - hadoop supergroup          0 2019-01-20 02:13 /weblogout
drwxr-xr-x   - hadoop supergroup          0 2018-12-23 07:46 /weblogwash
[hadoop@centos-aaron-h1 ~]$ hdfs dfs -ls /tmp/
Found 2 items
drwx------   - hadoop supergroup          0 2018-12-05 08:30 /tmp/hadoop-yarn
drwxrwxrwx   - hadoop supergroup          0 2019-01-20 23:34 /tmp/hive
[hadoop@centos-aaron-h1 ~]$  hdfs dfs -ls /user/hive
Found 1 items
drwxrwxrwx   - hadoop supergroup          0 2019-01-20 23:33 /user/hive/warehouse

      (5)、配置hive-site.xml

             a、配置hive本地臨時目錄

#將hive-site.xml文件中的${system:java.io.tmpdir}替換爲hive的本地臨時目錄,例如我使用的是 #/home/hadoop/apps/apache-hive-2.3.4-bin/tmp ,若是該目錄不存在,須要先進行建立,而且賦予讀寫權限:
[hadoop@centos-aaron-h1 apache-hive-2.3.4-bin]$ cd /home/hadoop/apps/apache-hive-2.3.4-bin
[hadoop@centos-aaron-h1 apache-hive-2.3.4-bin]$ mkdir tmp/
[hadoop@centos-aaron-h1 apache-hive-2.3.4-bin]$  chmod -R 777 tmp/
[hadoop@centos-aaron-h1 apache-hive-2.3.4-bin]$ cd conf

#在vim命令模式下執行以下命令完成替換
 %s#${system:java.io.tmpdir}#/home/hadoop/apps/apache-hive-2.3.4-bin/tmp#g
#以下:
#將
<property>
    <name>hive.exec.local.scratchdir</name>
    <value>${system:java.io.tmpdir}/${system:user.name}</value>
    <description>Local scratch space for Hive jobs</description>
</property>
#替換爲
<property>
    <name>hive.exec.local.scratchdir</name>
    <value>/home/hadoop/apps/apache-hive-2.3.4-bin/tmp/${system:user.name}</value>
    <description>Local scratch space for Hive jobs</description>
</property>

#配置Hive用戶名
#將hive-site.xml文件中的 ${system:user.name} 替換爲操做Hive的帳戶的用戶名,例如個人是 hadoop 。在vim命令模式##下執行以下命令完成替換:
 %s#${system:user.name}#hadoop#g
#以下:
將
<property>
    <name>hive.exec.local.scratchdir</name>
    <value>/home/hadoop/apps/apache-hive-2.3.4-bin/tmp/${system:user.name}</value>
    <description>Local scratch space for Hive jobs</description>
</property>
#替換爲
<property>
    <name>hive.exec.local.scratchdir</name>
    <value>/home/hadoop/apps/apache-hive-2.3.4-bin/tmp/hadoop</value>
    <description>Local scratch space for Hive jobs</description>
</property>

             b、修改Hive數據庫配置

屬性名稱 描述
javax.jdo.option.ConnectionDriverName     數據庫的驅動類名稱
javax.jdo.option.ConnectionURL   數據庫的JDBC鏈接地址
javax.jdo.option.ConnectionUserName  鏈接數據庫所使用的用戶名
javax.jdo.option.ConnectionPassword   鏈接數據庫所使用的密碼

Hive默認的配置使用的是Derby數據庫來存儲Hive的元數據信息,其配置信息以下:

<property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>org.apache.derby.jdbc.EmbeddedDriver</value>
    <description>Driver class name for a JDBC metastore</description>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:derby:;databaseName=metastore_db;create=true</value>
    <description>
      JDBC connect string for a JDBC metastore.
      To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
      For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
    </description>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>APP</value>
    <description>Username to use against metastore database</description>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>mine</value>
    <description>password to use against metastore database</description>
  </property>

須要將Derby數據庫切換爲MySQL數據庫的話,只須要修改以上4項配置,例如:

<property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
    <description>Driver class name for a JDBC metastore</description>
</property>
<property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&amp;useSSL=false</value>
    <description>
      JDBC connect string for a JDBC metastore.
      To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
      For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
    </description>
</property>
<property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>root</value>
    <description>Username to use against metastore database</description>
</property>
<property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>123456</value>
    <description>password to use against metastore database</description>
</property>

在配置 javax.jdo.option.ConnectionURL 的時候,使用useSSL=false,禁用MySQL鏈接警告,並且可能會致使Hive初始化MySQL元數據失敗。

此外,還須要將MySQL的驅動包拷貝到Hive的lib目錄下

#上面配置文件中的驅動名稱是 com.mysql.jdbc.Driver

cp ~/mysql-connector-java-5.1.28.jar  $HIVE_HOME/lib/

             c、配置 hive-env.sh

[hadoop@centos-aaron-h1 conf]$ cd ~/apps/apache-hive-2.3.4-bin/conf
[hadoop@centos-aaron-h1 conf]$ cp hive-env.sh.template hive-env.sh
[hadoop@centos-aaron-h1 conf]$ vi hive-env.sh

#新增如下內容
export HADOOP_HOME=/home/hadoop/apps/hadoop-2.9.1
export HIVE_CONF_DIR=/home/hadoop/apps/apache-hive-2.3.4-bin/conf
export HIVE_AUX_JARS_PATH=/home/hadoop/apps/apache-hive-2.3.4-bin/lib

      (6)、初始化啓動和測試Hive

[hadoop@centos-aaron-h1 apache-hive-2.3.4-bin]$ cd bin
[hadoop@centos-aaron-h1 bin]$ schematool -initSchema -dbType mysql
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/apps/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL:        jdbc:mysql://192.168.29.131:3306/hive?createDatabaseIfNotExist=true&useSSL=false
Metastore Connection Driver :    com.mysql.jdbc.Driver
Metastore connection User:       root
Starting metastore schema initialization to 2.3.0
Initialization script hive-schema-2.3.0.mysql.sql
Initialization script completed
schemaTool completed
[hadoop@centos-aaron-h1 bin]$

數據庫初始化完成以後,會在MySQL數據庫裏生成以下metadata表用於存儲Hive的元數據信息:

      (7)、啓動hive

[hadoop@centos-aaron-h1 bin]$ ./hive
which: no hbase in (/home/hadoop/apps/apache-hive-2.3.4-bin/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/local/jdk1.7.0_45/bin:/home/hadoop/apps/hadoop-2.9.1/bin:/home/hadoop/apps/hadoop-2.9.1/sbin:/home/hadoop/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/apps/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in file:/home/hadoop/apps/apache-hive-2.3.4-bin/conf/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. tez, spark) or using Hive 1.X releases.
hive>

輸入命令後發現hive 退格鍵不能用(網上搜索,將Secure CRT作如下操做就能夠了)

終端》仿真
修改 終端(T) 爲 Linux

查看hive數據庫:

hive> show databases;
OK
default
Time taken: 3.88 seconds, Fetched: 1 row(s)
hive>

hive操做建庫建表:

hive> create database wcc_log;
OK
Time taken: 0.234 seconds
hive> use wcc_log;
OK
Time taken: 0.019 seconds
hive> create table test_log(id int,name string);
OK
Time taken: 0.715 seconds
hive> show tables;
OK
test_log
Time taken: 0.026 seconds, Fetched: 1 row(s)
hive> select * from test_log;
OK
Time taken: 1.814 seconds
hive>

咱們能夠在hdfs中查看剛纔建立的數據庫、表等信息

mysql中查看hive的元數據信息(如下三種表分別是存儲hive倉庫中數據庫、表、字段的三張元數據表)

 

hive操做狀況表數據:

hive> use wcc_log;
OK
Time taken: 0.179 seconds
hive> truncate table test_log;
OK
Time taken: 0.399 seconds
hive> drop table test_log;
OK
Time taken: 1.195 seconds
hive> show tables;
OK
Time taken: 0.045 seconds
hive>

hive新建一張正規的表:

hive> create table t_web_log01(id int,name string)
    > row format delimited
    > fields terminated by ',';
OK
Time taken: 0.602 seconds

在linux中新建一個文件bbb_hive.txt,而且上傳到hdfs:/user/hive/warehouse/wcc_log.db/t_web_log01目錄

[hadoop@centos-aaron-h1 ~]$ cat bbb_hive.txt 
1,張三
2,李四
3,王二
4,麻子
5,隔壁老王
[hadoop@centos-aaron-h1 ~]$ hdfs dfs -put bbb_hive.txt /user/hive/warehouse/wcc_log.db/t_web_log01

查看hive中該表的數據:

[hadoop@centos-aaron-h1 bin]$ ./hive
which: no hbase in (/home/hadoop/apps/apache-hive-2.3.4-bin/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/local/jdk1.7.0_45/bin:/home/hadoop/apps/hadoop-2.9.1/bin:/home/hadoop/apps/hadoop-2.9.1/sbin:/home/hadoop/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/apps/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in file:/home/hadoop/apps/apache-hive-2.3.4-bin/conf/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. tez, spark) or using Hive 1.X releases.
hive> show databases;
OK
default
wcc_log
Time taken: 3.859 seconds, Fetched: 2 row(s)
hive> use wcc_log
    > ;
OK
Time taken: 0.024 seconds
hive> show tables;
OK
t_web_log01
Time taken: 0.027 seconds, Fetched: 1 row(s)
hive> select * from t_web_log01;
OK
1       張三
2       李四
3       王二
4       麻子
5       隔壁老王
Time taken: 1.353 seconds, Fetched: 5 row(s)
hive>

暫時未解決問題】在作聚合查詢時錯誤日誌記錄:select count(*) from t_web_log01;

hive> select count(id) from t_web_log01;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. tez, spark) or using Hive 1.X releases.
Query ID = hadoop_20190121041134_962cf495-4474-4c91-98ea-8a96bc548b20
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl.getClient(RpcClientFactoryPBImpl.java:81)
        at org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getProxy(HadoopYarnProtoRPC.java:48)
        at org.apache.hadoop.mapred.ClientCache$1.run(ClientCache.java:95)
        at org.apache.hadoop.mapred.ClientCache$1.run(ClientCache.java:92)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:356)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.hadoop.mapred.ClientCache.instantiateHistoryProxy(ClientCache.java:92)
        at org.apache.hadoop.mapred.ClientCache.getInitializedHSProxy(ClientCache.java:77)
        at org.apache.hadoop.mapred.YARNRunner.addHistoryToken(YARNRunner.java:219)
        at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:253)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1889)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576)
        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1889)
        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562)
        at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:411)
        at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:151)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2183)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1839)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1526)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
        at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
        at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl.getClient(RpcClientFactoryPBImpl.java:78)
        ... 45 more
Caused by: java.lang.OutOfMemoryError: PermGen space
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
        at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        at java.lang.Class.getDeclaredMethods0(Native Method)
        at java.lang.Class.privateGetDeclaredMethods(Class.java:2531)
        at java.lang.Class.privateGetPublicMethods(Class.java:2651)
        at java.lang.Class.privateGetPublicMethods(Class.java:2661)
        at java.lang.Class.privateGetPublicMethods(Class.java:2661)
        at java.lang.Class.privateGetPublicMethods(Class.java:2661)
        at java.lang.Class.getMethods(Class.java:1467)
        at sun.misc.ProxyGenerator.generateClassFile(ProxyGenerator.java:426)
        at sun.misc.ProxyGenerator.generateProxyClass(ProxyGenerator.java:323)
        at java.lang.reflect.Proxy.getProxyClass0(Proxy.java:636)
        at java.lang.reflect.Proxy.newProxyInstance(Proxy.java:722)
        at org.apache.hadoop.ipc.ProtobufRpcEngine.getProxy(ProtobufRpcEngine.java:101)
        at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:583)
        at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:549)
        at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:496)
        at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:461)
        at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:647)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:604)
        at org.apache.hadoop.mapreduce.v2.api.impl.pb.client.HSClientProtocolPBClientImpl.<init>(HSClientProtocolPBClientImpl.java:38)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
Job Submission failed with exception 'org.apache.hadoop.yarn.exceptions.YarnRuntimeException(java.lang.reflect.InvocationTargetException)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. java.lang.reflect.InvocationTargetException
Exception in thread "main" java.lang.OutOfMemoryError: PermGen space
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
        at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        at org.apache.hadoop.hive.common.FileUtils.deleteDirectory(FileUtils.java:778)
        at org.apache.hadoop.hive.ql.session.SessionState.close(SessionState.java:1560)
        at org.apache.hadoop.hive.cli.CliSessionState.close(CliSessionState.java:66)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:762)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
Exception in thread "Thread-1" java.lang.OutOfMemoryError: PermGen space




hive> set mapred.reduce.tasks = 1; 
hive> select count(1) from t_web_log01;
OK
0
Time taken: 1.962 seconds, Fetched: 1 row(s)
hive> select count(id) from t_web_log01;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. tez, spark) or using Hive 1.X releases.
Query ID = hadoop_20190121041942_3f55a0ac-c478-43f5-abf3-0a7aace5c334
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. PermGen space
Exception in thread "main" java.lang.OutOfMemoryError: PermGen space
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
        at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        at org.apache.hadoop.hive.common.FileUtils.deleteDirectory(FileUtils.java:778)
        at org.apache.hadoop.hive.ql.session.SessionState.close(SessionState.java:1560)
        at org.apache.hadoop.hive.cli.CliSessionState.close(CliSessionState.java:66)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:762)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:153)

       緣由:hive2.3.4已經建議更換其它的平臺(Apache Tez, Apache Spark)來跑任務了。博主下一篇文章將用hive1.2.2來爲小夥伴們從新搭建一次!

       最後寄語,以上是博主本次文章的所有內容,若是你們以爲博主的文章還不錯,請點贊;若是您對博主其它服務器大數據技術或者博主本人感興趣,請關注博主博客,而且歡迎隨時跟博主溝通交流。

相關文章
相關標籤/搜索