上一篇文章介紹了hive的原理以及實現機。本篇博客開始,博主將分享數據倉庫hive工具搭建全過程。html
1、安裝Hive
(1)、下載Hive和環境準備:java
Hive官網地址:http://hive.apache.org/index.htmlmysql
Hive下載地址:http://www.apache.org/dyn/closer.cgi/hive/linux
注意: 在安裝Hive以前,須要保證你的Hadoop集羣已經正常啓動,Hive只需在Hadoop集羣的NameNode節點上安裝便可,無需在DataNode節點上安裝。web
本文安裝的是 apache-hive-2.3.4-bin.tar.gz 其下載地址爲:http://mirrors.shu.edu.cn/apache/hive/hive-2.3.4/sql
(2)、執行安裝數據庫
#上傳 Alt+p; cd ~ put apache-hive-2.3.4-bin.tar.gz # 將下載好的hive壓縮包解壓到用戶根目錄 tar zxvf apache-hive-2.3.4-bin.tar.gz
(3)、配置hiveapache
#a.配置環境變量,編輯/etc/profile #set hive env export HIVE_HOME=/home/hadoop/apps/apache-hive-2.3.4-bin export PATH=${HIVE_HOME}/bin:$PATH #讓環境變量生效 source /etc/profile #建立hive-site.xml配置文件 # 在開始配置Hive以前,先執行以下命令,切換到Hive的操做帳戶,個人是 hadoop su - hadoop cd /home/hadoop/apps/apache-hive-2.3.4-bin/conf #以hive-default.xml.template爲模板,建立 hive-site.xml cp hive-default.xml.template hive-site.xml
(4)、在HDFS中建立Hive所需目錄vim
由於在hive-site.xml中有如下配置:centos
<property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> <description>location of default database for the warehouse</description> </property> <property> <name>hive.exec.scratchdir</name> <value>/tmp/hive</value> <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.</description> </property>
因此須要在HDFS中建立好相應的目錄,操做命令以下:
[hadoop@centos-aaron-h1 ~]$ hdfs dfs -mkdir -p /user/hive/warehouse [hadoop@centos-aaron-h1 ~]$ hdfs dfs -chmod -R 777 /user/hive/warehouse [hadoop@centos-aaron-h1 ~]$ hdfs dfs -mkdir -p /tmp/hive [hadoop@centos-aaron-h1 ~]$ hdfs dfs -chmod -R 777 /tmp/hive [hadoop@centos-aaron-h1 ~]$ hdfs dfs -ls / Found 16 items drwxr-xr-x - hadoop supergroup 0 2018-12-30 07:45 /combinefile drwxr-xr-x - hadoop supergroup 0 2018-12-24 00:51 /crw drwxr-xr-x - hadoop supergroup 0 2018-12-24 00:51 /en drwxr-xr-x - hadoop supergroup 0 2018-12-19 07:11 /index drwxr-xr-x - hadoop supergroup 0 2018-12-09 06:57 /localwccombineroutput drwxr-xr-x - hadoop supergroup 0 2018-12-24 00:51 /loge drwxr-xr-x - hadoop supergroup 0 2018-12-23 08:12 /ordergp drwxr-xr-x - hadoop supergroup 0 2018-12-19 05:48 /rjoin drwxr-xr-x - hadoop supergroup 0 2018-12-23 05:12 /shared drwx------ - hadoop supergroup 0 2019-01-20 23:34 /tmp drwxr-xr-x - hadoop supergroup 0 2019-01-20 23:33 /user drwxr-xr-x - hadoop supergroup 0 2018-12-05 08:32 /wccombineroutput drwxr-xr-x - hadoop supergroup 0 2018-12-05 08:39 /wccombineroutputs drwxr-xr-x - hadoop supergroup 0 2019-01-19 22:38 /webloginput drwxr-xr-x - hadoop supergroup 0 2019-01-20 02:13 /weblogout drwxr-xr-x - hadoop supergroup 0 2018-12-23 07:46 /weblogwash [hadoop@centos-aaron-h1 ~]$ hdfs dfs -ls /tmp/ Found 2 items drwx------ - hadoop supergroup 0 2018-12-05 08:30 /tmp/hadoop-yarn drwxrwxrwx - hadoop supergroup 0 2019-01-20 23:34 /tmp/hive [hadoop@centos-aaron-h1 ~]$ hdfs dfs -ls /user/hive Found 1 items drwxrwxrwx - hadoop supergroup 0 2019-01-20 23:33 /user/hive/warehouse
(5)、配置hive-site.xml
a、配置hive本地臨時目錄
#將hive-site.xml文件中的${system:java.io.tmpdir}替換爲hive的本地臨時目錄,例如我使用的是 #/home/hadoop/apps/apache-hive-2.3.4-bin/tmp ,若是該目錄不存在,須要先進行建立,而且賦予讀寫權限: [hadoop@centos-aaron-h1 apache-hive-2.3.4-bin]$ cd /home/hadoop/apps/apache-hive-2.3.4-bin [hadoop@centos-aaron-h1 apache-hive-2.3.4-bin]$ mkdir tmp/ [hadoop@centos-aaron-h1 apache-hive-2.3.4-bin]$ chmod -R 777 tmp/ [hadoop@centos-aaron-h1 apache-hive-2.3.4-bin]$ cd conf #在vim命令模式下執行以下命令完成替換 %s#${system:java.io.tmpdir}#/home/hadoop/apps/apache-hive-2.3.4-bin/tmp#g #以下: #將 <property> <name>hive.exec.local.scratchdir</name> <value>${system:java.io.tmpdir}/${system:user.name}</value> <description>Local scratch space for Hive jobs</description> </property> #替換爲 <property> <name>hive.exec.local.scratchdir</name> <value>/home/hadoop/apps/apache-hive-2.3.4-bin/tmp/${system:user.name}</value> <description>Local scratch space for Hive jobs</description> </property> #配置Hive用戶名 #將hive-site.xml文件中的 ${system:user.name} 替換爲操做Hive的帳戶的用戶名,例如個人是 hadoop 。在vim命令模式##下執行以下命令完成替換: %s#${system:user.name}#hadoop#g #以下: 將 <property> <name>hive.exec.local.scratchdir</name> <value>/home/hadoop/apps/apache-hive-2.3.4-bin/tmp/${system:user.name}</value> <description>Local scratch space for Hive jobs</description> </property> #替換爲 <property> <name>hive.exec.local.scratchdir</name> <value>/home/hadoop/apps/apache-hive-2.3.4-bin/tmp/hadoop</value> <description>Local scratch space for Hive jobs</description> </property>
b、修改Hive數據庫配置
屬性名稱 | 描述 |
javax.jdo.option.ConnectionDriverName | 數據庫的驅動類名稱 |
javax.jdo.option.ConnectionURL | 數據庫的JDBC鏈接地址 |
javax.jdo.option.ConnectionUserName | 鏈接數據庫所使用的用戶名 |
javax.jdo.option.ConnectionPassword | 鏈接數據庫所使用的密碼 |
Hive默認的配置使用的是Derby數據庫來存儲Hive的元數據信息,其配置信息以下:
<property> <name>javax.jdo.option.ConnectionDriverName</name> <value>org.apache.derby.jdbc.EmbeddedDriver</value> <description>Driver class name for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:derby:;databaseName=metastore_db;create=true</value> <description> JDBC connect string for a JDBC metastore. To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL. For example, jdbc:postgresql://myhost/db?ssl=true for postgres database. </description> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>APP</value> <description>Username to use against metastore database</description> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>mine</value> <description>password to use against metastore database</description> </property>
須要將Derby數據庫切換爲MySQL數據庫的話,只須要修改以上4項配置,例如:
<property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> <description>Driver class name for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&useSSL=false</value> <description> JDBC connect string for a JDBC metastore. To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL. For example, jdbc:postgresql://myhost/db?ssl=true for postgres database. </description> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> <description>Username to use against metastore database</description> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>123456</value> <description>password to use against metastore database</description> </property>
在配置 javax.jdo.option.ConnectionURL 的時候,使用useSSL=false,禁用MySQL鏈接警告,並且可能會致使Hive初始化MySQL元數據失敗。
此外,還須要將MySQL的驅動包拷貝到Hive的lib目錄下
#上面配置文件中的驅動名稱是 com.mysql.jdbc.Driver cp ~/mysql-connector-java-5.1.28.jar $HIVE_HOME/lib/
c、配置 hive-env.sh
[hadoop@centos-aaron-h1 conf]$ cd ~/apps/apache-hive-2.3.4-bin/conf [hadoop@centos-aaron-h1 conf]$ cp hive-env.sh.template hive-env.sh [hadoop@centos-aaron-h1 conf]$ vi hive-env.sh #新增如下內容 export HADOOP_HOME=/home/hadoop/apps/hadoop-2.9.1 export HIVE_CONF_DIR=/home/hadoop/apps/apache-hive-2.3.4-bin/conf export HIVE_AUX_JARS_PATH=/home/hadoop/apps/apache-hive-2.3.4-bin/lib
(6)、初始化啓動和測試Hive
[hadoop@centos-aaron-h1 apache-hive-2.3.4-bin]$ cd bin [hadoop@centos-aaron-h1 bin]$ schematool -initSchema -dbType mysql SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/hadoop/apps/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Metastore connection URL: jdbc:mysql://192.168.29.131:3306/hive?createDatabaseIfNotExist=true&useSSL=false Metastore Connection Driver : com.mysql.jdbc.Driver Metastore connection User: root Starting metastore schema initialization to 2.3.0 Initialization script hive-schema-2.3.0.mysql.sql Initialization script completed schemaTool completed [hadoop@centos-aaron-h1 bin]$
數據庫初始化完成以後,會在MySQL數據庫裏生成以下metadata表用於存儲Hive的元數據信息:
(7)、啓動hive
[hadoop@centos-aaron-h1 bin]$ ./hive which: no hbase in (/home/hadoop/apps/apache-hive-2.3.4-bin/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/local/jdk1.7.0_45/bin:/home/hadoop/apps/hadoop-2.9.1/bin:/home/hadoop/apps/hadoop-2.9.1/sbin:/home/hadoop/bin) SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/hadoop/apps/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Logging initialized using configuration in file:/home/hadoop/apps/apache-hive-2.3.4-bin/conf/hive-log4j2.properties Async: true Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. tez, spark) or using Hive 1.X releases. hive>
輸入命令後發現hive 退格鍵不能用(網上搜索,將Secure CRT作如下操做就能夠了)
終端》仿真 修改 終端(T) 爲 Linux
查看hive數據庫:
hive> show databases; OK default Time taken: 3.88 seconds, Fetched: 1 row(s) hive>
hive操做建庫建表:
hive> create database wcc_log; OK Time taken: 0.234 seconds hive> use wcc_log; OK Time taken: 0.019 seconds hive> create table test_log(id int,name string); OK Time taken: 0.715 seconds hive> show tables; OK test_log Time taken: 0.026 seconds, Fetched: 1 row(s) hive> select * from test_log; OK Time taken: 1.814 seconds hive>
咱們能夠在hdfs中查看剛纔建立的數據庫、表等信息
mysql中查看hive的元數據信息(如下三種表分別是存儲hive倉庫中數據庫、表、字段的三張元數據表)
hive操做狀況表數據:
hive> use wcc_log; OK Time taken: 0.179 seconds hive> truncate table test_log; OK Time taken: 0.399 seconds hive> drop table test_log; OK Time taken: 1.195 seconds hive> show tables; OK Time taken: 0.045 seconds hive>
hive新建一張正規的表:
hive> create table t_web_log01(id int,name string) > row format delimited > fields terminated by ','; OK Time taken: 0.602 seconds
在linux中新建一個文件bbb_hive.txt,而且上傳到hdfs:/user/hive/warehouse/wcc_log.db/t_web_log01目錄
[hadoop@centos-aaron-h1 ~]$ cat bbb_hive.txt 1,張三 2,李四 3,王二 4,麻子 5,隔壁老王 [hadoop@centos-aaron-h1 ~]$ hdfs dfs -put bbb_hive.txt /user/hive/warehouse/wcc_log.db/t_web_log01
查看hive中該表的數據:
[hadoop@centos-aaron-h1 bin]$ ./hive which: no hbase in (/home/hadoop/apps/apache-hive-2.3.4-bin/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/local/jdk1.7.0_45/bin:/home/hadoop/apps/hadoop-2.9.1/bin:/home/hadoop/apps/hadoop-2.9.1/sbin:/home/hadoop/bin) SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/hadoop/apps/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/hadoop/apps/hadoop-2.9.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Logging initialized using configuration in file:/home/hadoop/apps/apache-hive-2.3.4-bin/conf/hive-log4j2.properties Async: true Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. tez, spark) or using Hive 1.X releases. hive> show databases; OK default wcc_log Time taken: 3.859 seconds, Fetched: 2 row(s) hive> use wcc_log > ; OK Time taken: 0.024 seconds hive> show tables; OK t_web_log01 Time taken: 0.027 seconds, Fetched: 1 row(s) hive> select * from t_web_log01; OK 1 張三 2 李四 3 王二 4 麻子 5 隔壁老王 Time taken: 1.353 seconds, Fetched: 5 row(s) hive>
【暫時未解決問題】在作聚合查詢時錯誤日誌記錄:select count(*) from t_web_log01;
hive> select count(id) from t_web_log01; WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. tez, spark) or using Hive 1.X releases. Query ID = hadoop_20190121041134_962cf495-4474-4c91-98ea-8a96bc548b20 Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl.getClient(RpcClientFactoryPBImpl.java:81) at org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getProxy(HadoopYarnProtoRPC.java:48) at org.apache.hadoop.mapred.ClientCache$1.run(ClientCache.java:95) at org.apache.hadoop.mapred.ClientCache$1.run(ClientCache.java:92) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:356) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) at org.apache.hadoop.mapred.ClientCache.instantiateHistoryProxy(ClientCache.java:92) at org.apache.hadoop.mapred.ClientCache.getInitializedHSProxy(ClientCache.java:77) at org.apache.hadoop.mapred.YARNRunner.addHistoryToken(YARNRunner.java:219) at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:253) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1889) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1889) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:411) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:151) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2183) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1839) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1526) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:239) at org.apache.hadoop.util.RunJar.main(RunJar.java:153) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl.getClient(RpcClientFactoryPBImpl.java:78) ... 45 more Caused by: java.lang.OutOfMemoryError: PermGen space at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:800) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) at java.net.URLClassLoader.access$100(URLClassLoader.java:71) at java.net.URLClassLoader$1.run(URLClassLoader.java:361) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.Class.getDeclaredMethods0(Native Method) at java.lang.Class.privateGetDeclaredMethods(Class.java:2531) at java.lang.Class.privateGetPublicMethods(Class.java:2651) at java.lang.Class.privateGetPublicMethods(Class.java:2661) at java.lang.Class.privateGetPublicMethods(Class.java:2661) at java.lang.Class.privateGetPublicMethods(Class.java:2661) at java.lang.Class.getMethods(Class.java:1467) at sun.misc.ProxyGenerator.generateClassFile(ProxyGenerator.java:426) at sun.misc.ProxyGenerator.generateProxyClass(ProxyGenerator.java:323) at java.lang.reflect.Proxy.getProxyClass0(Proxy.java:636) at java.lang.reflect.Proxy.newProxyInstance(Proxy.java:722) at org.apache.hadoop.ipc.ProtobufRpcEngine.getProxy(ProtobufRpcEngine.java:101) at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:583) at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:549) at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:496) at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:461) at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:647) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:604) at org.apache.hadoop.mapreduce.v2.api.impl.pb.client.HSClientProtocolPBClientImpl.<init>(HSClientProtocolPBClientImpl.java:38) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) Job Submission failed with exception 'org.apache.hadoop.yarn.exceptions.YarnRuntimeException(java.lang.reflect.InvocationTargetException)' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. java.lang.reflect.InvocationTargetException Exception in thread "main" java.lang.OutOfMemoryError: PermGen space at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:800) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) at java.net.URLClassLoader.access$100(URLClassLoader.java:71) at java.net.URLClassLoader$1.run(URLClassLoader.java:361) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at org.apache.hadoop.hive.common.FileUtils.deleteDirectory(FileUtils.java:778) at org.apache.hadoop.hive.ql.session.SessionState.close(SessionState.java:1560) at org.apache.hadoop.hive.cli.CliSessionState.close(CliSessionState.java:66) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:762) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:239) at org.apache.hadoop.util.RunJar.main(RunJar.java:153) Exception in thread "Thread-1" java.lang.OutOfMemoryError: PermGen space hive> set mapred.reduce.tasks = 1; hive> select count(1) from t_web_log01; OK 0 Time taken: 1.962 seconds, Fetched: 1 row(s) hive> select count(id) from t_web_log01; WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. tez, spark) or using Hive 1.X releases. Query ID = hadoop_20190121041942_3f55a0ac-c478-43f5-abf3-0a7aace5c334 Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. PermGen space Exception in thread "main" java.lang.OutOfMemoryError: PermGen space at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:800) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) at java.net.URLClassLoader.access$100(URLClassLoader.java:71) at java.net.URLClassLoader$1.run(URLClassLoader.java:361) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at org.apache.hadoop.hive.common.FileUtils.deleteDirectory(FileUtils.java:778) at org.apache.hadoop.hive.ql.session.SessionState.close(SessionState.java:1560) at org.apache.hadoop.hive.cli.CliSessionState.close(CliSessionState.java:66) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:762) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:239) at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
緣由:hive2.3.4已經建議更換其它的平臺(Apache Tez, Apache Spark)來跑任務了。博主下一篇文章將用hive1.2.2來爲小夥伴們從新搭建一次!
最後寄語,以上是博主本次文章的所有內容,若是你們以爲博主的文章還不錯,請點贊;若是您對博主其它服務器大數據技術或者博主本人感興趣,請關注博主博客,而且歡迎隨時跟博主溝通交流。