由於Hive的使用依賴Hadoop,不一樣的版本之間有不少問題,大的原則上是hive2.x版本對應hadoop2.x版本,hive3.x版本對應hadoop3.x版本。java
可是在實際的使用過程當中仍是有各類兼容問題,具體的hive安裝能夠參考hive安裝,這裏咱們介紹一下遇到問題的解決方案。node
在看到日誌中有java.lang.NoClassDefFoundError的異常,通常是缺乏jar包,咱們只須要找到對應的jar包,jar的名稱好找日誌中就能看到,具體的版本能夠查看對應的源碼pom依賴。而後放到對應的lib目錄下就能夠了。web
在日誌中看到NoSuchMethod的異常,通常是由於jar包衝突,有多個版本。要找到hive的lib目錄,和hadoop的share目錄下的各個目錄中的jar包,而後使用高版本的替換對應的低版本,由於通常狀況高版本是兼容低版本的jar包的。若是用到hbase,hbase的lib目錄也要查看。spring
在hive3.1.1和hadoop3.0.2一塊兒使用的時候就會有disruptor和guava包有多個版本的問題,換成高版本就能夠了。sql
hive是一個操做hive數據庫的命令行接口(CLI,Client Line Interface)數據庫
beeline是一個操做hive數據庫的新命令行接口(New Client Line Interface)apache
beeline -u jdbc:hive2://localhost:10000 beeline -u jdbc:hive2://localhost:10000 -n user -p password beeline !connect jdbc:hive2://localhost:10000
HiveServer是一個服務端接口,使遠程客戶端能夠執行對Hive的查詢並返回結果。目前基於Thrift RPC的實現是HiveServer的改進版本,並支持多客戶端併發和身份驗證。windows
hive-site.xml數組
<property> <name>hive.server2.thrift.port</name> <value>10000</value> </property> <property> <name>hive.server2.thrift.bind.host</name> <value>127.0.0.1</value> </property> <property> <name>hive.server2.webui.host</name> <value>127.0.0.1</value> </property> <property> <name>hive.server2.webui.port</name> <value>10002</value> </property> <property> <name>hive.server2.enable.doAs</name> <value>false</value> </property>
hive.server2.enable.doAs:是爲了防止hdfs權限問題,這樣hive server會以提交用戶的身份去執行語句,若是設置爲false,則會以hive server的admin user來執行語句。併發
hive --service hiveserver hive --service hiveserver2 hiveserver2 --hiveconf hive.server2.thrift.port=10000
MetaStoreServer使用thrift協議提供了一個訪問元數據的服務,這樣遠程客戶端就能夠不用經過訪問數據庫的方式獲取元數據信息了,spark就會訪問這個服務。
hive --service metastore
<property> <name>hive.metastore.uris</name> <value>thrift://192.168.10.7:9083,thrift://192.168.10.8:9083</value> <description></description> </property>
TINYINT,SMALLINT,INT,BIGINT,BOOLEAN,FLOAT,DOUBLE,STRING,BINARY,TIMESTAMP,DECIMAL,CHAR,VARCHAR,DATE
其中: TINYINT 1字節
SMALLINT 2字節
INT 4字節
BIGINT 8字節
FLOAT 4字節
DOUBLE 8字節
ARRAY,MAP,STRUCT,UNION
array是數組類型,map是鍵值對,struct是其餘類型的組合
array與struct元素的分隔符是^B(ctrl+B,建立表中的八進制\002)
map鍵與值的分隔符是^C(ctrl+C,建立表中的八進制\003)
Hive默認的列分隔符是:^A(ctrl+A,建立表中的八進制\001)
Hive默認的行分隔符是\n
CREATE [EXTERNAL] TABLE [IF NOT EXISTS] table_name [(col_name data_type [COMMENT col_comment], ...)] [COMMENT table_comment] [PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)] [CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS] [ROW FORMAT row_format] [STORED AS file_format] [LOCATION hdfs_path]
create table goods(id int,name string,amount int) partitioned by (ctime date) row format delimited fields terminated by '\001' collection items terminated by '\002' map keys terminated by '\003' lines terminated by '\n' ;
能夠指定多個分區,分區列不用添加到建立表主體的字段中。
內部表和外部表的區別:
若是數據的全部處理都在Hive中進行,那麼使用內部表,可是若是Hive 和其餘工具要針對相同的數據集進行處理,使用外部表更合適。
分區表和分桶表的區別:
Hive數據表能夠根據某些字段進行分區操做,細化數據管理,可讓部分查詢更快。同 時表和分區也能夠進一步被劃分爲Buckets。
import org.junit.After; import org.junit.Before; import org.junit.Test; import java.sql.*; import java.text.SimpleDateFormat; import java.util.Calendar; import java.util.Random; public class HiveTest { // public static final String HIVE_URL = "jdbc:hive://127.0.0.1:10000/default";//hiveserver1 private static final String HIVE_URL = "jdbc:hive2://127.0.0.1:10000/test";//hiveserver2 private static String DRIVER_NAME = "org.apache.hive.jdbc.HiveDriver"; private Connection connection; @Before public void setUp() throws ClassNotFoundException, SQLException { Class.forName(DRIVER_NAME); connection = DriverManager.getConnection(HIVE_URL,"hive",""); } @Test public void showDbs() throws SQLException { Statement statement = connection.createStatement(); String sql = "show databases"; ResultSet rs = statement.executeQuery(sql); while (rs.next()){ System.out.println(rs.getString(1)); } } @Test public void createDB() throws SQLException { //create database dbName //create schema dbName Statement statement = connection.createStatement(); String sql = "create schema mytable"; statement.execute(sql); } @Test public void createTable() throws SQLException { String sql = "create table goods(id int,name string,amount int) " + "partitioned by (ctime date) " + "row format delimited " + "fields terminated by '\\001' " + "collection items terminated by '\\002' " + "map keys terminated by '\\003' " + "lines terminated by '\\n'"; Statement statement = connection.createStatement(); statement.execute(sql); } @Test public void createNewGoodsTable() throws SQLException { String sql = "create table new_goods(id int,name string,amount int) " + "partitioned by (ctime string) " + "row format delimited " + "fields terminated by '\\001' " + "collection items terminated by '\\002' " + "map keys terminated by '\\003' " + "lines terminated by '\\n'"; Statement statement = connection.createStatement(); statement.execute(sql); } @Test public void insertGoods() throws SQLException { String sql = "insert into goods(id,name,amount,ctime) values(?,?,?,?)"; PreparedStatement ps = connection.prepareStatement(sql); Random random = new Random(); String[] names = {"allen","alice","bob","tony","ribon"}; Calendar instance = Calendar.getInstance(); for(int i=0;i<10;i++){ ps.setInt(1,random.nextInt(1000)); ps.setString(2,names[random.nextInt(names.length)]); ps.setInt(3,random.nextInt(100000)); instance.add(Calendar.DAY_OF_MONTH,random.nextInt(3)); ps.setDate(4,new Date(instance.getTimeInMillis())); ps.executeUpdate(); } } @Test public void insertDood() throws SQLException { String sql = "insert into new_goods(id,name,amount,ctime) values (1,'allen',100,'2019-07-04')"; Statement statement = connection.createStatement(); statement.executeUpdate(sql); System.out.println("insert done"); statement.close(); } @Test public void selectGood() throws SQLException { String sql = "select id,name,amount,ctime from new_goods"; Statement statement = connection.createStatement(); ResultSet rs = statement.executeQuery(sql); SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd"); while (rs.next()){ System.out.println("id:" + rs.getInt(1)); System.out.println("name:" + rs.getString(2)); System.out.println("amount:" + rs.getInt(3)); System.out.println("ctime:" + rs.getString(4)); } } @Test public void selectGoods() throws SQLException { String sql = "select id,name,amount,ctime from goods"; Statement statement = connection.createStatement(); ResultSet rs = statement.executeQuery(sql); SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd"); while (rs.next()){ System.out.println("id:" + rs.getInt(1)); System.out.println("name:" + rs.getString(2)); System.out.println("amount:" + rs.getInt(3)); System.out.println("ctime:" + sdf.format(rs.getDate(4))); } } @Test public void descTable() throws SQLException { //desc tableName //describe tableName String sql = "desc goods"; Statement statement = connection.createStatement(); statement.execute(sql); } // @Test public void dropTable() throws SQLException { //drop table String sql = "drop table goods"; Statement statement = connection.createStatement(); statement.execute(sql); } @Test public void showTable() throws SQLException { //show tables; String sql = "show tables"; Statement statement = connection.createStatement(); statement.execute(sql); } @After public void tearDown() throws SQLException { connection.close(); } }
使用hiveserver,URL爲:
使用hiveserver2,URL爲:
若是遇到相似於下面的錯誤:
User: xxxx is not allowed to impersonate hive
能夠在hadoop的core-site.xml文件中加入以下配置,其中curitis爲用戶組,windows就是用戶名。
<property> <name>hadoop.proxyuser.curitis.groups</name> <value>*</value> <description></description> </property> <property> <name>hadoop.proxyuser.curitis.hosts</name> <value>*</value> <description></description> </property>
若是windows下用戶名包含.這樣的特殊字符,那就須要修改用戶名,稍微麻煩一點,先更名字,而後啓用Administrator用戶,使用Administrator用戶修改以前用戶名的用戶目錄,而後修改註冊表:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Profilelist
下面找到對應的舊用戶名,修改成新的用戶名。
若是遇到相似於下面的錯誤:
AccessControlException Permission denied: user=hive, access=WRITE, inode="/user/hive/warehouse/test.db":admin:supergroup:drwxr-xr-x
能夠執行:
hadoop fs -chmod -R 777 /user
若是遇到相似於下面的錯誤:
ipc.Client: Retrying connect to server: account.jetbrains.com/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
能夠在hadoop的yarn.site.xml配置文件中加入下面的配置:
<property> <name>yarn.resourcemanager.address</name> <value>127.0.0.1:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value> 127.0.0.1:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value> 127.0.0.1:8031</value> </property>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>org.curitis</groupId> <artifactId>hive-learn</artifactId> <version>1.0.0</version> <properties> <spring.version>5.1.3.RELEASE</spring.version> <junit.version>4.11</junit.version> <hive.version>3.1.1</hive.version> </properties> <dependencies> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-exec</artifactId> <version>${hive.version}</version> </dependency> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-common</artifactId> <version>${hive.version}</version> </dependency> <dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-jdbc</artifactId> <version>${hive.version}</version> </dependency> <!--test--> <dependency> <groupId>org.springframework</groupId> <artifactId>spring-test</artifactId> <version>${spring.version}</version> <scope>test</scope> </dependency> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>${junit.version}</version> <scope>test</scope> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <configuration> <source>8</source> <target>8</target> </configuration> </plugin> </plugins> </build> </project>
<?xml version="1.0" encoding="UTF-8"?> <configuration status="INFO" monitorInterval="600"> <Properties> <!-- <property name="LOG_HOME">${sys:user.home}/hive/test</property>--> <property name="LOG_HOME">F:/logs/hive/test</property> <!--輸出日誌的格式 %d{yyyy-MM-dd HH:mm:ss, SSS} : 日誌生產時間 %p : 日誌輸出格式 %c : logger的名稱 %m : 日誌內容,即 logger.info("message") %n : 換行符 %C : Java類名 %L : 日誌輸出所在行數 %M : 日誌輸出所在方法名 hostName : 本地機器名 hostAddress : 本地ip地址 --> <Property name="PATTERN_ONE">%5p [%t] %d{yyyy-MM-dd HH:mm:ss} (%F:%L) %m%n</Property> <Property name="PATTERN_TWO">%d{HH:mm:ss.SSS} %-5level %class{36} %L %M - %msg%xEx%n</Property> </Properties> <appenders> <console name="Console" target="SYSTEM_OUT"> <ThresholdFilter level="INFO" onMatch="ACCEPT" onMismatch="DENY" /> <PatternLayout pattern="${PATTERN_ONE}" /> </console> <File name="FileLog" fileName="${LOG_HOME}/hive.log" append="false"> <ThresholdFilter level="DEBUG" onMatch="ACCEPT" onMismatch="DENY"/> <PatternLayout pattern="${PATTERN_TWO}"/> </File> <RollingFile name="RollingFileInfo" fileName="${sys:user.home}/logs/info.log" filePattern="${sys:user.home}/logs/$${date:yyyy-MM}/info-%d{yyyy-MM-dd}-%i.log"> <ThresholdFilter level="info" onMatch="ACCEPT" onMismatch="DENY"/> <PatternLayout pattern="[%d{HH:mm:ss:SSS}] [%p] - %l - %m%n"/> <Policies> <TimeBasedTriggeringPolicy/> <SizeBasedTriggeringPolicy size="100 MB"/> </Policies> </RollingFile> <RollingFile name="RollingFileWarn" fileName="${sys:user.home}/logs/warn.log" filePattern="${sys:user.home}/logs/$${date:yyyy-MM}/warn-%d{yyyy-MM-dd}-%i.log"> <ThresholdFilter level="warn" onMatch="ACCEPT" onMismatch="DENY"/> <PatternLayout pattern="[%d{HH:mm:ss:SSS}] [%p] - %l - %m%n"/> <Policies> <TimeBasedTriggeringPolicy/> <SizeBasedTriggeringPolicy size="100 MB"/> </Policies> <DefaultRolloverStrategy max="20"/> </RollingFile> <RollingFile name="RollingFileError" fileName="${sys:user.home}/logs/error.log" filePattern="${sys:user.home}/logs/$${date:yyyy-MM}/error-%d{yyyy-MM-dd}-%i.log"> <ThresholdFilter level="error" onMatch="ACCEPT" onMismatch="DENY"/> <PatternLayout pattern="[%d{HH:mm:ss:SSS}] [%p] - %l - %m%n"/> <Policies> <TimeBasedTriggeringPolicy/> <SizeBasedTriggeringPolicy size="100 MB"/> </Policies> </RollingFile> </appenders> <loggers> <logger name="org.springframework" level="INFO"></logger> <root level="all"> <appender-ref ref="Console"/> <appender-ref ref="FileLog"/> <!-- <appender-ref ref="RollingFileInfo"/>--> <!-- <appender-ref ref="RollingFileWarn"/>--> <!-- <appender-ref ref="RollingFileError"/>--> </root> </loggers> </configuration>
2.2.0以前可用
hive-site.xml
<property> <name>hive.hwi.listen.host</name> <value>0.0.0.0</value> <description>監聽的地址</description> </property> <property> <name>hive.hwi.listen.port</name> <value>9999</value> <description>監聽的端口號</description> </property> <property> <name>hive.hwi.war.file</name> <value>${HIVE_HOME}/lib/hive-hwi-2.1.0.war</value> <description>war包所在的地址</description> </property>
hive --service hwi
localhost:9999/hwi