經過JDBC鏈接hive

時間 2019-12-04

標籤經過 jdbc 鏈接 hive 欄目 Java 简体版

原文原文鏈接

hive是大數據技術簇中進行數據倉庫應用的基礎組件，是其它相似數據倉庫應用的對比基準。基礎的數據操做咱們能夠經過腳本方式以hive-client進行處理。若須要開發應用程序，則須要使用hive的jdbc驅動進行鏈接。本文以hive wiki上示例爲基礎，詳細講解了如何使用jdbc鏈接hive數據庫。hive wiki原文地址：java

https://cwiki.apache.org/confluence/display/Hive/HiveClientsql

https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-JDBC數據庫

首先hive必須以服務方式啓動，咱們平臺選用hdp平臺，hdp2.2平臺默認啓動時hive server2 模式。hiveserver2是比hiveserver更高級的服務模式，提供了hiveserver不能提供的併發控制、安全機制等高級功能。服務器啓動以不一樣模式啓動，客戶端代碼的編碼方式也略有不一樣，具體見代碼。apache

服務啓動完成以後，在eclipse環境中編輯代碼。代碼以下：安全

import java.sql.SQLException;
import java.sql.Connection;
import java.sql.ResultSet;
import java.sql.Statement;
import java.sql.DriverManager; 

public class HiveJdbcClient {

  /*hiverserver 版本使用此驅動*/
  
  
  
  
  
   
   
   
   
   Technorati 標記: 
   
   
   
   
   hadoop,
   
   
   
   
   hive,
   
   
   
   
   jdbc
  
  
  
  
  
  //private static String driverName = "org.apache.hadoop.hive.jdbc.HiveDriver";
  /*hiverserver2 版本使用此驅動*/
  private static String driverName = "org.apache.hive.jdbc.HiveDriver";

  public static void main(String[] args) throws SQLException {

    try {
      Class.forName(driverName);
    } catch (ClassNotFoundException e) {
      e.printStackTrace();
      System.exit(1);
    }

    /*hiverserver 版本jdbc url格式*/
    //Connection con = DriverManager.getConnection("jdbc:hive://hostip:10000/default", "", "");

    /*hiverserver2 版本jdbc url格式*/
    Connection con = DriverManager.getConnection("jdbc:hive2://hostip:10000/default", "hive", "hive");
    Statement stmt = con.createStatement();
    //參數設置測試
    //boolean resHivePropertyTest = stmt
    //        .execute("SET tez.runtime.io.sort.mb = 128");
    
    boolean resHivePropertyTest = stmt
            .execute("set hive.execution.engine=tez");
    System.out.println(resHivePropertyTest);

    String tableName = "testHiveDriverTable";
    stmt.executeQuery("drop table " + tableName);
    ResultSet res = stmt.executeQuery("create table " + tableName + " (key int, value string)");

    //show tables
    String sql = "show tables '" + tableName + "'";
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql);
    if (res.next()) {
      System.out.println(res.getString(1));
    }

    //describe table
    sql = "describe " + tableName;
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql);
    while (res.next()) {
      System.out.println(res.getString(1) + "\t" + res.getString(2));
    } 

    // load data into table
    // NOTE: filepath has to be local to the hive server
    // NOTE: /tmp/a.txt is a ctrl-A separated file with two fields per line
    String filepath = "/tmp/a.txt";
    sql = "load data local inpath '" + filepath + "' into table " + tableName;
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql); 

    // select * query
    sql = "select * from " + tableName;
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql);
    while (res.next()) {
      System.out.println(String.valueOf(res.getInt(1)) + "\t" + res.getString(2));
    }
    
    // regular hive query
    sql = "select count(1) from " + tableName;
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql);
    while (res.next()) {
      System.out.println(res.getString(1));
    }

  }

}

能夠將以下jar包放在eclipse buildpath，能夠在啓動時放在classpath路徑。服務器

其中jdbcdriver可用hive-jdbc.jar,這樣的話，其餘的jar也必須包含，或者用jdbc-standalone jar包，用此jar包其餘jar包就能夠不用包含。其中hadoop-common包必定要包含。併發

執行後等待結果正確運行。若出現異常，則根據提示進行解決。提示不明確的幾個異常的解決方案以下：less

1. 假如classpath或者buildpath中不包含hadoop-common-0.23.9.jar，出現以下錯誤eclipse

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
    at org.apache.hive.jdbc.HiveConnection.createBinaryTransport(HiveConnection.java:393)
    at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:187)
    at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:163)
    at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
    at java.sql.DriverManager.getConnection(DriverManager.java:664)
    at java.sql.DriverManager.getConnection(DriverManager.java:247)
    at HiveJdbcClient.main(HiveJdbcClient.java:28)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 7 more

2. HIVE JDBC鏈接服務器卡死:oop

假如使用hiveserver 版本JDBCdriver 鏈接hiverserver2,將可能出現此問題，具體在JDBCDriver鏈接上以後根據協議要求請求hiveserver2返回數據時，hiveserver2不返回任何數據，所以JDBC driver將卡死不返回。

3. TezTask出錯，返回錯誤號1.

Exception in thread "main" java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask
    at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:296)
    at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:392)
    at HiveJdbcClient.main(HiveJdbcClient.java:40)

錯誤號1表明用戶認證失敗，在鏈接時必須指定用戶名密碼，有可能經過服務器設置能夠不須要用戶認證就能夠執行，hdp默認安裝配置用戶名密碼是hive,hive

3. TezTask出錯，返回錯誤號2.

TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.IllegalArgumentException: tez.runtime.io.sort.mb 256 should be larger than 0 and should be less than the available task memory (MB):133
    at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
    at org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.getInitialMemoryRequirement(ExternalSorter.java:291)
    at org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.initialize(OrderedPartitionedKVOutput.java:95)
    at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.call(LogicalIOProcessorRuntimeTask.java:430)
    at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.call(LogicalIOProcessorRuntimeTask.java:409)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex vertex_1441168955561_1508_2_00 [Map 1] killed/failed due to:null]
Vertex killed, vertexName=Reducer 2, vertexId=vertex_1441168955561_1508_2_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, Vertex vertex_1441168955561_1508_2_01 [Reducer 2] killed/failed due to:null]
DAG failed due to vertex failure. failedVertices:1 killedVertices:1
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask