Impala是Cloudera公司主導開發的新型查詢系統,它提供SQL語義,能查詢存儲在Hadoop的HDFS和HBase中的PB級大數據。Impala提供更快的查詢速度,性能上號稱比Hive快3~10倍。Impala是開源的,但通常都是經過cloudera manager或者在CDH版本上安裝,今天主要介紹的是在HDP版本上的安裝。node
版本git
Impala對於Hadoop的版本要求很高,如今說明一下當前安裝的版本信息github
Impala 2.5web
HDP 2.2.8.0 基於Hadoop2.6shell
安裝步驟api
1. 在/etc/yum.repo.d 中建立impala.repoapp
[cloudera-cdh5] # Packages for Cloudera's Distribution for Hadoop, Version 5, on RedHat or CentOS 6 x86_64 name=Cloudera's Distribution for Hadoop, Version 5 baseurl=https://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5/ gpgkey =https://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera gpgcheck = 1
2. 執行yum命令oop
yum install impala-server impala-catalog impala-state-store impala-shell
3.更改impala連接庫信息性能
ln -sf /usr/hdp/2.2.8.0-3150/hadoop/hadoop-common.jar /usr/lib/impala/lib/hadoop-common.jar ln -sf /usr/hdp/2.2.8.0-3150/hadoop/hadoop-annotations.jar /usr/lib/impala/lib/hadoop-annotations.jar ln -sf /usr/hdp/2.2.8.0-3150/hadoop/hadoop-auth.jar /usr/lib/impala/lib/hadoop-auth.jar ln -sf /usr/hdp/2.2.8.0-3150/hadoop/hadoop-aws.jar /usr/lib/impala/lib/hadoop-aws.jar ln -sf /usr/hdp/2.2.8.0-3150/hadoop-hdfs/hadoop-hdfs.jar /usr/lib/impala/lib/hadoop-hdfs.jar ln -sf /usr/hdp/2.2.8.0-3150/hadoop-mapreduce/hadoop-mapreduce-client-common.jar /usr/lib/impala/lib/hadoop-mapreduce-client-common.jar ln -sf /usr/hdp/2.2.8.0-3150/hadoop-mapreduce/hadoop-mapreduce-client-core.jar /usr/lib/impala/lib/hadoop-mapreduce-client-core.jar ln -sf /usr/hdp/2.2.8.0-3150/hadoop-mapreduce/hadoop-mapreduce-client-jobclient.jar /usr/lib/impala/lib/hadoop-mapreduce-client-jobclient.jar ln -sf /usr/hdp/2.2.8.0-3150/hadoop-mapreduce/hadoop-mapreduce-client-shuffle.jar /usr/lib/impala/lib/hadoop-mapreduce-client-shuffle.jar ln -sf /usr/hdp/2.2.8.0-3150/hadoop-yarn/hadoop-yarn-api.jar /usr/lib/impala/lib/hadoop-yarn-api.jar ln -sf /usr/hdp/2.2.8.0-3150/hadoop-yarn/hadoop-yarn-client.jar /usr/lib/impala/lib/hadoop-yarn-client.jar ln -sf /usr/hdp/2.2.8.0-3150/hadoop-yarn/hadoop-yarn-common.jar /usr/lib/impala/lib/hadoop-yarn-common.jar ln -sf /usr/hdp/2.2.8.0-3150/hadoop-yarn/hadoop-yarn-server-applicationhistoryservice.jar /usr/lib/impala/lib/hadoop-yarn-server-applicationhistoryservice.jar ln -sf /usr/hdp/2.2.8.0-3150/hadoop-yarn/hadoop-yarn-server-common.jar /usr/lib/impala/lib/hadoop-yarn-server-common.jar ln -sf /usr/hdp/2.2.8.0-3150/hadoop-yarn/hadoop-yarn-server-nodemanager.jar /usr/lib/impala/lib/hadoop-yarn-server-nodemanager.jar ln -sf /usr/hdp/2.2.8.0-3150/hadoop-yarn/hadoop-yarn-server-resourcemanager.jar /usr/lib/impala/lib/hadoop-yarn-server-resourcemanager.jar ln -sf /usr/hdp/2.2.8.0-3150/hadoop-yarn/hadoop-yarn-server-web-proxy.jar /usr/lib/impala/lib/hadoop-yarn-server-web-proxy.jar ln -sf /usr/hdp/2.2.8.0-3150/hadoop/lib/native/libhadoop.so /usr/lib/impala/lib/libhadoop.so ln -sf /usr/hdp/2.2.8.0-3150/hadoop/lib/native/libhadoop.so.1.0.0 /usr/lib/impala/lib/libhadoop.so.1.0.0 ln -sf /usr/hdp/2.2.8.0-3150/usr/lib/libhdfs.so /usr/lib/impala/lib/libhdfs.so ln -sf /usr/hdp/2.2.8.0-3150/usr/lib/libhdfs.so.0.0.0 /usr/lib/impala/lib/libhdfs.so.0.0.0 ln -sf /usr/hdp/2.2.8.0-3150/hive/lib/hive-ant.jar /usr/lib/impala/lib/hive-ant.jar ln -sf /usr/hdp/2.2.8.0-3150/hive/lib/hive-beeline.jar /usr/lib/impala/lib/hive-beeline.jar ln -sf /usr/hdp/2.2.8.0-3150/hive/lib/hive-common.jar /usr/lib/impala/lib/hive-common.jar ln -sf /usr/hdp/2.2.8.0-3150/hive/lib/hive-exec.jar /usr/lib/impala/lib/hive-exec.jar ln -sf /usr/hdp/2.2.8.0-3150/hive/lib/hive-hbase-handler.jar /usr/lib/impala/lib/hive-hbase-handler.jar ln -sf /usr/hdp/2.2.8.0-3150/hive/lib/hive-metastore.jar /usr/lib/impala/lib/hive-metastore.jar ln -sf /usr/hdp/2.2.8.0-3150/hive/lib/hive-serde.jar /usr/lib/impala/lib/hive-serde.jar ln -sf /usr/hdp/2.2.8.0-3150/hive/lib/hive-service.jar /usr/lib/impala/lib/hive-service.jar ln -sf /usr/hdp/2.2.8.0-3150/hive/lib/hive-shims-common.jar /usr/lib/impala/lib/hive-shims-common.jar ln -sf /usr/hdp/2.2.8.0-3150/hive/lib/hive-shims-scheduler.jar /usr/lib/impala/lib/hive-shims-scheduler.jar ln -sf /usr/hdp/2.2.8.0-3150/hive/lib/hive-shims.jar /usr/lib/impala/lib/hive-shims.jar ln -sf /usr/hdp/2.2.8.0-3150/zookeeper/zookeeper.jar /usr/lib/impala/lib/zookeeper.jar
4. 增長JAVA_HOME大數據
vi /etc/default/bigtop-utils
增長 EXPORT JAVA_HOME= /usr/jdk64/jdk1.7.0_67
5. 修改配置文件
增長 /etc/hadoop/conf/core-site.xml
<property> <name>dfs.client.read.shortcircuit</name> <value>true</value> </property> <property> <name>dfs.client.read.shortcircuit.skip.checksum</name> <value>false</value> </property> <property> <name>dfs.datanode.hdfs-blocks-metadata.enabled</name> <value>true</value> </property>
增長 /etc/hadoop/conf/hdfs-site.xml
<property> <name>dfs.datanode.hdfs-blocks-metadata.enabled</name> <value>true</value> </property> <property> <name>dfs.block.local-path-access.user</name> <value>impala</value> </property> <property> <name>dfs.client.file-block-storage-locations.timeout.millis</name> <value>60000</value> </property>
執行拷貝
cp /etc/hadoop/conf/*.xml /etc/impala/conf cp /etc/hive/conf/hive-site.xml /etc/impala/conf
不然報錯
E0721 15:52:54.265014 11246 impala-server.cc:247] Unsupported default filesystem. The default filesystem must be a DistributedFileSystem but the configured default filesystem is LocalFileSystem. fs.defaultFS (file:///) might be set incorrectly.ERROR: block location tracking is not properly enabled because - dfs.datanode.hdfs-blocks-metadata.enabled is not enabled. - dfs.client.file-block-storage-locations.timeout.millis is too low. It should be at least 10 seconds.
6. 修改 /etc/default/impala
修改IMPALA_CATALOG_SERVICE_HOST,IMPALA_STATE_STORE_HOST 兩個變量爲你啓動catalog,state_store兩個服務的host
7. 從網上下載 htrace-core-3.0.4.jar更名htrace-core-2.00.jar 替換 /usr/lib/impala/lib/htrace-core-2.00.jar
不然會報錯
E0711 13:55:00.451584 6925 impala-server.cc:247] NoClassDefFoundError: org/htrace/Trace CAUSED BY: ClassNotFoundException: org.htrace.Trace E0711 13:55:00.451653 6925 impala-server.cc:249] Aborting Impala Server startup due to improper configuration
8 一個host啓動 catalog, state_store
全部host啓動 impalad
提示:
安裝過程當中須要學會查看/var/log/impala的日誌,能夠查看當前的錯誤信息
咱們已經編寫好能夠用的ambari組件進行impala安裝,組件地址見下面的連接
https://github.com/cas-bigdatalab/ambari-impala-service