Kylin依賴於hadoop大數據平臺,安裝部署以前確認,大數據平臺已經安裝Hadoop
, HBase
, Hive
。html
說明:特別二進制包是一個在HBase 1.1+環境上編譯的Kylin快照二進制包;安裝它須要HBase 1.1.3或更高版本,不然以前版本中有一個已知的關於fuzzy key過濾器的缺陷,會致使Kylin查詢結果缺乏記錄:HBASE-14269。此外還需注意的是,這不是一個正式的發佈版(每隔幾周rebase KYLIN 1.3.x 分支上最新的改動),沒有通過完整的測試。java
能夠選擇本身須要的版本進行下載,這裏下載的是pache-kylin-1.6.0-bin.tar.gznode
$ tar -zxvf apache-kylin-1.6.0-bin.tar.gz $ mv apache-kylin-1.6.0 /home/hadoop/cloud/ $ ln -s /home/hadoop/cloud/apache-kylin-1.6.0 /home/hadoop/cloud/kylin
在/etc/profile裏配置KYLIN環境變量和一個名爲hive_dependency的變量web
vim /etc/profile
sql
//追加 export KYLIN_HOME=/home/hadoop/kylin export PATH=$PATH:$ KYLIN_HOME/bin export hive_dependency=/home/hadoop/hive/conf:/home/hadoop/hive/lib/*:/home/hadoop/hive/hcatalog/share/hcatalog/hive-hcatalog-core-2.0.0.jar
使配置文件生效shell
# source /etc/profile # su hadoop $ source /etc/profile
這個配置須要在從節點master2,slave1,slave2上同時配置,由於kylin提交的任務交給mr後,hadoop集羣將任務分發給從節點時,須要hive的依賴信息,若是不配置,則mr任務將報錯爲: hcatalogXXX找不到。數據庫
$ vim ~/cloud/kylin/bin/kylin.sh
apache
//顯式聲明 KYLIN_HOME export KYLIN_HOME=/home/Hadoop/kylin //在HBASE_CLASSPATH_PREFIX中顯示增長$hive_dependency依賴 export HBASE_CLASSPATH_PREFIX=${tomcat_root}/bin/bootstrap.jar:${tomcat_root}/bin/tomcat-juli.jar:${tomcat_root}/lib/*:$hive_dependency:$HBASE_CLASSPATH_PREFIX
$ check-env.sh KYLIN_HOME is set to /home/hadoop/kylin
進入conf文件夾,修改kylin各配置文件kylin.properties
以下bootstrap
$ vim ~/cloud/kylin/conf/kylin.properties
vim
kylin.rest.servers=master:7070 #定義kylin用於MR jobs的job.jar包和hbase的協處理jar包,用於提高性能。 kylin.job.jar=/home/hadoop/kylin/lib/kylin-job-1.6.0-SNAPSHOT.jar kylin.coprocessor.local.jar=/home/hadoop/kylin/lib/kylin-coprocessor-1.6.0-SNAPSHOT.jar
將kylin_hive_conf.xml
和kylin_job_conf.xml
的副本數設置爲2
<property> <name>dfs.replication</name> <value>2</value> <description>Block replication</description> </property>
注意:在啓動Kylin以前,先確認如下服務已經啓動
start-all.sh mr-jobhistory-daemon.sh start historyserver
hive --service metastore &
zkService.sh start
須要在每一個節點上執行,分別啓動全部節點的zookeeper服務
start-hbase.sh
$ find-hive-dependency.sh $ find-hbase-dependency.sh
$ kylin.sh start $ kylin.sh stop
Web訪問地址:http://192.168.1.10:7070/kylin/login
默認的登陸username/password 是 ADMIN/KYLIN
Kylin提供一個自動化腳原本建立測試CUBE,這個腳本也會自動建立出相應的hive數據表。運行sample例子的步驟:
S1: 運行${KYLIN_HOME}/bin/sample.sh腳本
$ sample.sh
關鍵提示信息:
KYLIN_HOME is set to /home/hadoop/kylin Going to create sample tables in hive... Sample hive tables are created successfully; Going to create sample cube... Sample cube is created successfully in project 'learn_kylin'; Restart Kylin server or reload the metadata from web UI to see the change.
S2:在MYSQL中查看此sample建立了哪幾張表
select DB_ID,OWNER,SD_ID,TBL_NAME from TBLS;
S3: 在hive客戶端查看建立的表和數據量(1w條)
hive> show tables; OK kylin_cal_dt kylin_category_groupings kylin_sales Time taken: 1.835 seconds, Fetched: 3 row(s) hive> select count(*) from kylin_sales; OK Time taken: 65.351 seconds, Fetched: 1 row(s)
S4: 重啓kylin server 刷新緩存
$ kylin.sh stop $ kylin.sh start
S5:用默認的用戶名密碼ADMIN/KYLIN訪問192.168.200.165:7070/kylin
進入控制檯後選擇project爲learn_kylin的那個項目。
S6: 選擇測試cube 「kylin_sales_cube」,點擊「Action」-「Build」,選擇一個2014-01-01之後的日期,這是爲了選擇所有的10000條測試記錄。
選擇一個生成日期
點擊提交會出現重建任務成功提交的提示
S7: 監控臺查看這個任務的執行進度,直到這個任務100%完成。
任務完成
切換到model控制檯會發現cube的狀態成爲了ready,表示能夠執行sql查詢了
執行過程當中,在hive裏會生成臨時表,待任務100%完成後,這張表會自動刪除
check-env.sh
提示please make sure user has the privilege to run hbase shell
檢查hbase
環境變量是否配置正確。從新配置後問題解決。
參考:http://www.jianshu.com/p/632b61f73fe8
hadoop-env.sh
腳本問題Kylin安裝問題--/home/hadoop-2.5.1/contrib/capacity-scheduler/.jar (No such file or directory)
WARNING: Failed to process JAR [jar:file:/home/hadoop-2.5.1/contrib/capacity-scheduler/.jar!/] for TLD files java.io.FileNotFoundException: /home/hadoop-2.5.1/contrib/capacity-scheduler/.jar (No such file or directory) at java.util.zip.ZipFile.open(Native Method) at java.util.zip.ZipFile.(ZipFile.java:215) at java.util.zip.ZipFile.(ZipFile.java:145) at java.util.jar.JarFile.(JarFile.java:153) at java.util.jar.JarFile.(JarFile.java:90) at sun.net.www.protocol.jar.URLJarFile.(URLJarFile.java:93) at sun.net.www.protocol.jar.URLJarFile.getJarFile(URLJarFile.java:69) at sun.net.www.protocol.jar.JarFileFactory.get(JarFileFactory.java:99) at sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:122) at sun.net.www.protocol.jar.JarURLConnection.getJarFile(JarURLConnection.java:89) at org.apache.tomcat.util.scan.FileUrlJar.(FileUrlJar.java:41) at org.apache.tomcat.util.scan.JarFactory.newInstance(JarFactory.java:34) at org.apache.catalina.startup.TldConfig.tldScanJar(TldConfig.java:485) at org.apache.catalina.startup.TldConfig.access$100(TldConfig.java:61) at org.apache.catalina.startup.TldConfig$TldJarScannerCallback.scan(TldConfig.java:296) at org.apache.tomcat.util.scan.StandardJarScanner.process(StandardJarScanner.java:258) at org.apache.tomcat.util.scan.StandardJarScanner.scan(StandardJarScanner.java:220) at org.apache.catalina.startup.TldConfig.execute(TldConfig.java:269) at org.apache.catalina.startup.TldConfig.lifecycleEvent(TldConfig.java:565) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:117) at org.apache.catalina.util.LifecycleBase.fireLifecycleEvent(LifecycleBase.java:90) at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5412) at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:901) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:877) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:649) at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:1081) at org.apache.catalina.startup.HostConfig$DeployWar.run(HostConfig.java:1877) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
其實這個問題只是一些小bug問題把這個腳本的內容改動一下就行了${HADOOP_HOME}/etc/hadoop/hadoop-env.sh
把下面的這一段循環語句給註釋掉
#for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do # if [ "$HADOOP_CLASSPATH" ]; then # export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f # else # export HADOOP_CLASSPATH=$f # fi #done
kylin.sh org.apache.kylin.storage.hbase.util.StorageCleanupJob --delete true
kylin cube測試時,報錯:org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
解決辦法:
1 配置hdfs-site.xml
<property> <name>dfs.permissions</name> <value>false</value> </property>
2 在hdfs
上給目錄/user
777
的權限
$ hadoop fs -chmod -R 777 /user
2017-02-17 19:51:39 星期五
update1: 2017-05-04 20:10:05 星期四