標籤(空格分隔): 大數據平臺構建html
- 一:安裝及配置Phoenix
- 二:Phoenix的基本操做
- 三:使用Phoenix bulkload數據到HBase
- 四:使用Phoenix從HBase中導出數據到HDFS
Phoenix中文翻譯爲鳳凰, 其最先是Salesforce的一個開源項目,Salesforce背景是一個搞ERP的,ERP軟件一個很大的特色就是數據庫操做,因此能搞出一個數據庫中間件也是很正常的。然後,Phoenix成爲Apache基金的頂級項目。 Phoenix具體是什麼呢,其本質是用Java寫的基於JDBC API操做HBase的開源SQL引擎
下載地址: http://archive.cloudera.com/cloudera-labs/phoenix/parcels/latest/ CLABS_PHOENIX-4.7.0-1.clabs_phoenix1.3.0.p0.000-el7.parcel CLABS_PHOENIX-4.7.0-1.clabs_phoenix1.3.0.p0.000-el7.parcel.sha1 manifest.json
yum install -y httpd* service httpd start chkconfig httpd on mkdir -p /var/www/html/phoenix mv CLABS_PHOENIX-4.7.0-1.clabs_phoenix1.3.0.p0.000-el7.parcel* /var/www/html/phoenix/ mv manifest.json /var/www/html/phoenix/ cd /var/www/html/phoenix/ mv CLABS_PHOENIX-4.7.0-1.clabs_phoenix1.3.0.p0.000-el7.parcel.sha1 CLABS_PHOENIX-4.7.0-1.clabs_phoenix1.3.0.p0.000-el7.parcel.sha
cd /opt/cloudera/parcels/CLABS_PHOENIX/bin
使用Phoenix登陸HBase ./phoenix-sqlline.py
須要指定Zookeeper ./phoenix-sqlline.py node-01.flyfish:2181:/hbase !table
create table hbase_test ( s1 varchar not null primary key, s2 varchar, s3 varchar, s4 varchar );
hbase 的接口登陸 hbase shell
upsert into hbase_test values('1','testname','testname1','testname2'); upsert into hbase_test values('2','tom','jack','harry');
刪除: delete from hbase_test where s1='1'; (刪除是按rowkey)
upsert into hbase_test values('1','hadoop','hive','zookeeper'); upsert into hbase_test values('2','oozie','hue','spark');
更新數據測試,注意Phoenix中沒有update語法,用upsert代替。插入多條數據須要執行多條upsert語句,沒辦法將全部的數據都寫到一個「values」後面。 upsert into hbase_test values('1','zhangyy','hive','zookeeper');
準備 導入的 測試文件 ls -ld ithbase.csv head -n 1 ithbase.csv
上傳到hdfs su - hdfs hdfs dfs -mkdir /flyfish hdfs dfs -put ithbase.csv /flyfish hdfs dfs -ls /flyfish
create table ithbase ( i_item_sk varchar not null primary key, i_item_id varchar, i_rec_start_varchar varchar, i_rec_end_date varchar );
執行bulkload命令導入數據 HADOOP_CLASSPATH=/opt/cloudera/parcels/CDH/lib/hbase/hbase-protocol-1.2.0-cdh5.12.1.jar:/opt/cloudera/parcels/CDH/lib/hbase/conf hadoop jar /opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-4.7.0-clabs-phoenix1.3.0-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool -t ithbase -i /flyfish/ithbase.csv
select * from ithbase
cat export.pig ---- REGISTER /opt/cloudera/parcels/CLABS_PHOENIX/lib/phoenix/phoenix-4.7.0-clabs-phoenix1.3.0-client.jar; rows = load 'hbase://query/SELECT * FROM ITHBASE' USING org.apache.phoenix.pig.PhoenixHBaseLoader('node-01.flyfish:2181'); STORE rows INTO 'flyfish1' USING PigStorage(','); ---- 執行pig pig -x mapreduce export.pig
在hdfs 上面查看文件 hdfs dfs -ls /user/hdfs/flyfish1 hdfs dfs -cat /user/hdfs/flyfish1/part-m-00000