1、 安裝PXF3.3.0.0,這裏所安裝的pxf的包文件都包含在apache-hawq-rpm-2.3.0.0-incubating.tar.gz裏面
下面步驟都是以root身份執行
這裏注意,pxf插件要用到tomcat服務,必須使用安裝包裏面的7.0.62, 不能安裝或升級爲 tomcat8,這會形成依賴的catalina.jar包 版本不匹配,以至pxf啓動!node
安裝時因爲pxf的包都是el6版本的,可是我用的centos7,因此rpm帶參數「--nodeps」以忽略RPM命令對依賴包的檢測。web
cd /opt/gpadmin/hawq_rpm_packages rpm -ivh apache-tomcat-7.0.62-el6.noarch.rpm rpm -ivh --nodeps pxf-service-3.3.0.0-1.el6.noarch.rpm rpm -ivh --nodeps pxf-hdfs-3.3.0.0-1.el6.noarch.rpm rpm -ivh --nodeps pxf-hive-3.3.0.0-1.el6.noarch.rpm rpm -ivh --nodeps pxf-hbase-3.3.0.0-1.el6.noarch.rpm rpm -ivh --nodeps pxf-jdbc-3.3.0.0-1.el6.noarch.rpm rpm -ivh --nodeps pxf-json-3.3.0.0-1.el6.noarch.rpm rpm -ivh --nodeps pxf-3.3.0.0-1.el6.noarch.rpm
2、 配置PXFapache
1,因爲hadoop爲HDP版本,因此使用hdp相關的jar包配置json
[root@ep-bd01 ~]cp /etc/pxf/conf/pxf-privatehdp.classpath /etc/pxf/conf/pxf-private.classpath
[root@ep-bd01 ~]cp /etc/pxf/conf/pxf-profiles.xml /etc/pxf/conf/pxf-profiles.default.xml
2,修改pxf目錄全部者:vim
[root@ep-bd01 ~] chown -R pxf:pxf /opt/pxf-3.3.0.0 [root@ep-bd01 ~] chown -R pxf:pxf /tmp/logs
3,創建軟鏈接目錄:centos
[root@ep-bd01 ~] ln -s /etc/pxf-3.3.0.0/conf /opt/pxf-3.3.0.0/conf [root@ep-bd01 ~] ln -s /usr/lib/pxf /opt/pxf-3.3.0.0/lib
4,創建一個init過程須要的目錄和template文件api
[root@ep-bd01 ~] mkdir /opt/pxf-3.3.0.0/conf-templates [root@ep-bd01 ~] cp /opt/pxf/conf/pxf-privatehdp.classpath /opt/pxf/conf-templates/pxf-private-hdp.classpath.template
5,修改pxf-service文件爲pxf,由於init過程須要創建同名的目錄,同時須要修改init.d目錄中的連接:瀏覽器
[root@ep-bd01 ~] mv /opt/pxf/pxf-service /opt/pxf/pxf [root@ep-bd01 ~] unlink /etc/init.d/pxf-service [root@ep-bd01 ~] ln -s /opt/pxf/pxf /etc/init.d/pxf-service
6,修改tomcat/conf目錄的權限,同時在pxf目錄中創建連接tomcat
[root@ep-bd01 ~] chmod 755 -R /opt/apache-tomcat/conf/ [root@ep-bd01 ~] ln -s /opt/apache-tomcat /opt/pxf-3.3.0.0/apache-tomcat
7,編輯/etc/pxf/conf/pxf-env.sh,修改 PARENT_SCRIPT_DIR和LD_LIBRARY_PATH的值oracle
[root@ep-bd01 ~] vim /etc/pxf/conf/pxf-env.sh export PARENT_SCRIPT_DIR=/opt/pxf-3.3.0.0
export PXF_HOME=/opt/pxf-3.3.0.0 export LD_LIBRARY_PATH=/usr/hdp/current/hadoop-client/lib/native:${LD_LIBRARY_PATH}
8,修改pxf腳本文件,設置PXF_HOME
[root@ep-bd01 ~] vim /opt/pxf-3.3.0.0/pxf
export PARENT_SCRIPT_DIR=/opt/pxf-3.3.0.0
export PXF_HOME=/opt/pxf-3.3.0.0
9,編輯/etc/pxf/conf/pxf-public.classpath,添加一系列的jar包
[root@ep-bd01 pxf-3.3.0.0]# vim /etc/pxf-3.3.0.0/conf/pxf-public.classpath /usr/hdp/current/hadoop-client/lib/commons-beanutils-1.9.3.jar /usr/hdp/current/hadoop-client/lib/commons-cli-1.2.jar /usr/hdp/current/hadoop-client/lib/commons-codec-1.11.jar /usr/hdp/current/hadoop-client/lib/commons-collections-3.2.2.jar /usr/hdp/current/hadoop-client/lib/commons-compress-1.4.1.jar /usr/hdp/current/hadoop-client/lib/commons-configuration2-2.1.1.jar /usr/hdp/current/hadoop-client/lib/commons-io-2.5.jar /usr/hdp/current/hadoop-client/lib/commons-lang-2.6.jar /usr/hdp/current/hadoop-client/lib/commons-lang3-3.4.jar /usr/hdp/current/hadoop-client/lib/commons-logging-1.1.3.jar /usr/hdp/current/hadoop-client/lib/commons-math3-3.1.1.jar /usr/hdp/current/hadoop-client/lib/commons-net-3.6.jar /usr/hdp/current/hadoop-client/lib/jersey-core-1.19.jar /usr/hdp/current/hadoop-client/lib/jersey-json-1.19.jar /usr/hdp/current/hadoop-client/lib/jersey-server-1.19.jar /usr/hdp/current/hadoop-client/lib/jersey-servlet-1.19.jar /usr/hdp/current/hadoop-client/lib/jsr311-api-1.1.1.jar /usr/hdp/current/hadoop-client/lib/woodstox-core-5.0.3.jar /usr/hdp/current/hadoop-client/lib/stax2-api-3.1.4.jar /usr/hdp/current/hadoop-client/lib/htrace-core4-4.1.0-incubating.jar /usr/hdp/current/hadoop-client/lib/re2j-1.1.jar /usr/hdp/3.0.0.0-1634/hbase/lib/atlas-hbase-plugin-impl/commons-configuration-1.10.jar /usr/hdp/current/hadoop-hdfs-datanode/hadoop-hdfs-rbf.jar /usr/hdp/current/hadoop-hdfs-datanode/hadoop-hdfs-nfs.jar /usr/hdp/current/hadoop-hdfs-datanode/hadoop-hdfs-native-client.jar /usr/hdp/current/hadoop-hdfs-datanode/hadoop-hdfs-httpfs.jar /usr/hdp/current/hadoop-hdfs-datanode/hadoop-hdfs-client-3.1.0.3.0.0.0-1634.jar /opt/pxf-3.3.0.0/lib/pxf-service-3.3.0.0.jar /opt/pxf-3.3.0.0/lib/pxf-api-3.3.0.0.jar :wq
10,複製pxf-profiles.xml 爲pxf-profiles-default.xml並編輯,添加profile配置
[root@ep-bd05 pxf-3.3.0.0]# cp /etc/pxf/conf/pxf-profiles.xml /etc/pxf/conf/pxf-profiles-default.xml [root@ep-bd05 pxf-3.3.0.0]# vim /etc/pxf/conf/pxf-profiles-default.xml <profiles> <profile> <name>HdfsTextSimple</name> <description>This profile is suitable for using when reading delimited single line records from plain text files on HDFS </description> <plugins> <fragmenter>org.apache.hawq.pxf.plugins.hdfs.HdfsDataFragmenter</fragmenter> <accessor>org.apache.hawq.pxf.plugins.hdfs.LineBreakAccessor</accessor> <resolver>org.apache.hawq.pxf.plugins.hdfs.StringPassResolver</resolver> </plugins> </profile> <profile> <name>HdfsTextMulti</name> <description>This profile is suitable for using when reading delimited single or multi line records (with quoted linefeeds) from plain text files on HDFS. It is not splittable (non parallel) and slower than HdfsTextSimple. </description> <plugins> <fragmenter>org.apache.hawq.pxf.plugins.hdfs.HdfsDataFragmenter</fragmenter> <accessor>org.apache.hawq.pxf.plugins.hdfs.QuotedLineBreakAccessor</accessor> <resolver>org.apache.hawq.pxf.plugins.hdfs.StringPassResolver</resolver> </plugins> </profile> </profiles>
3、初始化pxf,必須使用pxf用戶
1,設置pxf的密碼
passwd pxf
2,初始化,須要使用用戶pxf
[root@ep-bd03 pxf]# source /etc/pxf/conf/pxf-env.sh
[root@ep-bd03 pxf]# sudo -u pxf service pxf-service init
Generating /opt/pxf-3.3.0.0/conf/pxf-private.classpath file from /opt/pxf-3.3.0.0/conf-templates/pxf-private-hdp.classpath.template ...
cp /opt/pxf/pxf-service/webapps/pxf/WEB-INF/lib/*.jar /opt/pxf/lib/
4、啓動PXF service
1,啓動:
sudo -u pxf service pxf-service start
Checking if tomcat is up and running...
tomcat not responding, re-trying after 1 second (attempt number 1)
Checking if PXF webapp is up and running...
PXF webapp is listening on port 51200
2,測試:
使用pxf插件訪問已經事先從oracle導入到HDFS上的數據(使用了sqoop的--compress選項,是gz壓縮格式,可是HdfsTextSimple能夠直接訪問),下面是創建hawq外部表的命令,注意路徑中的星號。
drop external table ext.yx_bw; create external table ext.yx_bw (occur_time date, ...... )
location ('pxf://192.168.58.15:51200/var/data/ext/yx_bw/*?profile=hdfstextsimple') format 'text'(delimiter ',' null '');
**注意** ,此處的主機地址,我直接使用的是主機的地址,若是使用主機名稱則hawq訪問失敗,據我觀察應該是沒有正確轉換,一直沒能解決此問題,若是哪位大俠知道請必定不吝賜教,先謝過了!若是地址使用location ('pxf://bd05:51200/var/data/ext/yx_bw/*?profile=hdfstextsimple') ,外部表能夠創建,可是訪問數據時顯示以下錯誤,且沒有詳細信息,pxf服務的log也找不到訪問失敗的記錄!
epbd=> select * from ext.yx_bw; ERROR: remote component error (0): (libchurl.c:897)
下面是系統中的/etc/hosts文件和/etc/host.conf文件,因爲本集羣能夠訪問外網,能夠看到nslookup返回了錯誤地址,可是ping和curl訪問都是正確的。
root@ep-bd05 pg_log]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.58.11 ep-bd01 bd01 192.168.58.12 ep-bd02 bd02 192.168.58.13 ep-bd03 bd03 192.168.58.14 ep-bd04 bd04 192.168.58.15 ep-bd05 bd05
[root@ep-bd05 pg_log]# cat /etc/host.conf multi off
[root@ep-bd05 pg_log]# nslookup bd01 Server: 211.137.160.5 Address: 211.137.160.5#53 Non-authoritative answer: Name: bd01 Address: 211.137.170.246 [root@ep-bd05 pg_log]# ping bd01 PING ep-bd01 (192.168.58.11) 56(84) bytes of data. 64 bytes from ep-bd01 (192.168.58.11): icmp_seq=1 ttl=64 time=0.156 ms 64 bytes from ep-bd01 (192.168.58.11): icmp_seq=2 ttl=64 time=0.160 ms ^C --- ep-bd01 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 999ms rtt min/avg/max/mdev = 0.156/0.158/0.160/0.002 ms [root@ep-bd05 pg_log]# curl http://bd01:51200/pxf/v0 Wrong version v0, supported version is v15
[root@ep-bd05 pg_log]#
5、可選操做:
1,修改/opt/pxf/pxf-service/conf/catalina.properties,修改 base.shutdown.port
#base.shutdown.port=-1
base.shutdown.port=8005
2,修改/opt/pxf/pxf-service/conf/tomcat-users.xml,給用戶tomcat添加角色manager-gui ,以即可以在瀏覽器中管理webapps
[root@ep-bd01 ~] vim /opt/pxf/pxf-service/conf/tomcat-users.xml <role rolename="tomcat"/> <role rolename="manager-gui"/> <user username="tomcat" password="tomcat" roles="tomcat,manager-gui"/> <user username="both" password="tomcat" roles="tomcat,role1"/> <user username="role1" password="tomcat" roles="role1"/>