centos7 PDI(Kettle)安裝

kettle介紹

PDI(Kettle)是一種開源的 ETL 解決方案,書中介紹瞭如何使用PDI來實現數據的剖析、清洗、校驗、抽取、轉換、加載等各種常見的ETL類工做。
除了ODS/DW類比較大型的應用外,Kettle實際還能夠爲中小企業提供靈活的數據抽取和數據處理的功能。
Kettle除了支持各類關係型數據庫、HBase、MongoDB這樣的NoSQL數據源外,它還支持Excel、Access這類小型的數據源。
而且經過插件擴展,Kettle 能夠支持各種數據源。本書詳細介紹了Kettle能夠處理的數據源,
並且詳細介紹瞭如何使用Kettle抽取增量數據。Kettle的數據處理功能也很強大,除了選擇、過濾、分組、鏈接、排序這些經常使用的功能外,
Kettle 裏的Java表達式、正則表達式、Java腳本、Java類等功能都很是靈活而強大,都很是適合於各類數據處理功能

kettle下載

kettle安裝

  • kettle依賴java,因此須要安裝java
  • 若是安裝環境是centos7,須要安裝webkitgtk。同時須要安裝桌面(自行安裝)
    • yum install epel-release
    • yum install webkitgtk
  • kettle不須要安裝,直接解壓就能使用

如下是官網建議安裝依賴

How to get PDI up and running
 
Linux
 
Ubuntu 12.04 and later:
The libwebkitgtk package needs to be installed. This can be done by running
apt-get install libwebkitgtk-1.0.0
Unzip the downloaded file. Run spoon.sh file, it should be under /data-integration.
On some installations of Ubuntu 14.04, Unity doesn't display the menu bar. In order to fix that, spoon.sh has a setting to disable this integration, export
UBUNTU_MENUPROXY=0
You can try to remove that setting if you wish to see if it works propery on your machine
 
CentOS 6 Desktop:
The libwebkitgtk package needs to be installed. This can be done by running
yum install libwebkitgtk
Unzip the downloaded file and run spoon.sh, it should be under /data-integration.

kettle啓動

  • winodws啓動腳本
    • Spoon.bat
  • cenos7啓動腳本(須要桌面環境啓動,不然將報錯)
    • Spoon.sh

kettle報錯處理(centos系統須要在桌面環境啓動)

  • centos7 須要安裝webkitgtk
    • WARNING: no libwebkitgtk-1.0 detected, some features will be unavailable
  • java8 不支持MaxPermSize參數,啓動腳本中刪除便可
    • Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
  • 具體報錯以下:
#######################################################################
WARNING:  no libwebkitgtk-1.0 detected, some features will be unavailable
    Consider installing the package with apt-get or yum.
    e.g. 'sudo apt-get install libwebkitgtk-1.0-0'
#######################################################################
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; support was removed in 8.0
org.eclipse.swt.SWTError: No more handles [gtk_init_check() failed]
    at org.eclipse.swt.SWT.error(Unknown Source)
    at org.eclipse.swt.widgets.Display.createDisplay(Unknown Source)
    at org.eclipse.swt.widgets.Display.create(Unknown Source)
    at org.eclipse.swt.graphics.Device.<init>(Unknown Source)
    at org.eclipse.swt.widgets.Display.<init>(Unknown Source)
    at org.eclipse.swt.widgets.Display.<init>(Unknown Source)
    at org.pentaho.di.ui.spoon.Spoon.main(Spoon.java:649)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.pentaho.commons.launcher.Launcher.main(Launcher.java:92
  • 解決方法java

    • yum install webkitgtk

kettle桌面雙擊運行

  • 桌面建立啓動文件kettle.desktop
[Desktop Entry]
Version=7.1
Name=kettle
Exec=path to start script xxx/spoon.sh
Icon=path to ico /spoon.ico
Terminal=false
Type=Application
Categories=Application;

其餘報錯

  • 安裝KDE桌面後啓動kettle報錯(安裝gnome桌面沒出現此類問題)
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f4ab4f35164, pid=4011, tid=0x00007f4b09bd7700
#
# JRE version: OpenJDK Runtime Environment (8.0_151-b12) (build 1.8.0_151-b12)
# Java VM: OpenJDK 64-Bit Server VM (25.151-b12 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [libglib-2.0.so.0+0x5e164]  g_match_info_unref+0x4
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
  • 解決方法
    • 修改系統主題,改爲非GTK
As I already mentioned on #1245468 I could not verify that changing GTK_MODULES, UBUNTU_MENUPROXY, or GTK_IM_MODULE helps in any way.

However, I could verify that the problem GOES AWAYS IN KUBUNTU/KDE when doing:

System Settings -> Application Themes -> GTK -> Choose GTK2 Theme

Choose 'Radiance' instead of 'oxygen-gtk'

報錯:ERROR (version 7.1.0.0-12, build 1 from 2017-05-16 17.18.02 by buildguy) : java.io.IOException: Cannot run program "lsb_release": error=2, No such file or directory

  • 解決方法
    • yum -y install redhat-lsb

插入數據亂碼問題

在kettle的啓動文件spoon.sh中jvm的啓動參數中,添加參數
-Dfile.encoding=utf8(指定本身須要的字符集)
相關文章
相關標籤/搜索