目錄html
做者:Arthur_Qin 禾衆python
Greenplum 主體以及orca ( 新一代優化器 ) 的代碼以能夠從 Github 上下載。若是不打算查看代碼,想下載編譯好的二進制版能夠訪問其母公司 pivotal 官網 下載,具體配置安裝流程能夠參考《Greenplum 安裝》。linux
正文由此開始:c++
Greenplum is built on PostgreSQL and operates as a data warehouse and uses a shared-nothing, massively parallel processing (MPP) architecture, available under the terms of the GNU General Public License, owned by Pivotal. ——Greenplum Wikipedia article git
本文主要記錄 Greenplum 的源碼安裝過程,對 Greenplum 架構和歷史感興趣的推薦閱讀 《聊聊Greenplum的那些事》 ,以及個人另外一篇博文 《Greenplum 的分佈式框架結構》。github
參數 | 版本 |
---|---|
主機數量 | 3臺虛擬機,單核1.5G內存 (1臺做爲 master, 2臺做爲 slave),能聯網 |
主機系統 | Centos6.7 |
Greenplum 版本 | 201609 Github 版 (4.3.X) |
若是使用遠程服務器安裝,且遠程服務器不能直接聯網的,可經過同網段下其餘主機代理聯網。詳情可搜索關鍵字 ccproxy 代理
。sql
這裏使用1個master,2個slave,ip爲數據庫
10.77.100.121bootstrap
10.77.100.122緩存
10.77.100.123
其中10.77.100.121爲master,其他爲segment。
可使用的內網 ip ,也可使用外網 ip ,能相互 ping
通就能夠。
(虛擬機上網的話實際上是能夠與主機共享網絡的,能夠把網絡鏈接的網絡共享給vmware虛擬機網絡,請參考 《VMware虛擬機NAT模式的具體配置》 或者 《虛擬機沒法共享主機網絡沒法上網怎麼辦?》,而後在 Greenplum 的配置文件中使用 192開頭的 ip 便可)
這裏主要是爲以後Greenplum可以在各個節點之間相互通訊作準備
[root@dw-greenplum-1 conf]# vi /etc/hosts
127.0.0.1 localhost localhost.localdomain
10.77.100.121 dw-greenplum-1 mdw
10.77.100.122 dw-greenplum-2 sdw1
10.77.100.123 dw-greenplum-3 sdw2
這必定要用本身的 ip 啊,不要直接複製粘貼上面的。
配置了這個文件以後,必定要同時修改 /etc/sysconfig/network這個文件以下(全部機器都要修改):
[root@dw-greenplum-1 conf]# vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=dw-greenplum-1 #其餘的機子將 -1 改成 -2 -3 ...
這裏的HOSTNAME必定要與/etc/hosts中的主機名一致,最終可使用ping + 主機名
測試是否配置好。
[root@dw-greenplum-1 conf]# vi /etc/sysctl.conf
添加如下內容
kernel.shmall = 4294967296
kernel.shmmax = 500000000
kernel.shmmni = 4096
kernel.shmall = 4000000000
kernel.sem = 250 512000 100 2048
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1
net.ipv4.ip_forward = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.ipv4.ip_local_port_range = 1025 65535
net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152
net.core.wmem_max = 2097152
vm.overcommit_memory = 2
執行下面命令使生效
[root@dw-greenplum-1 conf]# sysctl -p
修改文件打開限制,添加如下內容
[root@dw-greenplum-1 conf]# vi /etc/security/limits.conf
* soft nofile 65536
* hard nofile 65536
* soft nproc 131072
* hard nproc 131072
關閉 SELINUX 安全設置
[root@dw-greenplum-1 conf]# vi /etc/selinux/config
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
# targeted - Targeted processes are protected,
# mls - Multi Level Security protection.
SELINUXTYPE=targeted
(optional) 編輯/boot/grub/grub.conf
添加
elevator=deadline
CentOS 6:
1) 永久性生效,重啓後不會復原
開啓: chkconfig iptables on
關閉: chkconfig iptables off
2) 即時生效
開啓: service iptables start
關閉: service iptables stop
CentOS 7:
systemctl start firewalld.service#啓動firewall
systemctl stop firewalld.service#中止firewall
systemctl disable firewalld.service#禁止firewall開機啓動
個人電腦是6.7的,因此
[root@dw-greenplum-1 conf]# service iptables stop [root@dw-greenplum-1 conf]# chkconfig iptables off
[root@dw-greenplum-1 ~]# groupadd -g 530 gpadmin [root@dw-greenplum-1 ~]# useradd -g 530 -u530 -m -d /home/gpadmin -s /bin/bash gpadmin [root@dw-greenplum-1 ~]# passwd gpadmin Changing password for user gpadmin. New password: BAD PASSWORD: it is too simplistic/systematic BAD PASSWORD: is too simple Retype new password: passwd: all authentication tokens updated successfully.
全部機器以 root 權限,在Terminal 中執行下列命令 (需聯網執行 yum
下載安裝相應包)
[root@dw-greenplum-1 ~]# yum -y install rsync coreutils glib2 lrzsz sysstat e4fsprogs xfsprogs ntp readline-devel zlib zlib-devel openssl openssl-devel pam-devel libxml2-devel libxslt-devel python-devel tcl-devel gcc make smartmontools flex bison perl perl-devel perl-ExtUtils* OpenIPMI-tools openldap openldap-devel logrotate gcc-c++ python-py [root@dw-greenplum-1 ~]# yum -y install bzip2-devel libevent-devel apr-devel curl-devel ed python-paramiko python-devel [root@dw-greenplum-1 ~]# wget https://bootstrap.pypa.io/get-pip.py [root@dw-greenplum-1 ~]# python get-pip.py [root@dw-greenplum-1 ~]# pip install lockfile paramiko setuptools epydoc psutil [root@dw-greenplum-1 ~]# pip install --upgrade setuptools
使用 gpadmin 用戶登陸, 使用 unzip
命令解壓下載好的源碼 ,會生成 gpdb-master 文件夾,爲了便於以後的命令與本文對應,能夠將 gpdb-master 代碼目錄移動到 /home/gpadmin 目錄下($ mv gpdb-master ~
)。
建立程序安裝目錄 mkdir
( 推薦直接 mkdir ~/gpdb
,將安裝目錄也放在 home 下),確認ls -l
全部者爲 gpadmin, 若是是 root 用戶建立的,以後須要 chown
修改。
gpadmin 用戶執行配置 --prefix
後是安裝目錄,須要debug的還要加 --enable-debug --enable-testutils --enable-debugbreak 等參數
$ ./configure --prefix=/home/gpadmin/gpdb
以後
$ make $ make install
若是到這部沒有出問題,恭喜你,你已正常安裝了。
若是你在上面任何一部發生錯誤, Google
一下報錯信息,通常都有解答。常見的問題是缺乏包或軟件,好比沒有安裝 bison
或是 flex
,又或者找不到 python.h 頭文件,你須要從新以root
登陸而後 yum
安裝(例如 yum install bison
)。若是明明已經安裝,可是仍是報找不到,那請嘗試在關鍵字後加上 devel
(例如 yum install python-devel
) 。最後 從新 依次運行本節的三個命令。
由於只在 master 上安裝了Greenplum,因此下面要將安裝包批量發送到每一個 slave 機器上,才能算是整個Greenplum 集羣完整安裝了Greenplum。固然你也能夠在全部機器上重複configure
和make
。
咱們先在 master 主節點上建立安裝 GP 的 tar 文件,其中 gpdb 是安裝路徑
$ cd /home/gpadmin $ gtar -cvf /home/gpadmin/gp.tar gpdb
下面的操做都是爲了鏈接全部節點,並將安裝包發送到每一個節點。
master 主機,以 gpadmin 用戶身份建立如下文本,可在~目錄下建立 conf 文件夾,用來放這些啓動置信息
$ vi ./conf/hostlist
mdw
sdw1
sdw2
vi ./conf/seg_hosts
sdw1
sdw2
安裝目錄下的greenplum_path.sh中保存了運行Greenplum的一些環境變量設置,包括GPHOOME、PYTHONHOME等設置,以 gpadmin 身份執行 source
命令使生效,以後 gpssh-exkeys
交換密鑰
[gpadmin@dw-greenplum-1 ~]$ source /home/gpadmin/gpdb/greenplum_path.sh [gpadmin@dw-greenplum-1 ~]$ gpssh-exkeys -f /home/gpadmin/conf/hostlist [STEP 1 of 5] create local ID and authorize on local host ... /home/gpadmin/.ssh/id_rsa file exists ... key generation skipped [STEP 2 of 5] keyscan all hosts and update known_hosts file [STEP 3 of 5] authorize current user on remote hosts ... send to sdw1 ... send to sdw2 [STEP 4 of 5] determine common authentication file content [STEP 5 of 5] copy authentication files to all remote hosts ... finished key exchange with sdw1 ... finished key exchange with sdw2 [INFO] completed successfully
咱們將之間的 tar 文件發給其餘機器
$ gpscp -f /home/gpadmin/conf/seg_hosts /home/gpadmin/gp.tar =:/home/gpadmin
master 節點鏈接 slave 節點,以後執行全部命令都應該有n份輸出纔對。
[gpadmin@mdw ~]$ gpssh -f /home/gpadmin/conf/hostlist Note: command history unsupported on this machine ... => pwd [sdw1] /home/gpadmin [sdw2] /home/gpadmin [ mdw] /home/gpadmin =>
在gpssh下解壓以前的 tar
=> gtar -xvf gp.tar
最後建立數據庫工做目錄
=> pwd [sdw1] /home/gpadmin [sdw2] /home/gpadmin [ mdw] /home/gpadmin => mkdir gpdata => cd gpdata => mkdir gpdatap1 gpdatap2 gpdatam1 gpdatam2 gpmaster => ll [sdw3] 總用量 20 [sdw3] drwxrwxr-x 2 gpadmin gpadmin 4096 8月 18 19:46 gpdatam1 [sdw3] drwxrwxr-x 2 gpadmin gpadmin 4096 8月 18 19:46 gpdatam2 [sdw3] drwxrwxr-x 2 gpadmin gpadmin 4096 8月 18 19:46 gpdatap1 [sdw3] drwxrwxr-x 2 gpadmin gpadmin 4096 8月 18 19:46 gpdatap2 [sdw3] drwxrwxr-x 2 gpadmin gpadmin 4096 8月 18 19:46 gpmaster [ mdw] 總用量 20 [ mdw] drwxrwxr-x 2 gpadmin gpadmin 4096 8月 18 19:46 gpdatam1 [ mdw] drwxrwxr-x 2 gpadmin gpadmin 4096 8月 18 19:46 gpdatam2 [ mdw] drwxrwxr-x 2 gpadmin gpadmin 4096 8月 18 19:46 gpdatap1 [ mdw] drwxrwxr-x 2 gpadmin gpadmin 4096 8月 18 19:46 gpdatap2 [ mdw] drwxrwxr-x 2 gpadmin gpadmin 4096 8月 18 19:46 gpmaster [sdw2] 總用量 20 [sdw2] drwxrwxr-x 2 gpadmin gpadmin 4096 8月 18 19:46 gpdatam1 [sdw2] drwxrwxr-x 2 gpadmin gpadmin 4096 8月 18 19:46 gpdatam2 [sdw2] drwxrwxr-x 2 gpadmin gpadmin 4096 8月 18 19:46 gpdatap1 [sdw2] drwxrwxr-x 2 gpadmin gpadmin 4096 8月 18 19:46 gpdatap2 [sdw2] drwxrwxr-x 2 gpadmin gpadmin 4096 8月 18 19:46 gpmaster => exit
[gpadmin@dw-greenplum-1 ~]$ cd [gpadmin@dw-greenplum-1 ~]$ vi .bash_profile
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
# User specific environment and startup programs
PATH=$PATH:$HOME/bin
export PATH
source /home/gpadmin/gpdb/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/home/gpadmin/gpdata/gpmaster/gpseg-1
export PGPORT=2345
export PGDATABASE=testDB
[gpadmin@dw-greenplum-1 ~]$ . ~/.bash_profile (讓環境變量生效)
將安裝目錄下的 /gpdb/docs/cli_help/gpconfigs/gpinitsystem_config 文件 copy 到 /home/gpadmin/conf 目錄下而後編輯,保留以下參數便可
ARRAY_NAME="Greenplum"
SEG_PREFIX=gpseg
PORT_BASE=40000
declare -a DATA_DIRECTORY=(/home/gpadmin/gpdata/gpdatap1 /home/gpadmin/gpdata/gpdatap2)
MASTER_HOSTNAME=mdw
MASTER_DIRECTORY=/home/gpadmin/gpdata/gpmaster
##### Port number for the master instance.
MASTER_PORT=2345
# #### Shell utility used to connect to remote hosts.
TRUSTED_SHELL=/usr/bin/ssh
##### Maximum log file segments between automatic WAL checkpoints.
CHECK_POINT_SEGMENTS=8
ENCODING=UNICODE
MIRROR_PORT_BASE=50000
REPLICATION_PORT_BASE=41000
MIRROR_REPLICATION_PORT_BASE=51000
declare -a MIRROR_DATA_DIRECTORY=(/home/gpadmin/gpdata/gpdatam1 /home/gpadmin/gpdata/gpdatam2)
MACHINE_LIST_FILE=/home/gpadmin/conf/seg_hosts
而後運行以下命令進行初始化,添加 -s sdw2 可增長masterby 功能,但主機的備份機器應確保作了一樣的配置
[gpadmin@dw-greenplum-1 ~]$ gpinitsystem -c /home/gpadmin/conf/gpinitsystem_config –a
若是成功的話,將看到以下內容。咱們就能夠用 psql
登錄默認數據庫了(\q
退出),而後能夠試試建立個數據庫 testDB 。
... 20160906:22:24:09:028822 gpinitsystem:dw-greenplum-1:gpadmin-[INFO]:-Greenplum Database instance successfully created 20160906:22:24:09:028822 gpinitsystem:dw-greenplum-1:gpadmin-[INFO]:------------------------------------------------------- 20160906:22:24:09:028822 gpinitsystem:dw-greenplum-1:gpadmin-[INFO]:-To complete the environment configuration, please 20160906:22:24:09:028822 gpinitsystem:dw-greenplum-1:gpadmin-[INFO]:-update gpadmin .bashrc file with the following 20160906:22:24:09:028822 gpinitsystem:dw-greenplum-1:gpadmin-[INFO]:-1. Ensure that the greenplum_path.sh file is sourced 20160906:22:24:09:028822 gpinitsystem:dw-greenplum-1:gpadmin-[INFO]:-2. Add "export MASTER_DATA_DIRECTORY=/home/gpadmin/gpdata/gpmaster/gpseg-1" 20160906:22:24:09:028822 gpinitsystem:dw-greenplum-1:gpadmin-[INFO]:- to access the Greenplum scripts for this instance: 20160906:22:24:09:028822 gpinitsystem:dw-greenplum-1:gpadmin-[INFO]:- or, use -d /home/gpadmin/gpdata/gpmaster/gpseg-1 option for the Greenplum scripts 20160906:22:24:09:028822 gpinitsystem:dw-greenplum-1:gpadmin-[INFO]:- Example gpstate -d /home/gpadmin/gpdata/gpmaster/gpseg-1 20160906:22:24:09:028822 gpinitsystem:dw-greenplum-1:gpadmin-[INFO]:-Script log file = /home/gpadmin/gpAdminLogs/gpinitsystem_20160906.log 20160906:22:24:09:028822 gpinitsystem:dw-greenplum-1:gpadmin-[INFO]:-To remove instance, run gpdeletesystem utility 20160906:22:24:09:028822 gpinitsystem:dw-greenplum-1:gpadmin-[INFO]:-To initialize a Standby Master Segment for this Greenplum instance 20160906:22:24:09:028822 gpinitsystem:dw-greenplum-1:gpadmin-[INFO]:-Review options for gpinitstandby 20160906:22:24:09:028822 gpinitsystem:dw-greenplum-1:gpadmin-[INFO]:------------------------------------------------------- 20160906:22:24:09:028822 gpinitsystem:dw-greenplum-1:gpadmin-[INFO]:-The Master /home/gpadmin/gpdata/gpmaster/gpseg-1/pg_hba.conf post gpinitsystem 20160906:22:24:09:028822 gpinitsystem:dw-greenplum-1:gpadmin-[INFO]:-has been configured to allow all hosts within this new 20160906:22:24:09:028822 gpinitsystem:dw-greenplum-1:gpadmin-[INFO]:-array to intercommunicate. Any hosts external to this 20160906:22:24:09:028822 gpinitsystem:dw-greenplum-1:gpadmin-[INFO]:-new array must be explicitly added to this file 20160906:22:24:09:028822 gpinitsystem:dw-greenplum-1:gpadmin-[INFO]:-Refer to the Greenplum Admin support guide which is 20160906:22:24:09:028822 gpinitsystem:dw-greenplum-1:gpadmin-[INFO]:-located in the /home/gpadmin/gpdb/docs directory 20160906:22:24:09:028822 gpinitsystem:dw-greenplum-1:gpadmin-[INFO]:------------------------------------------------------- [gpadmin@dw-greenplum-1 conf]$ psql -d postgres psql (8.3.23) Type "help" for help. postgres-# \q [gpadmin@dw-greenplum-1 gpft.bitbucket.org]$ createdb -E utf-8 testDB [gpadmin@dw-greenplum-1 gpft.bitbucket.org]$
若是錯誤的話,可能會出現以下內容:
20160921:06:14:35:052982 gpinitsystem:jyh-greenplum-1:gpadmin-[FATAL]:-Errors generated from parallel processes 20160921:06:14:35:052982 gpinitsystem:jyh-greenplum-1:gpadmin-[INFO]:-Dumped contents of status file to the log file 20160921:06:14:35:052982 gpinitsystem:jyh-greenplum-1:gpadmin-[INFO]:-Building composite backout file 20160921:06:14:35:052982 gpinitsystem:jyh-greenplum-1:gpadmin-[INFO]:-Start Function ERROR_EXIT 20160921:06:14:35:gpinitsystem:jyh-greenplum-1:gpadmin-[FATAL]:-Failures detected, see log file /home/gpadmin/gpAdminLogs/gpinitsystem_20160921.log for more detail Script Exiting! 20160921:06:14:35:052982 gpinitsystem:jyh-greenplum-1:gpadmin-[WARN]:-Script has left Greenplum Database in an incomplete state 20160921:06:14:35:052982 gpinitsystem:jyh-greenplum-1:gpadmin-[WARN]:-Run command /bin/bash /home/gpadmin/gpAdminLogs/backout_gpinitsystem_gpadmin_20160921_061055 to remove these changes 20160921:06:14:35:052982 gpinitsystem:jyh-greenplum-1:gpadmin-[INFO]:-Start Function BACKOUT_COMMAND 20160921:06:14:35:052982 gpinitsystem:jyh-greenplum-1:gpadmin-[INFO]:-End Function BACKOUT_COMMAND
那就按要求運行 /bin/bash /home/gpadmin/gpAdminLogs/backout_gpinitsystem_gpadmin_×××
清除 gpdata 下數據(也能夠自行刪除全部機器 gpdata 下各子文件夾下的全部數據),而後想辦法解決問題,最後從新初始化。
vi /home/gpadmin/.gphostcache
看看緩存的 host 對不對,不對的話修改過來。由於若是在修改 network 文件以前執行過 gpssh-exkeys ,可能會在 gphostcache 文件中生成主機名和 hostlist 配置中的名字造成對應關係,而 greenplum 以後不會再修改這個文件,這樣的話 gpdata 下就會初始化錯誤的節點數據,因此這裏是個大坑。TPC-H 介紹請參考 《TPC-H 使用》
事務處理性能委員會( Transaction ProcessingPerformance Council ),是由數10家會員公司建立的非盈利組織,總部設在美國。該組織對全世界開放,但迄今爲止,絕大多數會員都是美、日、西歐的大公司。TPC的成員主要是計算機軟硬件廠家,而非計算機用戶,它的功能是制定商務應用基準程序(Benchmark)的標準規範、性能和價格度量,並管理測試結果的發佈。
TPC- H 主要目的是評價特定查詢的決策支持能力,強調服務器在數據挖掘、分析處理方面的能力。查詢是決策支持應用的最主要應用之一,數據倉庫中的複雜查詢能夠分紅兩種類型:一種是預先知道的查詢,如按期的業務報表;另外一種則是事先未知的查詢,稱爲動態查詢(Ad- Hoc Query)。
通俗的講,TPC-H就是當一家數據庫開發商開發了一個新的數據庫操做系統,採用TpC-H做爲測試基準,來測試衡量數據庫操做系統查詢決策支持方面的能力。 —— 《TPC-H 使用》
First, download the TPC-H benchmark from http://tpc.org/tpch/default.asp and extract it to a directory
而後,使用TPC-H測試Greenplum,請參考以下
值得注意的是最後啓動測試使用命令 ./tpch.sh ./results testDB gpadmin
時,推薦要用 gpadmin 帳戶,不然須要修改 pg_hba.conf
中的用戶權限。
Greenplum 源碼編譯安裝教程 學徒Grayson
Greenplum安裝 renlipeng
轉載請註明 做者 Arthur_Qin(禾衆) 及文章地址 http://www.cnblogs.com/arthurqin/p/5849354.html
.