Part1:寫在最前mysql
MariaDB ColumnStore is the future of data warehousing. ColumnStore allows us to store more data and analyze it faster. Everyday, Pinger’s mobile applications process millions of text messages and phone calls. We also process more than 1.5 billion rows of logs per day. Analytic scalability and performance is critical to our business. MariaDB’s ColumnStore manages massive amounts of data and will scale with Pinger as we grow.ios
----from mariadb.comgit
Part2:大牛如何說sql
MariaDB ColumnStore是在MariaDB 10.1基礎上移植了InfiniDB4.6.2構建的大規模並行,高性能,壓縮,分佈式開源列式存儲引擎,相似收費產品Infobright。它設計用於大數據離線分析,用來抗衡Hadoop。你可使用標準SQL語句進行查詢,支持目前流行的sqlyog/navicat客戶端工具鏈接,對業務方使用沒有任何的不便,而且你不須要建立任何索引,不須要修改業務方的複雜SQL(自身就支持複雜的關聯查詢、聚合、存儲過程和用戶定義的函數),你惟一要作的就是把數據導入到ColumnStore裏,就沒你事了。這對一家沒有Hadoop工程師的公司來講,MariaDB ColumnStore會是一個更好的替代產品。數據庫
-----from 賀春暘bootstrap
Part3:環境簡介後端
192.168.1.248 HE1 um1centos
192.168.1.249 HE2 um2緩存
192.168.1.250 HE3 pm1bash
192.168.1.251 HE4 pm2
Part1:寫在最前
MariaDB ColumnStore是一種專爲分佈式大規模並行處理(MPP)設計的列式存儲引擎。它由三個組件組成,協同工做。
在官方給出的架構圖中,咱們能夠看到分爲三個組件構成:UM、PM、數據存儲層。
用戶模塊(UM):
用戶模塊管理和控制終端用戶查詢的操做,它維護每一個查詢的狀態,向一個或多個性能模塊發出請求以代爲執行SQL查詢工做,最後,用戶模塊聚集來自各個參與的性能模塊的全部查詢結果,以造成返回給用戶的完整的查詢結果集。
性能模塊(PM):
性能模塊負責存儲,檢索和管理數據,處理對查詢操做的塊請求,並將其傳遞迴用戶模塊以完成查詢請求。性能模塊將獲取的數據緩存在其內存中計算。MPP是經過容許用戶配置儘量多的性能模塊,以實現更高的處理能力。
存儲:
MariaDB ColumnStore對於存儲系統極爲靈活。當在內部運行時,它可使用本地存儲或共享存儲(例如SAN)來存儲數據。在Amazon EC2環境中,它可使用臨時或彈性塊存儲(EBS)卷。當無共享部署須要數據冗餘時,它被構建爲與GlusterFS和Apache Hadoop分佈式文件系統(HDFS)集成。
一句話總結:用戶模塊(UM)將客戶端發出的SQL請求進行分配,分配到後端性能模塊(PM),PM進行數據查詢分析,將處理的結果返回給UM,UM再把PM分析的結果進行聚合,最後返回給客戶端最終的查詢結果。
Percona專業MySQL服務提供商性能測試InfiniDB比其餘OLAP優點明顯。
Part1:打通互信
[root@HE1 ~]# ssh-keygen [root@HE1 ~]# ssh-copy-id '-p 22 root@192.168.1.248' [root@HE1 ~]# ssh-copy-id '-p 22 root@192.168.1.249' [root@HE1 ~]# ssh-copy-id '-p 22 root@192.168.1.250' [root@HE1 ~]# ssh-copy-id '-p 22 root@192.168.1.251' [root@HE1 ~]# ssh HE1 [root@HE1 ~]# ssh HE2 [root@HE1 ~]# ssh HE3 [root@HE1 ~]# ssh HE4 [root@HE1 ~]# cat /etc/hosts 192.168.1.248 HE1 192.168.1.249 HE2 192.168.1.250 HE3 192.168.1.251 HE4
在HE1,HE2,HE3,HE4每臺機器上執行上述命令,打通ssh互信。
Part2:關閉防火牆
[root@HE1 ~]# /etc/init.d/iptables status
iptables: Firewall is not running.
[root@HE1 ~]# chkconfig iptables off
[root@HE1 ~]# chkconfig --list | grep iptables
iptables 0:off1:off2:off3:off4:off5:off6:off
Part3:關閉文件系統訪問時間和修改磁盤調度策略
[root@HE1 ~]# cat /etc/fstab # # /etc/fstab # Created by anaconda on Sat Mar 5 09:35:40 2016 # # Accessible filesystems, by reference, are maintained under '/dev/disk' # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info # UUID=397d50ba-22b0-4d50-9e29-89e3b92d2d07 / ext4 defaults,noatime,barrier=0 1 1 [root@HE1 ~]# echo "deadline" > /sys/block/sda/queue/scheduler
Part4:關閉numa
[root@HE1 ~]# cat /etc/grub.conf # grub.conf generated by anaconda # # Note that you do not have to rerun grub after making changes to this file # NOTICE: You have a /boot partition. This means that # all kernel and initrd paths are relative to /boot/, eg. # root (hd0,0) # kernel /vmlinuz-version ro root=/dev/sda2 # initrd /initrd-[generic-]version.img #boot=/dev/sda default=1 timeout=5 splashp_w_picpath=(hd0,0)/grub/splash.xpm.gz hiddenmenu title CentOS (2.6.32-573.18.1.el6.x86_64.debug) root (hd0,0) kernel /vmlinuz-2.6.32-573.18.1.el6.x86_64.debug ro root=UUID=397d50ba-22b0-4d50-9e29-89e3b92d2d07 rd_NO_LUKS rd_NO_LVM LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet numa=off initrd /initramfs-2.6.32-573.18.1.el6.x86_64.debug.img title CentOS (2.6.32-431.el6.x86_64) root (hd0,0) kernel /vmlinuz-2.6.32-431.el6.x86_64 ro root=UUID=397d50ba-22b0-4d50-9e29-89e3b92d2d07 rd_NO_LUKS rd_NO_LVM LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet numa=off initrd /initramfs-2.6.32-431.el6.x86_64.img
Part5:安裝jemalloc內存管理
[root@HE1 ~]# yum install jemalloc-*
[root@HE1 ~]# reboot
在HE1,HE2,HE3,HE4每臺機器上執行上述命令
Part6:安裝boost軟件包
[root@HE1 ~]# yum -y install boost*
[root@HE1 ~]# yum -y groupinstall "Development Tools"
[root@HE1 ~]# yum -y install cmake
[root@HE1 ~]# tar xvf boost_1_55_0.tar.gz
[root@HE1 ~]# cd boost_1_55_0
[root@HE1 boost_1_55_0]# ./bootstrap.sh --with-libraries=atomic,date_time,exception,filesystem,iostreams,locale,program_options,regex,signals,system,test,thread,timer,log --prefix=/usr
[root@HE1 boost_1_55_0]# ./b2 install
Part6:安裝perl依賴包
[root@HE1 ~]# yum -y install expect perl perl-DBI openssl zlib perl-DBD-MySQL
Part7:安裝Mariadb ColumStore
[root@HE1 ~]# tar xvf mariadb-columnstore-1.0.6-1-centos6.x86_64.bin.tar.gz -C /usr/local
Part8:配置Mariadb ColumStore
[root@HE1 ~]# /usr/local/mariadb/columnstore/bin/postConfigure
至此,MariaDB ColumnStore安裝並配置完成
Part1:關閉HE2
當前HE2爲primary um
mcsadmin> getSystemStatus getsystemstatus Mon Dec 26 15:56:47 2016 System columnstore-1 System and Module statuses Component Status Last Status Change ------------ -------------------------- ------------------------ System BUSY_INIT Mon Dec 26 15:56:38 2016 Module um1 AUTO_DISABLED/DEGRADED Mon Dec 26 15:56:40 2016 Module um2 ACTIVE Mon Dec 26 15:54:21 2016 Module pm1 ACTIVE Mon Dec 26 02:03:27 2016 Module pm2 ACTIVE Mon Dec 26 02:03:41 2016 Active Parent OAM Performance Module is 'pm1' Primary Front-End MariaDB Columnstore Module is 'um2' MariaDB Columnstore Replication Feature is enabled [root@HE2 ~]# reboot
在當前primary um重啓後,能夠看到primary um已經自動切換至um1
mcsadmin> getSystemStatus getsystemstatus Mon Dec 26 15:58:19 2016 System columnstore-1 System and Module statuses Component Status Last Status Change ------------ -------------------------- ------------------------ System BUSY_INIT Mon Dec 26 15:58:10 2016 Module um1 ACTIVE Mon Dec 26 15:57:17 2016 Module um2 AUTO_DISABLED/DEGRADED Mon Dec 26 15:58:11 2016 Module pm1 ACTIVE Mon Dec 26 02:03:27 2016 Module pm2 ACTIVE Mon Dec 26 02:03:41 2016 Active Parent OAM Performance Module is 'pm1' Primary Front-End MariaDB Columnstore Module is 'um1' MariaDB Columnstore Replication Feature is enabled
Part2:檢查狀態
在原primary um (HE2)中進入數據庫查看,如今已是um1的從庫
[root@HE2 ~]# mcsmysql Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 10 Server version: 10.1.19-MariaDB Columnstore 1.0.6-1 Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [(none)]> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.1.248 Master_User: idbrep Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000013 Read_Master_Log_Pos: 1879 Relay_Log_File: relay-bin.000002 Relay_Log_Pos: 537 Relay_Master_Log_File: mysql-bin.000013 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 1879 Relay_Log_Space: 829 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: No Gtid_IO_Pos: Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: conservative 1 row in set (0.00 sec) MariaDB [(none)]>
Part1:主鍵和索引
MariaDB [helei]> create table helei_innodb( -> id int(10) unsigned NOT NULL AUTO_INCREMENT, -> c1 int(10) NOT NULL DEFAULT '0', -> c2 int(10) unsigned DEFAULT NULL, -> c5 int(10) unsigned NOT NULL DEFAULT '0', -> c3 timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, -> c4 varchar(200) NOT NULL DEFAULT '', -> PRIMARY KEY(id), -> KEY idx_c1(c1), -> KEY idx_c2(c2) -> )ENGINE=InnoDB ; Query OK, 0 rows affected (0.03 sec) MariaDB [helei]> create table helei_cs( -> id int(10) unsigned NOT NULL AUTO_INCREMENT, -> c1 int(10) NOT NULL DEFAULT '0', -> c2 int(10) unsigned DEFAULT NULL, -> c5 int(10) unsigned NOT NULL DEFAULT '0', -> c3 timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, -> c4 varchar(200) NOT NULL DEFAULT '', -> PRIMARY KEY(id), -> KEY idx_c1(c1), -> KEY idx_c2(c2) -> )ENGINE=Columnstore; ERROR 1069 (42000): Too many keys specified; max 0 keys allowed
這裏能夠看出,columnstore存儲引擎不支持主鍵也不須要索引
MariaDB [helei]> create table helei_cs( -> id int(10) unsigned NOT NULL , -> c1 varchar(200) NOT NULL DEFAULT '' -> )ENGINE=Columnstore; Query OK, 0 rows affected (0.34 sec) MariaDB [helei]> insert into helei_cs values(1,'1'); Query OK, 1 row affected (0.60 sec) MariaDB [helei]> insert into helei_cs values(2,'2'); Query OK, 1 row affected (0.08 sec) MariaDB [helei]> insert into helei_cs values(3,'3'); Query OK, 1 row affected (0.17 sec)
這裏能夠看出columnstore的插入速度較慢
Warning:警告
columnstore不支持主鍵、索引、timestamp、collate用法、char\varchar類型的sum/average用法。
Part2:效率測試
1G內存虛擬機MariaDB ColumnStore 2.82s,線上生產庫8G的innodb_buffer_pool_size該條慢查詢耗時17.894s。
若是pm2的機器掛掉了,按照本來的想法,應該pm1能夠繼續工做,但沒法進行查詢,會報錯:
ERROR 1815 (HY000): Internal error: st: 10000 TupleBPS::sendPrimitiveMessages() caught an exception: IDB-2034: At least one DBRoot required for that query is offline.
這應該是一個BUG,由於pm是負責拉取數據到內存中進行計算的,數據本文中的實驗應該都存放在um機器下。咱們來查一下如今表中的數據:
MariaDB [erp_test]> show tables; +--------------------+ | Tables_in_erp_test | +--------------------+ | erp_bjlikp | | erp_bjlips | | erp_likp | | erp_lips | | erp_mara | +--------------------+ 5 rows in set (0.00 sec) MariaDB [erp_test]> select count(*) from erp_bjlikp; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (1.15 sec) MariaDB [erp_test]> select count(*) from erp_bjlips; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (1.15 sec) MariaDB [erp_test]> select count(*) from erp_lips; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (1.14 sec) MariaDB [erp_test]> select count(*) from erp_mara; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (1.15 sec)
會發現全部的錶行數都變爲0了
ProcessMonitor pm2 AUTO_OFFLINE Tue Dec 27 22:23:50 2016 ProcessManager pm2 AUTO_OFFLINE Tue Dec 27 22:23:50 2016 DBRMControllerNode pm2 AUTO_OFFLINE Tue Dec 27 22:23:50 2016 ServerMonitor pm2 AUTO_OFFLINE Tue Dec 27 22:23:50 2016 DBRMWorkerNode pm2 AUTO_OFFLINE Tue Dec 27 22:23:50 2016 DecomSvr pm2 AUTO_OFFLINE Tue Dec 27 22:23:50 2016 PrimProc pm2 AUTO_OFFLINE Tue Dec 27 22:23:50 2016 WriteEngineServer pm2 AUTO_OFFLINE Tue Dec 27 22:23:50 2016
MariaDB [erp_test]> select count(*) from erp_lips; +----------+ | count(*) | +----------+ | 3147299 | +----------+ 1 row in set (0.37 sec) MariaDB [erp_test]> select count(*) from erp_mara; +----------+ | count(*) | +----------+ | 4361 | +----------+ 1 row in set (0.08 sec) MariaDB [erp_test]> select count(*) from erp_bjlips; +----------+ | count(*) | +----------+ | 2762244 | +----------+ 1 row in set (0.13 sec) MariaDB [erp_test]> select count(*) from erp_bjlikp; +----------+ | count(*) | +----------+ | 19032 | +----------+ 1 row in set (0.09 sec) MariaDB [erp_test]> select count(*) from erp_likp; +----------+ | count(*) | +----------+ | 169002 | +----------+ 1 row in set (0.08 sec)
pm2機器啓動後,發現又恢復正常
這應該是軟件的一個bug,應該會在1.0.7GA版本修復。
感謝賀春暘老師指點,本人在測試環境中得以驗證該BUG
——總結——
你們能夠用生產的複雜SQL跑一跑,來體驗一下。因爲筆者的水平有限,編寫時間也很倉促,文中不免會出現一些錯誤或者不許確的地方,不妥之處懇請讀者批評指正。