專職DBA-MySQL主從延遲複製 本次實驗環境延用MySQL主從異步複製的搭建環境 mysql集羣企業級架構方案 1.根據對數據庫的訪問請求實現讀寫分離(讀從寫主) 2.根據不一樣的業務拆分多個從庫以提供訪問 一主五從 3從提供外部用戶讀請求訪問(讀寫分離、LVS負載均衡) 1從用於內部用戶讀訪問(業務後臺、數據分析、搜索業務、財務統計、定時任務、開發查詢等) 1從用於數據庫定時全備份,以及增量備份(開啓binlog) 3.實現對主庫的高可用 (1).heartbeat+dbrd+mysql方案 經過dbrd工具對主數據庫服務器實現基於block的異機物理複製,相似於網絡RAID1. 優勢:速度很快。 缺點:不能被訪問,除非主節點宕機,備節點才能夠提供訪問。 (2).mysql-MMM(Master-Master replication Manager)方案 經過mysql的replication實現主主之間的數據同步。 優勢:能夠實現slaves負載均衡。 缺點:MMM沒法徹底保持數據的一致性。 (3).mysql-MHA(Master High Availability)+keepalived方案 經過mysql的replication實現數據庫服務器之間的數據同步。 優勢:同時能夠實現從庫負載均衡,主庫宕機後自動選擇最優的從庫,將其切換爲主庫。 並盡最大的努力對有全部的庫作數據補全操做,一直到最新。 並對其餘從庫和新主庫實現複製,再加上keepalived是爲了實現vip漂移。 (4).PXC (5).共享存儲方案 (6).數據庫分佈式部署方案 (7).MGR mysql企業級備份策略方案 1.利用mysql主從複製的從庫進行數據備份策略 (1).選擇一個不對外提供服務的從庫,專門作數據備份用。 (2).開啓從庫的binlog功能。 (3).數據量小於30GB用mysqldump邏輯備份; 數據庫大於30GB用Xtrabackup物理熱備工具。 mysql主從複製生產環境的常見延遲緣由 易致使複製延遲的緣由: 1.一個主庫的從庫太多 2.從庫硬件比主庫查 3.慢sql語句過多 4.主從複製的設計問題 5.主從複製之間的網絡延遲 6.主庫讀寫壓力太大 mysql主從複製數據一致性企業級方案 1.採用半同步複製方案 2.當複製發生延遲時讓程序改讀主庫 mysql多線程複製解決複製延遲實踐 [root@db01 ~]# mysqld --defaults-file=/data/mysql/3306/my.cnf & [root@db02 ~]# mysqld --defaults-file=/data/mysql/3306/my.cnf & (1).查看當前slave服務器的SQL線程狀態 [root@db02 ~]# mysql -S /data/mysql/3306/mysql.sock -p Enter password: Slave [(none)]> show processlist; +----+-------------+-----------+------+---------+------+--------------------------------------------------------+------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+-------------+-----------+------+---------+------+--------------------------------------------------------+------------------+ | 1 | system user | | NULL | Connect | 49 | Waiting for master to send event | NULL | | 2 | system user | | NULL | Connect | 48 | Slave has read all relay log; waiting for more updates | NULL | | 4 | root | localhost | NULL | Query | 0 | starting | show processlist | +----+-------------+-----------+------+---------+------+--------------------------------------------------------+------------------+ 3 rows in set (0.00 sec) (2).檢查多線程的參數配置 默認爲0表示單線程複製 Slave [(none)]> show variables like "%parallel%"; +------------------------+----------+ | Variable_name | Value | +------------------------+----------+ | slave_parallel_type | DATABASE | | slave_parallel_workers | 0 | +------------------------+----------+ 2 rows in set (0.01 sec) (3).中止主從複製,在線修改線程數 Slave [(none)]> stop slave; Query OK, 0 rows affected (0.00 sec) Slave [(none)]> set global slave_parallel_workers = 4; Query OK, 0 rows affected (0.00 sec) Slave [(none)]> show variables like "%parallel%"; +------------------------+----------+ | Variable_name | Value | +------------------------+----------+ | slave_parallel_type | DATABASE | | slave_parallel_workers | 4 | +------------------------+----------+ 2 rows in set (0.00 sec) (4).啓動主從複製,查看SQL線程數 Slave [(none)]> start slave; Query OK, 0 rows affected (0.04 sec) Slave [(none)]> show processlist; +----+-------------+-----------+------+---------+------+--------------------------------------------------------+------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+-------------+-----------+------+---------+------+--------------------------------------------------------+------------------+ | 4 | root | localhost | NULL | Query | 0 | starting | show processlist | | 5 | system user | | NULL | Connect | 28 | Waiting for master to send event | NULL | | 6 | system user | | NULL | Connect | 28 | Slave has read all relay log; waiting for more updates | NULL | | 7 | system user | | NULL | Connect | 28 | Waiting for an event from Coordinator | NULL | | 8 | system user | | NULL | Connect | 28 | Waiting for an event from Coordinator | NULL | | 9 | system user | | NULL | Connect | 28 | Waiting for an event from Coordinator | NULL | | 10 | system user | | NULL | Connect | 28 | Waiting for an event from Coordinator | NULL | +----+-------------+-----------+------+---------+------+--------------------------------------------------------+------------------+ 7 rows in set (0.00 sec) (5).想永久生效就寫入my.cnf [root@db02 ~]# vim /data/mysql/3306/my.cnf [mysqld] slave_parallel_workers = 4 讓mysql主從複製的從庫只讀訪問 1.read-only參數容許數據庫更新的條件 (1).具備super權限的用戶能夠更新,不受read-only參數影響。例如:root (2).來自從服務器具有主從複製權限的線程能夠更新,不受read-only參數的影響。例如:rep 2.如何配置read-only參數 (1).啓動數據庫時直接帶--read-only參數啓動。 mysqld_safe --read-only --user=mysql & (2).在my.cnf文件中配置 [root@db02 ~]# vim /data/mysql/3306/my.cnf [mysqld] read-only 而後重啓數據庫 mysqladmin -S /data/mysql/3306/mysql.sock -p shutdown mysqld --defaults-file=/data/mysql/3306/my.cnf & mysql主從複製讀寫分離Web用戶生產設置方案 在配置好mysql主從複製,並實現了讀寫分離之後,數據庫受權程序訪問的用戶設置方法: 1.主庫和從庫使用不一樣的用戶,授予不一樣的權限。 主庫上對web_w用戶的受權 grant select,insert,update,delete on `web`.* to 'web_w'@'192.168.10.%' identified by '123'; 從庫上對web_r用戶的受權 grant select on `web`.* to 'web_r'@'192.168.10.%' identified by '123'; 2.網站程序訪問主庫和從庫時使用一套用戶密碼。 (1).主庫和從庫使用相同的用戶,但授予不一樣的權限。 忽略主庫的mysql受權庫同步 [root@db01 ~]# vim /data/mysql/3306/my.cnf binlog-ignore-db = mysql #mysql庫不記錄binlog日誌 replicate-ignore-db = mysql #忽略複製mysql庫 在主庫上建立完web用戶和權限以後,在從庫上revoke回收對應的更新權限 主庫:grant select,insert,update,delete on `web`.* to 'web'@'192.168.10.%' identified by '123'; 從庫:grant select on `web`.* to 'web'@'192.168.10.%' identified by '123'; 在從庫上設置read-only參數,讓從庫只讀 [root@db02 ~]# vim /data/mysql/3306/my.cnf [mysqld] read-only 而後重啓數據庫 mysqladmin -S /data/mysql/3306/mysql.sock -p shutdown mysqld --defaults-file=/data/mysql/3306/my.cnf & mysql主從延遲複製方案及恢復實踐 [root@db02 ~]# mysql -S /data/mysql/3306/mysql.sock -p Enter password: Slave [(none)]> stop slave; Query OK, 0 rows affected (0.01 sec) Slave [(none)]> change master to master_delay = 60; Query OK, 0 rows affected (0.01 sec) Slave [(none)]> start slave; Query OK, 0 rows affected (0.02 sec) Slave [(none)]> show slave status\G *************************** 1. row *************************** SQL_Delay: 60 #延遲60秒進行復制 SQL_Remaining_Delay: NULL #還剩多少秒執行復制 Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates #sql線程狀態 [root@db01 ~]# mysql -S /data/mysql/3306/mysql.sock -p Enter password: Master [(none)]> create database app; Query OK, 1 row affected (0.00 sec) Master [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | app | | mysql | | performance_schema | | shenzhen | | sys | +--------------------+ 6 rows in set (0.00 sec) Slave [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | shenzhen | | sys | +--------------------+ 5 rows in set (0.01 sec) 可是中繼日誌裏面已經有建立的語句了,說明IO線程仍是實時在工做的。 [root@db02 ~]# cd /data/mysql/3306/data/ [root@db02 /data/mysql/3306/data]# ls -l total 122952 -rw-r----- 1 mysql mysql 56 Jul 15 05:52 auto.cnf -rw-r----- 1 mysql mysql 206 Jul 16 01:43 db02-relay-bin.000001 -rw-r----- 1 mysql mysql 476 Jul 16 01:47 db02-relay-bin.000002 -rw-r----- 1 mysql mysql 48 Jul 16 01:43 db02-relay-bin.index -rw-r----- 1 mysql mysql 599 Jul 15 06:54 ib_buffer_pool -rw-r----- 1 mysql mysql 12582912 Jul 16 01:23 ibdata1 -rw-r----- 1 mysql mysql 50331648 Jul 16 01:23 ib_logfile0 -rw-r----- 1 mysql mysql 50331648 Jul 15 05:52 ib_logfile1 -rw-r----- 1 mysql mysql 12582912 Jul 16 01:23 ibtmp1 -rw-r----- 1 mysql mysql 122 Jul 16 01:48 master.info drwxr-x--- 2 mysql mysql 4096 Jul 15 06:35 mysql drwxr-x--- 2 mysql mysql 8192 Jul 15 05:52 performance_schema -rw-r----- 1 mysql mysql 59 Jul 16 01:43 relay-log.info drwxr-x--- 2 mysql mysql 48 Jul 15 06:50 shenzhen drwxr-x--- 2 mysql mysql 8192 Jul 15 05:52 sys -rw-r----- 1 mysql mysql 84 Jul 16 01:43 worker-relay-log.info.1 -rw-r----- 1 mysql mysql 84 Jul 16 01:43 worker-relay-log.info.2 -rw-r----- 1 mysql mysql 84 Jul 16 01:43 worker-relay-log.info.3 -rw-r----- 1 mysql mysql 84 Jul 16 01:43 worker-relay-log.info.4 [root@db02 /data/mysql/3306/data]# mysqlbinlog db02-relay-bin.000002 SET @@session.lc_time_names=0/*!*/; SET @@session.collation_database=DEFAULT/*!*/; create database app /*!*/; 過了1分鐘後 Slave [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | app | | mysql | | performance_schema | | shenzhen | | sys | +--------------------+ 6 rows in set (0.00 sec) mysql的延遲複製實際上影響的只是SQL線程將數據應用到從庫。 而IO線程早已把主庫更新的數據寫到了從庫的中繼日誌裏面。 所以,在延遲複製期間,即便主庫宕機了,從庫到了延遲複製的時間,也依然會把數據更新到與主庫宕機時一致。 使用mysql主從延遲複製進行數據恢復實踐 1.模擬環境,將從庫延遲調整爲3600秒 [root@db02 ~]# mysql -S /data/mysql/3306/mysql.sock -p Enter password: Slave [(none)]> stop slave; Query OK, 0 rows affected (0.01 sec) [root@db02 ~]# mysql -u root -p -S /application/mysql/tmp/mysql.sock Enter password: mysql> stop slave ; Query OK, 0 rows affected (0.00 sec) Slave [(none)]> change master to master_delay = 3600; Query OK, 0 rows affected (0.02 sec) Slave [(none)]> start slave; Query OK, 0 rows affected (0.03 sec) Slave [(none)]> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.10.11 Master_User: rep Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000005 Read_Master_Log_Pos: 350 Relay_Log_File: db02-relay-bin.000002 Relay_Log_Pos: 320 Relay_Master_Log_File: mysql-bin.000005 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 350 Relay_Log_Space: 526 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 113306 Master_UUID: 7c145945-a680-11e9-baea-000c29a14cf7 Master_Info_File: /data/mysql/3306/data/master.info SQL_Delay: 3600 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: 7c145945-a680-11e9-baea-000c29a14cf7:1-4 Auto_Position: 0 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec) 2.模擬在主庫寫入數據,每隔5秒寫入一個庫,就當是模擬用戶寫入數據了 for n in {01..05} do mysql -S /data/mysql/3306/mysql.sock -p123 -e "create database app$n;" sleep 5 done [root@db01 ~]# for n in {01..05} > do > mysql -S /data/mysql/3306/mysql.sock -p123 -e "create database app$n;" > sleep 5 > done mysql: [Warning] Using a password on the command line interface can be insecure. mysql: [Warning] Using a password on the command line interface can be insecure. mysql: [Warning] Using a password on the command line interface can be insecure. mysql: [Warning] Using a password on the command line interface can be insecure. mysql: [Warning] Using a password on the command line interface can be insecure. [root@db01 ~]# 3.模擬人爲破壞數據,也能夠是不帶where的update語句。 Master [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | app | | app01 | | app02 | | app03 | | app04 | | app05 | | mysql | | performance_schema | | shenzhen | | sys | +--------------------+ 11 rows in set (0.00 sec) 刪除oldboy5數據庫,後面要作的就是把這個數據庫恢復回來,別的數據還得保留。 Master [(none)]> drop database app05; Query OK, 0 rows affected (0.01 sec) Master [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | app | | app01 | | app02 | | app03 | | app04 | | mysql | | performance_schema | | shenzhen | | sys | +--------------------+ 10 rows in set (0.00 sec) Master [(none)]> drop database app05; Query OK, 0 rows affected (0.01 sec) Master [(none)]> show databases like "%app%"; +------------------+ | Database (%app%) | +------------------+ | app | | app01 | | app02 | | app03 | | app04 | +------------------+ 5 rows in set (0.00 sec) 如今,全部的從庫都已是壞數據了,只有延遲從庫是好的,可是是一個小時以前的數據。 4.當數據庫出現誤刪數據的狀況時,特別是update不加條件破壞數據,要想完整恢復數據,最好選擇對外中止訪問措施,此時須要犧牲用戶體驗了。 [root@db01 ~]# iptables -I INPUT -p tcp --dport 3306 ! -s 192.168.10.13 -j DROP 非192.168.10.13禁止訪問數據庫3306端口,11是主庫IP,192.168.10.13爲遠程鏈接ssh客戶端的IP。 5.登陸主庫從庫查看binlog發送接收進行確認。 Master [(none)]> show processlist; +----+------+---------------------+------+-------------+------+---------------------------------------------------------------+------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+------+---------------------+------+-------------+------+---------------------------------------------------------------+------------------+ | 5 | root | localhost | NULL | Query | 0 | starting | show processlist | | 6 | rep | 192.168.10.12:39828 | NULL | Binlog Dump | 754 | Master has sent all binlog to slave; waiting for more updates | NULL | +----+------+---------------------+------+-------------+------+---------------------------------------------------------------+------------------+ 2 rows in set (0.00 sec) 從庫 Slave [(none)]> show processlist; +----+-------------+-----------+------+---------+------+----------------------------------------------------------------+------------------+ | Id | User | Host | db | Command | Time | State | Info | +----+-------------+-----------+------+---------+------+----------------------------------------------------------------+------------------+ | 20 | root | localhost | NULL | Query | 0 | starting | show processlist | | 21 | system user | | NULL | Connect | 789 | Waiting for master to send event | NULL | | 22 | system user | | NULL | Connect | 228 | Waiting until MASTER_DELAY seconds after master executed event | NULL | | 23 | system user | | NULL | Connect | 789 | Waiting for an event from Coordinator | NULL | | 24 | system user | | NULL | Connect | 789 | Waiting for an event from Coordinator | NULL | | 25 | system user | | NULL | Connect | 789 | Waiting for an event from Coordinator | NULL | | 26 | system user | | NULL | Connect | 789 | Waiting for an event from Coordinator | NULL | +----+-------------+-----------+------+---------+------+----------------------------------------------------------------+------------------+ 7 rows in set (0.00 sec) Slave [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | app | | mysql | | performance_schema | | shenzhen | | sys | +--------------------+ 6 rows in set (0.00 sec) 6.在從庫上中止主從複製,並查看數據庫是否已同步過來。 Slave [(none)]> stop slave; Query OK, 0 rows affected (0.01 sec) Slave [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | app | | mysql | | performance_schema | | shenzhen | | sys | +--------------------+ 6 rows in set (0.00 sec) 由於還未到延遲時間,因此數據不會同步到該延遲從庫。 7.根據relay-log.info記錄的sql線程讀取relay-log的位置,解析未應用到從庫的relay-bin日誌。 進入中繼日誌所在的目錄 [root@db02 ~]# cd /data/mysql/3306/data/ [root@db02 /data/mysql/3306/data]# ls -l total 122952 drwxr-x--- 2 mysql mysql 20 Jul 16 01:48 app -rw-r----- 1 mysql mysql 56 Jul 15 05:52 auto.cnf -rw-r----- 1 mysql mysql 206 Jul 16 01:55 db02-relay-bin.000001 -rw-r----- 1 mysql mysql 1282 Jul 16 02:05 db02-relay-bin.000002 -rw-r----- 1 mysql mysql 48 Jul 16 01:55 db02-relay-bin.index -rw-r----- 1 mysql mysql 599 Jul 15 06:54 ib_buffer_pool -rw-r----- 1 mysql mysql 12582912 Jul 16 01:49 ibdata1 -rw-r----- 1 mysql mysql 50331648 Jul 16 01:49 ib_logfile0 -rw-r----- 1 mysql mysql 50331648 Jul 15 05:52 ib_logfile1 -rw-r----- 1 mysql mysql 12582912 Jul 16 01:23 ibtmp1 -rw-r----- 1 mysql mysql 123 Jul 16 02:09 master.info drwxr-x--- 2 mysql mysql 4096 Jul 15 06:35 mysql drwxr-x--- 2 mysql mysql 8192 Jul 15 05:52 performance_schema -rw-r----- 1 mysql mysql 61 Jul 16 02:09 relay-log.info drwxr-x--- 2 mysql mysql 48 Jul 15 06:50 shenzhen drwxr-x--- 2 mysql mysql 8192 Jul 15 05:52 sys -rw-r----- 1 mysql mysql 84 Jul 16 01:55 worker-relay-log.info.1 -rw-r----- 1 mysql mysql 84 Jul 16 01:55 worker-relay-log.info.2 -rw-r----- 1 mysql mysql 84 Jul 16 01:55 worker-relay-log.info.3 -rw-r----- 1 mysql mysql 84 Jul 16 01:55 worker-relay-log.info.4 [root@db02 /data/mysql/3306/data]# ls -l *relay* -rw-r----- 1 mysql mysql 206 Jul 16 01:55 db02-relay-bin.000001 中繼日誌 -rw-r----- 1 mysql mysql 1282 Jul 16 02:05 db02-relay-bin.000002 中繼日誌 -rw-r----- 1 mysql mysql 48 Jul 16 01:55 db02-relay-bin.index 中繼日誌索引 -rw-r----- 1 mysql mysql 61 Jul 16 02:09 relay-log.info 線程讀取中繼日誌的位置信息 -rw-r----- 1 mysql mysql 84 Jul 16 01:55 worker-relay-log.info.1 -rw-r----- 1 mysql mysql 84 Jul 16 01:55 worker-relay-log.info.2 -rw-r----- 1 mysql mysql 84 Jul 16 01:55 worker-relay-log.info.3 -rw-r----- 1 mysql mysql 84 Jul 16 01:55 worker-relay-log.info.4 [root@db02 /data/mysql/3306/data]# cat relay-log.info 7 ./db02-relay-bin.000002 #SQL線程讀取中繼日誌的文件名信息 320 #SQL線程讀取中繼日誌的位置點信息 mysql-bin.000005 350 3600 4 1 8.解析SQL線程未解析的所有剩餘relay-bin中繼日誌數據。 [root@db02 /data/mysql/3306/data]# mysqlbinlog --start-position=320 db02-relay-bin.000002 > /backup/sql/relay.sql [root@db02 /data/mysql/3306/data]# mysqlbinlog --skip-gtids --start-position=320 db02-relay-bin.000002 > /backup/sql/relay.sql [root@db02 ~]# ls -l /backup/sql/relay.sql -rw-r--r-- 1 root root 3688 Jul 16 02:15 /backup/sql/relay.sql 9.找到破壞數據庫的SQL語句,並從已解析的SQL語句中將其刪除掉,這裏使用的是"drop database app05" [root@db02 ~]# egrep "drop database app05" /backup/sql/relay.sql drop database app05 [root@db02 ~]# sed -i '/drop database app05/d' /backup/sql/relay.sql [root@db02 ~]# egrep "drop database app05" /backup/sql/relay.sql [root@db02 ~]# egrep "^drop database app05" /backup/sql/relay.sql 10.將解析後並處理好的relay.sql數據文件恢復到延遲從庫。 [root@db02 ~]# mysql -S /data/mysql/3306/mysql.sock -p < /backup/sql/relay.sql Enter password: Slave [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | app | | app01 | | app02 | | app03 | | app04 | | app05 | | mysql | | performance_schema | | shenzhen | | sys | +--------------------+ 11 rows in set (0.00 sec) 以前的刪除的app05數據庫已經恢復找回來了!!! 利用延遲從庫恢復數據庫完畢,此時還須要將此從庫切換爲主庫,做爲新主庫對外提供用戶訪問。再對其餘遭到破壞的主從數據庫進行修復。 -------------------------------------------------------->OK Slave [(none)]> start slave; Query OK, 0 rows affected (0.01 sec) Master [shenzhen]> use shenzhen; Database changed Master [shenzhen]> insert into t1(id) values(2); Query OK, 1 row affected (0.01 sec) Master [shenzhen]> select * from t1; +------+ | id | +------+ | 1 | | 2 | +------+ 2 rows in set (0.00 sec) Slave [(none)]> stop slave; Query OK, 0 rows affected (0.02 sec) Slave [(none)]> change master to master_delay = 20; Query OK, 0 rows affected (0.01 sec) Slave [(none)]> start slave; Query OK, 0 rows affected (0.03 sec) Slave [(none)]> select * from shenzhen.t1; +------+ | id | +------+ | 1 | | 2 | +------+ 2 rows in set (0.00 sec)