gh-ost學習

時間 2019-12-19

標籤 ost 學習简体版

原文原文鏈接

gh-ost 學習筆記mysql

參考資料：git

https://m.aliyun.com/yunqi/articles/62928github

https://m.aliyun.com/yunqi/articles/115565?spm=5176.100239.0.0.bIL3Yr算法

https://yq.aliyun.com/articles/62928 【推薦】sql

https://github.com/github/gh-ost/tree/master/doc 【推薦】數據庫

https://github.com/wing324/helloworld_zh/blob/master/MySQL/gh-ost/GitHub%E5%BC%80%E6%BA%90MySQL%20Online%20DDL%E5%B7%A5%E5%85%B7gh-ost%E5%8F%82%E6%95%B0%E8%A7%A3%E6%9E%90.md安全

http://blog.csdn.net/qq_36281740/article/details/52711284服務器

下面是一個添加字段的 general_log 截圖：session

grant ALL on . to ptosc@'10.0.20.%' identified by 'ptosc'; -- 添加gh-ost操做的受權app

./gh-ost -user="ptosc" -password="ptosc" -host="10.0.20.25" -port=3306 -allow-on-master -database="hellodb" -debug -table="coc" -alter "add column cl111 char(20) not null" -assume-rbr -execute --initially-drop-old-table --ok-to-drop-table

部分表說明：

coc 原始表

_coc_del 原先的原始表，alter操做完成時，會自動將這個表重命名爲_coc_del 【安全起見，默認gh-ost不會刪除這個文件】

_coc_gho 影子表，alter操做完成時，會自動將這個表重命名爲coc

_coc_ghc 記錄操做進度等信息，其表中的內容相似下圖：

過程 https://github.com/github/gh-ost ：

鏈接到主庫的大體的操做過程以下：

一、gh-ost程序鏈接到主庫，檢查帳號權限，mysqld版本，估算須要操做的表的行數。

二、初始化stream鏈接，監聽binlog，並記錄gh-ost操做前的binlog位置。初始化Applier鏈接。

三、刪除_xxx_ghc表(若是以前存在)，重建這個ghc表，用於記錄gh-ost的操做changelog。

三、建立_xxx_gho表(最終表)，對gho表進行ALTER操做，而後複製原表數據到gho表。（這ALTER期間，gh-ost模擬成slave，將操做期間的DML產生的binlog event獲取到，並應用到gho表上）

四、【cut-over新舊錶切換階段】重命名xxx表爲_coc_del表，重命名_xxx_gho表爲coc表【rename表的瞬間是鎖表的】

五、刪除_xxx_ghc表

六、刪除_xxx_del表 (默認不會刪除原始表，須要加-ok-to-drop-table參數)

遷移和切換的細節實現：
關於gh-ost的實現，這裏只挑了rowcopy和binlog apply的順序問題和rename過程作了詳細解析。

數據遷移過程

在數據遷移的過程當中，數據變量有三個，暫且分爲，A:來自原表的rowcopy，B:binlog的apply，C:對原表的dml操做。

C操做會記錄binglog從而觸發B操做，因此B操做必定在C操做的後面，所以通常狀況下，會有ACB,CBA兩種組合，同時特殊狀況如binlog apply延遲，則會有CAB這種組合。

分析三種組合以前要先了解gh-ost在sql改寫方面是如何映射的：

RowCopy

原表操做

新表操做

select

insert ignore into

BinlogApply

原表操做

新表操做

insert

replace into

update

update 新表(全行更新)

delete

在上述原則的基礎上，咱們再來逐個分析不一樣順序組合的影響:

1.insert 操做

binlog是最權威的,gh-ost的原則是以binlog優先,因此不管任何順序下,數據都是和binlog保持一致,若是rowcopy在後,會insert ignore,若是binlog apply在後會replace into掉。

2.update/delete 操做

通常狀況下：

ACB組合,即對已經rowcopy過的數據，出現對原表的update/delete操做。這時候會所有經過binlog apply執行，注意binlog apply的update是對某一條記錄的所有列覆蓋更新，因此不會有累加的問題。

CBA組合,即對還沒有遷移的數據，出現對原表的update/delete操做。這時候對新表的binlog apply會是空操做，具體數據由rowcopy遷移。

特殊狀況下:

CAB組合,即先對原表更新完之後，rowcopy在binlog apply以前把數據遷移了過去，而在binlog event過來之後，會再次應用，這裏有問題？其實結合gh-ost的binlog aplly的sql映射規則,insert操做會被replace從新替換掉,update 會更新對應記錄所有行,delete 會是空操做。最終數據仍是一致的狀態。

cut-over過程：
在pt-osc或者online ddl中,最後的rename操做通常是耗時比較短,但若是表結構變動過程當中,有大查詢進來,那麼在rename操做的時候,會觸發MDL鎖的等待,若是在高峯期,這就是個嚴重的問題。因此gh-ost是怎麼作的呢？

gh-ost利用了MySQL的一個特性，就是原子性的rename請求，在全部被blocked的請求中，優先級永遠是最高的。gh-ost基於此設計了該方案：一個鏈接對原表加鎖，另啓一個鏈接嘗試rename操做，此時會被阻塞住，當釋放lock的時候，rename會首先被執行，其餘被阻塞的請求會繼續應用到新表。

migrator.go:iterateChunks() 函數來肯定什麼時候開始 cut－over

具體切換流程以下：

START

會話A

CREATE table tbl_old

防止rename過早執行

LOCK TABLES tbl WRITE, tbl_old WRITE

經過lock_wait_timeout設置爲2s控制超時,超時失敗會重試次數爲配置default-retries,默認60次

新的請求進來，關於原表的請求被blocked

RENAME TABLE tbl TO tbl_old, ghost TO tbl , 一樣被blocked

新的請求進來，關於原表的請求被blocked

檢查是否有blocked 的RENAME請求，經過show processlist

會話A: DROP TABLE tbl_old

會話A: UNLOCK TABLES

RENAME SUCCESS

session1

session2

begin

begin;

lock tables t1 write, _t1_del write

rename table t1 to _t1_del, _t1_gho to t1 -- 會被阻塞

drop table if exists _t1_del

unlock tables;

session1釋放鎖後，session會 rename 成功

不一樣階段失敗後如何處理：

若是第一步失敗，退出程序

若是會話A建表成功，加鎖失敗，退出程序，未加鎖

rename請求來的時候，會話A死掉，lock會自動釋放，同時由於tbl_old的存在rename也會失敗，全部請求恢復正常

rename被blocked的時候，會話A死掉，lock會自動釋放，一樣由於tbl_old的存在，rename會失敗，全部請求恢復正常

rename死掉，gh-ost會捕獲不到rename，會話A繼續運行，釋放lock，全部請求恢復正常

gh-ost要的權限比較大，須要嚴格控制訪問的IP

grant ALL on . to ptosc@'10.0.20.%' identified by 'ptosc'; # 若是gh-ost只在主庫經過SSH終端執行的話，能夠給本機地址受權便可。

./gh-ost -user="ptosc" -password="ptosc" -host="10.0.20.25" -port=3306 -allow-on-master -database="hellodb" -debug -table="coc" -alter "add column cl1 varchar(20) not null default \"test col1\" " -assume-rbr -ok-to-drop-table -serve-socket-file="/tmp/gh-ost.hellodb.coc.sock" -execute

gh-ost有三種工做模式：

a:鏈接到從庫，在主庫作遷移。

b:鏈接到主庫，遷移過程全部操做都在主上操做，包括讀取binlog等等。【推薦使用】

c:在從庫作遷移測試。

三種模式對比：

三種方法各有優缺點，但我只關心缺點，先說a的缺點，a會在從上面讀取binlog,但數據庫主從數據爲何會形成不一致，一個很重要的緣由是主庫的binlog沒有徹底在從庫執行。因此我的感受a方法有丟失數據的風險。

b方法任何操做都會再主庫操做，或多或少會對主庫負載形成影響，可是能夠經過調整一些參數下降和時刻關注這些影響，因此我的推薦使用b方法。

至於c方法是偏向測試用的，這裏不作過多介紹，可是c方法裏有一個細節，cut-over階段有會stop slave一個操做,其實這個操做風險特別高，有時stop slave 時間會很長，務必會對線上數據庫使用形成影響，因此若是使用c方法作測試也要在線下數據庫。

gh-ost 參數詳解：

-allow-master-master 容許在雙主環境下使用，貌似如今生產環境雙主用的比較少，通常都是MHA

explicitly allow running in a master-master setup

-allow-nullable-unique-key 容許gh-ost在數據遷移(migrate)依賴的惟一鍵能夠爲NULL，默認爲不容許爲NULL的惟一鍵。若是數據遷移(migrate)依賴的惟一鍵容許NULL值，則可能形成數據不正確，請謹慎使用。

allow gh-ost to migrate based on a unique key with nullable columns. As long as no NULL values exist, this should be OK. If NULL values exist in chosen key, data may be corrupted. Use at your own risk!

-allow-on-master 容許在主庫執行【gh-ost默認會去從庫執行命令，這個選項會強制在主庫執行，推薦這種方式使用gh-ost】

allow this migration to run directly on master. Preferably it would run on a replica

-alter string 這裏跟具體的alter語句【須要用雙引號或者單引號括起來】

alter statement (mandatory)

-approve-renamed-columns ALTER 給列更名的話，必須加這個選項

in case your ALTER statement renames columns, gh-ost will note that and offer its interpretation of the rename. By default gh-ost does not proceed to execute. This flag approves that gh-ost's interpretation is correct

例如：

    ./gh-ost -user="ptosc" -password="ptosc" -host="10.0.20.25" -port=3306 -allow-on-master -database="hellodb" -table="coc" -approve-renamed-columns -alter "change column c1 newc1  varchar(100) not null default \"測試字段1\" COMMENT \"測試註釋1\""  -assume-rbr -execute -ok-to-drop-table    操做完成後刪除舊錶

-assume-master-host string

(optional) explicitly tell gh-ost the identity of the master. Format:some.host.com[:port] This is useful in master-master setups where you wish to pick an explicit master, or in a tungsten-replicator where gh-ost is unabel to determine the master

-assume-rbr 確認gh-ost鏈接的數據庫實例的binlog_format=ROW的狀況下，能夠指定-assume-rbr，這樣能夠禁止從庫上運行stop slave,start slave,執行gh-ost用戶也不須要SUPER權限。

set to 'true' when you know for certain your server uses 'ROW' binlog_format. gh-ost is unable to tell, event after reading binlog_format, whether the replication process does indeed use 'ROW', and restarts replication to be certain RBR setting is applied. Such operation requires SUPER privileges which you might not have. Setting this flag avoids restarting replication and you can proceed to use gh-ost without SUPER privileges

-check-flag 檢查某個參數是否支持？？

Check if another flag exists/supported. This allows for cross-version scripting. Exits with 0 when all additional provided flags exist, nonzero otherwise. You must provide (dummy) values for flags that require a value. Example: gh-ost --check-flag --cut-over-lock-timeout-seconds --nice-ratio 0

-chunk-size int 在每次迭代中處理的行數量(容許範圍：100-100000)，默認值爲1000

amount of rows to handle in each iteration (allowed range: 100-100,000) (default 1000)

-concurrent-rowcount 該參數若是爲True(默認值)，則進行row-copy以後，估算統計行數(使用explain select count(*)方式)，並調整ETA時間，不然，gh-ost首先預估統計行數，而後開始row-copy。

(with --exact-rowcount), when true (default): count rows after row-copy begins, concurrently, and adjust row estimate later on; when false: first count rows, then start row copy (default true)

-conf string 能夠指定gh-ost默認去讀取配置的文件【通常直接在命令行指定DDL使用的選項便可】

-critical-load string 一系列逗號分隔的status-name=values組成，當MySQL中status超過對應的values，gh-ost將會退出。【用的較少】

Comma delimited status-name=threshold, same format as --max-load. When status exceeds threshold, app panics and quits

    例如:  -critical-load "Threads_running=1000,threads_connected=5000"  指的是當MySQL中的狀態值Threads_running>1000,threads_connected>5000的時候，gh-ost將會因爲該數據庫嚴重負載而中止並退出。

-critical-load-hibernate-seconds int 【用的較少】

When nonzero, critical-load does not panic and bail out; instead, gh-ost goes into hibernate for the specified duration. It will not read/write anything to from/to any server

-critical-load-interval-millis int 【用的較少】

When 0, migration immediately bails out upon meeting critical-load. When non-zero, a second check is done after given interval, and migration only bails out if 2nd check still meets critical load 。當值爲0時，當達到-critical-load，gh-ost當即退出。當值不爲0時，當達到-critical-load，gh-ost會在-critical-load-interval-millis秒數後，再次進行檢查，再次檢查依舊達到-critical-load，gh-ost將會退出。

-cut-over string 選擇cut-over類型:atomic/two-step，atomic(默認)類型是github的算法，two-step採用的是facebook-OSC的算法。

choose cut-over type (default|atomic, two-step) (default "atomic")

-cut-over-lock-timeout-seconds int gh-ost在cut-over階段最大的鎖等待時間，當鎖超時時，gh-ost的cut-over將重試。(默認值：3)

Max number of seconds to hold locks on tables while attempting to cut-over (retry attempted when lock exceeds timeout) (default 3)

-database string 指定須要處理的庫名

-debug 設置日誌的級別

-default-retries int 各類操做在panick前重試次數。(默認爲60次)

Default number of retries for various operations before panicking (default 60)

-discard-foreign-keys 【危險參數】對一個有外鍵的表，帶該參數執行gh-ost後，外鍵就會丟失。該參數很適合用於刪除外鍵，除此以外，請謹慎使用

DANGER! This flag will migrate a table that has foreign keys and will NOT create foreign keys on the ghost table, thus your altered table will have NO foreign keys. This is useful for intentional dropping of foreign keys

-dml-batch-size int 待確認具體的含義

batch size for DML events to apply in a single transaction (range 1-100) (default 10)

-exact-rowcount 使用count(*)計算錶行數，而不是從information_schema表裏估計行數（用於更精確的進度估算）

actually count table rows as opposed to estimate them (results in more accurate progress estimation)

-execute 真實執行ALTER操做，不加的話就是dry-run測試運行

-force-named-cut-over

When true, the 'unpostpone|cut-over' interactive command must name the migrated table

-force-table-names string

table name prefix to be used on the temporary tables

-heartbeat-interval-millis int gh-ost心跳頻率值，默認爲100秒

how frequently would gh-ost inject a heartbeat value (default 100)

-hooks-hint string

arbitrary message to be injected to hooks via GH_OST_HOOKS_HINT, for your convenience

-hooks-path string

directory where hook files are found (default: empty, ie. hooks disabled). Hook files found on this path, and conforming to hook naming conventions will be executed

-host string 主機地址 MySQL hostname (preferably a replica, not the master) (default "127.0.0.1")

-initially-drop-ghost-table 執行gh-ost時候，先刪除同名gho表(若是存在的話）【危險，慎用】

Drop a possibly existing Ghost table (remains from a previous run?) before beginning operation. Default is to panic and abort if such table exists

-initially-drop-old-table gh-ost操做以前，檢查並刪除已經存在的舊錶。【危險，慎用】

Drop a possibly existing OLD table (remains from a previous run?) before beginning operation. Default is to panic and abort if such table exists

-initially-drop-socket-file gh-ost強制刪除已經存在的socket文件。該參數不建議使用，可能會刪除一個正在運行的gh-ost程序，致使DDL失敗。【危險，慎用】

Should gh-ost forcibly delete an existing socket file. Be careful: this might drop the socket file of a running migration!

-master-password string

MySQL password on master, if different from that on replica. Requires --assume-master-host

-master-user string

MySQL user on master, if different from that on replica. Requires --assume-master-host

-user string MySQL用戶名

-max-lag-millis int 主從複製最大延遲時間，當主從複製延遲時間超過該值後，gh-ost將採起節流(throttle)措施，默認值：1500s

replication lag at which to throttle operation (default 1500)

-max-load string 一系列逗號分隔的status-name=values組成，當MySQL中status超過對應的values，gh-ost將採起節流(throttle)措施。

Comma delimited status-name=threshold. e.g: 'Threads_running=100,Threads_connected=500'. When status exceeds threshold, app throttles writes

          例如： -max-load "Threads_running=20,Connections=1500"  指的是當MySQL中的狀態值Threads_running>20,Connections>1500的時候，gh-ost將採起節流(throttle)措施。

-migrate-on-replica 在從庫上執行gh-ost命令

Have the migration run on a replica, not on the master. This will do the full migration on the replica including cut-over (as opposed to --test-on-replica)

-nice-ratio float 每次chunk時間段的休眠時間。【0：每一個chunk時間段不休眠；1：每row-copy 1毫秒，則另外休眠1毫秒；0.7：每row-copy 10毫秒，則另外休眠7毫秒】

force being 'nice', imply sleep time per chunk time; range: [0.0..100.0]. Example values: 0 is aggressive. 1: for every 1ms spent copying rows, sleep additional 1ms (effectively doubling runtime); 0.7: for every 10ms spend in a rowcopy chunk, spend 7ms sleeping immediately after

-ok-to-drop-table gh-ost操做結束後，刪除舊錶，默認狀態是不刪除舊錶，會存在_tablename_del表

Shall the tool drop the old table at end of operation. DROPping tables can be a long locking operation, which is why I'm not doing it by default. I'm an online tool, yes?

-panic-flag-file string 當這個文件被建立，gh-ost將會當即退出，且不會清理產生的臨時文件及socket文件【慎用】

when this file is created, gh-ost will immediately terminate, without cleanup

-password string MySQL密碼

-port int MySQL端口

-postpone-cut-over-flag-file string 當這個文件存在的時候，gh-ost的cut-over階段將會被推遲，直到該文件被刪除。

while this file exists, migration will postpone the final stage of swapping tables, and will keep on syncing the ghost table. Cut-over/swapping would be ready to perform the moment the file is deleted.

-quiet 靜默模式

-replica-server-id uint

server id used by gh-ost process. Default: 99999 (default 99999)

-replication-lag-query string

Deprecated. gh-ost uses an internal, subsecond resolution query

-serve-socket-file string

Unix socket file to serve on. Default: auto-determined and advertised upon startup

-serve-tcp-port int

TCP port to serve on. Default: disabled

-skip-foreign-key-checks 跳過外鍵檢查，除非你確認待操做的表沒有外鍵【危險，慎用】

set to 'true' when you know for certain there are no foreign keys on your table, and wish to skip the time it takes for gh-ost to verify that

-skip-renamed-columns ALTER 該參數告訴gh-ost跳該列的數據遷移，讓gh-ost把重命名列做爲可有可無的列。該操做很危險，你會損失該列的全部值。【危險，慎用】

in case your ALTER statement renames columns, gh-ost will note that and offer its interpretation of the rename. By default gh-ost does not proceed to execute. This flag tells gh-ost to skip the renamed columns, i.e. to treat what gh-ost thinks are renamed columns as unrelated columns. NOTE: you may lose column data

    例如：gh-ost -user="wing" -host="127.0.0.1" -port=3306 -database="wing" -table="t" -password="wing" -alter="change column c1 col1 int not null default 0" -assume-rbr -execute -skip-renamed-columns ALTER    這個ALTER操做會丟棄到c1列的原先所有數據

-stack 添加錯誤堆棧追蹤

-switch-to-rbr 讓gh-ost自動將從庫的binlog_format轉換爲ROW格式【生產環境都是用row格式了，基本上不用加這個參數】

let this tool automatically switch binary log format to 'ROW' on the replica, if needed. The format will NOT be switched back. I'm too scared to do that, and wish to protect you if you happen to execute another migration while this one is running

-table string 待ALTER操做的表名

-test-on-replica 在從庫上測試gh-ost，包括在從庫上數據遷移(migration)，數據遷移完成後stop slave，原表和ghost表馬上交換然後馬上交換回來。繼續保持stop slave，使你能夠對比兩張表。

Have the migration run on a replica, not on the master. At the end of migration replication is stopped, and tables are swapped and immediately swap-revert. Replication remains stopped and you can compare the two tables for building trust

-test-on-replica-skip-replica-stop 當-test-on-replica執行時，該參數表示該過程當中不用stop slave。

When --test-on-replica is enabled, do not issue commands stop replication (requires --test-on-replica)

-throttle-additional-flag-file string

當該文件被建立後，gh-ost操做當即中止。該參數能夠用在多個gh-ost同時操做的時候，建立一個文件，讓全部的gh-ost操做中止，或者刪除這個文件，讓全部的gh-ost操做恢復。

-throttle-control-replicas string

列出全部須要被檢查slave lag的從庫，如：-throttle-control-replica=192.16.12.22:3306,192.16.12.23:3307,192.16.13.12:3308

-throttle-flag-file string

當該文件被建立後，gh-ost操做當即中止。該參數適合控制單個gh-ost操做。

-throttle-http string

when given, gh-ost checks given URL via HEAD request; any response code other than 200 (OK) causes throttling; make sure it has low latency response

-throttle-query string 節流查詢。每秒鐘執行一次。當返回值=0時不須要節流，當返回值>0時，須要執行節流操做。該查詢會在數據遷移(migrated)服務器上操做，因此請確保該查詢是輕量級的。

when given, issued (every second) to check if operation should throttle. Expecting to return zero for no-throttle, >0 for throttle. Query is issued on the migrated server. Make sure this query is lightweight

-timestamp-old-table

Use a timestamp in old table name. This makes old table names unique and non conflicting cross migrations

-tungsten 告訴gh-ost你正在運行的是一個tungsten-replication拓撲結構。

explicitly let gh-ost know that you are running on a tungsten-replication based topology (you are likely to also provide --assume-master-host)

-verbose gh-ost執行時候輸出詳細信息

示例：

主庫：10.0.20.25

從庫：10.0.20.26

在主庫添加用於alter操做的帳號：grant ALL on . to ptosc@'10.0.20.25' identified by 'ptosc' ;

注意：下面的命令都是在主庫執行的。參數裏面 - 和-- 的同樣用，沒任何區別。另外，操做的表上必需要有主鍵。

./gh-ost -version 使用的是1.0.42版本。

添加字段：

./gh-ost -user="ptosc" -password="ptosc" -host="10.0.20.25" -port=3306 \
-max-load=Threads_running=25 \
-critical-load=Threads_running=1000 \
-chunk-size=1000 \
-allow-on-master --verbose -database="hellodb" -table="coc" \
-alter " add column c1 varchar(100) not null default \"測試字段1\" COMMENT \"測試註釋1\"" \
-assume-rbr -ok-to-drop-table -execute

修改某個字段的字符集：

./gh-ost -user="ptosc" -password="ptosc" -host="10.0.20.25" -port=3306 \
-allow-on-master --verbose -database="hellodb" -table="coc" \
-alter "change c1 c1 varchar(128) CHARACTER SET utf8mb4" \
-assume-rbr -ok-to-drop-table -execute

刪除字段：

./gh-ost -user="ptosc" -password="ptosc" -host="10.0.20.25" -port=3306 \
-allow-on-master --verbose -database="hellodb" -table="coc" \
-alter "drop column c1" \
-assume-rbr -ok-to-drop-table -execute

字段更名：

./gh-ost -user="ptosc" -password="ptosc" -host="10.0.20.25" -port=3306 \
-allow-on-master --verbose -database="hellodb" -table="coc" \
-approve-renamed-columns \
-alter "change column c1 newc1 varchar(100) not null default \"測試字段1\" COMMENT \"測試註釋1\"" \
-assume-rbr -ok-to-drop-table -execute

添加索引：

./gh-ost -user="ptosc" -password="ptosc" -host="10.0.20.25" -port=3306 \
-allow-on-master --verbose -database="hellodb" -table="coc" \
-alter "add index idx_newc1_classid(newc1,classid)" \
-assume-rbr -ok-to-drop-table -execute

刪除索引：

./gh-ost -user="ptosc" -password="ptosc" -host="10.0.20.25" -port=3306 \
-allow-on-master --verbose -database="hellodb" -table="coc" \
-alter "drop index idx_newc1_classid" \
-assume-rbr -ok-to-drop-table -execute

修改表引擎：

./gh-ost -user="ptosc" -password="ptosc" -host="10.0.20.25" -port=3306 \
-allow-on-master --verbose -database="hellodb" -table="coc" \
-alter "engine=InnoDB" \
-assume-rbr -ok-to-drop-table -execute

刪除字段的同時不要刪除老表：

./gh-ost -user="ptosc" -password="ptosc" -host="10.0.20.25" -port=3306 \-allow-on-master --verbose -database="hellodb" -table="coc" \-alter "drop column newc1" \-assume-rbr -execute