技術分享 | event_ scheduler致使複製中斷的故障分析

原創做者: 洪斌php


 

問題背景

在5.6.29和5.7.11版本以前,當binlog格式設置成mixed時,建立event事件中包含sysdate函數時,會致使複製中斷。mysql

與此bug相關:linux

https://bugs.mysql.com/bug.php?id=71859git

場景重現

測試版本:mysql 5.6.23重現方法:分別設置statement、mixed、row格式執行如下語句,mixed格式時會致使複製中斷,其餘兩種格式不會形成複製中斷。sql

DELIMITER $$
DROP EVENT IF EXISTS `MIS_CUMULATIVE_REV_EVENT`
$$
CREATE EVENT IF NOT EXISTS `MIS_CUMULATIVE_REV_EVENT`
ON SCHEDULE EVERY 3 second STARTS sysdate()
ON COMPLETION PRESERVE
DO BEGIN
DECLARE EXIT HANDLER FOR SQLEXCEPTION
SELECT CONCAT('SQL EXCEPTION - TERMINATING EVENT FOR STORED PROCEDURE CALL');
END
$$

 

  • statement 模式

set global binlog_format=statement;安全

設置5s延遲觀察下starts字段時間,複製正常,主從有差別ide

slave2 [localhost] {msandbox} (test) > change master to MASTER_DELAY=5;函數

  • Row模式

set global binlog_format=row;測試

複製正常,主從有差別ui

start字段一樣有差別,由於sysdate函數是sql實際執行時間;若是是now函數,start字段時間會是同樣的。

  • mixed模式

set global binlog_format=mixed;

關閉延遲

mixed模式下產生的binlog日誌

正常的statement 和row 模式

從這些截圖可知,在mixed模式下建立event時,在同一個事務內即產生了DDL statement,也產生了額外寫入mysql.event表的row事件。

對於DDL語句在任何binlog格式下,都會以statement格式記錄。

mixed模式下遇到非複製安全函數會轉換成row模式。例如:sysdate()返回實際執行時間而非調用時間與now函數有差別。

mysql> SELECT NOW(), SLEEP(2), NOW();
+---------------------+----------+---------------------+
| NOW() | SLEEP(2) | NOW() |
+---------------------+----------+---------------------+
| 2006-04-12 13:47:36 | 0 | 2006-04-12 13:47:36 |
+---------------------+----------+---------------------+
mysql> SELECT SYSDATE(), SLEEP(2), SYSDATE();
+---------------------+----------+---------------------+
| SYSDATE() | SLEEP(2) | SYSDATE() |
+---------------------+----------+---------------------+
| 2006-04-12 13:47:44 | 0 | 2006-04-12 13:47:46 |
+---------------------+----------+---------------------+

 

問題分析

利用systemtap觀察下,create event時函數調用,定位爲什麼產生了row事件。查看源碼中有哪些create_event函數

[root@10-186-21-66 ~]# stap -L 'process("/usr/local/mysql/bin/mysqld").function("create_event")'
process("/usr/local/mysql-advanced-5.6.23-linux-glibc2.5-x86_64/bin/mysqld").function("create_event@/export/home/pb2/build/sb_0-14249461-1422537824.58/mysqlcom-pro-5.6.23/sql/event_db_repository.cc:660") $this:class Event_db_repository* const $thd:struct THD* $parse_data:struct Event_parse_data* $create_if_not:bool $event_already_exists:bool* $table:struct TABLE* $sp:struct sp_head* $saved_mode:sql_mode_t $mdl_savepoint:class MDL_savepoint
process("/usr/local/mysql-advanced-5.6.23-linux-glibc2.5-x86_64/bin/mysqld").function("create_event@/export/home/pb2/build/sb_0-14249461-1422537824.58/mysqlcom-pro-5.6.23/sql/event_queue.cc:200") $this:class Event_queue* const $thd:struct THD* $new_element:struct Event_queue_element* $created:bool* $__FUNCTION__:char[] const
process("/usr/local/mysql-advanced-5.6.23-linux-glibc2.5-x86_64/bin/mysqld").function("create_event@/export/home/pb2/build/sb_0-14249461-1422537824.58/mysqlcom-pro-5.6.23/sql/events.cc:307") $thd:struct THD* $parse_data:struct Event_parse_data* $if_not_exists:bool

 

共有如下3處

第1處:

/**
Create a new event.
@param[in,out] thd THD
@param[in] parse_data Event's data from parsing stage
@param[in] if_not_exists Whether IF NOT EXISTS was
specified
In case there is an event with the same name (db) and
IF NOT EXISTS is specified, an warning is put into the stack.
@sa Events::drop_event for the notes about locking, pre-locking
and Events DDL.
@retval FALSE OK
@retval TRUE Error (reported)
*/
bool
Events::create_event(THD *thd, Event_parse_data *parse_data,
bool if_not_exists)

 

第2處:

/**
Creates an event record in mysql.event table.
Creates an event. Relies on mysql_event_fill_row which is shared with
::update_event.
@pre All semantic checks must be performed outside. This function
only creates a record on disk.
@pre The thread handle has no open tables.
@param[in,out] thd THD
@param[in] parse_data Parsed event definition
@param[in] create_if_not TRUE if IF NOT EXISTS clause was provided
to CREATE EVENT statement
@param[out] event_already_exists When method is completed successfully
set to true if event already exists else
set to false
@retval FALSE success
@retval TRUE error
*/
bool
Event_db_repository::create_event(THD *thd, Event_parse_data *parse_data,
bool create_if_not,
bool *event_already_exists)

 

第3處:

/**
Adds an event to the queue.
Compute the next execution time for an event, and if it is still
active, add it to the queue. Otherwise delete it.
The object is left intact in case of an error. Otherwise
the queue container assumes ownership of it.
@param[in] thd thread handle
@param[in] new_element a new element to add to the queue
@param[out] created set to TRUE if no error and the element is
added to the queue, FALSE otherwise
@retval TRUE an error occured. The value of created is undefined,
the element was not deleted.
@retval FALSE success
*/
bool
Event_queue::create_event(THD *thd, Event_queue_element *new_element,
bool *created)

 

利用systemtap觀察如下兩個函數調用狀況

probe process("/usr/local/mysql/bin/mysqld").function("create_event") {
printf(">>>>>>create_event\n");
print_ubacktrace();
}
probe process("/usr/local/mysql/bin/mysqld").function("set_current_stmt_binlog_format_row") {
printf("+++++set_current_stmt_binlog_format_row\n");
print_ubacktrace();
}

運行stap -v event.stp,在另外一個mysql終端執行建立event語句

>>>>>>create_event
0x7d36b1 : _ZN6Events12create_eventEP3THDP16Event_parse_datab+0x21/0x340 [...local/mysql-advanced-5.6.23-linux-glibc2.5-x86_64/bin/mysqld]
0x727984 : _Z21mysql_execute_commandP3THD+0x5804/0x6ce0 [...local/mysql-advanced-5.6.23-linux-glibc2.5-x86_64/bin/mysqld]
0x729178 : _Z11mysql_parseP3THDPcjP12Parser_state+0x318/0x420 [...local/mysql-advanced-5.6.23-linux-glibc2.5-x86_64/bin/mysqld]
0x72a4cb : _Z16dispatch_command19enum_server_commandP3THDPcj+0xbcb/0x28f0 [...local/mysql-advanced-5.6.23-linux-glibc2.5-x86_64/bin/mysqld]
0x72c2c7 : _Z10do_commandP3THD+0xd7/0x1c0 [...local/mysql-advanced-5.6.23-linux-glibc2.5-x86_64/bin/mysqld]
0x6f3ee6 : _Z24do_handle_one_connectionP3THD+0x116/0x1b0 [...local/mysql-advanced-5.6.23-linux-glibc2.5-x86_64/bin/mysqld]
0x6f3fc5 : handle_one_connection+0x45/0x60 [...local/mysql-advanced-5.6.23-linux-glibc2.5-x86_64/bin/mysqld]
0x9b22f6 : pfs_spawn_thread+0x126/0x140 [...local/mysql-advanced-5.6.23-linux-glibc2.5-x86_64/bin/mysqld]
0x7ff7b42859d1 [/lib64/libpthread-2.12.so+0x79d1/0x219000]
>>>>>>create_event
0x8bae27 : _ZN19Event_db_repository12create_eventEP3THDP16Event_parse_databPb+0x17/0x2b0 [...local/mysql-advanced-5.6.23-linux-glibc2.5-x86_64/bin/mysqld]
0x7d377f : _ZN6Events12create_eventEP3THDP16Event_parse_datab+0xef/0x340 [...local/mysql-advanced-5.6.23-linux-glibc2.5-x86_64/bin/mysqld]
0x727984 : _Z21mysql_execute_commandP3THD+0x5804/0x6ce0 [...local/mysql-advanced-5.6.23-linux-glibc2.5-x86_64/bin/mysqld]
0x729178 : _Z11mysql_parseP3THDPcjP12Parser_state+0x318/0x420 [...local/mysql-advanced-5.6.23-linux-glibc2.5-x86_64/bin/mysqld]
0x72a4cb : _Z16dispatch_command19enum_server_commandP3THDPcj+0xbcb/0x28f0 [...local/mysql-advanced-5.6.23-linux-glibc2.5-x86_64/bin/mysqld]
0x72c2c7 : _Z10do_commandP3THD+0xd7/0x1c0 [...local/mysql-advanced-5.6.23-linux-glibc2.5-x86_64/bin/mysqld]
0x6f3ee6 : _Z24do_handle_one_connectionP3THD+0x116/0x1b0 [...local/mysql-advanced-5.6.23-linux-glibc2.5-x86_64/bin/mysqld]
0x6f3fc5 : handle_one_connection+0x45/0x60 [...local/mysql-

 

基本可知函數調用順序 Event::create_event -> Event_db_repository::create_event -> decide_logging_format -> set_current_stmt_binlog_format_row_if_mixed -> Event_queue::create_event

binlog.cc decide_logging_format()

if (lex->is_stmt_unsafe() || lex->is_stmt_row_injection()
|| (flags_write_all_set & HA_BINLOG_STMT_CAPABLE) == 0)
{
/* log in row format! */
set_current_stmt_binlog_format_row_if_mixed();
}

 

由於sysdate()是非安全函數,因此調用了set_current_stmt_binlog_format_row_if_mixed()。

問題應該出在set_current_stmt_binlog_format_row_if_mixed

[root@10-186-21-66 ~]# addr2line -e /usr/local/mysql/bin/mysqld 0x8e6556
/export/home/pb2/build/sb_0-14249461-1422537824.58/mysqlcom-pro-5.6.23/sql/sql_class.h:3669

 

sql_class.h

inline void set_current_stmt_binlog_format_row_if_mixed()
{
DBUG_ENTER("set_current_stmt_binlog_format_row_if_mixed");
/*
This should only be called from decide_logging_format.
@todo Once we have ensured this, uncomment the following
statement, remove the big comment below that, and remove the
in_sub_stmt==0 condition from the following 'if'.
*/
/* DBUG_ASSERT(in_sub_stmt == 0); */
/*
If in a stored/function trigger, the caller should already have done the
change. We test in_sub_stmt to prevent introducing bugs where people
wouldn't ensure that, and would switch to row-based mode in the middle
of executing a stored function/trigger (which is too late, see also
reset_current_stmt_binlog_format_row()); this condition will make their
tests fail and so force them to propagate the
lex->binlog_row_based_if_mixed upwards to the caller.
*/
if ((variables.binlog_format == BINLOG_FORMAT_MIXED) &&
(in_sub_stmt == 0))
set_current_stmt_binlog_format_row();
DBUG_VOID_RETURN;
}
省略部分代碼
inline void set_current_stmt_binlog_format_row()
{
DBUG_ENTER("set_current_stmt_binlog_format_row");
current_stmt_binlog_format= BINLOG_FORMAT_ROW;
DBUG_VOID_RETURN;
}

從上述代碼能夠看出,此函數判斷binlog格式是mixed時則調用set_current_stmt_binlog_format_row,

修復補丁中,在create_event 和update_event 函數中,在調用Event_db_repository::create_event前設置binlog格式爲statement,再執行Event_db_repository::create_event時就不會產生row事件了。

@@ -308,6 +308,7 @@ Events::create_event(THD *thd, Event_parse_data *parse_data,
{
bool ret;
bool save_binlog_row_based, event_already_exists;
+ ulong save_binlog_format= thd->variables.binlog_format;
DBUG_ENTER("Events::create_event");
if (check_if_system_tables_error())
@@ -341,10 +342,15 @@ Events::create_event(THD *thd, Event_parse_data *parse_data,
*/
if ((save_binlog_row_based= thd->is_current_stmt_binlog_format_row()))
thd->clear_current_stmt_binlog_format_row();
+ thd->variables.binlog_format= BINLOG_FORMAT_STMT;
+
if (lock_object_name(thd, MDL_key::EVENT,
parse_data->dbname.str, parse_data->name.str))
- DBUG_RETURN(TRUE);
+ {
+ ret= true;
+ goto err;
+ }
/* On error conditions my_error() is called so no need to handle here */
if (!(ret= db_repository->create_event(thd, parse_data, if_not_exists,
@@ -399,10 +405,12 @@ Events::create_event(THD *thd, Event_parse_data *parse_data,
}
}
}
+err:
/* Restore the state of binlog format */
DBUG_ASSERT(!thd->is_current_stmt_binlog_format_row());
if (save_binlog_row_based)
thd->set_current_stmt_binlog_format_row();
+ thd->variables.binlog_format= save_binlog_format;

完整補丁見:git log -p 112899f406d2f1f838a180669781a8973ef3e343

 

結論

1. 升級到5.6.29版本或改用row模式

2. 若是event 的dml語句中出現相似sysdate(),在mixed模式下並不會形成數據不一致,會自動轉成row模式

3. 在主從環境下建立event_scheduler,slave默認是禁用狀態SLAVESIDE_DISABLED。應考慮主從切換後,原主從的event_scheduler狀態,避免產生意外行爲,好比 new slave 寫入數據。

相關文章
相關標籤/搜索