因爲業務開發須要,須要對數據庫代理進行研究,在研究 MySQL Proxy 實現原理的過程當中,對一些功能點進行了分析總結。本文主要講解下 MySQL Proxy 的 daemon 和 keepalive 功能實現原理。
MySQL Proxy 是數據庫代理實現中的一種,提供了 MySQL server 與 MySQL client 之間的通訊功能。因爲 MySQL Proxy 使用的是 MySQL 網絡協議,故其能夠在不作任何修改的狀況下,配合任何符合該協議的且與 MySQL 兼容的客戶端一塊兒使用。在最基本的配置下,MySQL Proxy 僅僅是簡單地將自身置於服務器和客戶端之間,負責將 query 從客戶端傳遞到服務器,再未來自服務器的應答返回給相應的客戶端。在高級配置下,MySQL Proxy 能夠用來監視和改變客戶端和服務器之間的通訊。查詢注入(query interception) 功能容許你按須要添加性能分析命令 (profiling) ,且能夠經過 Lua 腳本語言對注入的命令進行腳本化控制。
本文不討論 MySQL Proxy 做爲數據庫代理在功能上和實踐中的優劣,而是着重講述其源碼實現中的兩個功能點:daemon 功能和 keepalive 功能。
經過命令行啓動 MySQL Proxy 時常常會用到以下兩個配置項:--daemon 和 –keepalive 。在其相應的幫助命令中的解釋爲:
mysql
APUE 上的定義以下: 守護進程也稱 daemon 進程,是生存期較長的一種進程,它們經常在系統自舉時啓動,僅在系統關閉時才終止。由於它們沒有控制終端,因此說它們是再後臺運行的。
首先,講解下 daemon 實現的基本原則。事實上,編寫守護進程程序時是存在一些基本規則的,目的是防止產生不須要的交互做用(好比與終端的交互)。規則以下: sql
有了上面的原則,如今對照下 MySQL Proxy 中的代碼: shell
/** * start the app in the background * * UNIX-version */ void chassis_unix_daemonize(void) { #ifdef _WIN32 g_assert_not_reached(); /* shouldn't be tried to be called on win32 */ #else #ifdef SIGTTOU signal(SIGTTOU, SIG_IGN); #endif #ifdef SIGTTIN signal(SIGTTIN, SIG_IGN); #endif #ifdef SIGTSTP signal(SIGTSTP, SIG_IGN); #endif if (fork() != 0) exit(0); if (setsid() == -1) exit(0); signal(SIGHUP, SIG_IGN); if (fork() != 0) exit(0); chdir("/"); umask(0); #endif }
從上面的實現代碼中,能夠看出如下幾點: 數據庫
在上述 6 條 daemon 編程規則中沒有提到 signal 處理的問題,那麼針對 SIGHUP 的處理表明的是什麼意思呢?仍是參閱 APUE : 編程
若是終端接口檢測到一個鏈接斷開,則將此信號發送給與該終端相關的控制進程(會話首進程)。僅當終端的 CLOCAL 標誌沒有設置時,上述條件下才產生此信號。 服務器
有別於由終端正常產生的信號(如中斷、退出和掛起)-- 這些信號老是傳遞給前臺進程組 -- SIGHUP 信號能夠發送到位於後臺運行的會話首進程。SIGHUP 信號的默認處理動做是終止當前進程。一般會使用該信號來通知守護進程,以從新讀取它們的配置文件,由於守護進程不會有控制終端,並且一般決不會收到這種信號。 網絡
從上面這段文字能夠看出,這裏增長了 signal 信號處理的緣由是,在 setsid 和第二次 fork 之間,當前的子進程仍舊是會話首進程,有可能會在收到 SIGHUP 信號時終止,因此這裏經過設置 SIG_IGN 進行忽略。 session
至此,一個 daemon-mode 的守護進程就啓動了。
app
下面講解下 keepalive 功能的實現。簡單的說,MySQL Proxy 的服務器編程模型爲:1個 daemon 父進程 + 一個工做子進程(在其中能夠再啓動 n 個工做線程)。而 keepalive 的功能就是要求 daemon 進程在發現工做子進程被異常終結後,可以從新啓動該子進程。 async
首先講下 daemon 進程中的實現代碼,其主要實現的功能爲:
/** * forward the signal to the process group, but not us */ static void chassis_unix_signal_forward(int sig) { #ifdef _WIN32 g_assert_not_reached(); /* shouldn't be tried to be called on win32 */ #else signal(sig, SIG_IGN); /* we don't want to create a loop here */ kill(0, sig); #endif } /** * keep the ourself alive * * if we or the child gets a SIGTERM, we quit too * on everything else we restart it */ int chassis_unix_proc_keepalive(int *child_exit_status) { #ifdef _WIN32 g_assert_not_reached(); /* shouldn't be tried to be called on win32 */ return 0; /* for VC++, to silence a warning */ #else int nprocs = 0; pid_t child_pid = -1; /* we ignore SIGINT and SIGTERM and just let it be forwarded to the child instead * as we want to collect its PID before we shutdown too * * the child will have to set its own signal handlers for this */ for (;;) { /* try to start the children */ while (nprocs < 1) { pid_t pid = fork(); if (pid == 0) { /* child */ g_debug("%s: we are the child: %d", G_STRLOC, getpid()); return 0; } else if (pid < 0) { /* fork() failed */ g_critical("%s: fork() failed: %s (%d)", G_STRLOC, g_strerror(errno), errno); return -1; } else { /* we are the angel, let's see what the child did */ g_message("%s: [angel] we try to keep PID=%d alive", G_STRLOC, pid); /* forward a few signals that are sent to us to the child instead */ signal(SIGINT, chassis_unix_signal_forward); signal(SIGTERM, chassis_unix_signal_forward); signal(SIGHUP, chassis_unix_signal_forward); signal(SIGUSR1, chassis_unix_signal_forward); signal(SIGUSR2, chassis_unix_signal_forward); child_pid = pid; nprocs++; } } if (child_pid != -1) { struct rusage rusage; int exit_status; pid_t exit_pid; g_debug("%s: waiting for %d", G_STRLOC, child_pid); #ifdef HAVE_WAIT4 exit_pid = wait4(child_pid, &exit_status, 0, &rusage); #else memset(&rusage, 0, sizeof(rusage)); /* make sure everything is zero'ed out */ exit_pid = waitpid(child_pid, &exit_status, 0); #endif g_debug("%s: %d returned: %d", G_STRLOC, child_pid, exit_pid); if (exit_pid == child_pid) { /* our child returned, let's see how it went */ if (WIFEXITED(exit_status)) { g_message("%s: [angel] PID=%d exited normally with exit-code = %d (it used %ld kBytes max)", G_STRLOC, child_pid, WEXITSTATUS(exit_status), rusage.ru_maxrss / 1024); if (child_exit_status) *child_exit_status = WEXITSTATUS(exit_status); return 1; } else if (WIFSIGNALED(exit_status)) { int time_towait = 2; /* our child died on a signal * * log it and restart */ g_critical("%s: [angel] PID=%d died on signal=%d (it used %ld kBytes max) ... waiting 3min before restart", G_STRLOC, child_pid, WTERMSIG(exit_status), rusage.ru_maxrss / 1024); /** * to make sure we don't loop as fast as we can, sleep a bit between * restarts */ signal(SIGINT, SIG_DFL); signal(SIGTERM, SIG_DFL); signal(SIGHUP, SIG_DFL); while (time_towait > 0) time_towait = sleep(time_towait); nprocs--; child_pid = -1; } else if (WIFSTOPPED(exit_status)) { } else { g_assert_not_reached(); } } else if (-1 == exit_pid) { /* EINTR is ok, all others bad */ if (EINTR != errno) { /* how can this happen ? */ g_critical("%s: wait4(%d, ...) failed: %s (%d)", G_STRLOC, child_pid, g_strerror(errno), errno); return -1; } } else { g_assert_not_reached(); } } } #endif }其次講解工做子進程中的實現代碼,其主要實現的功能爲:
經過 libevent 提供的接口設置對 SIGTERM/SIGINT/SIGHUP 三個信號的處理,經過 libevent 的信號處理方式能夠作到,將I/O事件、Timer事件和信號事件統一按event-driven方式進行處理的目的,這樣,一旦工做子進程檢測到相應的信號,就會將控制變量signal_shutdown設置爲1,進而令循環終止。
void chassis_set_shutdown_location(const gchar* location) { if (signal_shutdown == 0) g_message("Initiating shutdown, requested from %s", (location != NULL ? location : "signal handler")); signal_shutdown = 1; } gboolean chassis_is_shutdown() { return signal_shutdown == 1; } static void sigterm_handler(int G_GNUC_UNUSED fd, short G_GNUC_UNUSED event_type, void G_GNUC_UNUSED *_data) { chassis_set_shutdown_location(NULL); } static void sighup_handler(int G_GNUC_UNUSED fd, short G_GNUC_UNUSED event_type, void *_data) { chassis *chas = _data; g_message("received a SIGHUP, closing log file"); /* this should go into the old logfile */ chassis_log_set_logrotate(chas->log); g_message("re-opened log file after SIGHUP"); /* ... and this into the new one */ }
int chassis_mainloop(void *_chas) { chassis *chas = _chas; guint i; struct event ev_sigterm, ev_sigint; #ifdef SIGHUP struct event ev_sighup; #endif chassis_event_thread_t *mainloop_thread; /* redirect logging from libevent to glib */ event_set_log_callback(event_log_use_glib); /* add a event-handler for the "main" events */ mainloop_thread = chassis_event_thread_new(); chassis_event_threads_init_thread(chas->threads, mainloop_thread, chas); chassis_event_threads_add(chas->threads, mainloop_thread); chas->event_base = mainloop_thread->event_base; /* all global events go to the 1st thread */ g_assert(chas->event_base); /* setup all plugins all plugins */ for (i = 0; i < chas->modules->len; i++) { chassis_plugin *p = chas->modules->pdata[i]; g_assert(p->apply_config); if (0 != p->apply_config(chas, p->config)) { g_critical("%s: applying config of plugin %s failed", G_STRLOC, p->name); return -1; } } /* * drop root privileges if requested */ #ifndef _WIN32 if (chas->user) { struct passwd *user_info; uid_t user_id= geteuid(); /* Don't bother if we aren't superuser */ if (user_id) { g_critical("can only use the --user switch if running as root"); return -1; } if (NULL == (user_info = getpwnam(chas->user))) { g_critical("unknown user: %s", chas->user); return -1; } if (chas->log->log_filename) { /* chown logfile */ if (-1 == chown(chas->log->log_filename, user_info->pw_uid, user_info->pw_gid)) { g_critical("%s.%d: chown(%s) failed: %s", __FILE__, __LINE__, chas->log->log_filename, g_strerror(errno) ); return -1; } } setgid(user_info->pw_gid); setuid(user_info->pw_uid); g_debug("now running as user: %s (%d/%d)", chas->user, user_info->pw_uid, user_info->pw_gid ); } #endif signal_set(&ev_sigterm, SIGTERM, sigterm_handler, NULL); event_base_set(chas->event_base, &ev_sigterm); signal_add(&ev_sigterm, NULL); signal_set(&ev_sigint, SIGINT, sigterm_handler, NULL); event_base_set(chas->event_base, &ev_sigint); signal_add(&ev_sigint, NULL); #ifdef SIGHUP signal_set(&ev_sighup, SIGHUP, sighup_handler, chas); event_base_set(chas->event_base, &ev_sighup); if (signal_add(&ev_sighup, NULL)) { g_critical("%s: signal_add(SIGHUP) failed", G_STRLOC); } #endif if (chas->event_thread_count < 1) chas->event_thread_count = 1; /* create the event-threads * * - dup the async-queue-ping-fds * - setup the events notification * */ for (i = 1; i < (guint)chas->event_thread_count; i++) { /* we already have 1 event-thread running, the main-thread */ chassis_event_thread_t *event_thread; event_thread = chassis_event_thread_new(); chassis_event_threads_init_thread(chas->threads, event_thread, chas); chassis_event_threads_add(chas->threads, event_thread); } /* start the event threads */ if (chas->event_thread_count > 1) { chassis_event_threads_start(chas->threads); } /** * handle signals and all basic events into the main-thread * * block until we are asked to shutdown */ chassis_event_thread_loop(mainloop_thread); signal_del(&ev_sigterm); signal_del(&ev_sigint); #ifdef SIGHUP signal_del(&ev_sighup); #endif return 0; }
通過了上述源碼分析,下面進行一些實驗對其進行檢驗。
1.啓動帶 keepalive 功能的 mysql-proxy。
[root@Betty data]# mysql-proxy --defaults-file=/etc/mysql-proxy.cnf [root@Betty ~]# ps ajx PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND 1 16766 16765 16765 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf 16766 16767 16765 16765 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf
2.向 daemon進程發送 INT 信號。
[root@Betty ~]# kill -INT 16766
3. MySQL Proxy日誌顯示內容:
2013-03-19 19:31:38: (message) Initiating shutdown, requested from signal handler 2013-03-19 19:31:39: (message) shutting down normally, exit code is: 0 2013-03-19 19:31:39: (debug) chassis-unix-daemon.c:167: 16767 returned: 16767 2013-03-19 19:31:39: (message) chassis-unix-daemon.c:176: [angel] PID=16767 exited normally with exit-code = 0 (it used 1 kBytes max) 2013-03-19 19:31:39: (message) Initiating shutdown, requested from mysql-proxy-cli.c:606 2013-03-19 19:31:39: (message) shutting down normally, exit code is: 0
能夠看出,父子進程均退出。由於其信號處理函數會將全局變量 signal_shutdown 設置爲 1,從而致使子進程退出 loop 循環,而處於 waitpid 狀態的父進程得到的子進程的退出狀態爲 child_exit_status = 0 ,進而令父進程也會正常退出執行。
4.重複上述動做,可是改成向子進程發送 INT 信號。
[root@Betty ~]# ps ajx PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND 1 16872 16871 16871 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf 16872 16873 16871 16871 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf [root@Betty ~]# kill -INT 16873
日誌內容以下,徹底相同。
2013-03-19 20:03:49: (message) Initiating shutdown, requested from signal handler 2013-03-19 20:03:50: (message) shutting down normally, exit code is: 0 2013-03-19 20:03:50: (debug) chassis-unix-daemon.c:167: 16873 returned: 16873 2013-03-19 20:03:50: (message) chassis-unix-daemon.c:176: [angel] PID=16873 exited normally with exit-code = 0 (it used 1 kBytes max) 2013-03-19 20:03:50: (message) Initiating shutdown, requested from mysql-proxy-cli.c:606 2013-03-19 20:03:50: (message) shutting down normally, exit code is: 0
5. 一樣的實驗(對子進程和和父進程分別實驗一次),只是將信號變爲 -TERM ,結果和上面的徹底相同(由於代碼中對這兩個信號的處理方式徹底相同)。
6. 一樣的實驗(對子進程和和父進程分別實驗一次),只是將信號變爲 -HUP ,結果以下:
2013-03-19 20:10:03: (message) received a SIGHUP, closing log file 2013-03-19 20:10:03: (message) re-opened log file after SIGHUP
上述打印出如今子進程的 HUP 信號處理函數中。該函數僅對日誌設置了 rotate_logs = true 標識,並無設置 signal_shutdown = 1 ,因此子進程不會結束,父進程也不會結束。
7. 一樣的實驗,將信號變爲 -KILL ,向子進程發送:
[root@Betty ~]# ps ajx PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND 1 16902 16901 16901 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf 16902 16903 16901 16901 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf [root@Betty ~]# kill -KILL 16903
輸出日誌以下:
2013-03-19 20:09:38: (debug) chassis-unix-daemon.c:121: we are the child: 16903 2013-03-19 20:09:38: (critical) plugin proxy 0.8.3 started 2013-03-19 20:09:38: (debug) max open file-descriptors = 1024 2013-03-19 20:09:38: (message) proxy listening on port 172.16.40.60:4040 2013-03-19 20:09:38: (message) added read/write backend: 172.16.40.60:12345 2013-03-19 20:09:38: (message) chassis-unix-daemon.c:136: [angel] we try to keep PID=16903 alive 2013-03-19 20:09:38: (debug) chassis-unix-daemon.c:157: waiting for 16903 ... ... 2013-03-19 20:31:36: (debug) chassis-unix-daemon.c:167: 16903 returned: 16903 2013-03-19 20:31:36: (critical) chassis-unix-daemon.c:189: [angel] PID=16903 died on signal=9 (it used 1 kBytes max) ... waiting 3min before restart 2013-03-19 20:31:38: (debug) chassis-unix-daemon.c:121: we are the child: 16947 2013-03-19 20:31:38: (critical) plugin proxy 0.8.3 started 2013-03-19 20:31:38: (debug) max open file-descriptors = 1024 2013-03-19 20:31:38: (message) proxy listening on port 172.16.40.60:4040 2013-03-19 20:31:38: (message) added read/write backend: 172.16.40.60:12345 2013-03-19 20:31:38: (message) chassis-unix-daemon.c:136: [angel] we try to keep PID=16947 alive 2013-03-19 20:31:38: (debug) chassis-unix-daemon.c:157: waiting for 16947
從日誌和代碼上均可以分析得出緣由:因爲 -KILL 信號是沒法獲取或者忽略的,因此當發送該信號給子進程後,子進程將被殺死,退出狀態爲 died on signal=9 ,此時父進程會執行 restart 子進程的操做。
此時從新查看進程信息:
[root@Betty ~]# ps ajx PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND 1 16902 16901 16901 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf 16902 16947 16901 16901 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf
若向父進程發送 -KILL 信號,那麼父進程將被直接殺死,子進程被 init 收留,而 init 進程根本不會理會是否須要 keepalive 子進程的問題,因此此時再向子進程發送 -KILL ,子進程被殺死後,不會從新被啓動。
8. 一樣的實驗,將信號變爲-STOP,向子進程發送:
[root@Betty ~]# ps ajx PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND 1 16977 16976 16976 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf 16977 16978 16976 16976 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf [root@Betty ~]# kill -STOP 16978 1 16977 16976 16976 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf 16977 16978 16976 16976 ? -1 T 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf [root@Betty ~]# kill -CONT 16978 1 16977 16976 16976 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf 16977 16978 16976 16976 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf [root@Betty ~]# kill -STOP 16977 1 16977 16976 16976 ? -1 T 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf 16977 16978 16976 16976 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf [root@Betty ~]# kill -CONT 16977 1 16977 16976 16976 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf 16977 16978 16976 16976 ? -1 S 0 0:00 mysql-proxy --defaults-file=/etc/mysql-proxy.cnf
出現上述結果的緣由,是信號 -STOP 一樣不可捕獲和忽略,而進程對該信號的默認處理方式爲暫停進程(能夠從進程狀態標誌看出來)。同時在代碼中,父進程在得到子進程狀態處於暫停時,沒有作任何特別處理,只是從新調用 waitpid 繼續獲取子進程的狀態而已。
【總結】
daemon 功能和 keepalive 功能屬於服務器程序開發過程當中常常要面對到的問題,本文提供了上述功能的一種實現方式。經過學習開源代碼,能夠有機會接觸到一些經典的處理問題的方法,經過對一些問題的深刻了解,可以進一步完善自身的知識體系,強化對一些知識的理解。最後引用一位大師的名言:源碼面前,了無祕密。祝玩的開心!
====================================
再貼兩個 daemonize 的實現進行對比(取自 memcached-1.4.14):
int daemonize(int nochdir, int noclose) { int fd; switch (fork()) { case -1: return (-1); case 0: break; default: _exit(EXIT_SUCCESS); } if (setsid() == -1) return (-1); if (nochdir == 0) { if(chdir("/") != 0) { perror("chdir"); return (-1); } } if (noclose == 0 && (fd = open("/dev/null", O_RDWR, 0)) != -1) { if(dup2(fd, STDIN_FILENO) < 0) { perror("dup2 stdin"); return (-1); } if(dup2(fd, STDOUT_FILENO) < 0) { perror("dup2 stdout"); return (-1); } if(dup2(fd, STDERR_FILENO) < 0) { perror("dup2 stderr"); return (-1); } if (fd > STDERR_FILENO) { if(close(fd) < 0) { perror("close"); return (-1); } } } return (0); }(下面代碼取自 Twemproxy)
static rstatus_t nc_daemonize(int dump_core) { rstatus_t status; pid_t pid, sid; int fd; pid = fork(); switch (pid) { case -1: log_error("fork() failed: %s", strerror(errno)); return NC_ERROR; case 0: break; default: /* parent terminates */ _exit(0); } /* 1st child continues and becomes the session leader */ sid = setsid(); if (sid < 0) { log_error("setsid() failed: %s", strerror(errno)); return NC_ERROR; } if (signal(SIGHUP, SIG_IGN) == SIG_ERR) { log_error("signal(SIGHUP, SIG_IGN) failed: %s", strerror(errno)); return NC_ERROR; } pid = fork(); switch (pid) { case -1: log_error("fork() failed: %s", strerror(errno)); return NC_ERROR; case 0: break; default: /* 1st child terminates */ _exit(0); } /* 2nd child continues */ /* change working directory */ if (dump_core == 0) { status = chdir("/"); if (status < 0) { log_error("chdir(\"/\") failed: %s", strerror(errno)); return NC_ERROR; } } /* clear file mode creation mask */ umask(0); /* redirect stdin, stdout and stderr to "/dev/null" */ fd = open("/dev/null", O_RDWR); if (fd < 0) { log_error("open(\"/dev/null\") failed: %s", strerror(errno)); return NC_ERROR; } status = dup2(fd, STDIN_FILENO); if (status < 0) { log_error("dup2(%d, STDIN) failed: %s", fd, strerror(errno)); close(fd); return NC_ERROR; } status = dup2(fd, STDOUT_FILENO); if (status < 0) { log_error("dup2(%d, STDOUT) failed: %s", fd, strerror(errno)); close(fd); return NC_ERROR; } status = dup2(fd, STDERR_FILENO); if (status < 0) { log_error("dup2(%d, STDERR) failed: %s", fd, strerror(errno)); close(fd); return NC_ERROR; } if (fd > STDERR_FILENO) { status = close(fd); if (status < 0) { log_error("close(%d) failed: %s", fd, strerror(errno)); return NC_ERROR; } } return NC_OK; }