1、問題描述
但願在svn的post-commit中執行一個後臺任務,可是發現該後臺任務沒有退出以前,svn提交始終不會返回。按照bash派生的後臺任務的定義,就是在子進程派生以後,父進程不會把終端輸入(終端的前臺任務)派發給後臺任務,也不會同步等該該子進程的返回。在父shell退出以後,內核會把退出進程的父進程設置爲系統的始祖進程,也就是1號任務。此時svn已經再也不是post-commit的父進程,因此它無權wait同步等待post-commit的退出,因此它必定使用了其它的同步機制。
2、內核中對於進程退出的處理
do_exit--->>exit_notify
static void
forget_original_parent(struct task_struct *father, struct list_head *to_release)
{
struct task_struct *p, *reaper = father;
struct list_head *_p, *_n;
do {
reaper = next_thread(reaper);
if (reaper == father) {
reaper = child_reaper(father);
break;
}
} while (reaper->exit_state);
static inline struct task_struct *child_reaper(struct task_struct *tsk)
{
return init_pid_ns.child_reaper;
}
這裏的意思大體是這樣的,若是退出的進程是一個進程中的一個線程,那麼將本身子進程託付給線程組中的另外一個線程,至關於本身的兄弟線程。惋惜的是這種狀況在一般的咱們使用的工具都不會存在,一個進程裏只有一個線程,不存在兄弟線程。此時就要執行第二步、或者叫作plan B,該方法就是把本身的子進程從新設置到系統的初始化進程,也就是1號進程。
3、作一個驗證
在post-commit中添加一個
sleep 1234 &
這樣經過ps能夠方便的找到這個任務
root@Harry hooks]# ps aux | grep sleep
daemon 19691 0.0 0.0 1892 400 pts/0 S 13:20 0:00 sleep 1234
root 22089 0.0 0.0 4220 696 pts/1 S+ 13:34 0:00 grep sleep
[root@Harry hooks]# cat /proc/19691/status
Name: sleep
State: S (sleeping)
Tgid: 19691
Pid: 19691
PPid: 1
TracerPid: 0
Uid: 2 2 2 2
Gid: 2 2 2 2
Utrace: 0
FDSize: 256
Groups: 1 2 4 7
能夠看到,此時sleep的父進程的確已經變成了1號進程,也就是系統的始祖進程。
[root@Harry hooks]# ps aux | grep post-commit
daemon 19690 0.0 0.0 0 0 pts/0 Z 13:20 0:00 [post-commit] <defunct>
root 24274 0.0 0.0 4220 696 pts/1 S+ 13:47 0:00 grep post-commit
此時它的父進程post-commit已經退出,成爲殭屍進程。
4、此時apache的狀態
一樣是gdb附加到進程上,執行各個線程顯示
(gdb) bt
#0 0x00cad424 in __kernel_vsyscall ()
#1 0x002e78f6 in epoll_wait () from /lib/libc.so.6
#2 0x009b8bd9 in impl_pollset_poll (pollset=0x99bce70, timeout=100,
num=0xa71e6284, descriptors=0xa71e6288) at poll/unix/epoll.c:256
#3 0x009ba574 in apr_pollset_poll (pollset=0x99bce70, timeout=100000,
num=0xa71e6284, descriptors=0xa71e6288) at poll/unix/pollset.c:343
#4 0x080aca7c in listener_thread (thd=0x99bd590, dummy=0xb6c014c0)
at event.c:1392
#5 0x009bfe0d in dummy_worker (opaque=0x99bd590) at threadproc/unix/thread.c:142
#6 0x00391925 in start_thread () from /lib/libpthread.so.0
#7 0x002e707e in clone () from /lib/libc.so.6
(gdb) info thread
18 Thread 0xb11f6b70 (LWP 6221) 0x00cad424 in __kernel_vsyscall ()
* 2 Thread 0xa71e6b70 (LWP 6237) 0x00cad424 in __kernel_vsyscall ()
1 Thread 0xb77bd9c0 (LWP 6210) 0x00cad424 in __kernel_vsyscall ()
(gdb) thread 1
[Switching to thread 1 (Thread 0xb77bd9c0 (LWP 6210))]#0 0x00cad424 in __kernel_vsyscall ()
(gdb) bt
#0 0x00cad424 in __kernel_vsyscall ()
#1 0x002df451 in select () from /lib/libc.so.6
#2 0x009c0baf in apr_sleep (t=500000) at time/unix/time.c:246
#3 0x080ae2fb in join_workers (listener=0x99bd590, threads=0x9a954c0)
at event.c:1959
#4 0x080ae895 in child_main (child_num_arg=0) at event.c:2109
#5 0x080ae9b8 in make_child (s=0x997cea8, slot=0) at event.c:2169
#6 0x080aeb02 in startup_children (number_to_start=3) at event.c:2233
#7 0x080af475 in event_run (_pconf=0x99580a8, plog=0x99b8aa0, s=0x997cea8)
at event.c:2561
#8 0x0806ef8f in ap_run_mpm (pconf=0x99580a8, plog=0x99b8aa0, s=0x997cea8)
at mpm_common.c:98
#9 0x08068eef in main (argc=2, argv=0xbfab1d24) at main.c:777
(gdb) thread 18
[Switching to thread 18 (Thread 0xb11f6b70 (LWP 6221))]#0 0x00cad424 in __kernel_vsyscall ()
(gdb) bt
#0 0x00cad424 in __kernel_vsyscall ()
#1 0x00398eeb in read () from /lib/libpthread.so.0
#2 0x009ad41e in apr_file_read (thefile=0x9acff20, buf=0xb6d50020,
nbytes=0xb11f5e18) at file_io/unix/readwrite.c:116
#3 0x0043da48 in svn_io_file_read (file=0x9acff20, buf=0xb6d50020,
nbytes=0xb11f5e18, pool=0x9aca898) at subversion/libsvn_subr/io.c:3132
#4 0x0043b324 in stringbuf_from_aprfile (result=0xb11f5e9c, filename=0x0,
file=0x9acff20, check_size=1, pool=0x9aca898)
at subversion/libsvn_subr/io.c:2049
#5 0x0043b633 in svn_stringbuf_from_aprfile (result=0xb11f5e9c, file=0x9acff20,
pool=0x9aca898) at subversion/libsvn_subr/io.c:2106
#6 0x00bf72a9 in check_hook_result (name=0xc10126 "post-commit",
cmd=0x9acfd60 "/svnrepo/tsecer/hooks/post-commit", cmd_proc=0xb11f5f10,
read_errhandle=0x9acff20, pool=0x9aca898)
at subversion/libsvn_repos/hooks.c:71
#7 0x00bf77ef in run_hook_cmd (result=0x0, name=0xc10126 "post-commit",
cmd=0x9acfd60 "/svnrepo/tsecer/hooks/post-commit", args=0xb11f5f64,
stdin_handle=0x0, pool=0x9aca898) at subversion/libsvn_repos/hooks.c:211
#8 0x00bf81df in svn_repos__hooks_post_commit (repos=0xb6c032c8, rev=1,
pool=0x9aca898) at subversion/libsvn_repos/hooks.c:469
#9 0x00bf577f in svn_repos_fs_commit_txn (conflict_p=0xb11f6000,
repos=0xb6c032c8, new_rev=0xb11f5ffc, txn=0x9acc520, pool=0x9aca898)
at subversion/libsvn_repos/fs-wrap.c:64
---Type <return> to continue, or q <return> to quit---
#10 0x003c899e in merge (target=0x9aca338, source=0x9aa4948, no_auto_merge=1,
no_checkout=1, prop_elem=0x9aca0a0, output=0x9acb470)
at subversion/mod_dav_svn/version.c:1426
#11 0x00175f82 in dav_method_merge (r=0x9aca8d8) at mod_dav.c:4399
#12 0x00176a36 in dav_handler (r=0x9aca8d8) at mod_dav.c:4778
#13 0x0808aeda in ap_run_handler (r=0x9aca8d8) at config.c:169
#14 0x0808b5f2 in ap_invoke_handler (r=0x9aca8d8) at config.c:432
#15 0x080a2c68 in ap_process_async_request (r=0x9aca8d8) at http_request.c:317
#16 0x0809f3cf in ap_process_http_async_connection (c=0xb6c01c00)
at http_core.c:143
#17 0x0809f5a2 in ap_process_http_connection (c=0xb6c01c00) at http_core.c:228
#18 0x08095ee3 in ap_run_process_connection (c=0xb6c01c00) at connection.c:41
#19 0x080ab9ef in process_socket (thd=0x99bd390, p=0xb6c019c8, sock=0xb6c01a10,
cs=0xb6c01bb8, my_child_num=0, my_thread_num=9) at event.c:917
#20 0x080adb14 in worker_thread (thd=0x99bd390, dummy=0xb6c00a40) at event.c:1744
#21 0x009bfe0d in dummy_worker (opaque=0x99bd390) at threadproc/unix/thread.c:142
#22 0x00391925 in start_thread () from /lib/libpthread.so.0
#23 0x002e707e in clone () from /lib/libc.so.6
(gdb) frame 2
#2 0x009ad41e
in apr_file_read (thefile=0x9acff20, buf=0xb6d50020,
nbytes=0xb11f5e18) at file_io/unix/readwrite.c:116
116 rv = read(thefile->filedes, buf, *nbytes);
(gdb) p *thefile
$2 = {pool = 0x9aca898, filedes = 12, fname = 0x0, flags = 0, eof_hit = 0,
is_pipe = 1, timeout = -1, buffered = 0, blocking = BLK_ON, ungetchar = -1,
buffer = 0x0, bufpos = 0, bufsize = 0, dataRead = 0, direction = 0,
filePtr = 0, thlock = 0x0}
(gdb) shell ls /proc/6210/fd/
12
/proc/6210/fd/12
(gdb) shell ls /proc/6210/fd/12 -l
lr-x------. 1 root root 64 2012-10-14 13:34 /proc/6210/fd/12 ->
pipe:[4017708]
(gdb)
[root@Harry hooks]# ll /proc/19691/fd
total 0
lr-x------. 1 daemon daemon 64 2012-10-14 13:35 0 -> /dev/null
l-wx------. 1 daemon daemon 64 2012-10-14 13:35 1 -> /dev/null
l-wx------. 1 daemon daemon 64 2012-10-14 13:20 2 ->
pipe:[4017708]
能夠看到,apache的父進程在等待子進程的標準錯誤結束,若是子進程的標準錯誤一直沒有關閉,那麼父進程將會一直等待。
5、內核中關於文件進程和關閉
當派生一個新的線程時,此時子進程會增長文件的引用計數。
copy_process--->>>copy_files--->>>dup_fd
for (i = open_files; i != 0; i--) {
struct file *f = *old_fds++;
if (f) {
get_file(f);
}
其中
#define get_file(x) atomic_inc(&(x)->
f_count)
只是增長了文件的引用計數。
當進程退出關閉一個文件的時候,執行操做爲
void fastcall fput(struct file *file)
{
if (atomic_dec_and_test(&
file->f_count)) __fput(file); } 此時因爲sleep進程是打開了pipe文件的一個進程,因此它也會致使全部的文件關閉的時候也沒法將pipe釋放,即便它的父進程已經變成init進程。這也是爲何須要在svn的post-commit中將後臺任務的標準錯誤也重定向到另一個文件的緣由。只有當管道關閉以後,read系統調用纔會返回EOF。 這一點對於svn來講,當post-commit失敗的時候,它須要知道錯誤輸出的內容,提示給svn的客戶端,因此它從子進程的標準錯誤中讀取數據也是合理的。