首先,套接字管理器是全局惟一的,與有多少個網絡接口無關,全局變量定義在/bin/named/include/named/globals.h:網絡
EXTERN isc_socketmgr_t * ns_g_socketmgr INIT(NULL);多線程
#0 isc__socketmgr_create2 (mctx=0x8742d0, managerp=0x8701f8, maxsocks=0) at socket.c:4143
#1 0x000000000041919e in create_managers () at ./main.c:604
#2 0x0000000000419727 in setup () at ./main.c:850
#3 0x0000000000419a2b in main (argc=4, argv=0x7fffffffe5c8) at ./main.c:1058socket
使用多線程時,isc__socketmgr_create2會建立管道、select\epoll線程,工做線程經過管道控制select\epoll線程的工做。tcp
bind9啓動時會掃描一下網絡接口,運行期間會定時掃描,掃描間隔能夠設置相應定時器,這樣網絡環境發生變化,bind9能夠及時感知。bind9會爲每個網絡接口建立兩個監聽套接字,爲lo網絡接口建立控制套接字。因此只有一個物理網卡的機器,在啓動時會建立3個套接字。ide
udp監聽套接字:函數
#0 isc__socket_create (manager0=0x7ffff7fa9010, pf=2, type=isc_sockettype_udp, socketp=0x7fffec7870c8) at socket.c:2580 #1 0x00000000004861cc in open_socket (mgr=0x7ffff7fa9010, local=0x7ffff7fbe290, options=1, sockp=0x7fffec7872f8) at dispatch.c:1797 #2 0x0000000000489b27 in get_udpsocket (mgr=0x7ffff7fae270, sockmgr=0x7ffff7fa9010, taskmgr=0x7ffff7fa5010, localaddr=0x7ffff7fbe290, maxrequests=<value optimized out>, attributes=44, dispp=0x7fffec787418) at dispatch.c:2792 #3 dispatch_createudp (mgr=0x7ffff7fae270, sockmgr=0x7ffff7fa9010, taskmgr=0x7ffff7fa5010, localaddr=0x7ffff7fbe290, maxrequests=<value optimized out>, attributes=44, dispp=0x7fffec787418) at dispatch.c:2860 #4 0x000000000048a042 in dns_dispatch_getudp (mgr=0x7ffff7fae270, sockmgr=0x7ffff7fa9010, taskmgr=0x7ffff7fa5010, localaddr=0x7ffff7fbe290, buffersize=<value optimized out>, maxbuffers=<value optimized out>, maxrequests=32768, buckets=8219, increment=8237, attributes=44, mask=30, dispp=0x7ffff7fbe340) at dispatch.c:2714 #5 0x000000000041520b in ns_interface_listenudp (ifp=0x7ffff7fbe250) at interfacemgr.c:261 #6 0x00000000004155e5 in ns_interface_setup (mgr=0x7ffff7fb6f70, addr=0x7fffec787700, name=0x7fffec787570 "eth0", ifpret=0x7fffec787878, accept_tcp=isc_boolean_true) at interfacemgr.c:365 #7 0x0000000000416a16 in do_scan (mgr=0x7ffff7fb6f70, ext_listen=0x0, verbose=isc_boolean_true) at interfacemgr.c:844 #8 0x0000000000416bf2 in ns_interfacemgr_scan0 (mgr=0x7ffff7fb6f70, ext_listen=0x0, verbose=isc_boolean_true) at interfacemgr.c:897 #9 0x0000000000416c92 in ns_interfacemgr_scan (mgr=0x7ffff7fb6f70, verbose=isc_boolean_true) at interfacemgr.c:923 #10 0x0000000000435107 in scan_interfaces (server=0x7ffff7fae010, verbose=isc_boolean_true) at server.c:3604 #11 0x0000000000437d60 in load_configuration (filename=0x7fffffffe850 "/var/named/named.conf", server=0x7ffff7fae010, first_time=isc_boolean_true) at server.c:4638 #12 0x0000000000439fdf in run_server (task=0x7ffff7fba010, event=0x0) at server.c:5268 #13 0x00000000005b3b15 in dispatch (manager=0x7ffff7fa5010) at task.c:1012 #14 0x00000000005b3da1 in run (uap=0x7ffff7fa5010) at task.c:1157 #15 0x0000003817a07a51 in start_thread () from /lib64/libpthread.so.0 #16 0x00000038176e896d in clone () from /lib64/libc.so.6
tcp套接字:spa
#0 isc__socket_create (manager0=0x7ffff7fa9010, pf=2, type=isc_sockettype_tcp, socketp=0x7ffff7fbe348) at socket.c:2580 #1 0x0000000000415344 in ns_interface_accepttcp (ifp=0x7ffff7fbe250) at interfacemgr.c:297 #2 0x0000000000415600 in ns_interface_setup (mgr=0x7ffff7fb6f70, addr=0x7fffec787700, name=0x7fffec787570 "eth0", ifpret=0x7fffec787878, accept_tcp=isc_boolean_true) at interfacemgr.c:370 #3 0x0000000000416a16 in do_scan (mgr=0x7ffff7fb6f70, ext_listen=0x0, verbose=isc_boolean_true) at interfacemgr.c:844 #4 0x0000000000416bf2 in ns_interfacemgr_scan0 (mgr=0x7ffff7fb6f70, ext_listen=0x0, verbose=isc_boolean_true) at interfacemgr.c:897 #5 0x0000000000416c92 in ns_interfacemgr_scan (mgr=0x7ffff7fb6f70, verbose=isc_boolean_true) at interfacemgr.c:923 #6 0x0000000000435107 in scan_interfaces (server=0x7ffff7fae010, verbose=isc_boolean_true) at server.c:3604 #7 0x0000000000437d60 in load_configuration (filename=0x7fffffffe850 "/var/named/named.conf", server=0x7ffff7fae010, first_time=isc_boolean_true) at server.c:4638 #8 0x0000000000439fdf in run_server (task=0x7ffff7fba010, event=0x0) at server.c:5268 #9 0x00000000005b3b15 in dispatch (manager=0x7ffff7fa5010) at task.c:1012 #10 0x00000000005b3da1 in run (uap=0x7ffff7fa5010) at task.c:1157 #11 0x0000003817a07a51 in start_thread () from /lib64/libpthread.so.0 #12 0x00000038176e896d in clone () from /lib64/libc.so.6
rndc控制套接字:線程
#0 isc__socket_create (manager0=0x7ffff7fa9010, pf=2, type=isc_sockettype_tcp, socketp=0x7fffea8431a8) at socket.c:2580 #1 0x0000000000413a62 in add_listener (cp=0x7ffff7faf038, listenerp=0x7fffec787aa0, control=0x7ffff7fcfb38, config=0x7ffff7fcf550, addr=0x7fffec787990, aclconfctx=0x7ffff7fa3070, socktext=0x7fffec787a40 "0.0.0.0#953", type=isc_sockettype_tcp) at controlconf.c:1145 #2 0x0000000000413fd2 in ns_controls_configure (cp=0x7ffff7faf038, config=0x7ffff7fcf550, aclconfctx=0x7ffff7fa3070) at controlconf.c:1281 #3 0x0000000000438916 in load_configuration (filename=0x7fffffffe850 "/var/named/named.conf", server=0x7ffff7fae010, first_time=isc_boolean_true) at server.c:4862 #4 0x0000000000439fdf in run_server (task=0x7ffff7fba010, event=0x0) at server.c:5268 #5 0x00000000005b3b15 in dispatch (manager=0x7ffff7fa5010) at task.c:1012 #6 0x00000000005b3da1 in run (uap=0x7ffff7fa5010) at task.c:1157 #7 0x0000003817a07a51 in start_thread () from /lib64/libpthread.so.0 #8 0x00000038176e896d in clone () from /lib64/libc.so.6
socket在socketmgr中的存儲:3d
sock->manager->fds[sock->fd] = sock;
sock->manager->fdstate[sock->fd] = MANAGED;code
幾個重點函數:
epoll既能夠監聽管道,又能夠監聽套接字。bind9的套接字監聽控制管道在管道創建的時候就直接加入到監聽列表中了,具體棧過程以下:
#0 watch_fd (manager=0x7ffff7fa9010, fd=9, msg=-3) at socket.c:814 #1 0x00000000005c5947 in setup_watcher (mctx=0x8742d0, manager=0x7ffff7fa9010) at socket.c:3973 #2 0x00000000005c5f2a in isc__socketmgr_create2 (mctx=0x8742d0, managerp=0x8701f8, maxsocks=4096) at socket.c:4246 #3 0x000000000041919e in create_managers () at ./main.c:604 #4 0x0000000000419727 in setup () at ./main.c:850 #5 0x0000000000419a2b in main (argc=4, argv=0x7fffffffe5c8) at ./main.c:1058
若是有須要監聽的套接字,能夠經過寫上面的管道, 使用管道能夠避免線程同步的麻煩。
#0 select_poke (mgr=0x7ffff7fa9010, fd=512, msg=-3) at socket.c:1026 #1 0x00000000005c68dd in socket_recv (sock=0x7ffff7fd6010, dev=0x7fffeaed8148, task=0x7ffff7fba9b0, flags=0) at socket.c:4486 #2 0x00000000005c6f61 in isc__socket_recv2 (sock0=0x7ffff7fd6010, region=0x7ffff0f90d10, minimum=1, task=0x7ffff7fba9b0, event=0x7fffeaed8148, flags=0) at socket.c:4619 #3 0x000000000040cf58 in client_udprecv (client=0x7fffe4004c40) at client.c:2359 #4 0x000000000040877a in client_start (task=0x7ffff7fba9b0, event=0x7fffe40050e8) at client.c:583 #5 0x00000000005b3b15 in dispatch (manager=0x7ffff7fa5010) at task.c:1012 #6 0x00000000005b3da1 in run (uap=0x7ffff7fa5010) at task.c:1157 #7 0x0000003817a07a51 in start_thread () from /lib64/libpthread.so.0 #8 0x00000038176e896d in clone () from /lib64/libc.so.6
在socket_recv函數中有這樣的代碼:
/* * Enqueue the request. If the socket was previously not being * watched, poke the watcher to start paying attention to it. */ if (ISC_LIST_EMPTY(sock->recv_list) && !sock->pending_recv) select_poke(sock->manager, sock->fd, SELECT_POKE_READ); ISC_LIST_ENQUEUE(sock->recv_list, dev, ev_link);
用於把client的按讀事件的調度方式轉化爲epoll按文件描述符的調度方式(一個套接字能夠有不少的讀事件)。
在internal_recv()(internal_recv函數後面會講到)函數中有以下代碼:
poke: if (!ISC_LIST_EMPTY(sock->recv_list)) select_poke(sock->manager, sock->fd, SELECT_POKE_READ);
經過這兩處的寫管道配合,即便沒有用鎖,也能夠完美線程同步。
寫管道以後,watcher線程再讀管道,具體棧過程以下:
#0 watch_fd (manager=0x7ffff7fa9010, fd=512, msg=-3) at socket.c:795 #1 0x00000000005bf14b in wakeup_socket (manager=0x7ffff7fa9010, fd=512, msg=-3) at socket.c:996 #2 0x00000000005c554e in process_ctlfd (manager=0x7ffff7fa9010) at socket.c:3761 #3 0x00000000005c549f in process_fds (manager=0x7ffff7fa9010, events=0x7ffff7faa010, nevents=1) at socket.c:3665 #4 0x00000000005c5696 in watcher (uap=0x7ffff7fa9010) at socket.c:3872 #5 0x0000003817a07a51 in start_thread () from /lib64/libpthread.so.0 #6 0x00000038176e896d in clone () from /lib64/libc.so.6
select_readmsg會在watch_fd函數以前調用,用於讀管道。
select/epoll監聽線程(watcher函數)是一個快速精悍線程,也就是說判斷到可讀可寫狀態後的讀寫操做不是在此函數完成的,因此在epoll以後要及時把相關套接字從epoll中監聽列表中剔除(調用unwatch_fd函數),只有當實際接受函數完成或等待讀事件耗盡纔會再次加進去。從列表中剔除的同時發送讀事件,task調度線程會通知實際的讀函數去完成讀任務。
socket_recv、dispatch_recv、internal_recv三個函數的關係: