本文將介紹在Linux系統中,數據包是如何一步一步從網卡傳到進程手中的。linux
若是英文沒有問題,強烈建議閱讀後面參考裏的兩篇文章,裏面介紹的更詳細。git
本文只討論以太網的物理網卡,不涉及虛擬設備,而且以一個UDP包的接收過程做爲示例.github
本示例裏列出的函數調用關係來自於kernel 3.13.0,若是你的內核不是這個版本,函數名稱和相關路徑可能不同,但背後的原理應該是同樣的(或者有細微差異)
網卡須要有驅動才能工做,驅動是加載到內核中的模塊,負責銜接網卡和內核的網絡模塊,驅動在加載的時候將本身註冊進網絡模塊,當相應的網卡收到數據包時,網絡模塊會調用相應的驅動程序處理數據。api
下圖展現了數據包(packet)如何進入內存,並被內核的網絡模塊開始處理:網絡
+-----+ | | Memroy +--------+ 1 | | 2 DMA +--------+--------+--------+--------+ | Packet |-------->| NIC |------------>| Packet | Packet | Packet | ...... | +--------+ | | +--------+--------+--------+--------+ | |<--------+ +-----+ | | +---------------+ | | 3 | Raise IRQ | Disable IRQ | 5 | | | ↓ | +-----+ +------------+ | | Run IRQ handler | | | CPU |------------------>| NIC Driver | | | 4 | | +-----+ +------------+ | 6 | Raise soft IRQ | ↓
軟中斷會觸發內核網絡模塊中的軟中斷處理函數,後續流程以下socket
+-----+ 17 | | +----------->| NIC | | | | |Enable IRQ +-----+ | | +------------+ Memroy | | Read +--------+--------+--------+--------+ +--------------->| NIC Driver |<--------------------- | Packet | Packet | Packet | ...... | | | | 9 +--------+--------+--------+--------+ | +------------+ | | | skb Poll | 8 Raise softIRQ | 6 +-----------------+ | | 10 | | ↓ ↓ +---------------+ Call +-----------+ +------------------+ +--------------------+ 12 +---------------------+ | net_rx_action |<-------| ksoftirqd | | napi_gro_receive |------->| enqueue_to_backlog |----->| CPU input_pkt_queue | +---------------+ 7 +-----------+ +------------------+ 11 +--------------------+ +---------------------+ | | 13 14 | + - - - - - - - - - - - - - - - - - - - - - - + ↓ ↓ +--------------------------+ 15 +------------------------+ | __netif_receive_skb_core |----------->| packet taps(AF_PACKET) | +--------------------------+ +------------------------+ | | 16 ↓ +-----------------+ | protocol layers | +-----------------+
enqueue_to_backlog函數也會被netif_rx函數調用,而netif_rx正是lo設備發送數據包時調用的函數
因爲是UDP包,因此第一步會進入IP層,而後一級一級的函數往下調:tcp
| | ↓ promiscuous mode && +--------+ PACKET_OTHERHOST (set by driver) +-----------------+ | ip_rcv |-------------------------------------->| drop this packet| +--------+ +-----------------+ | | ↓ +---------------------+ | NF_INET_PRE_ROUTING | +---------------------+ | | ↓ +---------+ | | enabled ip forword +------------+ +----------------+ | routing |-------------------->| ip_forward |------->| NF_INET_FORWARD | | | +------------+ +----------------+ +---------+ | | | | destination IP is local ↓ ↓ +---------------+ +------------------+ | dst_output_sk | | ip_local_deliver | +---------------+ +------------------+ | | ↓ +------------------+ | NF_INET_LOCAL_IN | +------------------+ | | ↓ +-----------+ | UDP layer | +-----------+
| | ↓ +---------+ +-----------------------+ | udp_rcv |----------->| __udp4_lib_lookup_skb | +---------+ +-----------------------+ | | ↓ +--------------------+ +-----------+ | sock_queue_rcv_skb |----->| sk_filter | +--------------------+ +-----------+ | | ↓ +------------------+ | __skb_queue_tail | +------------------+ | | ↓ +---------------+ | sk_data_ready | +---------------+
調用完sk_data_ready以後,一個數據包處理完成,等待應用層程序來讀取,上面全部函數的執行過程都在軟中斷的上下文中。
應用層通常有兩種方式接收數據,一種是recvfrom函數阻塞在那裏等着數據來,這種狀況下當socket收到通知後,recvfrom就會被喚醒,而後讀取接收隊列的數據;另外一種是經過epoll或者select監聽相應的socket,當收到通知後,再調用recvfrom函數去讀取接收隊列的數據。兩種狀況都能正常的接收到相應的數據包。ide
瞭解數據包的接收流程有助於幫助咱們搞清楚咱們能夠在哪些地方監控和修改數據包,哪些狀況下數據包可能被丟棄,爲咱們處理網絡問題提供了一些參考,同時瞭解netfilter中相應鉤子的位置,對於瞭解iptables的用法有必定的幫助,同時也會幫助咱們後續更好的理解Linux下的網絡虛擬設備。函數
在接下來的幾篇文章中,將會介紹Linux下的網絡虛擬設備和iptables。ui
Monitoring and Tuning the Linux Networking Stack: Receiving Data
Illustrated Guide to Monitoring and Tuning the Linux Networking Stack: Receiving Data
NAPI