深刻淺出NIO Socket實現機制

時間 2019-11-09

標籤深刻 nio socket 實現機制欄目 Netty 简体版

原文原文鏈接

前言

Java NIO 由如下幾個核心部分組成：linux

Buffer
Channel
Selector

之前基於net包進行socket編程時，accept方法會一直阻塞，直到有客戶端請求的到來，並返回socket進行相應的處理。整個過程是流水線的，處理完一個請求，才能去獲取並處理後面的請求；固然咱們能夠把獲取socket和處理socket的過程分開，一個線程負責accept，線程池負責處理請求。編程

NIO爲咱們提供了更好的解決方案，採用選擇器（Selector）找出已經準備好讀寫的socket，並按順序處理，基於通道（Channel）和緩衝區（Buffer）來傳輸和保存數據。windows

Buffer和Channel已經介紹過深刻淺出NIO Channel和Buffer，本文主要介紹NIO的Selector和Socket的實踐以及實現原理。數組

Selector是什麼？

在養雞場，有這一我的，天天的工做就是不停檢查幾個特殊的雞籠，若是有雞進來，有雞出去，有雞生蛋，有雞生病等等，就把相應的狀況記錄下來。這樣，若是負責人想知道雞場狀況，只須要到那我的查詢便可，固然前提是，負責得讓那我的知道須要記錄哪些狀況。緩存

Selector的做用至關這我的的工做，每一個雞籠至關於一個SocketChannel，單個線程經過Selector能夠管理多個SocketChannel。服務器

A Thread uses a Selector to handle 3 Channels

爲了實現Selector管理多個SocketChannel，必須將多個具體的SocketChannel對象註冊到Selector對象，並聲明須要監聽的事件，目前有4種類型的事件：數據結構

connect：客戶端鏈接服務端事件，對應值爲SelectionKey.OP_CONNECT(8)
accept：服務端接收客戶端鏈接事件，對應值爲SelectionKey.OP_ACCEPT(16)
read：讀事件，對應值爲SelectionKey.OP_READ(1)
write：寫事件，對應值爲SelectionKey.OP_WRITE(4)app

當SocketChannel有對應的事件發生時，Selector可以覺察到並進行相應的處理。異步

爲了更好地理解NIO Socket，先來看一段服務端的示例代碼socket

ServerSocketChannel serverChannel = ServerSocketChannel.open();
serverChannel.configureBlocking(false);
serverChannel.socket().bind(new InetSocketAddress(port));
Selector selector = Selector.open();
serverChannel.register(selector, SelectionKey.OP_ACCEPT);
while(true){
    int n = selector.select();
    if (n == 0) continue;
    Iterator ite = this.selector.selectedKeys().iterator();
    while(ite.hasNext()){
        SelectionKey key = (SelectionKey)ite.next();
        if (key.isAcceptable()){
            SocketChannel clntChan = ((ServerSocketChannel) key.channel()).accept();
            clntChan.configureBlocking(false);
            //將選擇器註冊到鏈接到的客戶端信道，
            //並指定該信道key值的屬性爲OP_READ，
            //同時爲該信道指定關聯的附件
            clntChan.register(key.selector(), SelectionKey.OP_READ, ByteBuffer.allocate(bufSize));
        }
        if (key.isReadable()){
            handleRead(key);
        }
        if (key.isWritable() && key.isValid()){
            handleWrite(key);
        }
        if (key.isConnectable()){
            System.out.println("isConnectable = true");
        }
      ite.remove();
    }
}

服務端鏈接過程
一、建立ServerSocketChannel實例serverSocketChannel，並bind到指定端口。
二、建立Selector實例selector；
三、將serverSocketChannel註冊到selector，並指定事件OP_ACCEPT。
四、while循環執行：
4.一、調用select方法，該方法會阻塞等待，直到有一個或多個通道準備好了I/O操做或等待超時。
4.二、獲取選取的鍵列表；
4.三、循環鍵集中的每一個鍵：
4.3.a、獲取通道，並從鍵中獲取附件（若是添加了附件）；
4.3.b、肯定準備就緒的操縱並執行，若是是accept操做，將接收的信道設置爲非阻塞模式，並註冊到選擇器；
4.3.c、若是須要，修改鍵的興趣操做集；
4.3.d、從已選鍵集中移除鍵

在步驟3中，selector只註冊了serverSocketChannel的OP_ACCEPT事件

若是有客戶端A鏈接服務，執行select方法時，能夠經過serverSocketChannel獲取客戶端A的socketChannel，並在selector上註冊socketChannel的OP_READ事件。
若是客戶端A發送數據，會觸發read事件，這樣下次輪詢調用select方法時，就能經過socketChannel讀取數據，同時在selector上註冊該socketChannel的OP_WRITE事件，實現服務器往客戶端寫數據。

NIO Socket實現原理

SocketChannel、ServerSocketChannel和Selector的實例初始化都經過SelectorProvider類實現，其中Selector是整個NIO Socket的核心實現。

public static SelectorProvider provider() {
    synchronized (lock) {
        if (provider != null)
            return provider;
        return AccessController.doPrivileged(
            new PrivilegedAction<SelectorProvider>() {
                public SelectorProvider run() {
                        if (loadProviderFromProperty())
                            return provider;
                        if (loadProviderAsService())
                            return provider;
                        provider = sun.nio.ch.DefaultSelectorProvider.create();
                        return provider;
                    }
                });
    }
}

SelectorProvider在windows和linux下有不一樣的實現，provider方法會返回對應的實現。

Selector分析

Selector是如何作到同時管理多個socket？

Selector初始化時，會實例化PollWrapper、SelectionKeyImpl數組和Pipe。

WindowsSelectorImpl(SelectorProvider sp) throws IOException {
    super(sp);
    pollWrapper = new PollArrayWrapper(INIT_CAP);
    wakeupPipe = Pipe.open();
    wakeupSourceFd = ((SelChImpl)wakeupPipe.source()).getFDVal();

    // Disable the Nagle algorithm so that the wakeup is more immediate
    SinkChannelImpl sink = (SinkChannelImpl)wakeupPipe.sink();
    (sink.sc).socket().setTcpNoDelay(true);
    wakeupSinkFd = ((SelChImpl)sink).getFDVal();
    pollWrapper.addWakeupSocket(wakeupSourceFd, 0);
}

pollWrapper用Unsafe類申請一塊物理內存，存放註冊時的socket句柄fdVal和event的數據結構pollfd，其中pollfd共8位，0~3位保存socket句柄，4~7位保存event。

pollfd

pollWrapper

pollWrapper提供了fdVal和event數據的相應操做，如添加操做經過Unsafe的putInt和putShort實現。

void putDescriptor(int i, int fd) {
    pollArray.putInt(SIZE_POLLFD * i + FD_OFFSET, fd);
}
void putEventOps(int i, int event) {
    pollArray.putShort(SIZE_POLLFD * i + EVENT_OFFSET, (short)event);
}

SelectionKeyImpl保存註冊時的channel、selector、event以及保存在pollWrapper的偏移位置index。

先看看serverChannel.register(selector, SelectionKey.OP_ACCEPT)是如何實現的：

public final SelectionKey register(Selector sel, int ops, Object att)
    throws ClosedChannelException {
    synchronized (regLock) {
        SelectionKey k = findKey(sel);
        if (k != null) {
            k.interestOps(ops);
            k.attach(att);
        }
        if (k == null) {
            // New registration
            synchronized (keyLock) {
                if (!isOpen())
                    throw new ClosedChannelException();
                k = ((AbstractSelector)sel).register(this, ops, att);
                addKey(k);
            }
        }
        return k;
    }
}

若是該channel和selector已經註冊過，則直接添加事件和附件。
不然經過selector實現註冊過程。

protected final SelectionKey register(AbstractSelectableChannel ch,
      int ops,  Object attachment) {
    if (!(ch instanceof SelChImpl))
        throw new IllegalSelectorException();
    SelectionKeyImpl k = new SelectionKeyImpl((SelChImpl)ch, this);
    k.attach(attachment);
    synchronized (publicKeys) {
        implRegister(k);
    }
    k.interestOps(ops);
    return k;
}

protected void implRegister(SelectionKeyImpl ski) {
    synchronized (closeLock) {
        if (pollWrapper == null)
            throw new ClosedSelectorException();
        growIfNeeded();
        channelArray[totalChannels] = ski;
        ski.setIndex(totalChannels);
        fdMap.put(ski);
        keys.add(ski);
        pollWrapper.addEntry(totalChannels, ski);
        totalChannels++;
    }
}

以當前channel和selector爲參數，初始化 SelectionKeyImpl 對象selectionKeyImpl ，並添加附件attachment。
若是當前channel的數量totalChannels等於SelectionKeyImpl數組大小，對SelectionKeyImpl數組和pollWrapper進行擴容操做。
若是totalChannels % MAX_SELECTABLE_FDS == 0，則多開一個線程處理selector。
pollWrapper.addEntry將把selectionKeyImpl中的socket句柄添加到對應的pollfd。
k.interestOps(ops)方法最終也會把event添加到對應的pollfd。

因此，無論serverSocketChannel，仍是socketChannel，在selector註冊事件後，最終都保存在pollArray中。

接着，再來看看selector中的select是如何實現一次獲取多個有事件發生的channel的。
底層由selector實現類的doSelect方法實現，以下：

 protected int doSelect(long timeout) throws IOException {
        if (channelArray == null)
            throw new ClosedSelectorException();
        this.timeout = timeout; // set selector timeout
        processDeregisterQueue();
        if (interruptTriggered) {
            resetWakeupSocket();
            return 0;
        }
        // Calculate number of helper threads needed for poll. If necessary
        // threads are created here and start waiting on startLock
        adjustThreadsCount();
        finishLock.reset(); // reset finishLock
        // Wakeup helper threads, waiting on startLock, so they start polling.
        // Redundant threads will exit here after wakeup.
        startLock.startThreads();
        // do polling in the main thread. Main thread is responsible for
        // first MAX_SELECTABLE_FDS entries in pollArray.
        try {
            begin();
            try {
                subSelector.poll();
            } catch (IOException e) {
                finishLock.setException(e); // Save this exception
            }
            // Main thread is out of poll(). Wakeup others and wait for them
            if (threads.size() > 0)
                finishLock.waitForHelperThreads();
          } finally {
              end();
          }
        // Done with poll(). Set wakeupSocket to nonsignaled  for the next run.
        finishLock.checkForException();
        processDeregisterQueue();
        int updated = updateSelectedKeys();
        // Done with poll(). Set wakeupSocket to nonsignaled  for the next run.
        resetWakeupSocket();
        return updated;
    }

其中 subSelector.poll() 是select的核心，由native函數poll0實現，readFds、writeFds 和exceptFds數組用來保存底層select的結果，數組的第一個位置都是存放發生事件的socket的總數，其他位置存放發生事件的socket句柄fd。

private final int[] readFds = new int [MAX_SELECTABLE_FDS + 1]; private final int[] writeFds = new int [MAX_SELECTABLE_FDS + 1]; private final int[] exceptFds = new int [MAX_SELECTABLE_FDS + 1]; private int poll() throws IOException{ // poll for the main thread return poll0(pollWrapper.pollArrayAddress, Math.min(totalChannels, MAX_SELECTABLE_FDS), readFds, writeFds, exceptFds, timeout); }

執行 selector.select() ，poll0函數把指向socket句柄和事件的內存地址傳給底層函數。

若是以前沒有發生事件，程序就阻塞在select處，固然不會一直阻塞，由於epoll在timeout時間內若是沒有事件，也會返回。
一旦有對應的事件發生，poll0方法就會返回。
processDeregisterQueue方法會清理那些已經cancelled的SelectionKey
updateSelectedKeys方法統計有事件發生的SelectionKey數量，並把符合條件發生事件的SelectionKey添加到selectedKeys哈希表中，提供給後續使用。

在早期的JDK1.4和1.5 update10版本以前，Selector基於select/poll模型實現，是基於IO複用技術的非阻塞IO，不是異步IO。在JDK1.5 update10和linux core2.6以上版本，sun優化了Selctor的實現，底層使用epoll替換了select/poll。

epoll原理

epoll是Linux下的一種IO多路複用技術，能夠很是高效的處理數以百萬計的socket句柄。

先看看使用c封裝的3個epoll系統調用：

int epoll_create(int size)
epoll_create創建一個epoll對象。參數size是內核保證可以正確處理的最大句柄數，多於這個最大數時內核可不保證效果。
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event)
epoll_ctl能夠操做epoll_create建立的epoll，如將socket句柄加入到epoll中讓其監控，或把epoll正在監控的某個socket句柄移出epoll。
int epoll_wait(int epfd, struct epoll_event *events,int maxevents, int timeout)
epoll_wait在調用時，在給定的timeout時間內，所監控的句柄中有事件發生時，就返回用戶態的進程。

大概看看epoll內部是怎麼實現的：

epoll初始化時，會向內核註冊一個文件系統，用於存儲被監控的句柄文件，調用epoll_create時，會在這個文件系統中建立一個file節點。同時epoll會開闢本身的內核高速緩存區，以紅黑樹的結構保存句柄，以支持快速的查找、插入、刪除。還會再創建一個list鏈表，用於存儲準備就緒的事件。
當執行epoll_ctl時，除了把socket句柄放到epoll文件系統裏file對象對應的紅黑樹上以外，還會給內核中斷處理程序註冊一個回調函數，告訴內核，若是這個句柄的中斷到了，就把它放到準備就緒list鏈表裏。因此，當一個socket上有數據到了，內核在把網卡上的數據copy到內核中後，就把socket插入到就緒鏈表裏。
當epoll_wait調用時，僅僅觀察就緒鏈表裏有沒有數據，若是有數據就返回，不然就sleep，超時時馬上返回。

epoll的兩種工做模式：

LT：level-trigger，水平觸發模式，只要某個socket處於readable/writable狀態，不管何時進行epoll_wait都會返回該socket。
ET：edge-trigger，邊緣觸發模式，只有某個socket從unreadable變爲readable或從unwritable變爲writable時，epoll_wait纔會返回該socket。

socket讀數據

socket寫數據

read實現

經過遍歷selector中的SelectionKeyImpl數組，獲取發生事件的socketChannel對象，其中保存了對應的socket句柄，實現以下。

public int read(ByteBuffer buf) throws IOException {
    if (buf == null)
        throw new NullPointerException();
    synchronized (readLock) {
        if (!ensureReadOpen())
            return -1;
        int n = 0;
        try {
            begin();
            synchronized (stateLock) {
                if (!isOpen()) {         
                    return 0;
                }
                readerThread = NativeThread.current();
            }
            for (;;) {
                n = IOUtil.read(fd, buf, -1, nd);
                if ((n == IOStatus.INTERRUPTED) && isOpen()) {
                    // The system call was interrupted but the channel
                    // is still open, so retry
                    continue;
                }
                return IOStatus.normalize(n);
            }
        } finally {
            readerCleanup();        // Clear reader thread
            // The end method, which 
            end(n > 0 || (n == IOStatus.UNAVAILABLE));

            // Extra case for socket channels: Asynchronous shutdown
            //
            synchronized (stateLock) {
                if ((n <= 0) && (!isInputOpen))
                    return IOStatus.EOF;
            }
            assert IOStatus.check(n);
        }
    }
}

經過Buffer的方式讀取socket的數據。

wakeup實現

public Selector wakeup() {
    synchronized (interruptLock) {
        if (!interruptTriggered) {
            setWakeupSocket();
            interruptTriggered = true;
        }
    }
    return this;
}

// Sets Windows wakeup socket to a signaled state.
private void setWakeupSocket() {
   setWakeupSocket0(wakeupSinkFd);
}
private native void setWakeupSocket0(int wakeupSinkFd);