OKio - 從新定義了「短小精悍」的IO框架

前言

其實接觸Square的這款IO框架仍是由於okHttp這個網絡框架,由於他的網絡IO是經過OKio來完成的。不過,對於Java原生IO體系我倒是早已心懷不滿。基本上我很排斥寫Java的IO部分,由於寫起來很麻煩和笨重,有多排斥呢?javascript

我記得大學那會兒,準備寫一個編譯器,在讀取代碼的那個IO部分用的python來完成的,而後在Java層來接收字符。java

我就是這麼不喜歡Java原生IO體系。python

我一直都想本身對Java IO的API作一個完全的封裝,和原生IO接口來個了斷,結果一直由於各類緣由沒去作。在瞭解了OKio以後,就更加沒有動力去封裝原生接口了。api

今天藉着這個機會,向你們介紹這個短小精悍的IO框架,順便也和你們探討一下封裝的相關問題,但願經過這篇文章,你們可以樂於放棄原生的IO接口,轉而使用這款IO框架來做爲本身平常開發的工具。數組


原生IO:沒那麼簡單

在聊OKio以前,咱們仍是先複習一下Java原生IO體系。緩存

下面是Java IO輸入部分的架構網絡

須要說明的是,以上並非Java IO框架的所有,只是例舉一些你們可能有印象的類,而且省去了不少繼承自這些類的的子類。看一看上面的結構圖,你就知道什麼叫複雜了。觀察上圖,咱們至少能夠吐槽如下幾點:數據結構

  • IO接口的實現類太多
  • 每一個類基本對應一種IO需求,致使它的體系十分龐大

固然,Java中出現這種龐大的IO體系是有它的歷史緣由的,這是使用裝飾者模式來構建和拓展的Java IO體系的必然結果。所以咱們也沒必要過度苛責。架構


OKio:就是這麼簡單

說完了Java原生IO接口的種種問題以後,咱們能夠開始來聊一聊OKio這個框架了。那麼,它究竟是一種什麼樣的框架呢?app

俗話說得好, 文字定義終覺淺,絕知此事要上圖

從上面能夠看到,其實OKio是對於Java原生IO接口的一次封裝。一次成功的封裝。

那麼,在OKio 的幫助下,完成一次讀寫操做又是怎樣的呢?

// 寫入數據
 String fileName="test.txt";
        String path= Environment.getExternalStorageDirectory().getPath();
        File file=null;
        BufferedSink bufferSink=null;
        try{
            file=new File(path,fileName);
            if (!file.exists()){
                file.createNewFile();
            }
            bufferSink=Okio.buffer(Okio.sink(file));
            bufferSink.writeString("this is some thing import \n", Charset.forName("utf-8"));
            bufferSink.writeString("this is also some thing import \n", Charset.forName("utf-8"));
            bufferSink.close();

        }catch(Exception e){

        }


//讀取數據
 try {
            BufferedSource bufferedSource=Okio.buffer(Okio.source(file));
            String str=bufferedSource.readByteString().string(Charset.forName("utf-8"));
            Log.e("TAG","--->"+str);
        } catch (Exception e) {
            e.printStackTrace();
        }複製代碼

以上是我隨手寫的一個文件的寫入和讀取操做,能夠看到,整個過程實際上是很是簡單的,不過這並非重點,重點是寫入和讀取的方式和數據類型都十分的靈活,

是的,十分靈活

好比,讀取數據能夠很輕鬆的一行一行的讀取:

//一行一行的讀出數據
        try {
            BufferedSource bufferedSource=Okio.buffer(Okio.source(file));
            Log.e("TAG-string","--->"+bufferedSource.readUtf8Line());
            Log.e("TAG-string","--->"+bufferedSource.readUtf8Line());
            Log.e("TAG-string","--->"+bufferedSource.readUtf8Line());
            bufferedSource.close();
        } catch (Exception e) {
            e.printStackTrace();
        }複製代碼

再好比,你能夠直接讀寫Java數據類型等等,能夠說,OKio很是優雅的知足了Java IO的絕大部分需求。卻有沒有Java原生IO的繁瑣。


OKio詳解

上文寫的一些實例代碼解釋很少,當你仔細的瞭解了OKio這個框架以後,你就會理解上面每一行示例代碼所表明的意思。

好了,咱們仍是從這張圖來切入

上面能夠看到,實際上Sink和Source是OKio中的最基本的接口,大概至關於OutputStream和InputStream在原生接口中的地位。

咱們以輸出相關的Sink接口爲例

public interface Sink extends Closeable, Flushable {
  //經過緩衝區寫入數據
  void write(Buffer source, long byteCount) throws IOException;
//刷新 (緩衝區)
  @Override void flush() throws IOException;
//超時機制
  Timeout timeout();
//關閉寫操做
  @Override void close() throws IOException;
}複製代碼

上面的寫入操做最基礎的接口,固然,你看到了Buffer和flush()這個方法,這也就意味着寫入操做極可能是圍繞緩衝區來進行的,事實上確實是這樣,咱們日後看。

Sink下面一層接口是BufferedSink:

public interface BufferedSink extends Sink {
  Buffer buffer();
  BufferedSink write(ByteString byteString) throws IOException;
  BufferedSink write(byte[] source) throws IOException;
  BufferedSink write(byte[] source, int offset, int byteCount) throws IOException;
  long writeAll(Source source) throws IOException;
  BufferedSink write(Source source, long byteCount) throws IOException;
  BufferedSink writeUtf8(String string) throws IOException;
  BufferedSink writeUtf8(String string, int beginIndex, int endIndex) throws IOException;
  BufferedSink writeUtf8CodePoint(int codePoint) throws IOException;
  BufferedSink writeString(String string, Charset charset) throws IOException;
  BufferedSink writeString(String string, int beginIndex, int endIndex, Charset charset)
      throws IOException;
  BufferedSink writeByte(int b) throws IOException;
  BufferedSink writeShort(int s) throws IOException;
  BufferedSink writeShortLe(int s) throws IOException;
  BufferedSink writeInt(int i) throws IOException;
  BufferedSink writeIntLe(int i) throws IOException;
  BufferedSink writeLong(long v) throws IOException;
  BufferedSink writeLongLe(long v) throws IOException;
  BufferedSink writeDecimalLong(long v) throws IOException;
  BufferedSink writeHexadecimalUnsignedLong(long v) throws IOException;
  BufferedSink emitCompleteSegments() throws IOException;
  BufferedSink emit() throws IOException;
  OutputStream outputStream();
}複製代碼

其實上面的接口也很明瞭,就是在基本接口的基礎上,定義各式各樣的寫入方式。

真正實現上面這些接口的類則是RealBufferedSink,我摘取部分代碼做爲說明

final class RealBufferedSink implements BufferedSink {
//實例化一個緩衝區,用於保存須要寫入的數據。
  public final Buffer buffer = new Buffer();
  public final Sink sink;
  boolean closed;
  RealBufferedSink(Sink sink) {
    if (sink == null) throw new NullPointerException("sink == null");
    this.sink = sink;
  }

  @Override public Buffer buffer() {
    return buffer;
  }
    //經過緩衝區把ByteString類型的數據寫入
  @Override public BufferedSink write(ByteString byteString) throws IOException {
    if (closed) throw new IllegalStateException("closed");
    buffer.write(byteString);
    //完成寫入
    return emitCompleteSegments();
  }

//經過緩衝區把String類型的數據寫入
  @Override public BufferedSink writeString(String string, Charset charset) throws IOException {
    if (closed) throw new IllegalStateException("closed");
    buffer.writeString(string, charset);
    return emitCompleteSegments();
  }
...
...

//經過緩衝區把byte數組中的數據寫入
  @Override public BufferedSink write(byte[] source, int offset, int byteCount) throws IOException {
    if (closed) throw new IllegalStateException("closed");
    buffer.write(source, offset, byteCount);
    //完成寫入
    return emitCompleteSegments();
  }
//完成寫入
  @Override public BufferedSink emitCompleteSegments() throws IOException {
    if (closed) throw new IllegalStateException("closed");
    long byteCount = buffer.completeSegmentByteCount();
    if (byteCount > 0) sink.write(buffer, byteCount);
    return this;
  }
...
...
...

}複製代碼

ByteString內部能夠保存byte類型的數據,做爲一個工具類,它能夠把byte轉爲String,這個String能夠是utf8的值,也能夠是base64後的值,也能夠是md5的值等等

上面只是一部分嗎的代碼,可是你們也能看到buffer這個變量反覆出現,並且深度參與了寫入數據的過程,咱們能夠一塊兒去看看,着重看上面涉及到的幾個方法

public final class Buffer implements BufferedSource, BufferedSink, Cloneable {
  private static final byte[] DIGITS =
      { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f' };
  static final int REPLACEMENT_CHARACTER = '\ufffd';

  Segment head;
  long size;

  public Buffer() {
  }

  /** Returns the number of bytes currently in this buffer. */
  public long size() {
    return size;
  }


 //寫入String類型的數據
@Override 
public Buffer writeString(String string, Charset charset) {
    //調用下面的方法
    return writeString(string, 0, string.length(), charset);
  }
//準備寫入String數據
  @Override
  public Buffer writeString(String string, int beginIndex, int endIndex, Charset charset) {
    if (string == null) throw new IllegalArgumentException("string == null");
    if (beginIndex < 0) throw new IllegalAccessError("beginIndex < 0: " + beginIndex);
    if (endIndex < beginIndex) {
      throw new IllegalArgumentException("endIndex < beginIndex: " + endIndex + " < " + beginIndex);
    }
    if (endIndex > string.length()) {
      throw new IllegalArgumentException(
          "endIndex > string.length: " + endIndex + " > " + string.length());
    }
    if (charset == null) throw new IllegalArgumentException("charset == null");
    //假如是utf-8編碼的數據,則調用writeUtf8()
    if (charset.equals(Util.UTF_8)) return writeUtf8(string, beginIndex, endIndex);
    //不然,將String轉化爲byte類型的數據
    byte[] data = string.substring(beginIndex, endIndex).getBytes(charset);
    //而後執行write(),寫入byte數組
    return write(data, 0, data.length);
  }


  //offset:寫入數據的數組下標起點,
  //byteCount :寫入數據的長度
    @Override 
public Buffer write(byte[] source, int offset, int byteCount) {
    if (source == null) throw new IllegalArgumentException("source == null");
    //作一些檢查工做
    checkOffsetAndCount(source.length, offset, byteCount);

    int limit = offset + byteCount;
    //開始循環寫入數據
    while (offset < limit) {
    //Segment??黑人問號臉??
    //咱們不妨把Segment先當作一種相似數組結構的容器
    //這個方法就是獲取一個數據容器
      Segment tail = writableSegment(1);
    // limit - offset是代寫入的數據的長度
    // Segment.SIZE - tail.limit是這個容器剩餘空間的長度
      int toCopy = Math.min(limit - offset, Segment.SIZE - tail.limit);
      //調用Java方法把數據複製到容器中。
      System.arraycopy(source, offset, tail.data, tail.limit, toCopy);
      //記錄相關偏移量
      offset += toCopy;
      tail.limit += toCopy;
    }
    //增長buffer的size
    size += byteCount;
    return this;
  }

  //獲取一個Segment
Segment writableSegment(int minimumCapacity) {
    if (minimumCapacity < 1 || minimumCapacity > Segment.SIZE) throw new IllegalArgumentException();
    if (head == null) {
    假如當前Segment爲空,則從Segment池中拿到一個
      head = SegmentPool.take(); // Acquire a first segment.
      return head.next = head.prev = head;
    }
    //獲取當前Segment的前一個Segment
    //看來這是一個鏈表結構沒跑了
    Segment tail = head.prev;
    //檢查這個Segment容器是否有剩餘空間可供寫入 
    if (tail.limit + minimumCapacity > Segment.SIZE || !tail.owner) {
      //假如沒有,則拿一個新的的Segment來代替這個(即鏈表的下一個)
      tail = tail.push(SegmentPool.take()); // Append a new empty segment to fill up.
    }
    return tail;
  }

  }複製代碼

好了,如今咱們基本上揭開了OKio框架中隱藏的最重要的一個東西,數據緩存機制,主要包括Buffer,Segment,SegmentPool,

後二者主要集中在Buffer類中運用,數據是經過Buffer寫入一個叫Segment容器中的。

關於SegmentPool,其實它的存在很簡單,保存暫時不用的數據容器,防止頻繁GC,基本上全部的XX池的做用的是這樣,防止已申請的資源被回收,增長資源的重複利用,提升效率,減小GC,避免內存抖動....

關於Segment,咱們已經知道它是一個數據容器,並且是一個鏈表結構,根據它有prev和next兩個引用變量能夠推測,其實它是一個雙向鏈表,爲了照顧某些數據結構比較弱的同窗,特地畫了一下

大概就是這個樣子。下面咱們在深刻去了解這個Segment的代碼細節

final class Segment {
  /** The size of all segments in bytes. */
  static final int SIZE = 8192;

  /** Segments will be shared when doing so avoids {@code arraycopy()} of this many bytes. */
  static final int SHARE_MINIMUM = 1024;

//segment中保存數據的數組
  final byte[] data;

  /** The next byte of application data byte to read in this segment. */
  int pos;

  /** The first byte of available data ready to be written to. */
  int limit;

  /** True if other segments or byte strings use the same byte array. */
  boolean shared;

  /** True if this segment owns the byte array and can append to it, extending {@code limit}. */
  boolean owner;

  /** Next segment in a linked or circularly-linked list. */
  Segment next;

  /** Previous segment in a circularly-linked list. */
  Segment prev;

  Segment() {
    this.data = new byte[SIZE];
    this.owner = true;
    this.shared = false;
  }

  Segment(Segment shareFrom) {
    this(shareFrom.data, shareFrom.pos, shareFrom.limit);
    shareFrom.shared = true;
  }

  //建立一個Segment
  Segment(byte[] data, int pos, int limit) {
    this.data = data;
    this.pos = pos;
    this.limit = limit;
    this.owner = false;
    this.shared = true;
  }

    //從鏈表中移除一個segment
  /** * Removes this segment of a circularly-linked list and returns its successor. * Returns null if the list is now empty. */
  public Segment pop() {
    Segment result = next != this ? next : null;
    prev.next = next;
    next.prev = prev;
    next = null;
    prev = null;
    return result;
  }

//從鏈表中添加一個segment
  /** * Appends {@code segment} after this segment in the circularly-linked list. * Returns the pushed segment. */
  public Segment push(Segment segment) {
    segment.prev = this;
    segment.next = next;
    next.prev = segment;
    next = segment;
    return segment;
  }


//下面這些方法主要是在Segment內部作一些存儲的優化用的
  /** * Splits this head of a circularly-linked list into two segments. The first * segment contains the data in {@code [pos..pos+byteCount)}. The second * segment contains the data in {@code [pos+byteCount..limit)}. This can be * useful when moving partial segments from one buffer to another. * * <p>Returns the new head of the circularly-linked list. */
  public Segment split(int byteCount) {
    ...
    ...
    ...
  }

  /** * Call this when the tail and its predecessor may both be less than half * full. This will copy data so that segments can be recycled. */
  public void compact() {
    ...
    ...
    ...
  }

  /** Moves {@code byteCount} bytes from this segment to {@code sink}. */
  public void writeTo(Segment sink, int byteCount) {
   ...
   ...
   ...
  }
}複製代碼

其實總體來看,Segment的結構仍是很是簡單的。SegmentPool咱們也能夠順手看了,由於也很簡單

final class SegmentPool {
  /** The maximum number of bytes to pool. */
  // TODO: Is 64 KiB a good maximum size? Do we ever have that many idle segments?
  static final long MAX_SIZE = 64 * 1024; // 64 KiB.

  /** Singly-linked list of segments. */
  static Segment next;

  /** Total bytes in this pool. */
  static long byteCount;

  private SegmentPool() {
  }
//獲取一個閒置的Segment
  static Segment take() {
    synchronized (SegmentPool.class) {
      if (next != null) {
        Segment result = next;
        next = result.next;
        result.next = null;
        byteCount -= Segment.SIZE;
        return result;
      }
    }
    return new Segment(); // Pool is empty. Don't zero-fill while holding a lock.
  }
    //回收一個閒置的Segment
  static void recycle(Segment segment) {
    if (segment.next != null || segment.prev != null) throw new IllegalArgumentException();
    if (segment.shared) return; // This segment cannot be recycled.
    synchronized (SegmentPool.class) {
      if (byteCount + Segment.SIZE > MAX_SIZE) return; // Pool is full.
      byteCount += Segment.SIZE;
      segment.next = next;
      segment.pos = segment.limit = 0;
      next = segment;
    }
  }
}複製代碼

SegmentPool是經過一個單向的鏈表結構構成的池,你問我爲啥他不用雙向鏈表?由於不必,Segment池中全部閒置的對象都是同樣的,只要保證每次能從其中獲取到一個對象便可,所以沒必要用雙向鏈表結構來實現。

那麼Segment中使用雙向鏈表的結構來構造節點是爲何呢?那是由於使用雙向鏈表結構的話,數據的複製和轉移,以及Segment內部作相關的優化都十分方便和高效。

好了,咱們如今能夠理一理了,在RealBufferedSink這個實現類中,數據從以各類形式寫入到其Buffer裏,而Buffer經過Segment和SegmentPool來管理這些緩存的數據,目前爲止,數據尚未真正寫入到文件中,只是保存在緩存裏,

那麼數據真正寫入文件是在何時呢?

答案是在Close()方法中,咱們能夠看看RealBufferedSink這個類的close()方法

//每次寫入完,咱們會調用close()方法,最終都會調用到這裏
  @Override public void close() throws IOException {
    //若是已經關閉,則直接返回
    if (closed) return;

    // Emit buffered data to the underlying sink. If this fails, we still need
    // to close the sink; otherwise we risk leaking resources.
    Throwable thrown = null;
    try {
    //只要buffer中有數據,就一次性寫入
      if (buffer.size > 0) {
        sink.write(buffer, buffer.size);
      }
    } catch (Throwable e) {
      thrown = e;
    }

    try {
      sink.close();
    } catch (Throwable e) {
      if (thrown == null) thrown = e;
    }
    closed = true;

    if (thrown != null) Util.sneakyRethrow(thrown);
  }複製代碼

sink.write(buffer, buffer.size);這個方法纔是真正的寫入數據到文件,這個sink只是一個接口,那麼它的實現類在哪裏呢?

咱們在回看最開頭關於寫入數據的示例代碼:

String fileName="test.txt";
        String path= Environment.getExternalStorageDirectory().getPath();
        File file=null;
        BufferedSink bufferSink=null;
        try{
            file=new File(path,fileName);
            if (!file.exists()){
                file.createNewFile();
            }
            //這是很是關鍵的一步,Okio.sink(file)就是建立Sink的實現類
            bufferSink=Okio.buffer(Okio.sink(file));

            bufferSink.writeString("this is some thing import \n", Charset.forName("utf-8"));
            bufferSink.writeString("this is also some thing import \n", Charset.forName("utf-8"));
            bufferSink.close();

        }catch(Exception e){

        }複製代碼

咱們在進入OKio類中看看這個sink(file)方法:

//會往下調用
  /** Returns a sink that writes to {@code file}. */
  public static Sink sink(File file) throws FileNotFoundException {
    if (file == null) throw new IllegalArgumentException("file == null");
    //構建一個輸出流
    return sink(new FileOutputStream(file));
  }

  //會往下調用
    /** Returns a sink that writes to {@code out}. */
  public static Sink sink(OutputStream out) {
    return sink(out, new Timeout());
  }

  //在這裏建立一個sink的實現類
   private static Sink sink(final OutputStream out, final Timeout timeout) {
    if (out == null) throw new IllegalArgumentException("out == null");
    if (timeout == null) throw new IllegalArgumentException("timeout == null");

    return new Sink() {
      @Override public void write(Buffer source, long byteCount) throws IOException {
        checkOffsetAndCount(source.size, 0, byteCount);
        while (byteCount > 0) {
          timeout.throwIfReached();
          Segment head = source.head;
          int toCopy = (int) Math.min(byteCount, head.limit - head.pos);
          //最後使用的依然是Java 原生的api來實現數據的真正寫入
          out.write(head.data, head.pos, toCopy);

          head.pos += toCopy;
          byteCount -= toCopy;
          source.size -= toCopy;

          if (head.pos == head.limit) {
            source.head = head.pop();
            SegmentPool.recycle(head);
          }
        }
      }

      @Override public void flush() throws IOException {
        out.flush();
      }

      @Override public void close() throws IOException {
        out.close();
      }

      @Override public Timeout timeout() {
        return timeout;
      }

      @Override public String toString() {
        return "sink(" + out + ")";
      }
    };
  }複製代碼

好了,關於數據寫入,整個前因後果咱們基本上都講完了。

讀取的過程以此類推,先讀入緩存區,在從緩存區中讀,沒有太大的區別。

總結

咱們能夠再回顧一下:

  • 經過外部傳入File,Socket,或者OutputStream類型來構建一個輸入流
  • OKio內部建立一個緩存區,並返回一個BufferSink
  • 經過這個BufferSink來實現寫入各類數據,實際上都存入了緩存區
  • 最終調用close()方法,一次定把緩存區的數據寫入到文件中

雖然它內部對於數據類型的轉換,數據緩存的優化我並無提到,可是也無傷大雅,由於只要你;理解了它的緩衝區的設計,那麼這個IO框架的優勢和高效的地方就一目瞭然了;固然,OKio也號稱能高效的使用NIO來進行讀寫,不過客戶端基本上用不上這樣的功能,因此也不作考究。

咱們再回過頭來看OKio的框架,就能夠明白一些事情,爲何他能夠這麼簡單的實現多種數據類型的讀寫?緣由就在於它實現了一個緩存區,整個IO是基於緩存的。咱們的操做都是針對緩存區的,因此能夠很是靈活的實現多種數據類型的讀寫。而咱們也看到,最終數據仍是經過字節流寫入到了文件。

我不知道你從中是否看到了什麼關於封裝的一些東西,不過我卻是有幾分感觸想分享給你們。

封裝並不該該僅僅侷限於把幾步重複的代碼放在一個方法裏而後統一調用,更多的時候,咱們應該思考原來框架的缺陷,以解決這些缺陷爲目的進行封裝,若是在原來的架構上難以解決,則應該在適當的時候往前跨一步,跳出原來框架的侷限。

有的時候,一次成功的封裝,至關於一次完美的重構。


後記

去年年底沒有湊熱鬧發《年度總結》,因此沒機會祝你們新年快樂,在這裏祝你們新的一年裏工做順利
!!


勘誤

暫無

相關文章
相關標籤/搜索