Java IO類庫之DataInputStream和DataOutputStream

1、DataInputStream

1 - DataInputStream介紹

    DataInputStream屬於數據輸入流,繼承自FilterInputStream,使用了裝飾器模式經過實現DataInput接口容許程序以機器無關的方式從綁定的底層輸入流中讀取JAVA內置的基礎數據類型。應用程序可使用DataInputStream讀取以前由DataOutputStream寫入的數據。java

2 - DataInputStream源碼分析

1)類成員變量

/**
     * working arrays initialized on demand by readUTF
     */
    private byte bytearr[] = new byte[80];
    private char chararr[] = new char[80];

    由註釋可知bytearr和chararr這兩個成員變量在readUTF方法裏須要用到這裏先跳過,後續講解該方法時再解釋他們的做用數組

2)成員方法bash

//構造方法,綁定一個要裝飾的底層輸入流
    public DataInputStream(InputStream in) {
        super(in);
    }

    //從輸入流中讀取一段字節數據並存儲到字節數組b中
    public final int read(byte b[]) throws IOException {
        return in.read(b, 0, b.length);
    }
    //功能與read(byte b[])同樣,off指定了字節數組b開始存儲字節數據的起始位置,len表示讀取的字節個數
    public final int read(byte b[], int off, int len) throws IOException {
        return in.read(b, off, len);
    }

    //從輸入流中讀取字節數據並存儲到字節數組b中,數組b沒有填滿則一直讀取,直到填滿數組,若是字節數組b的長度大於輸入流大
    //小,那麼拋出EOFException異常
    public final void readFully(byte b[]) throws IOException {
        readFully(b, 0, b.length);
    }
    //從輸入流中讀取字節數據並存儲到字節數組b中,數組b沒有填滿則一直讀取,直到填滿數組,若是len-off大於輸入流in剩餘可 
    //讀字節大小,那麼拋出EOFException異常
    public final void readFully(byte b[], int off, int len) throws IOException {
        if (len < 0)
            throw new IndexOutOfBoundsException();
        int n = 0;
        while (n < len) {
            int count = in.read(b, off + n, len - n);
            if (count < 0)
                throw new EOFException();
            n += count;
        }
    }

    //嘗試跳過n個字節,輸入流剩餘可讀字節可能小於n故實際可能跳過的字節數小於n
    public final int skipBytes(int n)
    
    //從輸入流中讀取boolean類型的值
    public final boolean readBoolean() throws IOException {
        int ch = in.read();
        if (ch < 0)
            throw new EOFException();
        return (ch != 0);
    }
    
    //從輸入流中讀取Byte類型的值
    public final byte readByte() throws IOException {
        int ch = in.read();
        if (ch < 0)
            throw new EOFException();
        return (byte)(ch);
    }

    //從輸入流中讀取無符號Byte類型值,也就是讀取值爲整數的byte值
    public final int readUnsignedByte() throws IOException {
        int ch = in.read();
        if (ch < 0)
            throw new EOFException();
        return ch;
    }

    //從輸入流中讀取有符號short(佔2個字節)類型值,由於JAVA IO採用的高位編址,因此高位ch1須要左移8位
    public final short readShort() throws IOException {
        int ch1 = in.read();
        int ch2 = in.read();
        if ((ch1 | ch2) < 0)
            throw new EOFException();
        return (short)((ch1 << 8) + (ch2 << 0));
    }

    //從輸入流中讀取無符號short類型值
    public final int readUnsignedShort() throws IOException {
        int ch1 = in.read();
        int ch2 = in.read();
        if ((ch1 | ch2) < 0)
            throw new EOFException();
        return (ch1 << 8) + (ch2 << 0);
    }

    //從輸入流中讀取char類型值
    public final char readChar() throws IOException {
        int ch1 = in.read();
        int ch2 = in.read();
        if ((ch1 | ch2) < 0)
            throw new EOFException();
        return (char)((ch1 << 8) + (ch2 << 0));
    }

    //從輸入流中讀取int類型值(4位)
    public final int readInt() throws IOException {
        int ch1 = in.read();
        int ch2 = in.read();
        int ch3 = in.read();
        int ch4 = in.read();
        if ((ch1 | ch2 | ch3 | ch4) < 0)
            throw new EOFException();
        return ((ch1 << 24) + (ch2 << 16) + (ch3 << 8) + (ch4 << 0));
    }

    以上成員方法邏輯較爲簡單不作分析,下面咱們重點分析一下readLong、readDouble、readFloat和readUTF方法less

1)readLong方法源碼分析

private byte readBuffer[] = new byte[8];

    /**
     * See the general contract of the <code>readLong</code>
     * method of <code>DataInput</code>.
     * <p>
     * Bytes
     * for this operation are read from the contained
     * input stream.
     *
     * @return     the next eight bytes of this input stream, interpreted as a
     *             <code>long</code>.
     * @exception  EOFException  if this input stream reaches the end before
     *               reading eight bytes.
     * @exception  IOException   the stream has been closed and the contained
     *             input stream does not support reading after close, or
     *             another I/O error occurs.
     * @see        java.io.FilterInputStream#in
     */
    public final long readLong() throws IOException {
        readFully(readBuffer, 0, 8);
        return (((long)readBuffer[0] << 56) +
                ((long)(readBuffer[1] & 255) << 48) +
                ((long)(readBuffer[2] & 255) << 40) +
                ((long)(readBuffer[3] & 255) << 32) +
                ((long)(readBuffer[4] & 255) << 24) +
                ((readBuffer[5] & 255) << 16) +
                ((readBuffer[6] & 255) <<  8) +
                ((readBuffer[7] & 255) <<  0));
    }

readLong方法讀取輸入流8個字節並轉化爲一個長整形值,注意方法開始必須從輸入流阻塞讀滿8個字節數據到readBuffer字節數組中,若是未讀滿8個字節且到達輸入流末尾,那麼拋出EOFException異常,反序列化與序列化相對應,遵循高位編址,值高位保存在輸入流低位,輸入流從低位開始讀取。this

2)readDouble方法編碼

public final double readDouble() throws IOException {
        return Double.longBitsToDouble(readLong());
    }

由源碼可知readDouble是先將充分讀取輸入流8個字節轉化爲對應長整形值,而後根據改長整形值還原寫入字節輸入流以前的Double值。spa

3)readUTF方法code

該方法邏輯比較複雜也是本類要講的重點方法orm

public final static String readUTF(DataInput in) throws IOException {
        //從輸入流中讀取無符號short類型的值,使用UTF-8編碼的字節輸入流前2個字節保存的是字節數據的長度
        int utflen = in.readUnsignedShort();//獲取輸入流長度
        byte[] bytearr = null;
        char[] chararr = null;
        //分配字節數組bytearr和字符數組chararr
        if (in instanceof DataInputStream) {
            DataInputStream dis = (DataInputStream)in;
            if (dis.bytearr.length < utflen){
                dis.bytearr = new byte[utflen*2];
                dis.chararr = new char[utflen*2];
            }
            chararr = dis.chararr;
            bytearr = dis.bytearr;
        } else {
            bytearr = new byte[utflen];
            chararr = new char[utflen];
        }

        int c, char2, char3;
        int count = 0;
        int chararr_count=0;
        //從數據輸入流中讀取字節數據到bytearr中直到讀滿utflen個字節
        in.readFully(bytearr, 0, utflen);
        //由於UTF-8編碼的字節流中一個字符佔用的字節數1-4個字節不等,這裏至關於預處理輸入流中的單字節符號
        while (count < utflen) {
            c = (int) bytearr[count] & 0xff;
            //UTF-8的每一個字節值都不會超過127,超過127則退出
            if (c > 127) break;
            count++;
            chararr[chararr_count++]=(char)c;
        }
        //處理完單字節符號以後接下來基於佔字節數不一樣的UTF-8通用格式和第1字節特徵處理UTF-8
        while (count < utflen) {
            //將每一個字節轉換成int值
            c = (int) bytearr[count] & 0xff;
            //轉換後的int值c右移4位
            switch (c >> 4) {
                //若UTF-8是單字節,即bytearrcount[count]對應的是UTF-8單字節約定的"0xxxxxxx"通用格式
                //那麼c的取值範圍在0-7之間,單字節UTF-8字符直接對int值轉化便可
                case 0: case 1: case 2: case 3: case 4: case 5: case 6: case 7:
                    /* 0xxxxxxx*/
                    count++;
                    chararr[chararr_count++]=(char)c;
                    break;
                //若UTF-8是2個字節,即bytearr[count]對應通用格式是"110xxxxx 10xxxxxx"通用格式的第一個
                //那麼bytearr[count]對應的int值c的取值範圍是12-13,需進行移位運算以後轉爲相應字符
                case 12: case 13:
                    /* 110x xxxx   10xx xxxx*/
                    count += 2;
                    if (count > utflen)
                        throw new UTFDataFormatException(
                            "malformed input: partial character at end");
                    char2 = (int) bytearr[count-1];
                    if ((char2 & 0xC0) != 0x80)
                        throw new UTFDataFormatException(
                            "malformed input around byte " + count);
                    chararr[chararr_count++]=(char)(((c & 0x1F) << 6) |
                                                    (char2 & 0x3F));
                    break;
                //若UTF-8是三個字節,即bytearr[count]對應的是1110xxxx 10xxxxxx 10xxxxxx通用格式中的第一個
                //那麼對應的c取值是14
                case 14:
                    /* 1110 xxxx  10xx xxxx  10xx xxxx */
                    count += 3;
                    if (count > utflen)
                        throw new UTFDataFormatException(
                            "malformed input: partial character at end");
                    char2 = (int) bytearr[count-2];
                    char3 = (int) bytearr[count-1];
                    if (((char2 & 0xC0) != 0x80) || ((char3 & 0xC0) != 0x80))
                        throw new UTFDataFormatException(
                            "malformed input around byte " + (count-1));
                    chararr[chararr_count++]=(char)(((c     & 0x0F) << 12) |
                                                    ((char2 & 0x3F) << 6)  |
                                                    ((char3 & 0x3F) << 0));
                    break;
                default:
                    /* 10xx xxxx,  1111 xxxx */
                    throw new UTFDataFormatException(
                        "malformed input around byte " + count);
            }
        }
        // The number of chars produced may be less than utflen
        return new String(chararr, 0, chararr_count);
    }

readUTF方法的做用是從輸入流中讀取UTF-8編碼數據,並以String字符串的形式返回,下面是readUTF方法的方法流程邏輯:

1)讀取輸出流中UTF-8字節數據的長度

2)建立兩個數組字節數組bytearr和字符數組chararr分別用於保存輸入流utf-8字節數據和轉換後的字符數據。

這裏它首先判斷方法傳入的輸入流in是否是DataInputStream:

    若是不是,新建數組bytearr和chararr兩個數組分配的容量都等於開始讀取的UTF-8字節數據的長度,由於沒法預測UTF-8字符串全部字符佔用的字節數所以chararr數組假設都爲單字節字符分配得最大容量;

    若是是,判斷數據輸入流in成員變量bytearr的數組長度是否小於UTF-8字節數據長度:若小於,則成員變量bytearr和chararr均擴增爲UTF-8字節數據長度的兩倍,設置bytearr和chararr指向數據輸入流in的兩個成員變量bytearr和chararr。(這裏不理解爲何bytearr和chararr須要擴增爲utflen的兩倍不是平白浪費半的空間嗎)

3)將UTF-8數據所有讀取到字節數組bytearr中

4)對UTF-8中的單字節數據進行預處理

5)對4)預處理以後的數據,繼續進行處理,由於UTF-8字符佔用1~4字節不等,咱們須要根據佔用字節數不一樣的UTF-8通用格式,經過轉化UTF-8首個byte爲int值並右移4位區分UTF-8佔用的是幾個字節,而後分別進行字符轉化處理

6)將字符數組chararr轉化爲字符串並返回。

2、DataOutputStream

1 - DataOutputStream介紹

    DataOutputStream是數據輸出流,繼承自FilterOutputStream,用於裝飾其餘輸出流,經過實現DataOutput接口爲綁定的輸出流提供寫入JAVA內置基礎數據類型的額外功能,應用程序能夠經過DataInputStream讀取由DataOutputStream寫入的基礎數據類型。

2 - DataOutputStream源碼分析

public
class DataOutputStream extends FilterOutputStream implements DataOutput {
    //數據輸出流寫入的字節數
    protected int written;

    //數據輸出流的字節數組用於保存數據輸出流寫入的數據
    private byte[] bytearr = null;

    //構造方法,綁定其餘輸出流
    public DataOutputStream(OutputStream out) {
        super(out);
    }

    //增長數據輸出流已寫入字節數值written
    private void incCount(int value) {
        int temp = written + value;
        if (temp < 0) {
            temp = Integer.MAX_VALUE;
        }
        written = temp;
    }

    //將byte對應int值寫入到數據輸出流
    public synchronized void write(int b) throws IOException {
        out.write(b);
        incCount(1);
    }

    //將字節數組從off開始的len個字節寫入到數據輸出流
    public synchronized void write(byte b[], int off, int len)
        throws IOException
    {
        out.write(b, off, len);
        incCount(len);
    }

    //清空緩衝將緩衝中的數據都寫入到輸出流中
    public void flush() throws IOException {
        out.flush();
    }

    //將Boolean值寫入到數據輸出流中
    public final void writeBoolean(boolean v) throws IOException {
        out.write(v ? 1 : 0);
        incCount(1);
    }

    //將byte類型值寫入到數據輸出流中
    public final void writeByte(int v) throws IOException {
        out.write(v);
        incCount(1);
    }

    //將shor類型值寫入到數據輸出流中
    public final void writeShort(int v) throws IOException {
        out.write((v >>> 8) & 0xFF);
        out.write((v >>> 0) & 0xFF);
        incCount(2);
    }

    //將char類型值寫入到數據輸出流中,注意char佔2個字節
    public final void writeChar(int v) throws IOException {
        out.write((v >>> 8) & 0xFF);
        out.write((v >>> 0) & 0xFF);
        incCount(2);
    }

    //將Int類型值寫入數據輸出流中
    public final void writeInt(int v) throws IOException {
        out.write((v >>> 24) & 0xFF);
        out.write((v >>> 16) & 0xFF);
        out.write((v >>>  8) & 0xFF);
        out.write((v >>>  0) & 0xFF);
        incCount(4);
    }

    private byte writeBuffer[] = new byte[8];

    //將long類型值寫入到數據輸出流中,long佔8字節
    public final void writeLong(long v) throws IOException {
        writeBuffer[0] = (byte)(v >>> 56);
        writeBuffer[1] = (byte)(v >>> 48);
        writeBuffer[2] = (byte)(v >>> 40);
        writeBuffer[3] = (byte)(v >>> 32);
        writeBuffer[4] = (byte)(v >>> 24);
        writeBuffer[5] = (byte)(v >>> 16);
        writeBuffer[6] = (byte)(v >>>  8);
        writeBuffer[7] = (byte)(v >>>  0);
        out.write(writeBuffer, 0, 8);
        incCount(8);
    }

    //將float類型值寫入到數據輸出流中,注意float類型值寫入先轉化爲對應字節位整形再以整形值寫入
    public final void writeFloat(float v) throws IOException {
        writeInt(Float.floatToIntBits(v));
    }

    //將double類型值寫入到數據輸入流中
    public final void writeDouble(double v) throws IOException {
        writeLong(Double.doubleToLongBits(v));
    }

    //將string類型值寫入到數據輸出流中實際寫入時是將String對應的每一個字符轉換成byte數據後寫入輸出流中
    public final void writeBytes(String s) throws IOException {
        int len = s.length();
        for (int i = 0 ; i < len ; i++) {
            out.write((byte)s.charAt(i));
        }
        incCount(len);
    }

    //將String類型值寫入數據輸入流,實際寫入時是將String每一個字符轉化爲char數據後寫入輸出流
    public final void writeChars(String s) throws IOException {
        int len = s.length();
        for (int i = 0 ; i < len ; i++) {
            int v = s.charAt(i);
            out.write((v >>> 8) & 0xFF);
            out.write((v >>> 0) & 0xFF);
        }
        incCount(len * 2);
    }

    //將UTF-8編碼字符串寫入到數據輸出流中
    public final void writeUTF(String str) throws IOException {
        writeUTF(str, this);
    }

   
    static int writeUTF(String str, DataOutput out) throws IOException {
        //獲取String長度
        int strlen = str.length();
        //統計utf-8字節數
        int utflen = 0;
        int c, count = 0;

        //統計UTF-8字節數,根據UTF-8首字符判斷UTF-8是由幾個字節組成的
        for (int i = 0; i < strlen; i++) {
            c = str.charAt(i);
            if ((c >= 0x0001) && (c <= 0x007F)) {
                utflen++;
            } else if (c > 0x07FF) {
                utflen += 3;
            } else {
                utflen += 2;
            }
        }
        //若是讀取的字節數超出65535字節拋出UTFDataFormatException異常編碼字節過長
        if (utflen > 65535)
            throw new UTFDataFormatException(
                "encoded string too long: " + utflen + " bytes");
        //建立字節數組bytearr
        byte[] bytearr = null;
        //若是傳入的輸出流是DataOutputStream或其子類,若是out成員變量bytearr爲空或者長度小於UTF-8編碼字節流長度
        //utflen,那麼擴容爲utflen的2倍+2,並讓外部字節數組bytearr指向它,注意多分配2個字節用於記錄
        //UTF-8編碼字節流長度
        if (out instanceof DataOutputStream) {
            DataOutputStream dos = (DataOutputStream)out;
            if(dos.bytearr == null || (dos.bytearr.length < (utflen+2)))
                dos.bytearr = new byte[(utflen*2) + 2];
            bytearr = dos.bytearr;
        } else {
            bytearr = new byte[utflen+2];
        }

        //寫入UTF-8字節長度
        bytearr[count++] = (byte) ((utflen >>> 8) & 0xFF);
        bytearr[count++] = (byte) ((utflen >>> 0) & 0xFF);

        int i=0;
        //對UTF-8單字節數據進行預處理
        for (i=0; i<strlen; i++) {
           c = str.charAt(i);
           if (!((c >= 0x0001) && (c <= 0x007F))) break;
           bytearr[count++] = (byte) c;
        }
        
        //對預處理以後的數據,接着進行處理
        for (;i < strlen; i++){
            c = str.charAt(i);
            //UTF-8是單字節數據寫入
            if ((c >= 0x0001) && (c <= 0x007F)) {
                bytearr[count++] = (byte) c;

            }
            //UTF-8是3字節數據寫入
            else if (c > 0x07FF) {
                bytearr[count++] = (byte) (0xE0 | ((c >> 12) & 0x0F));
                bytearr[count++] = (byte) (0x80 | ((c >>  6) & 0x3F));
                bytearr[count++] = (byte) (0x80 | ((c >>  0) & 0x3F));
            }
            //UTF-8是雙字節數據寫入 
            else {
                bytearr[count++] = (byte) (0xC0 | ((c >>  6) & 0x1F));
                bytearr[count++] = (byte) (0x80 | ((c >>  0) & 0x3F));
            }
        }
        //寫入將字節數組bytearr寫入數據輸入流中
        out.write(bytearr, 0, utflen+2);
        return utflen + 2;
    }

    //返回輸出流中寫入的字節數
    public final int size() {
        return written;
    }
}

    整體而言,DataOutputStream的成員方法邏輯較爲簡單,咱們能夠重點關注邏輯複雜的writeUTF方法,它的流程邏輯以下:

1)獲取UTF-8編碼字符串長度strlen;

2)統計UTF-8編碼字節流長度utflen,若是字節數超出2個字節16位計數範圍2^16-1(65535),那麼拋出異常編碼字節數據過長;

3)聲明字節數組bytearr保存編碼字節數據,字節數組容量分配遵循下述規則:

     1-若是當前輸入流out不是DataOutputStream及其子類實例對象,那麼初始化字節數組bytearr數組長度爲utflen;

     2- 若是當前輸入流out是DataOutputStream及其子類實例對象,那麼若數據輸入流out內部成員變量bytearr爲空或者數組長度小於utflen+2,那麼out成員變量字節數組bytearr擴容爲utflen*2+2,讓聲明數組bytearr指向它,不然直接指向數據輸入流out的內部字節數組bytearr。

4)將標識編碼字節流長度的2個字節寫入輸出緩衝字節數組bytearr;

5)對UTF-8字符串單字節字符進行預處理,遇到非單字節UTF-8字符直接退出;

6)對預處理以後的UTF-8字符數據區分單字節多字節編碼處理後寫入輸出緩衝字節數組bytearr;

7)將緩衝字節數組bytearr寫入到數據輸出流中;

8)返回寫入的字節長度utflen+2(包括字節長度標識位2個字節)

相關文章
相關標籤/搜索