【源碼剖析】序列化底層實現原理

時間 2021-02-17

標籤 java 數組網絡 ide 函數佈局學習 this .net debug 欄目 Java 简体版

原文原文鏈接

在以前的博文中，本人介紹了 Java對象經 序列化 後，轉換成的內容java

相信不少同窗在上一篇博文中，仍對生成的內容的格式抱有不少疑惑
那麼，在本篇博文中，本人就來在源碼角度，來帶同窗們瞭解下 對象序列化 的本質：數組

咱們平時使用 序列化 機制，基本上都會是以下步驟：網絡

調用代碼：

首先，咱們須要一個以後會被序列化的對象：ide

pojo對象：

package edu.youzg.about_serialize.pojo;

import java.io.Serializable;

/**
 * @Author: Youzg
 * @CreateTime: 2021-2-16 16:48
 * @Description: 帶你深究Java的本質！
 */
public class Fan implements Serializable {
    private static final long serialVersionUID = 3439052454193760044L;

    private String name;

    private Integer age;

    private int likeNum;

    public Fan() {
    }

    public Fan(String name, Integer age, int likeNum) {
        this.name = name;
        this.age = age;
        this.likeNum = likeNum;
    }

    public static long getSerialVersionUID() {
        return serialVersionUID;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public Integer getAge() {
        return age;
    }

    public void setAge(Integer age) {
        this.age = age;
    }

    public int getLikeNum() {
        return likeNum;
    }

    public void setLikeNum(int likeNum) {
        this.likeNum = likeNum;
    }

    @Override
    public String toString() {
        return "Fan{" +
                "name='" + name + '\'' +
                ", age=" + age +
                ", likeNum=" + likeNum +
                '}';
    }

}

那麼，要被序列化的pojo準備好了，
如今，本人來給出一個 序列化對象 的小Demo：函數

序列化對象：

package edu.youzg.about_serialize.demo;

import edu.youzg.about_serialize.pojo.Fan;

import java.io.FileOutputStream;
import java.io.IOException;
import java.io.ObjectOutputStream;

public class YouzgDemo {

    public static void main(String[] args) throws IOException {
        try {
            ObjectOutputStream oos = new ObjectOutputStream(new FileOutputStream("./prettyGril.txt"));
            Fan aFan = new Fan("prettyGril", 16, 666);
            oos.writeObject(aFan);
        } catch (Exception e) {
            // TODO: handle exception
        }
    }

}

那麼，至於運行結果，因爲在前一篇博文中展現過，本人就再也不展現了佈局

相信看完前一篇博文的同窗，對於最後的輸出結果的內容，仍然抱有不少疑惑
那麼，接下來，本人就來說解下輸出格式 以及 序列化對象的原理：學習

序列化的本質：

在上面的代碼中，咱們能夠看出：
序列化對象，只需調用兩個方法：this

new ObjectOutputStream();

和.net

oos.writeObject();

那麼，本人如今來說解下這兩個方法的源碼：debug

首先是 ObjectOutputStream類的構造函數：

ObjectOutputStream類✨構造函數源碼：

public ObjectOutputStream(OutputStream out) throws IOException {
        verifySubclass();
    // bout 能夠理解爲一個 「容器」，用於存儲對象序列化後的部分信息
    bout = new BlockDataOutputStream(out);
    handles = new HandleTable(10, (float) 3.00);
    subs = new ReplaceTable(10, (float) 3.00);
    enableOverride = false;
    writeStreamHeader();

	// 將 bout 的 blkmode屬性 置爲 true
    bout.setBlockDataMode(true);
    if (extendedDebugInfo) {
        debugInfoStack = new DebugTraceInfoStack();
    } else {
        debugInfoStack = null;
    }
}

那麼，本人再來展現下上面方法中 BlockDataOutputStream類 的構造函數：

BlockDataOutputStream類 ✨ 構造函數源碼：

private static final int MAX_BLOCK_SIZE = 1024;
private static final int MAX_HEADER_SIZE = 5;
private static final int CHAR_BUF_SIZE = 256;

private final byte[] buf = new byte[MAX_BLOCK_SIZE];
private final byte[] hbuf = new byte[MAX_HEADER_SIZE];
private final char[] cbuf = new char[CHAR_BUF_SIZE];

private boolean blkmode = false;

private int pos = 0;

private final OutputStream out;

private final DataOutputStream dout;

BlockDataOutputStream(OutputStream out) {
    this.out = out;
    dout = new DataOutputStream(this);
}

能夠看到：

BlockDataOutputStream類其實就是 封裝後的 DataOutputStream類
並提供了一些 緩衝區 和參數

咱們也能夠看到，當執行 ObjectOutputStream類的構造函數時，構造了一個 BlockDataOutputStream類的對象，
並在以後的代碼中，
那麼，爲何在這幾行代碼中，本人單獨挑選

bout = new BlockDataOutputStream(out);

這行代碼進行代碼展現呢？

這是由於在以後的 writeObject方法 的代碼中，會主要使用到 bout對象 的方法

在建立完的對象以後，的構造函數調用了 writeStreamHeader方法
那麼，本人就來說解下 writeStreamHeader方法 的源碼：

ObjectOutputStream類✨writeStreamHeader方法源碼：

/**
 * Magic number that is written to the stream header.
 */
final static short STREAM_MAGIC = (short)0xaced;

/**
 * Version number that is written to the stream header.
 */
final static short STREAM_VERSION = 5;

protected void writeStreamHeader() throws IOException {
    bout.writeShort(STREAM_MAGIC);
    bout.writeShort(STREAM_VERSION);
}

根據 源碼註釋，咱們可以知道：

STREAM_MAGIC 表明 序列化的標識
STREAM_MAGIC 表明版本

能夠看到：

其實這個方法的 具體實現流程 咱們不用跟下去
咱們只須要知道這個方法的做用是：發送一個頭信息 便可

其實，在 ObjectOutputStream類 的 構造函數 中，還有一個值得注意的點：

enableOverride = false;

那麼，構造函數的基本流程咱們理解了
如今，本人來說解下 writeObject方法 的源碼：

ObjectOutputStream類✨writeObject方法源碼：

public final void writeObject(Object obj) throws IOException {
    if (enableOverride) {
        writeObjectOverride(obj);
        return;
    }
    try {
        writeObject0(obj, false);
    } catch (IOException ex) {
        if (depth == 0) {
            writeFatalException(ex);
        }
        throw ex;
    }
}

根據上文對代碼的分析，咱們可以知道：

執行 writeObject方法，其實只會執行 writeObject0方法

那麼，接下來，咱們來看看 writeObject0方法 的具體實現步驟：

ObjectOutputStream類✨writeObject0方法源碼：

private void writeObject0(Object obj, boolean unshared)
        throws IOException
    {
    boolean oldMode = bout.setBlockDataMode(false);
    depth++;
    try {
    	// 1⃣️
        // handle previously written and non-replaceable objects
        int h;
        if ((obj = subs.lookup(obj)) == null) {
            writeNull();
            return;
        } else if (!unshared && (h = handles.lookup(obj)) != -1) {
            writeHandle(h);
            return;
        } else if (obj instanceof Class) {
            writeClass((Class) obj, unshared);
            return;
        } else if (obj instanceof ObjectStreamClass) {
            writeClassDesc((ObjectStreamClass) obj, unshared);
            return;
        }

		// 2⃣️
        // check for replacement object
        Object orig = obj;
        Class<?> cl = obj.getClass();
        ObjectStreamClass desc;
        for (;;) {
            // REMIND: skip this check for strings/arrays?
            Class<?> repCl;
            desc = ObjectStreamClass.lookup(cl, true);
            if (!desc.hasWriteReplaceMethod() ||
                (obj = desc.invokeWriteReplace(obj)) == null ||
                (repCl = obj.getClass()) == cl)
            {
                break;
            }
            cl = repCl;
        }
        if (enableReplace) {
            Object rep = replaceObject(obj);
            if (rep != obj && rep != null) {
                cl = rep.getClass();
                desc = ObjectStreamClass.lookup(cl, true);
            }
            obj = rep;
        }

		// 3⃣️
        // if object replaced, run through original checks a second time
        if (obj != orig) {
            subs.assign(orig, obj);
            if (obj == null) {
                writeNull();
                return;
            } else if (!unshared && (h = handles.lookup(obj)) != -1) {
                writeHandle(h);
                return;
            } else if (obj instanceof Class) {
                writeClass((Class) obj, unshared);
                return;
            } else if (obj instanceof ObjectStreamClass) {
                writeClassDesc((ObjectStreamClass) obj, unshared);
                return;
            }
        }

		// 4⃣️
        // remaining cases
        if (obj instanceof String) {
            writeString((String) obj, unshared);
        } else if (cl.isArray()) {
            writeArray(obj, desc, unshared);
        } else if (obj instanceof Enum) {
            writeEnum((Enum<?>) obj, desc, unshared);
        } else if (obj instanceof Serializable) {
            writeOrdinaryObject(obj, desc, unshared);
        } else {
            if (extendedDebugInfo) {
                throw new NotSerializableException(
                    cl.getName() + "\n" + debugInfoStack.toString());
            } else {
                throw new NotSerializableException(cl.getName());
            }
        }
    } finally {
        depth--;
        bout.setBlockDataMode(oldMode);
    }
}

這裏的源碼太長了
在本人看來，長的代碼的可讀性不好
所以，本人在源碼中經過 序號註釋，將這個方法拆分紅了幾部分

那麼，本人如今就來分別講解下這幾部分的原理：

第一部分：

// 1⃣️
// handle previously written and non-replaceable objects
int h;
if ((obj = subs.lookup(obj)) == null) {
    writeNull();
    return;
} else if (!unshared && (h = handles.lookup(obj)) != -1) {
    writeHandle(h);
    return;
} else if (obj instanceof Class) {
    writeClass((Class) obj, unshared);
    return;
} else if (obj instanceof ObjectStreamClass) {
    writeClassDesc((ObjectStreamClass) obj, unshared);
    return;
}

從源碼的註釋中，咱們可以知道：

這部分代碼的 主要功能 是：
處理 之前編寫的對象 和 不可替換對象

絕大多數狀況下，咱們的代碼，是不會進入這個代碼塊的
因此，這個代碼塊咱們只要知道它的做用就能夠了

第二部分：

// 2⃣️
// check for replacement object
Object orig = obj;
Class<?> cl = obj.getClass();
ObjectStreamClass desc;
for (;;) {
    // REMIND: skip this check for strings/arrays?
    Class<?> repCl;
    desc = ObjectStreamClass.lookup(cl, true);
    if (!desc.hasWriteReplaceMethod() ||
        (obj = desc.invokeWriteReplace(obj)) == null ||
        (repCl = obj.getClass()) == cl)
    {
        break;
    }
    cl = repCl;
}
if (enableReplace) {
    Object rep = replaceObject(obj);
    if (rep != obj && rep != null) {
        cl = rep.getClass();
        desc = ObjectStreamClass.lookup(cl, true);
    }
    obj = rep;
}

根據註釋，咱們可以知道：

這部分代碼的功能是 檢查可替換對象
其實，咱們來讀一讀源碼中的if判斷條件，也會發現：
這部分的代碼塊也是 不會執行的

第三部分：

// 3⃣️
// if object replaced, run through original checks a second time
if (obj != orig) {
    subs.assign(orig, obj);
    if (obj == null) {
        writeNull();
        return;
    } else if (!unshared && (h = handles.lookup(obj)) != -1) {
        writeHandle(h);
        return;
    } else if (obj instanceof Class) {
        writeClass((Class) obj, unshared);
        return;
    } else if (obj instanceof ObjectStreamClass) {
        writeClassDesc((ObjectStreamClass) obj, unshared);
        return;
    }
}

根據 源碼註釋 和 以前代碼的講解，咱們可以知道：

這裏的代碼塊的做用是若是對象被替換，則第二次執行原始檢查
可是，咱們並無在上面代碼塊中替換對象
所以，這部分代碼塊是不會執行的

第四部分：

// 4⃣️
// remaining cases
if (obj instanceof String) {
    writeString((String) obj, unshared);
} else if (cl.isArray()) {
    writeArray(obj, desc, unshared);
} else if (obj instanceof Enum) {
    writeEnum((Enum<?>) obj, desc, unshared);
} else if (obj instanceof Serializable) {
    writeOrdinaryObject(obj, desc, unshared);
} else {
    if (extendedDebugInfo) {
        throw new NotSerializableException(
            cl.getName() + "\n" + debugInfoStack.toString());
    } else {
        throw new NotSerializableException(cl.getName());
    }
}

從註釋中，咱們能夠得知：

這部分的代碼塊的做用是：處理剩餘類型的對象

其實這裏咱們也能看出：

就是在處理 字符串、數組、枚舉、可序列化對象

因爲咱們通常序列化進行網絡傳輸的基本上都是 可序列化對象，
那麼，本人就來着重講解下 可序列化對象 的處理方法進行詳細講解：

可序列化對象的處理：

writeOrdinaryObject 方法✨源碼解析：

private void writeOrdinaryObject(Object obj,
                                 ObjectStreamClass desc,
                                 boolean unshared)
    throws IOException
{
    if (extendedDebugInfo) {
        debugInfoStack.push(
            (depth == 1 ? "root " : "") + "object (class \"" +
            obj.getClass().getName() + "\", " + obj.toString() + ")");
    }
    try {
    	// 檢查 可序列化性
        desc.checkSerialize();

		// 先寫入一個 「對象標誌符」，表示當前傳輸的數據爲 一個對象
        bout.writeByte(TC_OBJECT);
        // 寫入 類元數據
        writeClassDesc(desc, false);
        handles.assign(unshared ? null : obj);
        if (desc.isExternalizable() && !desc.isProxy()) {
            writeExternalData((Externalizable) obj);
        } else {
        	// 將 序列化後的對象數據 寫入
            writeSerialData(obj, desc);
        }
    } finally {
        if (extendedDebugInfo) {
            debugInfoStack.pop();
        }
    }
}

writeClassDesc 方法✨源碼解析：

private void writeClassDesc(ObjectStreamClass desc, boolean unshared)
    throws IOException
{
    int handle;
    if (desc == null) {
        writeNull();
    } else if (!unshared && (handle = handles.lookup(desc)) != -1) {
        writeHandle(handle);
    } else if (desc.isProxy()) {
        writeProxyDesc(desc, unshared);
    } else {
    	// 最終咱們的邏輯，只會執行這個方法
        writeNonProxyDesc(desc, unshared);
    }
}

writeNonProxyDesc 方法✨源碼解析：

/**
 * Writes class descriptor representing a standard (i.e., not a dynamic
 * proxy) class to stream.
 */
private void writeNonProxyDesc(ObjectStreamClass desc, boolean unshared)
    throws IOException
{
	// 先寫入 一個「類元標誌符」
    bout.writeByte(TC_CLASSDESC);
    handles.assign(unshared ? null : desc);

    if (protocol == PROTOCOL_VERSION_1) {
        // do not invoke class descriptor write hook with old protocol
        desc.writeNonProxy(this);
    } else {
    	// 這個if循環，通常會執行else代碼塊中的代碼
        writeClassDescriptor(desc);
    }

    Class<?> cl = desc.forClass();
    bout.setBlockDataMode(true);
    if (cl != null && isCustomSubclass()) {
        ReflectUtil.checkPackageAccess(cl);
    }
    annotateClass(cl);
    bout.setBlockDataMode(false);
    // 寫入 一個「object描述塊結束標誌符」
    bout.writeByte(TC_ENDBLOCKDATA);

    writeClassDesc(desc.getSuperDesc(), false);
}

那麼，本人再來展現下上面方法，最後調用的writeClassDescriptor方法的源碼：

writeClassDescriptor 方法✨源碼解析：

protected void writeClassDescriptor(ObjectStreamClass desc)
        throws IOException
{
    desc.writeNonProxy(this);
}

能夠看到：

這個方法其實就是包裝調用的 writeNonProxy方法

writeNonProxy 方法✨源碼解析：

/**
 * Writes non-proxy class descriptor information to given output stream.
 */
void writeNonProxy(ObjectOutputStream out) throws IOException {
	// 先寫入 類名
    out.writeUTF(name);
	// 先寫入 類的序列號
    out.writeLong(getSerialVersionUID());

	// 計算 類的特性（flags）
    byte flags = 0;
    if (externalizable) {
        flags |= ObjectStreamConstants.SC_EXTERNALIZABLE;
        int protocol = out.getProtocolVersion();
        if (protocol != ObjectStreamConstants.PROTOCOL_VERSION_1) {
            flags |= ObjectStreamConstants.SC_BLOCK_DATA;
        }
    } else if (serializable) {
    	// 通常都會執行這裏的代碼
    	// 由於咱們序列化的對象，通常都是具備serializable屬性的
        flags |= ObjectStreamConstants.SC_SERIALIZABLE;
    }
    if (hasWriteObjectData) {
        flags |= ObjectStreamConstants.SC_WRITE_METHOD;
    }
    if (isEnum) {
        flags |= ObjectStreamConstants.SC_ENUM;
    }
    // 將 類的特性 寫入，以便「反解析」的進行
    out.writeByte(flags);

	// 寫入 對象的字段前，先寫入 字段的數量，以便以後的「反解析」過程的進行
    out.writeShort(fields.length);
    for (int i = 0; i < fields.length; i++) {
        ObjectStreamField f = fields[i];
        // 寫入 字段的類型碼
        out.writeByte(f.getTypeCode());
        // 寫入 字段的名字
        out.writeUTF(f.getName());

		// 若是是 對象/接口
		// 寫入 表示對象的字符串
        if (!f.isPrimitive()) {
            out.writeTypeString(f.getTypeString());
        }
    }
}

根據本人在源碼中的註釋，咱們也能看出；

這個方法的做用是：
寫入目標對象的 類型信息

具體寫入順序爲：

類名

類的序列號

類的特性

字段信息（字段的類型碼 + 字段的名字）

表示對象的字符串（若是是對象接口）

那麼，上面代碼中，本人說起的 字段狀態碼 是什麼呢？

字段類型碼：

咱們能夠從這個方法中，看到類型和 類型碼 的 對應關係：

/**
 * Creates an ObjectStreamField representing a field with the given name,
 * signature and unshared setting.
 */
ObjectStreamField(String name, String signature, boolean unshared) {
    if (name == null) {
        throw new NullPointerException();
    }
    this.name = name;
    this.signature = signature.intern();
    this.unshared = unshared;
    field = null;

    switch (signature.charAt(0)) {
        case 'Z': type = Boolean.TYPE; break;
        case 'B': type = Byte.TYPE; break;
        case 'C': type = Character.TYPE; break;
        case 'S': type = Short.TYPE; break;
        case 'I': type = Integer.TYPE; break;
        case 'J': type = Long.TYPE; break;
        case 'F': type = Float.TYPE; break;
        case 'D': type = Double.TYPE; break;
        case 'L':
        case '[': type = Object.class; break;
        default: throw new IllegalArgumentException("illegal signature");
    }
}

那麼，本人來總結下：

類型	類型碼
Boolean	Z
Byte	B
Character	C
Short	S
Integer	I
Long	J
Float	F
Double	D
數組	[
Object（對象/接口）	L
其它	非法參數異常

writeSerialData 方法✨源碼解析：

/**
 * Writes instance data for each serializable class of given object, from
 * superclass to subclass.
 */
private void writeSerialData(Object obj, ObjectStreamClass desc)
    throws IOException
{
	// 按照「由父到子」順序，獲取 序列化對象的數據佈局
    ObjectStreamClass.ClassDataSlot[] slots = desc.getClassDataLayout();
    for (int i = 0; i < slots.length; i++) {
        ObjectStreamClass slotDesc = slots[i].desc;
        if (slotDesc.hasWriteObjectMethod()) {	// 若是 目標對象的類 實現了 writeObject方法
            PutFieldImpl oldPut = curPut;
            curPut = null;
            SerialCallbackContext oldContext = curContext;

            if (extendedDebugInfo) {
                debugInfoStack.push(
                    "custom writeObject data (class \"" +
                    slotDesc.getName() + "\")");
            }
            try {
                curContext = new SerialCallbackContext(obj, slotDesc);
                bout.setBlockDataMode(true);
                slotDesc.invokeWriteObject(obj, this);
                bout.setBlockDataMode(false);
                bout.writeByte(TC_ENDBLOCKDATA);
            } finally {
                curContext.setUsed();
                curContext = oldContext;
                if (extendedDebugInfo) {
                    debugInfoStack.pop();
                }
            }

            curPut = oldPut;
        } else {
        	// 默認方式，寫入實例數據
            defaultWriteFields(obj, slotDesc);
        }
    }
}

那麼，本人再來展現下 默認寫入實例數據 的源碼：

defaultWriteFields 方法✨源碼解析：

private void defaultWriteFields(Object obj, ObjectStreamClass desc)
    throws IOException
{
    Class<?> cl = desc.forClass();
    if (cl != null && obj != null && !cl.isInstance(obj)) {
        throw new ClassCastException();
    }

    desc.checkDefaultSerialize();

    int primDataSize = desc.getPrimDataSize();
    if (primVals == null || primVals.length < primDataSize) {
        primVals = new byte[primDataSize];
    }
    // 獲取 對象中的 基本類型的實例數據
    // 並將 這些數據 放入 primVals數組 中
    desc.getPrimFieldValues(obj, primVals);
    // 寫入 實例數據數組
    bout.write(primVals, 0, primDataSize, false);

	// 將 目標對象的成員屬性字段 保存在 objVals數組 中
    ObjectStreamField[] fields = desc.getFields(false);
    Object[] objVals = new Object[desc.getNumObjFields()];
    int numPrimFields = fields.length - objVals.length;
    desc.getObjFieldValues(obj, objVals);
    for (int i = 0; i < objVals.length; i++) {
        if (extendedDebugInfo) {
            debugInfoStack.push(
                "field (class \"" + desc.getName() + "\", name: \"" +
                fields[numPrimFields + i].getName() + "\", type: \"" +
                fields[numPrimFields + i].getType() + "\")");
        }
        try {
        	// 以 「遞歸式」 將 目標對象 的數據，轉換並寫入
            writeObject0(objVals[i],
                         fields[numPrimFields + i].isUnshared());
        } finally {
            if (extendedDebugInfo) {
                debugInfoStack.pop();
            }
        }
    }
}

從上面代碼和本人的註釋，咱們可以得出：