以太坊智能合約虛擬機(EVM)原理與實現

以太坊 EVM原理與實現

以太坊底層經過EVM模塊支持合約的執行與調用,調用時根據合約地址獲取到代碼,生成環境後載入到EVM中運行。一般智能合約的開發流程是用solidlity編寫邏輯代碼,再經過編譯器編譯元數據,最後再發布到以太坊上。html

01220829_zhgu[1].png

代碼結構
.
├── analysis.go            //跳轉目標斷定
├── common.go
├── contract.go            //合約數據結構
├── contracts.go           //預編譯好的合約
├── errors.go
├── evm.go                 //執行器 對外提供一些外部接口   
├── gas.go                 //call gas花費計算 一級指令耗費gas級別
├── gas_table.go           //指令耗費計算函數表
├── gen_structlog.go       
├── instructions.go        //指令操做
├── interface.go           
├── interpreter.go         //解釋器 調用核心
├── intpool.go             //int值池
├── int_pool_verifier_empty.go
├── int_pool_verifier.go
├── jump_table.go           //指令和指令操做(操做,花費,驗證)對應表
├── logger.go               //狀態日誌
├── memory.go               //EVM 內存
├── memory_table.go         //EVM 內存操做表 主要衡量操做所需內存大小
├── noop.go
├── opcodes.go              //Op指令 以及一些對應關係     
├── runtime
│   ├── env.go              //執行環境 
│   ├── fuzz.go
│   └── runtime.go          //運行接口 測試使用
├── stack.go                //棧
└── stack_table.go          //棧驗證

指令
OpCode
文件opcodes.go中定義了全部的OpCode,該值是一個byte,合約編譯出來的bytecode中,一個OpCode就是上面的一位。opcodes按功能分爲9組(運算相關,塊操做,加密相關等)。c++

//算數相關
    const (
        // 0x0 range - arithmetic ops
        STOP OpCode = iota
        ADD
        MUL
        SUB
        DIV
        SDIV
        MOD
        SMOD
        ADDMOD
        MULMOD
        EXP
        SIGNEXTEND
    )

Instruction
文件jump.table.go定義了四種指令集合,每一個集合實質上是個256長度的數組,名字翻譯過來是(荒地,農莊,拜占庭,君士坦丁堡)估計是對應了EVM的四個發展階段。指令集向前兼容。數據庫

frontierInstructionSet       = NewFrontierInstructionSet()
    homesteadInstructionSet      = NewHomesteadInstructionSet()
    byzantiumInstructionSet      = NewByzantiumInstructionSet()
    constantinopleInstructionSet = NewConstantinopleInstructionSet()

具體每條指令結構以下,字段意思見註釋。數組

type operation struct {
    //對應的操做函數
    execute executionFunc
    // 操做對應的gas消耗
    gasCost gasFunc
    // 棧深度驗證
    validateStack stackValidationFunc
    // 操做所需空間
    memorySize memorySizeFunc

    halts   bool // 運算停止
    jumps   bool // 跳轉(for)
    writes  bool // 是否寫入
    valid   bool // 操做是否有效
    reverts bool // 出錯回滾
    returns bool // 返回
}

按下面的ADD指令爲例緩存

定義
    ADD: {
        execute:       opAdd,
        gasCost:       constGasFunc(GasFastestStep),
        validateStack: makeStackFunc(2, 1),
        valid:         true,
    },

操做
不一樣的操做有所不一樣,操做對象根據指令不一樣可能影響棧,內存,statedb。數據結構

func opAdd(pc *uint64, evm *EVM, contract *Contract, memory *Memory, stack *Stack) ([]byte, error) {
        //彈出一個值,取出一個值(這個值依舊保存在棧上面,運算結束後這個值就改變成結果值)
        x, y := stack.pop(), stack.peek()
        //加運算
        math.U256(y.Add(x, y))
        //數值緩存
        evm.interpreter.intPool.put(x)
        return nil, nil
    }

gas花費
不一樣的運算有不一樣的初始值和對應的運算方法,具體的方法都定義在gas_table裏面。 按加法的爲例,一次加操做固定耗費爲3。app

//固定耗費
    func constGasFunc(gas uint64) gasFunc {
        return func(gt params.GasTable, evm *EVM, contract *Contract, stack *Stack, mem *Memory, memorySize uint64) (uint64, error) {
            return gas, nil
        }
    }

除此以外還有兩個定義會影響gas的計算,一般做爲量化的一個單位。less

//file go-ethereum/core/vm/gas.go
    const (
        GasQuickStep   uint64 = 2
        GasFastestStep uint64 = 3
        GasFastStep    uint64 = 5
        GasMidStep     uint64 = 8
        GasSlowStep    uint64 = 10
        GasExtStep     uint64 = 20

        GasReturn       uint64 = 0
        GasStop         uint64 = 0
        GasContractByte uint64 = 200
    )

    //file go-ethereum/params/gas_table.go
    type GasTable struct {
        ExtcodeSize uint64
        ExtcodeCopy uint64
        Balance     uint64
        SLoad       uint64
        Calls       uint64
        Suicide     uint64

        ExpByte uint64

        // CreateBySuicide occurs when the
        // refunded account is one that does
        // not exist. This logic is similar
        // to call. May be left nil. Nil means
        // not charged.
        CreateBySuicide uint64
    }

memorySize
由於加操做不須要申請內存於是memorySize爲默認值0。ide

棧驗證
先驗證棧上的操做數夠不夠,再驗證棧是否超出最大限制,加法在這裏僅需驗證其參數夠不夠,運算以後棧是要減一的。函數

func makeStackFunc(pop, push int) stackValidationFunc {
        return func(stack *Stack) error {
            //深度驗證
            if err := stack.require(pop); err != nil {
                return err
            }
            //最大值驗證
            //StackLimit       uint64 = 1024 
            if stack.len()+push-pop > int(params.StackLimit) {
                return fmt.Errorf("stack limit reached %d (%d)", stack.len(), params.StackLimit)
            }
            return nil
        }
    }

智能合約
合約是EVM智能合約的存儲單位也是解釋器執行的基本單位,包含了代碼,調用人,全部人,gas相關的信息.

type Contract struct {
        // CallerAddress is the result of the caller which initialised this
        // contract. However when the "call method" is delegated this value
        // needs to be initialised to that of the caller's caller.
        CallerAddress common.Address
        caller        ContractRef
        self          ContractRef

        jumpdests destinations // result of JUMPDEST analysis.

        Code     []byte
        CodeHash common.Hash
        CodeAddr *common.Address
        Input    []byte

        Gas   uint64
        value *big.Int

        Args []byte

        DelegateCall bool
    }

EVM原生預編譯了一批合約,定義在contracts.go裏面。主要用於加密操做。

// PrecompiledContractsByzantium contains the default set of pre-compiled Ethereum
// contracts used in the Byzantium release.
var PrecompiledContractsByzantium = map[common.Address]PrecompiledContract{
    common.BytesToAddress([]byte{1}): &ecrecover{},
    common.BytesToAddress([]byte{2}): &sha256hash{},
    common.BytesToAddress([]byte{3}): &ripemd160hash{},
    common.BytesToAddress([]byte{4}): &dataCopy{},
    common.BytesToAddress([]byte{5}): &bigModExp{},
    common.BytesToAddress([]byte{6}): &bn256Add{},
    common.BytesToAddress([]byte{7}): &bn256ScalarMul{},
    common.BytesToAddress([]byte{8}): &bn256Pairing{},
}

執行機

EVM中棧用於保存操做數,每一個操做數的類型是big.int,這就是網上不少人說EVM是256位虛擬機的緣由。執行opcode的時候,從上往下彈出操做數,做爲操做的參數。

type Stack struct {
    data []*big.Int
}

func (st *Stack) push(d *big.Int) {
    // NOTE push limit (1024) is checked in baseCheck
    //stackItem := new(big.Int).Set(d)
    //st.data = append(st.data, stackItem)
    st.data = append(st.data, d)
}

func (st *Stack) peek() *big.Int {
    return st.data[st.len()-1]
}

func (st *Stack) pop() (ret *big.Int) {
    ret = st.data[len(st.data)-1]
    st.data = st.data[:len(st.data)-1]
    return
}

內存
內存用於一些內存操做(MLOAD,MSTORE,MSTORE8)及合約調用的參數拷貝(CALL,CALLCODE)。

內存數據結構,維護了一個byte數組,MLOAD,MSTORE讀取存入的時候都要指定位置及長度才能準確的讀寫。

type Memory struct {
        store       []byte
        lastGasCost uint64
    }

    // Set sets offset + size to value
    func (m *Memory) Set(offset, size uint64, value []byte) {
        // length of store may never be less than offset + size.
        // The store should be resized PRIOR to setting the memory
        if size > uint64(len(m.store)) {
            panic("INVALID memory: store empty")
        }

        // It's possible the offset is greater than 0 and size equals 0. This is because
        // the calcMemSize (common.go) could potentially return 0 when size is zero (NO-OP)
        if size > 0 {
            copy(m.store[offset:offset+size], value)
        }
    }

    func (self *Memory) Get(offset, size int64) (cpy []byte) {
        if size == 0 {
            return nil
        }

        if len(self.store) > int(offset) {
            cpy = make([]byte, size)
            copy(cpy, self.store[offset:offset+size])

            return
        }

        return
    }

內存操做

func opMload(pc *uint64, evm *EVM, contract *Contract, memory *Memory, stack *Stack) ([]byte, error) {
        offset := stack.pop()
        val := evm.interpreter.intPool.get().SetBytes(memory.Get(offset.Int64(), 32))
        stack.push(val)

        evm.interpreter.intPool.put(offset)
        return nil, nil
    }

    func opMstore(pc *uint64, evm *EVM, contract *Contract, memory *Memory, stack *Stack) ([]byte, error) {
        // pop value of the stack
        mStart, val := stack.pop(), stack.pop()
        memory.Set(mStart.Uint64(), 32, math.PaddedBigBytes(val, 32))

        evm.interpreter.intPool.put(mStart, val)
        return nil, nil
    }

    func opMstore8(pc *uint64, evm *EVM, contract *Contract, memory *Memory, stack *Stack) ([]byte, error) {
        off, val := stack.pop().Int64(), stack.pop().Int64()
        memory.store[off] = byte(val & 0xff)

        return nil, nil
    }

stateDb
合約自己不保存數據,那麼合約的數據是保存在哪裏呢?合約及其調用相似於數據庫的日誌,保存了合約定義以及對他的一系列操做,只要將這些操做執行一遍就能獲取當前的結果,可是若是每次都要去執行就太慢了,於是這部分數據是會持久化到stateDb裏面的。code中定義了兩條指令SSTORE SLOAD用於從db中讀寫合約當前的狀態。

func opSload(pc *uint64, evm *EVM, contract *Contract, memory *Memory, stack *Stack) ([]byte, error) {
        loc := common.BigToHash(stack.pop())
        val := evm.StateDB.GetState(contract.Address(), loc).Big()
        stack.push(val)
        return nil, nil
    }

    func opSstore(pc *uint64, evm *EVM, contract *Contract, memory *Memory, stack *Stack) ([]byte, error) {
        loc := common.BigToHash(stack.pop())
        val := stack.pop()
        evm.StateDB.SetState(contract.Address(), loc, common.BigToHash(val))

        evm.interpreter.intPool.put(val)
        return nil, nil
    }

執行過程
執行入口定義在evm.go中,功能就是組裝執行環境(代碼,執行人關係,參數等)。

func (evm *EVM) Call(caller ContractRef, addr common.Address, input []byte, gas uint64, value *big.Int) (ret []byte, leftOverGas uint64, err error) {
        if evm.vmConfig.NoRecursion && evm.depth > 0 {
            return nil, gas, nil
        }

        // 合約調用深度檢查
        if evm.depth > int(params.CallCreateDepth) {
            return nil, gas, ErrDepth
        }
        // balance 檢查
        if !evm.Context.CanTransfer(evm.StateDB, caller.Address(), value) {
            return nil, gas, ErrInsufficientBalance
        }

        var (
            to       = AccountRef(addr)
            //保存當前狀態,若是出錯,就回滾到這個狀態
            snapshot = evm.StateDB.Snapshot()
        )
        if !evm.StateDB.Exist(addr) {
            //建立調用對象的stateObject
            precompiles := PrecompiledContractsHomestead
            if evm.ChainConfig().IsByzantium(evm.BlockNumber) {
                precompiles = PrecompiledContractsByzantium
            }
            if precompiles[addr] == nil && evm.ChainConfig().IsEIP158(evm.BlockNumber) && value.Sign() == 0 {
                return nil, gas, nil
            }
            evm.StateDB.CreateAccount(addr)
        }
        //調用別人合約可能須要花錢
        evm.Transfer(evm.StateDB, caller.Address(), to.Address(), value)

        //建立合約環境
        contract := NewContract(caller, to, value, gas)
        contract.SetCallCode(&addr, evm.StateDB.GetCodeHash(addr), evm.StateDB.GetCode(addr))

        start := time.Now()

        // Capture the tracer start/end events in debug mode
        if evm.vmConfig.Debug && evm.depth == 0 {
            evm.vmConfig.Tracer.CaptureStart(caller.Address(), addr, false, input, gas, value)

            defer func() { // Lazy evaluation of the parameters
                evm.vmConfig.Tracer.CaptureEnd(ret, gas-contract.Gas, time.Since(start), err)
            }()
        }
        //執行操做
        ret, err = run(evm, contract, input)

        // When an error was returned by the EVM or when setting the creation code
        // above we revert to the snapshot and consume any gas remaining. Additionally
        // when we're in homestead this also counts for code storage gas errors.
        if err != nil {
            //錯誤回滾
            evm.StateDB.RevertToSnapshot(snapshot)
            if err != errExecutionReverted {
                contract.UseGas(contract.Gas)
            }
        }
        return ret, contract.Gas, err
    }

相似的函數有四個。詳細區別見最後的參考。

Call A->B A,B的環境獨立

CallCode、 和Call相似 區別在於storage位置不同

DelegateCall、 和CallCode相似,區別在於msg.send不同

StaticCall 和call類似 只是不能修改狀態

Contract和參數構造完成後調用執行函數,執行函數會檢查調用的是否會以前編譯好的原生合約,若是是原生合約則調用原生合約,不然調用解釋器執行函數運算合約。

// run runs the given contract and takes care of running precompiles with a fallback to the byte code interpreter.
    func run(evm *EVM, contract *Contract, input []byte) ([]byte, error) {
        if contract.CodeAddr != nil {
            precompiles := PrecompiledContractsHomestead
            if evm.ChainConfig().IsByzantium(evm.BlockNumber) {
                precompiles = PrecompiledContractsByzantium
            }
            if p := precompiles[*contract.CodeAddr]; p != nil {
                return RunPrecompiledContract(p, input, contract)
            }
        }
        return evm.interpreter.Run(contract, input)
    }

解釋器

func (in *Interpreter) Run(contract *Contract, input []byte) (ret []byte, err error) {

        //返回數據
        in.returnData = nil

        var (
            op    OpCode        // 當前指令
            mem   = NewMemory() // 內存
            stack = newstack()  // 棧
            pc   = uint64(0)    // 指令位置
            cost uint64         // gas花費
            pcCopy  uint64      // debug使用
            gasCopy uint64      // debug使用
            logged  bool        // debug使用
        )
        contract.Input = input  //函數入參

        //*****省略******

        for atomic.LoadInt32(&in.evm.abort) == 0 {
            //獲取一條指令及指令對應的操做
            op = contract.GetOp(pc)
            operation := in.cfg.JumpTable[op]
            //valid校驗
            if !operation.valid {
                return nil, fmt.Errorf("invalid opcode 0x%x", int(op))
            }
            //棧校驗
            if err := operation.validateStack(stack); err != nil {
                return nil, err
            }
            //修改檢查
            if err := in.enforceRestrictions(op, operation, stack); err != nil {
                return nil, err
            }

            var memorySize uint64
            //計算內存 按操做所須要的操做數來算
            if operation.memorySize != nil {
                memSize, overflow := bigUint64(operation.memorySize(stack))
                if overflow {
                    return nil, errGasUintOverflow
                }
                // 
                if memorySize, overflow = math.SafeMul(toWordSize(memSize), 32); overflow {
                    return nil, errGasUintOverflow
                }
            }
            // 校驗cost 調用前面提到的costfunc 計算本次操做cost消耗
            cost, err = operation.gasCost(in.gasTable, in.evm, contract, stack, mem, memorySize)
            if err != nil || !contract.UseGas(cost) {
                return nil, ErrOutOfGas  //超出掛掉
            }
            if memorySize > 0 {
                //若是本次操做須要消耗memory ,擴展memory 
                mem.Resize(memorySize)  
            }

            // 執行操做
            res, err := operation.execute(&pc, in.evm, contract, mem, stack)

            if verifyPool {
                verifyIntegerPool(in.intPool)
            }
            // 若是遇到return 設置返回值
            if operation.returns {
                in.returnData = res
            }

            switch {
            case err != nil:
                return nil, err       //報錯
            case operation.reverts:   //出錯回滾
                return res, errExecutionReverted
            case operation.halts:
                return res, nil       //中止
            case !operation.jumps:    //跳轉
                pc++
            }
        }
        return nil, nil
    }

Solidity案例
和其餘語言相似,有了字節碼運行機,就能夠在字節碼上面再組織其餘高級語言,而solidlity語言就是實現了這樣的語言編譯器,方便了合約編寫,有利於推廣以太坊dapp開發。

pragma solidity ^0.4.17;

contract simple {
      uint num = 0;
    function simple(){
        num = 123;
    }
    
  
    function add(uint i) public returns(uint){
        uint m = 111;
        num =num * i+m;
        return num;
    } 

}

生成的Opcodes碼

JUMPDEST 函數入口

PUSH + JUMPI/JUMP 相似於調用函數

CALLDATASIZE + CALLDATALOAD 大約是獲取函數參數

.code
  PUSH 80           contract simple {\n      uint ...
  PUSH 40           contract simple {\n      uint ...
  MSTORE            contract simple {\n      uint ...
  PUSH 0            0  //成員變量初始值
  DUP1          uint num = 0
  //從下面這條指令能夠看出,初始化的時候成員變量就會存到statedb裏面去
  SSTORE            uint num = 0
  CALLVALUE             function simple(){\n        nu...
  DUP1          olidity ^
  ISZERO            a 
  PUSH [tag] 1          a 
  JUMPI             a 
  PUSH 0            r
  DUP1          o
  REVERT            .17;\n
contra
tag 1           a 
  //下面部分是構造函數執行的部分
  JUMPDEST          a 
  POP           function simple(){\n        nu...
  PUSH 7B           123
  PUSH 0            num  
  DUP2          num = 123
  SWAP1             num = 123
  //改變成員變量最後都會寫入到statedb裏面去
  SSTORE            num = 123
  POP           num = 123
  PUSH #[$] 0000000000000000000000000000000000000000000000000000000000000000            contract simple {\n      uint ...
  DUP1          contract simple {\n      uint ...
  PUSH [$] 0000000000000000000000000000000000000000000000000000000000000000         contract simple {\n      uint ...
  PUSH 0            contract simple {\n      uint ...
  CODECOPY          contract simple {\n      uint ...
  PUSH 0            contract simple {\n      uint ...
  RETURN            contract simple {\n      uint ...
  //上面部分作完初始化以後並不會進入到runtime階段
.data
  0:
    .code
      //下面這段代碼大約是處理參數的
      PUSH 80           contract simple {\n      uint ...
      PUSH 40           contract simple {\n      uint ...
      MSTORE            contract simple {\n      uint ...
      PUSH 4            contract simple {\n      uint ...
      CALLDATASIZE          contract simple {\n      uint ...
      LT            contract simple {\n      uint ...
      PUSH [tag] 1          contract simple {\n      uint ...
      JUMPI             contract simple {\n      uint ...
      PUSH 0            contract simple {\n      uint ...
      CALLDATALOAD          contract simple {\n      uint ...
      PUSH 100000000000000000000000000000000000000000000000000000000            contract simple {\n      uint ...
      SWAP1             contract simple {\n      uint ...
      DIV           contract simple {\n      uint ...
      PUSH FFFFFFFF         contract simple {\n      uint ...
      AND           contract simple {\n      uint ...
      DUP1          contract simple {\n      uint ...
      PUSH 1003E2D2         contract simple {\n      uint ...
      EQ            contract simple {\n      uint ...
      PUSH [tag] 2          contract simple {\n      uint ...
      JUMPI             contract simple {\n      uint ...
    tag 1           contract simple {\n      uint ...
      JUMPDEST          contract simple {\n      uint ...
      PUSH 0            contract simple {\n      uint ...
      DUP1          contract simple {\n      uint ...
      REVERT            contract simple {\n      uint ...
    tag 2           function add(uint i) public re...
      JUMPDEST          function add(uint i) public re...
      CALLVALUE             function add(uint i) public re...
      DUP1          olidity ^
      ISZERO            a 
      PUSH [tag] 3          a 
      JUMPI             a 
      PUSH 0            r
      DUP1          o
      REVERT            .17;\n
contra
    tag 3           a 
      JUMPDEST          a 
      POP           function add(uint i) public re...
      PUSH [tag] 4          function add(uint i) public re...
      PUSH 4            function add(uint i) public re...
      DUP1          function add(uint i) public re...
      CALLDATASIZE          function add(uint i) public re...
      SUB           function add(uint i) public re...
      DUP2          function add(uint i) public re...
      ADD           function add(uint i) public re...
      SWAP1             function add(uint i) public re...
      DUP1          function add(uint i) public re...
      DUP1          function add(uint i) public re...
      CALLDATALOAD          function add(uint i) public re...
      SWAP1             function add(uint i) public re...
      PUSH 20           function add(uint i) public re...
      ADD           function add(uint i) public re...
      SWAP1             function add(uint i) public re...
      SWAP3             function add(uint i) public re...
      SWAP2             function add(uint i) public re...
      SWAP1             function add(uint i) public re...
      POP           function add(uint i) public re...
      POP           function add(uint i) public re...
      POP           function add(uint i) public re...
      PUSH [tag] 5          function add(uint i) public re...
      JUMP          function add(uint i) public re...
    tag 4           function add(uint i) public re...
      JUMPDEST          function add(uint i) public re...
      PUSH 40           function add(uint i) public re...
      MLOAD             function add(uint i) public re...
      DUP1          function add(uint i) public re...
      DUP3          function add(uint i) public re...
      DUP2          function add(uint i) public re...
      MSTORE            function add(uint i) public re...
      PUSH 20           function add(uint i) public re...
      ADD           function add(uint i) public re...
      SWAP2             function add(uint i) public re...
      POP           function add(uint i) public re...
      POP           function add(uint i) public re...
      PUSH 40           function add(uint i) public re...
      MLOAD             function add(uint i) public re...
      DUP1          function add(uint i) public re...
      SWAP2             function add(uint i) public re...
      SUB           function add(uint i) public re...
      SWAP1             function add(uint i) public re...
      RETURN            function add(uint i) public re...
    tag 5           function add(uint i) public re...
      //函數內容
JUMPDEST            function add(uint i) public re...
      //這下面就是函數的代碼了
      PUSH 0            uint //局部變量在棧裏面
      DUP1          uint m
      PUSH 6F           111
      SWAP1             uint m = 111
      POP           uint m = 111 //從push0到這裏實現了定義局部變量並賦值
      DUP1          m
      DUP4          i            //獲取參數
      PUSH 0            num
      SLOAD             num      //上面那句和這句實現了讀取成員變量
      MUL           num * i      //乘
      ADD           num * i+m    //加
      PUSH 0            num
      DUP2          num =num * i+m
      SWAP1             num =num * i+m   //這三句賦值
      SSTORE            num =num * i+m   //成員變量存儲
      POP           num =num * i+m
      //下面幾句實現return
      PUSH 0            num
      SLOAD             num
      SWAP2             return num    
      POP           return num
      POP           function add(uint i) public re...
      SWAP2             function add(uint i) public re...
      SWAP1             function add(uint i) public re...
      POP           function add(uint i) public re...
      JUMP [out]            function add(uint i) public re...
    .data

01221030_Z6Do[1].png

參考

Call、CallCode、DelegateCall:https://ethereum.stackexchange.com/questions/3667/difference-between-call-callcode-and-delegatecall

solidity結構:https://solidity.readthedocs.io/en/develop/structure-of-a-contract.html#

runtime bytecode和bytecode :https://ethereum.stackexchange.com/questions/13086/solc-bin-vs-bin-runtime/13087#13087

remix: https://remix.ethereum.org/

轉自:(魂祭心)https://my.oschina.net/hunjixin/blog/1805306

安利兩個教程:1.以太坊入門實戰 2.以太坊電商DApp實戰

相關文章
相關標籤/搜索