hotspot解釋器模塊(hotspot\src\share\vm\interpreter
)有兩個實現:基於C++的解釋器和基於彙編的模板解釋器。hotspot默認使用比較快的模板解釋器。
其中java
bytecodeInterpreter*
+ cppInterpreter*
templateTable*
+ templateInterpreter*
它們前者負責字節碼的解釋,後者負責解釋器的運行時,共同完成解釋功能。這裏咱們只關注模板解釋器。windows
模板解釋器又分爲三個組成部分:數組
templateInterpreterGenerator
解釋器生成器templateTable
字節碼實現templateInterpreter
解釋器class TemplateInterpreter: public AbstractInterpreter { friend class VMStructs; friend class InterpreterMacroAssembler; friend class TemplateInterpreterGenerator; friend class TemplateTable; friend class CodeCacheExtensions; // friend class Interpreter; public: enum MoreConstants { number_of_return_entries = number_of_states, // number of return entry points number_of_deopt_entries = number_of_states, // number of deoptimization entry points number_of_return_addrs = number_of_states // number of return addresses }; protected: static address _throw_ArrayIndexOutOfBoundsException_entry; static address _throw_ArrayStoreException_entry; static address _throw_ArithmeticException_entry; static address _throw_ClassCastException_entry; static address _throw_NullPointerException_entry; static address _throw_exception_entry; static address _throw_StackOverflowError_entry; static address _remove_activation_entry; // continuation address if an exception is not handled by current frame #ifdef HOTSWAP static address _remove_activation_preserving_args_entry; // continuation address when current frame is being popped #endif // HOTSWAP #ifndef PRODUCT static EntryPoint _trace_code; #endif // !PRODUCT static EntryPoint _return_entry[number_of_return_entries]; // entry points to return to from a call static EntryPoint _earlyret_entry; // entry point to return early from a call static EntryPoint _deopt_entry[number_of_deopt_entries]; // entry points to return to from a deoptimization static EntryPoint _continuation_entry; static EntryPoint _safept_entry; static address _invoke_return_entry[number_of_return_addrs]; // for invokestatic, invokespecial, invokevirtual return entries static address _invokeinterface_return_entry[number_of_return_addrs]; // for invokeinterface return entries static address _invokedynamic_return_entry[number_of_return_addrs]; // for invokedynamic return entries static DispatchTable _active_table; // the active dispatch table (used by the interpreter for dispatch) static DispatchTable _normal_table; // the normal dispatch table (used to set the active table in normal mode) static DispatchTable _safept_table; // the safepoint dispatch table (used to set the active table for safepoints) static address _wentry_point[DispatchTable::length]; // wide instructions only (vtos tosca always) public: ... static int InterpreterCodeSize; };
裏面不少address
變量,EntryPoint
是一個address數組,DispatchTable
也是。
模板解釋器就是由一系列例程(routine)組成的,即address
變量,它們每一個都表示一個例程的入口地址,好比異常處理例程,invoke指令例程,用於gc的safepoint例程...
舉個形象的例子,咱們都知道字節碼文件長這樣:緩存
public void f(); 0: aload_0 1: invokespecial #5 // Method A.f:()V 4: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream; 7: ldc #6 // String ff 9: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 12: return
若是要讓咱們寫解釋器,可能基本上就是一個循環裏面switch,根據不一樣opcode派發到不一樣例程,例程的代碼都是同樣的模板代碼,對aload_0的處理永遠是取局部變量槽0的數據放到棧頂,那麼徹底能夠在switch派發字節碼前準備好這些模板代碼,templateInterpreterGenerator
就是作的這件事,它的generate_all()
函數初始化了全部的例程:架構
void TemplateInterpreterGenerator::generate_all() { // 設置slow_signature_handler例程 { CodeletMark cm(_masm, "slow signature handler"); AbstractInterpreter::_slow_signature_handler = generate_slow_signature_handler(); } // 設置error_exit例程 { CodeletMark cm(_masm, "error exits"); _unimplemented_bytecode = generate_error_exit("unimplemented bytecode"); _illegal_bytecode_sequence = generate_error_exit("illegal bytecode sequence - method not verified"); } ...... }
另外,既然已經涉及到機器碼了,單獨的templateInterpreterGenerator
顯然是不能完成這件事的,它還須要配合
hotspot\src\cpu\x86\vm\templateInterpreterGenerator_x86.cpp
&&hotspot\src\cpu\x86\vm\templateInterpreterGenerator_x86_64.cpp
一塊兒作事(個人機器是x86+windows)。ide
使用-XX:+UnlockDiagnosticVMOptions -XX:+PrintInterpreter -XX:+LogCompilation -XX:LogFile=file.log
保存結果到文件,能夠查看生成的這些例程。
隨便舉個例子,模板解釋器特殊處理java.lang.Math裏的不少數學函數,使用它們不須要創建一般意義的java棧幀,且使用sse指令能夠獲得極大的性能提高:函數
// hotspot\src\cpu\x86\vm\templateInterpreterGenerator_x86_64.cpp address TemplateInterpreterGenerator::generate_math_entry(AbstractInterpreter::MethodKind kind) { // rbx,: Method* // rcx: scratrch // r13: sender sp if (!InlineIntrinsics) return NULL; // Generate a vanilla entry address entry_point = __ pc(); if (kind == Interpreter::java_lang_math_fmaD) { if (!UseFMA) { return NULL; // Generate a vanilla entry } __ movdbl(xmm0, Address(rsp, wordSize)); __ movdbl(xmm1, Address(rsp, 3 * wordSize)); __ movdbl(xmm2, Address(rsp, 5 * wordSize)); __ fmad(xmm0, xmm1, xmm2, xmm0); } else if (kind == Interpreter::java_lang_math_fmaF) { if (!UseFMA) { return NULL; // Generate a vanilla entry } __ movflt(xmm0, Address(rsp, wordSize)); __ movflt(xmm1, Address(rsp, 2 * wordSize)); __ movflt(xmm2, Address(rsp, 3 * wordSize)); __ fmaf(xmm0, xmm1, xmm2, xmm0); } else if (kind == Interpreter::java_lang_math_sqrt) { __ sqrtsd(xmm0, Address(rsp, wordSize)); } else if (kind == Interpreter::java_lang_math_exp) { __ movdbl(xmm0, Address(rsp, wordSize)); if (StubRoutines::dexp() != NULL) { __ call(RuntimeAddress(CAST_FROM_FN_PTR(address, StubRoutines::dexp()))); } else { __ call_VM_leaf0(CAST_FROM_FN_PTR(address, SharedRuntime::dexp)); } } else if (kind == Interpreter::java_lang_math_log) { __ movdbl(xmm0, Address(rsp, wordSize)); if (StubRoutines::dlog() != NULL) { __ call(RuntimeAddress(CAST_FROM_FN_PTR(address, StubRoutines::dlog()))); } else { __ call_VM_leaf0(CAST_FROM_FN_PTR(address, SharedRuntime::dlog)); } } else if (kind == Interpreter::java_lang_math_log10) { __ movdbl(xmm0, Address(rsp, wordSize)); if (StubRoutines::dlog10() != NULL) { __ call(RuntimeAddress(CAST_FROM_FN_PTR(address, StubRoutines::dlog10()))); } else { __ call_VM_leaf0(CAST_FROM_FN_PTR(address, SharedRuntime::dlog10)); } } else if (kind == Interpreter::java_lang_math_sin) { __ movdbl(xmm0, Address(rsp, wordSize)); if (StubRoutines::dsin() != NULL) { __ call(RuntimeAddress(CAST_FROM_FN_PTR(address, StubRoutines::dsin()))); } else { __ call_VM_leaf0(CAST_FROM_FN_PTR(address, SharedRuntime::dsin)); } } else if (kind == Interpreter::java_lang_math_cos) { __ movdbl(xmm0, Address(rsp, wordSize)); if (StubRoutines::dcos() != NULL) { __ call(RuntimeAddress(CAST_FROM_FN_PTR(address, StubRoutines::dcos()))); } else { __ call_VM_leaf0(CAST_FROM_FN_PTR(address, SharedRuntime::dcos)); } } else if (kind == Interpreter::java_lang_math_pow) { __ movdbl(xmm1, Address(rsp, wordSize)); __ movdbl(xmm0, Address(rsp, 3 * wordSize)); if (StubRoutines::dpow() != NULL) { __ call(RuntimeAddress(CAST_FROM_FN_PTR(address, StubRoutines::dpow()))); } else { __ call_VM_leaf0(CAST_FROM_FN_PTR(address, SharedRuntime::dpow)); } } else if (kind == Interpreter::java_lang_math_tan) { __ movdbl(xmm0, Address(rsp, wordSize)); if (StubRoutines::dtan() != NULL) { __ call(RuntimeAddress(CAST_FROM_FN_PTR(address, StubRoutines::dtan()))); } else { __ call_VM_leaf0(CAST_FROM_FN_PTR(address, SharedRuntime::dtan)); } } else { __ fld_d(Address(rsp, wordSize)); switch (kind) { case Interpreter::java_lang_math_abs: __ fabs(); break; default: ShouldNotReachHere(); } __ subptr(rsp, 2*wordSize); // Round to 64bit precision __ fstp_d(Address(rsp, 0)); __ movdbl(xmm0, Address(rsp, 0)); __ addptr(rsp, 2*wordSize); } __ pop(rax); __ mov(rsp, r13); __ jmp(rax); return entry_point; }
咱們關注java.lang.math.Pow()
方法,加上-XX:+PrintInterpreter
查看生成的例程:oop
else if (kind == Interpreter::java_lang_math_pow) { __ movdbl(xmm1, Address(rsp, wordSize)); __ movdbl(xmm0, Address(rsp, 3 * wordSize)); if (StubRoutines::dpow() != NULL) { __ call(RuntimeAddress(CAST_FROM_FN_PTR(address, StubRoutines::dpow()))); } else { __ call_VM_leaf0(CAST_FROM_FN_PTR(address, SharedRuntime::dpow)); } }
---------------------------------------------------------------------- method entry point (kind = java_lang_math_pow) [0x000001bcb62feaa0, 0x000001bcb62feac0] 32 bytes 0x000001bcb62feaa0: vmovsd 0x8(%rsp),%xmm1 0x000001bcb62feaa6: vmovsd 0x18(%rsp),%xmm0 0x000001bcb62feaac: callq 0x000001bcb62f19d0 0x000001bcb62feab1: pop %rax 0x000001bcb62feab2: mov %r13,%rsp 0x000001bcb62feab5: jmpq *%rax 0x000001bcb62feab7: nop 0x000001bcb62feab8: add %al,(%rax) 0x000001bcb62feaba: add %al,(%rax) 0x000001bcb62feabc: add %al,(%rax) 0x000001bcb62feabe: add %al,(%rax)
callq
會調用hotspot\src\cpu\x86\vm\stubGenerator_x86_64.cpp
的address generate_libmPow()
,感興趣的能夠去看一下,這裏就不展開了。性能
如今咱們知道了模板解釋器實際上是由一堆例程構成的,可是,字節碼的例程的呢?看看上面TemplateInterpreter
的類定義,有個static DispatchTable _active_table;
,它就是咱們要找的東西了。具體來講templateInterpreterGenerator
會調用TemplateInterpreterGenerator::set_entry_points()
爲每一個字節碼設置例程,該例程經過templateTable::template_for()
得到。一樣,這些代碼須要關心cpu架構,因此本身每一個字節碼的例程也是由hotspot\src\cpu\x86\vm\templateTable_x86.cpp
+templateTable
共同完成的。
字節碼太多了,這裏也隨便舉個例子,考慮istore,它負責將棧頂數據出棧並存放到當前方法的局部變量表,實現以下:this
void TemplateTable::istore() { transition(itos, vtos); locals_index(rbx); __ movl(iaddress(rbx), rax); }
合情合理的實現
等等,當使用-XX:+PrintInterpreter
查看istore的合情合理的例程時卻獲得了一大堆彙編:
---------------------------------------------------------------------- istore 54 istore [0x00000192d1972ba0, 0x00000192d1972c00] 96 bytes 0x00000192d1972ba0: mov (%rsp),%eax 0x00000192d1972ba3: add $0x8,%rsp 0x00000192d1972ba7: movzbl 0x1(%r13),%ebx 0x00000192d1972bac: neg %rbx 0x00000192d1972baf: mov %eax,(%r14,%rbx,8) 0x00000192d1972bb3: movzbl 0x2(%r13),%ebx 0x00000192d1972bb8: add $0x2,%r13 0x00000192d1972bbc: movabs $0x7fffd56e0fa0,%r10 0x00000192d1972bc6: jmpq *(%r10,%rbx,8) 0x00000192d1972bca: mov (%rsp),%eax 0x00000192d1972bcd: add $0x8,%rsp 0x00000192d1972bd1: movzwl 0x2(%r13),%ebx 0x00000192d1972bd6: bswap %ebx 0x00000192d1972bd8: shr $0x10,%ebx 0x00000192d1972bdb: neg %rbx 0x00000192d1972bde: mov %eax,(%r14,%rbx,8) 0x00000192d1972be2: movzbl 0x4(%r13),%ebx 0x00000192d1972be7: add $0x4,%r13 0x00000192d1972beb: movabs $0x7fffd56e0fa0,%r10 0x00000192d1972bf5: jmpq *(%r10,%rbx,8) 0x00000192d1972bf9: nopl 0x0(%rax)
雖然勉強能看出mov %eax,(%r14,%rbx,8)
對應__ movl(iaddress(n), rax);
,可是多出來的代碼怎麼回事。
要回答這個問題,須要點其餘知識。
以前提到
templateInterpreterGenerator
會調用TemplateInterpreterGenerator::set_entry_points()
爲每一個字節碼設置例程
能夠從set_entry_points
出發看看它爲istore作了什麼特殊的事情:
... // 指令是否存在 if (Bytecodes::is_defined(code)) { Template* t = TemplateTable::template_for(code); assert(t->is_valid(), "just checking"); set_short_entry_points(t, bep, cep, sep, aep, iep, lep, fep, dep, vep); } // 指令是否能夠擴寬,即wide if (Bytecodes::wide_is_defined(code)) { Template* t = TemplateTable::template_for_wide(code); assert(t->is_valid(), "just checking"); set_wide_entry_point(t, wep); } ... }
中間有一句話:
Template* t = TemplateTable::template_for(code);
從模板表中的查找Bytecodes::Code
常量獲得的是一個Template
,Template
描述了一個指定的字節碼對應的代碼的一些屬性
// A Template describes the properties of a code template for a given bytecode
// and provides a generator to generate the code template.
// hotspot\src\share\vm\utilities\globalDefinitions.hpp // TosState用來描述一個字節碼或者方法執行先後的狀態。 enum TosState { // describes the tos cache contents btos = 0, // byte, bool tos cached ztos = 1, // byte, bool tos cached ctos = 2, // char tos cached stos = 3, // short tos cached itos = 4, // int tos cached ltos = 5, // long tos cached ftos = 6, // float tos cached dtos = 7, // double tos cached atos = 8, // object cached vtos = 9, // tos not cached number_of_states, ilgl // illegal state: should not occur };
// hotspot\src\share\vm\interpreter\templateTable.hpp class Template VALUE_OBJ_CLASS_SPEC { private: enum Flags { uses_bcp_bit, // 是否須要字節碼指針(bcp)? does_dispatch_bit, // 是否須要dispatch? calls_vm_bit, // 是否調用了虛擬機方法? wide_bit // 可否擴寬,即加wide }; typedef void (*generator)(int arg); // 字節碼代碼生成器,實際上是一個函數指針 int _flags; // 就是↑描述的flag TosState _tos_in; // 執行字節碼前的棧頂緩存狀態 TosState _tos_out; // 執行字節碼的棧頂緩存狀態 generator _gen; // 字節碼代碼生成器 int _arg; // 字節碼代碼生成器參數
而後找到istore對應的模板定義:
//hotspot\src\share\vm\interpreter\templateTable.cpp void TemplateTable::initialize() { ... // interpr. templates // Java spec bytecodes ubcp|disp|clvm|iswd in out generator argument def(Bytecodes::_istore , ubcp|____|clvm|____, itos, vtos, istore , _ ); def(Bytecodes::_lstore , ubcp|____|____|____, ltos, vtos, lstore , _ ); def(Bytecodes::_fstore , ubcp|____|____|____, ftos, vtos, fstore , _ ); def(Bytecodes::_dstore , ubcp|____|____|____, dtos, vtos, dstore , _ ); def(Bytecodes::_astore , ubcp|____|clvm|____, vtos, vtos, astore , _ ); ... // wide Java spec bytecodes def(Bytecodes::_istore , ubcp|____|____|iswd, vtos, vtos, wide_istore , _ ); def(Bytecodes::_lstore , ubcp|____|____|iswd, vtos, vtos, wide_lstore , _ ); def(Bytecodes::_fstore , ubcp|____|____|iswd, vtos, vtos, wide_fstore , _ ); def(Bytecodes::_dstore , ubcp|____|____|iswd, vtos, vtos, wide_dstore , _ ); def(Bytecodes::_astore , ubcp|____|____|iswd, vtos, vtos, wide_astore , _ ); def(Bytecodes::_iinc , ubcp|____|____|iswd, vtos, vtos, wide_iinc , _ ); def(Bytecodes::_ret , ubcp|disp|____|iswd, vtos, vtos, wide_ret , _ ); def(Bytecodes::_breakpoint , ubcp|disp|clvm|____, vtos, vtos, _breakpoint , _ ); ... }
這裏定義的意思就是,istore
使用無參數的生成器istore函數生成例程,這個生成器正是以前提到的那個很短的彙編代碼:
void TemplateTable::istore() { transition(itos, vtos); locals_index(rbx); __ movl(iaddress(rbx), rax); }
ubcp
表示使用字節碼指針,所謂字節碼指針指的是該字節碼的操做數是否存在於字節碼裏面,一圖勝千言:
istore的index緊跟在istore(0x36)後面,因此istore須要移動字節碼指針以獲取index。
istore
還規定執行前棧頂緩存int值(itos),執行後不緩存(vtos),且istore還有一個wide版本,這個版本使用兩個字節的index。
有了這些信息,能夠試着解釋多出的彙編是怎麼回事了。set_entry_points()
爲istore和wide版本的istore生成代碼,
咱們選擇普通版本的istore解釋,wide版本的依樣畫葫蘆便可。它又進一步調用了set_short_entry_points()
:
void TemplateInterpreterGenerator::set_entry_points(Bytecodes::Code code) { ... if (Bytecodes::is_defined(code)) { Template* t = TemplateTable::template_for(code); assert(t->is_valid(), "just checking"); set_short_entry_points(t, bep, cep, sep, aep, iep, lep, fep, dep, vep); } if (Bytecodes::wide_is_defined(code)) { Template* t = TemplateTable::template_for_wide(code); assert(t->is_valid(), "just checking"); set_wide_entry_point(t, wep); } ... } void TemplateInterpreterGenerator::set_short_entry_points(Template* t, address& bep, address& cep, address& sep, address& aep, address& iep, address& lep, address& fep, address& dep, address& vep) { assert(t->is_valid(), "template must exist"); switch (t->tos_in()) { case btos: case ztos: case ctos: case stos: ShouldNotReachHere(); // btos/ctos/stos should use itos. break; case atos: vep = __ pc(); __ pop(atos); aep = __ pc(); generate_and_dispatch(t); break; case itos: vep = __ pc(); __ pop(itos); iep = __ pc(); generate_and_dispatch(t); break; case ltos: vep = __ pc(); __ pop(ltos); lep = __ pc(); generate_and_dispatch(t); break; case ftos: vep = __ pc(); __ pop(ftos); fep = __ pc(); generate_and_dispatch(t); break; case dtos: vep = __ pc(); __ pop(dtos); dep = __ pc(); generate_and_dispatch(t); break; case vtos: set_vtos_entry_points(t, bep, cep, sep, aep, iep, lep, fep, dep, vep); break; default : ShouldNotReachHere(); break; } }
set_short_entry_points
會根據該指令執行前是否須要棧頂緩存pop數據,istore使用了itos緩存,因此須要pop:
// hotspot\src\cpu\x86\vm\interp_masm_x86.cpps void InterpreterMacroAssembler::pop_i(Register r) { // XXX can't use pop currently, upper half non clean movl(r, Address(rsp, 0)); addptr(rsp, wordSize); }
稍微須要注意的是這裏說的pop是一個彈出的概念,實際生成的代碼是mov,試着解釋那一大堆彙編:
mov指令
---------------------------------------------------------------------- istore 54 istore [0x00000192d1972ba0, 0x00000192d1972c00] 96 bytes ;獲取棧頂int緩存 0x00000192d1972ba0: mov (%rsp),%eax 0x00000192d1972ba3: add $0x8,%rsp 0x00000192d1972ba7: movzbl 0x1(%r13),%ebx 0x00000192d1972bac: neg %rbx 0x00000192d1972baf: mov %eax,(%r14,%rbx,8) 0x00000192d1972bb3: movzbl 0x2(%r13),%ebx 0x00000192d1972bb8: add $0x2,%r13 0x00000192d1972bbc: movabs $0x7fffd56e0fa0,%r10 0x00000192d1972bc6: jmpq *(%r10,%rbx,8) 0x00000192d1972bca: mov (%rsp),%eax 0x00000192d1972bcd: add $0x8,%rsp 0x00000192d1972bd1: movzwl 0x2(%r13),%ebx 0x00000192d1972bd6: bswap %ebx 0x00000192d1972bd8: shr $0x10,%ebx 0x00000192d1972bdb: neg %rbx 0x00000192d1972bde: mov %eax,(%r14,%rbx,8) 0x00000192d1972be2: movzbl 0x4(%r13),%ebx 0x00000192d1972be7: add $0x4,%r13 0x00000192d1972beb: movabs $0x7fffd56e0fa0,%r10 0x00000192d1972bf5: jmpq *(%r10,%rbx,8) 0x00000192d1972bf9: nopl 0x0(%rax)
接着generate_and_dispatch()
又分爲執行前(dispatch_prolog
)+執行字節碼(t->generate()
)+執行後三部分(dispatch_epilog
):
void TemplateInterpreterGenerator::generate_and_dispatch(Template* t, TosState tos_out) { ... int step = 0; if (!t->does_dispatch()) { step = t->is_wide() ? Bytecodes::wide_length_for(t->bytecode()) : Bytecodes::length_for(t->bytecode()); if (tos_out == ilgl) tos_out = t->tos_out(); // compute bytecode size assert(step > 0, "just checkin'"); // setup stuff for dispatching next bytecode if (ProfileInterpreter && VerifyDataPointer && MethodData::bytecode_has_profile(t->bytecode())) { __ verify_method_data_pointer(); } __ dispatch_prolog(tos_out, step); } // generate template t->generate(_masm); // advance if (t->does_dispatch()) { #ifdef ASSERT // make sure execution doesn't go beyond this point if code is broken __ should_not_reach_here(); #endif // ASSERT } else { // dispatch to next bytecode __ dispatch_epilog(tos_out, step); } }
x86的字節碼執行前不會作任何事,因此沒有其餘代碼:
---------------------------------------------------------------------- istore 54 istore [0x00000192d1972ba0, 0x00000192d1972c00] 96 bytes ;獲取棧頂int緩存 0x00000192d1972ba0: mov (%rsp),%eax 0x00000192d1972ba3: add $0x8,%rsp ; 執行istore,即移動bcp指針獲取index,放入局部變量槽 0x00000192d1972ba7: movzbl 0x1(%r13),%ebx 0x00000192d1972bac: neg %rbx 0x00000192d1972baf: mov %eax,(%r14,%rbx,8) 0x00000192d1972bb3: movzbl 0x2(%r13),%ebx 0x00000192d1972bb8: add $0x2,%r13 0x00000192d1972bbc: movabs $0x7fffd56e0fa0,%r10 0x00000192d1972bc6: jmpq *(%r10,%rbx,8) 0x00000192d1972bca: mov (%rsp),%eax 0x00000192d1972bcd: add $0x8,%rsp 0x00000192d1972bd1: movzwl 0x2(%r13),%ebx 0x00000192d1972bd6: bswap %ebx 0x00000192d1972bd8: shr $0x10,%ebx 0x00000192d1972bdb: neg %rbx 0x00000192d1972bde: mov %eax,(%r14,%rbx,8) 0x00000192d1972be2: movzbl 0x4(%r13),%ebx 0x00000192d1972be7: add $0x4,%r13 0x00000192d1972beb: movabs $0x7fffd56e0fa0,%r10 0x00000192d1972bf5: jmpq *(%r10,%rbx,8) 0x00000192d1972bf9: nopl 0x0(%rax)
執行後調用的是dispatch_prolog
:
void InterpreterMacroAssembler::dispatch_epilog(TosState state, int step) { dispatch_next(state, step); } void InterpreterMacroAssembler::dispatch_next(TosState state, int step) { // load next bytecode (load before advancing _bcp_register to prevent AGI) load_unsigned_byte(rbx, Address(_bcp_register, step)); // advance _bcp_register increment(_bcp_register, step); dispatch_base(state, Interpreter::dispatch_table(state)); } void InterpreterMacroAssembler::dispatch_base(TosState state, address* table, bool verifyoop) { verify_FPU(1, state); if (VerifyActivationFrameSize) { Label L; mov(rcx, rbp); subptr(rcx, rsp); int32_t min_frame_size = (frame::link_offset - frame::interpreter_frame_initial_sp_offset) * wordSize; cmpptr(rcx, (int32_t)min_frame_size); jcc(Assembler::greaterEqual, L); stop("broken stack frame"); bind(L); } if (verifyoop) { verify_oop(rax, state); } #ifdef _LP64 // 防止意外執行到死代碼 lea(rscratch1, ExternalAddress((address)table)); jmp(Address(rscratch1, rbx, Address::times_8)); #else Address index(noreg, rbx, Address::times_ptr); ExternalAddress tbl((address)table); ArrayAddress dispatch(tbl, index); jump(dispatch); #endif // _LP64 }
---------------------------------------------------------------------- istore 54 istore [0x00000192d1972ba0, 0x00000192d1972c00] 96 bytes ; 獲取棧頂int緩存 0x00000192d1972ba0: mov (%rsp),%eax 0x00000192d1972ba3: add $0x8,%rsp ; 執行istore,即移動bcp指針獲取index,放入局部變量槽 0x00000192d1972ba7: movzbl 0x1(%r13),%ebx 0x00000192d1972bac: neg %rbx 0x00000192d1972baf: mov %eax,(%r14,%rbx,8) ; 加載下一個字節碼,istore後面一個字節是index,因此須要r13+2 0x00000192d1972bb3: movzbl 0x2(%r13),%ebx 0x00000192d1972bb8: add $0x2,%r13 ; 防止意外執行到死代碼 0x00000192d1972bbc: movabs $0x7fffd56e0fa0,%r10 0x00000192d1972bc6: jmpq *(%r10,%rbx,8) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; 以前提到istore有一個wide版本的也會一併生成,wide istore格式以下 ; wide istore byte1, byte2 [四個字節] ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; 獲取棧頂緩存的int 0x00000192d1972bca: mov (%rsp),%eax 0x00000192d1972bcd: add $0x8,%rsp ; 獲取兩個字節的index 0x00000192d1972bd1: movzwl 0x2(%r13),%ebx ; 除兩個字節的index外0填充,好比當前index分別爲2,2,擴展後ebx=0x00000202 0x00000192d1972bd6: bswap %ebx ; 4個字節反序,ebx=0x02020000 0x00000192d1972bd8: shr $0x10,%ebx ; ebx=0x00000202 0x00000192d1972bdb: neg %rbx ; 取負數 0x00000192d1972bde: mov %eax,(%r14,%rbx,8) ; r14-rbx*8, ; 加載下一個字節碼,wide istore byte1,byte2 因此r13+4 0x00000192d1972be2: movzbl 0x4(%r13),%ebx 0x00000192d1972be7: add $0x4,%r13 ; 防止意外執行到死代碼 0x00000192d1972beb: movabs $0x7fffd56e0fa0,%r10 0x00000192d1972bf5: jmpq *(%r10,%rbx,8) 0x00000192d1972bf9: nopl 0x0(%rax)