版權聲明:本文爲博主原創文章,未經博主容許不得轉載。java
手動碼字不易,請你們尊重勞動成果,謝謝多線程
本文經過scala代碼編譯生成的class文件的角度來對Scala的閉包實現機制進行簡單分析app
首先以一個簡單的例子開始:函數
class ClosureDemo {
def func() = {
var i = 2
val inc: () => Unit = () => i = i + 1
val add: Int => Int = (ii: Int) => ii + i
(inc, add)
}
}
在這個代碼中,inc
和add
引用了func
函數中的i
變量,因爲Scala中函數是頭等值,所以inc
和add
將造成閉包來引用外部的i
變量。ui
編譯上述代碼咱們將獲得三個class文件:this
ClosureDemo.class
ClosureDemo$$anonfun$1.class
ClosureDemo$$anonfun$2.classspa
這三個文件分別是ClosureDemo
類自身和兩個閉包,Scala會爲每一個閉包生成一個Class文件,若是嵌套過深,可能會出現特別長的類名,從而在Windows上引發一些路徑過長的錯誤。.net
在Spark源碼中的ClosureCleaner
類中,咱們能夠看到這樣的代碼,用來判斷這個類是否是閉包:線程
// Check whether a class represents a Scala closure
private def isClosure(cls: Class[_]): Boolean = {
cls.getName.contains("$anonfun$")
}
首先咱們使用javap
來看下ClosureDemo.class
文件的內容:
{
public scala.Tuple2<scala.Function0<scala.runtime.BoxedUnit>, scala.Function1<java.lang.Object, java.lang.Object>> func();
descriptor: ()Lscala/Tuple2;
flags: ACC_PUBLIC
Code:
stack=4, locals=4, args_size=1
0: iconst_2
1: invokestatic #16 // Method scala/runtime/IntRef.create:(I)Lscala/runtime/IntRef;
4: astore_1
5: new #18 // class ClosureDemo$$anonfun$1
8: dup
9: aload_0
10: aload_1
11: invokespecial #22 // Method ClosureDemo$$anonfun$1."<init>":(LClosureDemo;Lscala/runtime/IntRef;)V
14: astore_2
15: new #24 // class ClosureDemo$$anonfun$2
18: dup
19: aload_0
20: aload_1
21: invokespecial #25 // Method ClosureDemo$$anonfun$2."<init>":(LClosureDemo;Lscala/runtime/IntRef;)V
24: astore_3
25: new #27 // class scala/Tuple2
28: dup
29: aload_2
30: aload_3
31: invokespecial #30 // Method scala/Tuple2."<init>":(Ljava/lang/Object;Ljava/lang/Object;)V
34: areturn
LocalVariableTable:
Start Length Slot Name Signature
0 35 0 this LClosureDemo;
5 29 1 i Lscala/runtime/IntRef;
15 19 2 inc Lscala/Function0;
25 9 3 add Lscala/Function1;
LineNumberTable:
line 3: 0
line 4: 5
line 5: 15
line 6: 25
Signature: #46 // ()Lscala/Tuple2<Lscala/Function0<Lscala/runtime/BoxedUnit;>;Lscala/Function1<Ljava/lang/Object;Ljava/lang/Object;>;>;
public ClosureDemo();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #41 // Method java/lang/Object."<init>":()V
4: return
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this LClosureDemo;
LineNumberTable:
line 8: 0
}
因爲其不含字段表,所以咱們重點關注其方法表,從上述class文件中我滿能夠看到它具備兩個方法:
一、func() 咱們定義的func函數
二、ClosureDemo() 類構造函數
咱們重點關注func
函數的實現:
首先將一個int型整數2
壓入棧頂,而後調用scala.runtime.IntRef
類中的靜態函數:create(Int):scala.runtime.IntRef
來將以前的2
包裝到IntRef類裏,咱們來看下IntRef的實現:
package scala.runtime;
public class IntRef implements java.io.Serializable {
private static final long serialVersionUID = 1488197132022872888L;
public int elem;
public IntRef(int elem) { this.elem = elem; }
public String toString() { return java.lang.Integer.toString(elem); }
public static IntRef create(int e) { return new IntRef(e); }
public static IntRef zero() { return new IntRef(0); }
}
代碼很簡單,只是簡單把這個int
類型的變量包裝在了IntRef類裏,這樣這個變量就成功從棧中跑到了堆裏。再以後就是兩個閉包類的構造過程了,其中有一點須要重點關注下,那就是在調用這兩個閉包類的構造函數時,傳入了this
和剛剛構造好的IntRef
。
下面咱們進入閉包類裏來看下,如下是ClosureDemo$$anonfun$1.class
文件的字段表和方法表,它是inc
編譯後生成的字節碼:
{
public static final long serialVersionUID;
descriptor: J
flags: ACC_PUBLIC, ACC_STATIC, ACC_FINAL
ConstantValue: long 0l
private final scala.runtime.IntRef i$1;
descriptor: Lscala/runtime/IntRef;
flags: ACC_PRIVATE, ACC_FINAL
public final void apply();
descriptor: ()V
flags: ACC_PUBLIC, ACC_FINAL
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokevirtual #23 // Method apply$mcV$sp:()V
4: return
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this LClosureDemo$$anonfun$1;
LineNumberTable:
line 4: 0
public void apply$mcV$sp();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=3, locals=1, args_size=1
0: aload_0
1: getfield #27 // Field i$1:Lscala/runtime/IntRef;
4: aload_0
5: getfield #27 // Field i$1:Lscala/runtime/IntRef;
8: getfield #33 // Field scala/runtime/IntRef.elem:I
11: iconst_1
12: iadd
13: putfield #33 // Field scala/runtime/IntRef.elem:I
16: return
LocalVariableTable:
Start Length Slot Name Signature
0 17 0 this LClosureDemo$$anonfun$1;
LineNumberTable:
line 4: 0
public final java.lang.Object apply();
descriptor: ()Ljava/lang/Object;
flags: ACC_PUBLIC, ACC_FINAL, ACC_BRIDGE, ACC_SYNTHETIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokevirtual #36 // Method apply:()V
4: getstatic #42 // Field scala/runtime/BoxedUnit.UNIT:Lscala/runtime/BoxedUnit;
7: areturn
LocalVariableTable:
Start Length Slot Name Signature
0 8 0 this LClosureDemo$$anonfun$1;
LineNumberTable:
line 4: 0
public ClosureDemo$$anonfun$1(ClosureDemo, scala.runtime.IntRef);
descriptor: (LClosureDemo;Lscala/runtime/IntRef;)V
flags: ACC_PUBLIC
Code:
stack=2, locals=3, args_size=3
0: aload_0
1: aload_2
2: putfield #27 // Field i$1:Lscala/runtime/IntRef;
5: aload_0
6: invokespecial #46 // Method scala/runtime/AbstractFunction0$mcV$sp."<init>":()V
9: return
LocalVariableTable:
Start Length Slot Name Signature
0 10 0 this LClosureDemo$$anonfun$1;
0 10 1 $outer LClosureDemo;
0 10 2 i$1 Lscala/runtime/IntRef;
LineNumberTable:
line 4: 0
}
從上述代碼中咱們能夠看到,其含有兩個字段和四個方法:
public static final long serialVersionUID=0L;
private final scala.runtime.IntRef i$1;
public final void apply()
public void apply$mcV$sp()
public final java.lang.Object apply()
public ClosureDemo$$anonfun$1(ClosureDemo, scala.runtime.IntRef)
咱們先從構造函數看起,以前分析ClosureDemo.class
時咱們看到在構造兩個閉包時,傳入了外部類的this引用的IntRef,正時調用的這個構造函數。這個構造函數很簡單,把第二個參數IntRef
存到了類字段i$1
裏,這個IntRef
就是包裝了2
這個數字的類引用。以後調用其父類scala/runtime/AbstractFunction0$mcV$sp
的構造函數。這個類名仍是頗有意思的,從我幾回試驗來看,它具備如下規律:
一、前半部分scala/runtime/AbstractFunction0
中AbstractFunction0
表明函數的參數類型,0
表明沒有參數,AbstractFunctionX
表明X個參數等。它繼承了對應的FunctionX
父類。
二、後半部分$mcV$sp
中的V
表明了函數的返回值是Void類型,舉個Scala源碼中的例子:boolean apply$mcZIJ$sp(int v1, long v2);
咱們再看上面class文件中剩餘的幾個方法,兩個apply
方法,其中一個只是爲了兼容老版本而生成的方法(ACC_BRIDGE, ACC_SYNTHETIC),另外一個僅僅直接調用apply$mcV$sp
方法。所以咱們重點來看下apply$mcV$sp
方法的實現。代碼也十分簡單:
一、取類字段i$1
到棧中
二、取IntRef的elem字段值,即IntRef所包裝的值
三、將其加1並寫回該IntRef類中
因爲IntRef爲堆中的類,所以全部其餘引用了該IntRef類的字段都將看到該數字被加1(不考慮多線程)
在ClosureDemo$$anonfun$2.class
中的代碼和ClosureDemo$$anonfun$1.class
中一致,只是僅僅返回了IntRef中值與輸入的Int之和。因爲在構造ClosureDemo$$anonfun$1
和ClosureDemo$$anonfun$2
時傳入的是同一個IntRef,所以當它們對應的inc
和add
被外部調用時,其操做的數字爲同一個數字,看上去就還像操做func
方法中的i
變量同樣。這樣inc
和add
就實現了包含外部變量i
的閉包。
不知你們是否注意到,在構造這兩個閉包時,構造函數裏傳入了外包裝的類對象,可是在這個例子中,咱們看到它並無被使用,而且它的名字很奇特,叫$outer
。下面咱們對例子稍微改造下:
class ClosureDemo {
def func() = {
def i = 2
val j = 3
var k = 4
val add: Int => Int = (ii: Int) => ii + i + j + k
k = k + 1
add
}
}
編譯後會生成兩個文件:
ClosureDemo.class
ClosureDemo$$anonfun$1.class
咱們仍是先來看ClosureDemo.class
文件:
{
public scala.Function1<java.lang.Object, java.lang.Object> func();
descriptor: ()Lscala/Function1;
flags: ACC_PUBLIC
Code:
stack=5, locals=4, args_size=1
0: iconst_3
1: istore_1
2: iconst_4
3: invokestatic #16 // Method scala/runtime/IntRef.create:(I)Lscala/runtime/IntRef;
6: astore_2
7: new #18 // class ClosureDemo$$anonfun$1
10: dup
11: aload_0
12: iload_1
13: aload_2
14: invokespecial #22 // Method ClosureDemo$$anonfun$1."<init>":(LClosureDemo;ILscala/runtime/IntRef;)V
17: astore_3
18: aload_2
19: aload_2
20: getfield #26 // Field scala/runtime/IntRef.elem:I
23: iconst_1
24: iadd
25: putfield #26 // Field scala/runtime/IntRef.elem:I
28: aload_3
29: areturn
LocalVariableTable:
Start Length Slot Name Signature
0 30 0 this LClosureDemo;
2 27 1 j I
7 22 2 k Lscala/runtime/IntRef;
18 11 3 add Lscala/Function1;
LineNumberTable:
line 4: 0
line 5: 2
line 6: 7
line 7: 18
line 8: 28
Signature: #43 // ()Lscala/Function1<Ljava/lang/Object;Ljava/lang/Object;>;
public final int ClosureDemo$$i$1();
descriptor: ()I
flags: ACC_PUBLIC, ACC_FINAL
Code:
stack=1, locals=1, args_size=1
0: iconst_2
1: ireturn
LocalVariableTable:
Start Length Slot Name Signature
0 2 0 this LClosureDemo;
LineNumberTable:
line 3: 0
public ClosureDemo();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #38 // Method java/lang/Object."<init>":()V
4: return
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this LClosureDemo;
LineNumberTable:
line 10: 0
}
因爲咱們在func
方法中定義了i
函數,所以生成了一個叫作ClosureDemo$$i$1
的方法。咱們首先看下val j
、var k
兩個變量的處理方式:
一、因爲j是val修飾,所以它直接做爲Int類型變量傳入了ClosureDemo$$anonfun$1
的構造函數裏
二、因爲k是var修飾,所以它被包裝到了IntRef裏並傳入ClosureDemo$$anonfun$1
的構造函數裏,關注下後面對k加1的操做,它也是基於IntRef這個包裝進行的。
以後咱們來看下ClosureDemo$$anonfun$1.class
文件:
{
public static final long serialVersionUID;
descriptor: J
flags: ACC_PUBLIC, ACC_STATIC, ACC_FINAL
ConstantValue: long 0l
private final ClosureDemo $outer;
descriptor: LClosureDemo;
flags: ACC_PRIVATE, ACC_FINAL, ACC_SYNTHETIC
private final int j$1;
descriptor: I
flags: ACC_PRIVATE, ACC_FINAL
private final scala.runtime.IntRef k$1;
descriptor: Lscala/runtime/IntRef;
flags: ACC_PRIVATE, ACC_FINAL
public final int apply(int);
descriptor: (I)I
flags: ACC_PUBLIC, ACC_FINAL
Code:
stack=2, locals=2, args_size=2
0: aload_0
1: iload_1
2: invokevirtual #27 // Method apply$mcII$sp:(I)I
5: ireturn
LocalVariableTable:
Start Length Slot Name Signature
0 6 0 this LClosureDemo$$anonfun$1;
0 6 1 ii I
LineNumberTable:
line 6: 0
public int apply$mcII$sp(int);
descriptor: (I)I
flags: ACC_PUBLIC
Code:
stack=2, locals=2, args_size=2
0: iload_1
1: aload_0
2: getfield #32 // Field $outer:LClosureDemo;
5: invokevirtual #36 // Method ClosureDemo.ClosureDemo$$i$1:()I
8: iadd
9: aload_0
10: getfield #38 // Field j$1:I
13: iadd
14: aload_0
15: getfield #40 // Field k$1:Lscala/runtime/IntRef;
18: getfield #45 // Field scala/runtime/IntRef.elem:I
21: iadd
22: ireturn
LocalVariableTable:
Start Length Slot Name Signature
0 23 0 this LClosureDemo$$anonfun$1;
0 23 1 ii I
LineNumberTable:
line 6: 0
public final java.lang.Object apply(java.lang.Object);
descriptor: (Ljava/lang/Object;)Ljava/lang/Object;
flags: ACC_PUBLIC, ACC_FINAL, ACC_BRIDGE, ACC_SYNTHETIC
Code:
stack=2, locals=2, args_size=2
0: aload_0
1: aload_1
2: invokestatic #52 // Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
5: invokevirtual #54 // Method apply:(I)I
8: invokestatic #58 // Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;
11: areturn
LocalVariableTable:
Start Length Slot Name Signature
0 12 0 this LClosureDemo$$anonfun$1;
0 12 1 v1 Ljava/lang/Object;
LineNumberTable:
line 6: 0
public ClosureDemo$$anonfun$1(ClosureDemo, int, scala.runtime.IntRef);
descriptor: (LClosureDemo;ILscala/runtime/IntRef;)V
flags: ACC_PUBLIC
Code:
stack=2, locals=4, args_size=4
0: aload_1
1: ifnonnull 6
4: aconst_null
5: athrow
6: aload_0
7: aload_1
8: putfield #32 // Field $outer:LClosureDemo;
11: aload_0
12: iload_2
13: putfield #38 // Field j$1:I
16: aload_0
17: aload_3
18: putfield #40 // Field k$1:Lscala/runtime/IntRef;
21: aload_0
22: invokespecial #65 // Method scala/runtime/AbstractFunction1$mcII$sp."<init>":()V
25: return
LocalVariableTable:
Start Length Slot Name Signature
0 26 0 this LClosureDemo$$anonfun$1;
0 26 1 $outer LClosureDemo;
0 26 2 j$1 I
0 26 3 k$1 Lscala/runtime/IntRef;
LineNumberTable:
line 6: 0
StackMapTable: number_of_entries = 1
frame_type = 6 /* same */
}
從上述代碼中咱們能夠看到,其含有四個字段和四個方法:
public static final long serialVersionUID=0L;
private final ClosureDemo $outer
private final int j$1;
private final scala.runtime.IntRef k$1
public final int apply(int)
public int apply$mcII$sp(int)
public final java.lang.Object apply(java.lang.Object)
public ClosureDemo$$anonfun$1(ClosureDemo, int, scala.runtime.IntRef)
咱們仍是從構造函數開始入手,它先檢測了第一個入參是不是null,若是是null則拋出空指針異常,不然將其存入類的$outer
字段裏。以後將j: Int
與k: IntRef
存入類的j$1
與k$1
字段裏。
因爲apply
方法只是簡單調用apply$mcII$sp(int)
方法,所以咱們繼續分析apply$mcII$sp(int)
。首先它調用了ClosureDemo
類的ClosureDemo$$i$1
方法取i
的值,而後取Int類型的j$1
的值,再取IntRef類型的k$1
中的elem值,將它們加在一塊兒返回。
從這個例子咱們能夠看出:
一、閉包調用外部方法會把外層類對象存在該閉包的$outer
字段中,並在使用到該函數時用$outer
進行invokevirtual
調用
二、閉包調用外部val變量時,僅僅把該變量存在對應名稱的字段中,在使用時直接取值
三、閉包調用外部var變量時,若是變量爲值(AnyVal)類型,則會建立對應的Ref對象將其包裹並存在字段中,若是爲引用類型(AnyRef),則會建立ObjectRef對象來包裹。在使用時取其elem
字段來取它的原始值。
在本篇博客中,只介紹了一層包裝的閉包。在Scala中還能夠實現不少層包裝的閉包,與一層包裝的區別僅僅在於每一層閉包會在須要時將其最近的一層外包裝對象的存儲在其
$outer
字段裏,有興趣能夠本身構造如下來看看其class文件。