轉自:http://www.cnblogs.com/futuredo/archive/2012/10/19/2727204.htmlhtml
Erlang/OTP R15B02編程
In R12B, the most natural way to write binary construction and matching is now significantly faster than in earlier releases.app
在R12B版本中,構造和匹配二進制數據最天然的方式,相比較以前的版本其效率有了明顯提升。ide
To construct at binary, you can simply write函數式編程
R12B版本中,構造二進制數據,你能夠簡單地這樣寫(但不要在R12B以前的版本中這樣寫)函數
DO (in R12B) / REALLY DO NOT (in earlier releases)oop
my_list_to_binary(List) ->
my_list_to_binary(List, <<>>).
my_list_to_binary([H|T], Acc) ->
my_list_to_binary(T, <<Acc/binary,H>>);
my_list_to_binary([], Acc) ->
Acc.
(%% 取出一個列表元素,轉換到binary當中 %%)優化
In releases before R12B, Acc would be copied in every iteration. In R12B, Acc will be copied only in the first iteration and extra space will be allocated at the end of the copied binary. In the next iteration, H will be written in to the extra space. When the extra space runs out, the binary will be reallocated with more extra space.ui
在R12B以前的版本中,Acc在每次的迭代中 都會被複制一次。在R12B版本中,Acc只在第一次迭代中會被複制一次,而且會在二進制副本結尾分配額外的空間。在下一次迭代中,H會被寫到額外的空間 上。當這些額外的空間使用完畢,系統會爲二進制數據會從新分配內存並帶有更大的額外空間。this
The extra space allocated (or reallocated) will be twice the size of the existing binary data, or 256, whichever is larger.
額外分配的空間的大小將是現有二進制數據大小的兩倍,或者256,看哪一個更大。
The most natural way to match binaries is now the fastest:
在R12B版本中,匹配二進制數據最天然的方式(相比其餘方式)如今是效率最快的:
DO (in R12B)
my_binary_to_list(<<H,T/binary>>) ->
[H|my_binary_to_list(T)];
my_binary_to_list(<<>>) -> [].
二進制數據是怎麼實現的
(%% 這部分講的是內部機制,有點不太好理解,翻譯起來術語容易混淆 %%)
Internally, binaries and bitstrings are implemented in the same way. In this section, we will call them binaries since that is what they are called in the emulator source code.
二進制數據和比特位串的內部實現是相同的。在這一節,咱們將它們統一稱爲二進制數據,由於在虛擬機的源代碼裏它們就是這麼叫的。
There are four types of binary objects internally. Two of them are containers for binary data and two of them are merely references to a part of a binary.
在內部,共有4種二進制對象。其中兩種是裝載二進制數據的容器,另外兩種只是對二進制數據中的一部分的引用。
(%% 兩種是數據容器,Refc binaries 和 heap binaries %%)
The binary containers are called refc binaries (short for reference-counted binaries) and heap binaries.
二進制數據容器被叫作refc binaries(引用計數的二進制數據的簡稱)和heap binaries(堆二進制數據)
Refc binaries consist of two parts: an object stored on the process heap, called a ProcBin, and the binary object itself stored outside all process heaps.
Refc binaries 由兩部分組成:存儲在進程堆上的一個叫ProcBin的對象,存儲在全部進程堆以外的binary object二進制對象自己。
The binary object can be referenced by any number of ProcBins from any number of processes; the object contains a reference counter to keep track of the number of references, so that it can be removed when the last reference disappears.
能夠從任意進程的任意ProcBins來引用到binary object二進制對象;這個對象(binary object)包含一個引用計數器,用來追蹤引用的數量,這樣就能夠在最後一個引用消失以後將本身移除。
All ProcBin objects in a process are part of a linked list, so that the garbage collector can keep track of them and decrement the reference counters in the binary when a ProcBin disappears.
一個進程中全部的ProcBin對象都是一個鏈表的一部分,這樣垃圾回收器就能夠追蹤到它們,當一個ProcBin消失以後,能夠減少其(binary object)引用計數。
Heap binaries are small binaries, up to 64 bytes, that are stored directly on the process heap. They will be copied when the process is garbage collected and when they are sent as a message. They don't require any special handling by the garbage collector.
Heap binaries 是一種小型的二進制數據,最大爲64字節,直接存儲在進程堆上。當進程被回收或者做爲一個消息被髮送的時候,這些數據就會被複制。它們不須要垃圾回收器作特殊處理。
There are two types of reference objects that can reference part of a refc binary or heap binary. They are called sub binaries andmatch contexts.
有兩種引用對象能夠引用到refc binary或者heap binary的一部分,叫作sub binaries(子二進制數據)和match contexts(匹配上下文)。
(%% 兩種是數據引用,sub binaries 和 match context %%)
A sub binary is created by split_binary/2 and when a binary is matched out in a binary pattern. A sub binary is a reference into a part of another binary (refc or heap binary, never into a another sub binary). Therefore, matching out a binary is relatively cheap because the actual binary data is never copied.
當使用split_binary/2函數或者一 個二進制數據被匹配的時候,會產生出一個sub binary(子二進制數據)。一個sub binary(子二進制數據)是對另外一個二進制數據一部分的引用(能夠是refc binary或者heap binary,但不能是另一個sub binary)。所以,匹配出來一個二進制數據相對來講開銷不大,由於實際的二進制數據沒有被複制。
A match context is similar to a sub binary, but is optimized for binary matching; for instance, it contains a direct pointer to the binary data. For each field that is matched out of a binary, the position in the match context will be incremented.
一個match context(匹配上下文)和sub binary(子二進制數據)類似,但對二進制數據匹配作了優化;例如,它包含了一個指向二進制數據的指針。每個匹配出來的二進制數據的域,在 match context(匹配上下文)中都有一個位置,而且是遞增的。
In R11B, a match context was only used during a binary matching operation.
在R11B版本中,一個match context(匹配上下文)只是在進行二進制數據匹配操做時纔會被使用。
In R12B, the compiler tries to avoid generating code that creates a sub binary, only to shortly afterwards create a new match context and discard the sub binary. Instead of creating a sub binary, the match context is kept.
在R12B版本中,編譯器試圖避免產生那種會生成sub binary(子二進制數據)的代碼,由於以後很快會生成一個新的match context(匹配上下文),棄用sub binary(子二進制數據)。
The compiler can only do this optimization if it can know for sure that the match context will not be shared. If it would be shared, the functional properties (also called referential transparency) of Erlang would break.
編譯器只有在確保match context(匹配上下文)不會被共享時才能作出這種優化。若是match context(匹配上下文)會被共享,那麼Erlang的功能特性(也叫作引用透明)會失效。
(%% 不明白match context的共享是怎麼影響到優化的 %%)
(%%
看到阮一峯的博客《函數式編程初探》一文裏 有對引用透明(Referential transparency)的解釋,它指的是函數的運行不依賴於外部變量或"狀態",只依賴於輸入的參數,任什麼時候候只要參數相同,引用函數所獲得的返回值 老是相同的。其餘類型的語言,函數的返回值每每與系統狀態有關,不一樣的狀態之下,返回值是不同的。這就叫"引用不透明",很不利於觀察和理解程序的行 爲。
出處:http://www.ruanyifeng.com/blog/2012/04/functional_programming.html
%%)
構造二進制數據
In R12B, appending to a binary or bitstring
在R12B版本中,虛擬機對二進制數據和比特位串的附加操做
<<Binary/binary, ...>>
<<Binary/bitstring, ...>>
is specially optimized by the run-time system. Because the run-time system handles the optimization (instead of the compiler), there are very few circumstances in which the optimization will not work.
進行了特別優化。由於是虛擬機(不是編譯器)來作優化,因此優化過程幾乎適用於全部狀況。
To explain how it works, we will go through this code
爲了解釋它是怎麼工做的,咱們來看看這些代碼
Bin0 = <<0>>, %% 1
Bin1 = <<Bin0/binary,1,2,3>>, %% 2
Bin2 = <<Bin1/binary,4,5,6>>, %% 3
Bin3 = <<Bin2/binary,7,8,9>>, %% 4
Bin4 = <<Bin1/binary,17>>, %% 5 !!!
{Bin4,Bin3} %% 6
line by line.
The first line (marked with the %% 1 comment), assigns a heap binary to the variable Bin0.
第一行給Bin0變量賦了一個heap binary(堆二進制數據)值。
The second line is an append operation. Since Bin0 has not been involved in an append operation, a new refc binary will be created and the contents of Bin0 will be copied into it. The ProcBin part of the refc binary will have its size set to the size of the data stored in the binary, while the binary object will have extra space allocated. The size of the binary object will be either twice the size of Bin0 or 256, whichever is larger. In this case it will be 256.
第二行是一個附加操做。由於以前Bin0未曾有 經歷過一個附加操做,因此如今一個新的refc binary會被建立,Bin0的內容會被複制到其中。refc binary中,ProcBin部分的大小被設置成二進制數據的大小,二進制對象部分會分配額外的空間,(二進制對象)大小是Bin0大小的兩倍,或者是 256,看哪一個大,這個例子中是256。
It gets more interesting in the third line. Bin1 has been used in an append operation, and it has 255 bytes of unused storage at the end, so the three new bytes will be stored there.
第三行更有意思。Bin1以前經歷過一個附加操做,在結尾有255個字節空間沒被使用,因此三個新的字節會存在那裏。
(%% 爲何是255個 %%)
Same thing in the fourth line. There are 252 bytes left, so there is no problem storing another three bytes.
第四行也是同樣。還有252個字節留存,因此再存3個字節也沒問題。
But in the fifth line something interesting happens. Note that we don't append to the previous result in Bin3, but to Bin1. We expect that Bin4 will be assigned the value <<0,1,2,3,17>>. We also expect that Bin3 will retain its value (<<0,1,2,3,4,5,6,7,8,9>>). Clearly, the run-time system cannot write the byte 17 into the binary, because that would change the value of Bin3 to <<0,1,2,3,4,17,6,7,8,9>>.
但在第五行有趣的事發生了。注意,咱們沒有把之 前的結果附加到Bin3上,而是附加到了Bin1上。咱們預計Bin4會被賦值爲<<0,1,2,3,17>>,Bin3會保留 它原有的值<<0,1,2,3,4,5,6,7,8,9>>。顯然,虛擬機不能把17寫到二進制數據裏,由於那樣會把Bin3的 值變爲<<0,1,2,3,4,17,6,7,8,9>>。
(%% 17爲何不寫到4的位置 %%)
What will happen?
那麼到底會發生什麼?
The run-time system will see that Bin1 is the result from a previous append operation (not from the latest append operation), so it will copy the contents of Bin1 to a new binary and reserve extra storage and so on. (We will not explain here how the run-time system can know that it is not allowed to write into Bin1; it is left as an exercise to the curious reader to figure out how it is done by reading the emulator sources, primarily erl_bits.c.)
虛擬機會看到Bin1是以前附加操做的結果(不是最新的那次附加操做),因此它會把Bin1的內容複製到一個新的二進制數據中,並分配額外的空間。(這裏咱們不會解釋虛擬機是怎樣知道數據不能夠寫到Bin1中的;有興趣能夠去看看虛擬機的源碼 erl_bits.c)
強行復制的狀況
The optimization of the binary append operation requires that there is a single ProcBin and a single reference to the ProcBin for the binary. The reason is that the binary object can be moved (reallocated) during an append operation, and when that happens the pointer in the ProcBin must be updated. If there would be more than one ProcBin pointing to the binary object, it would not be possible to find and update all of them.
二進制數據附加操做的優化,要求只有一個 ProcBin和對其二進制對象的單個引用。理由是,在一個附加操做中,二進制對象能夠移動(從新分配),發生這種狀況時,在ProcBin裏的指針必須 更新。若是有多於一個ProcBin指向二進制對象,那就不可能發現和更新全部指針。
(%% 前面提到過,ProcBin是存在鏈表裏的,既然一個指針能夠更新,爲何不能更新其餘ProcBin的指針 %%)
Therefore, certain operations on a binary will mark it so that any future append operation will be forced to copy the binary. In most cases, the binary object will be shrunk at the same time to reclaim the extra space allocated for growing.
所以,二進制數據上的這種操做會標明,任何之後的附加操做會要求複製二進制數據。在大多數狀況下,二進制對象同時也會收縮,回收增加時期分配的額外空間。
When appending to a binary
當對二進制數據進行附加操做時
Bin = <<Bin0,...>>
only the binary returned from the latest append operation will support further cheap append operations. In the code fragment above, appending to Bin will be cheap, while appending to Bin0 will force the creation of a new binary and copying of the contents of Bin0.
只有最新一次附加操做返回的二進制數據纔會支持之後開銷小的附加操做。在上面的代碼塊中,對Bin進行附加操做開銷較小,但對Bin0進行附加操做,則要求新建一個新的二進制數據,並複製Bin0的內容。
If a binary is sent as a message to a process or port, the binary will be shrunk and any further append operation will copy the binary data into a new binary. For instance, in the following code fragment
若是一個二進制數據被當作消息發送給一個進程或者端口,那麼這個二進制數據會收縮,之後的附加操做會複製原有的數據到一個新的二進制數據,下面的代碼塊中
Bin1 = <<Bin0,...>>,
PortOrPid ! Bin1,
Bin = <<Bin1,...>> %% Bin1 will be COPIED
Bin1 will be copied in the third line.
第三行,Bin1會被複制。
The same thing happens if you insert a binary into an ets table or send it to a port using erlang:port_command/2.
若是你把一個二進制數據插入到ets表中,或者使用erlang:port_command/2函數把它發送到一個端口,一樣也會進行復制。
Matching a binary will also cause it to shrink and the next append operation will copy the binary data:
匹配一個二進制數據也會引發數據收縮,下一次附加操做會複製二進制數據:
Bin1 = <<Bin0,...>>,
<<X,Y,Z,T/binary>> = Bin1,
Bin = <<Bin1,...>> %% Bin1 will be COPIED
The reason is that a match context contains a direct pointer to the binary data.
這是由於,match context(匹配上下文)包含了一個指向二進制數據的指針。
(%% 這個解釋不太明白,有了指針就能夠複製指向的數據,但爲何要複製呢 %%)
If a process simply keeps binaries (either in "loop data" or in the process dictionary), the garbage collector may eventually shrink the binaries. If only one such binary is kept, it will not be shrunk. If the process later appends to a binary that has been shrunk, the binary object will be reallocated to make place for the data to be appended.
若是一個進程只是簡單地保留二進制數據 (在"loop data"或者進程字典中),垃圾回收器可能最後會收縮這些數據。若是進程只是保留一個二進制數據,那麼它不會被收縮。若是進程以後對收縮的二進制數據進 行了附加操做,那麼二進制對象會被從新分配,給附加的數據留出空間。
匹配二進制數據
We will revisit the example shown earlier
咱們來從新看一下以前的例子(R12B版本)
DO (in R12B)
my_binary_to_list(<<H,T/binary>>) ->
[H|my_binary_to_list(T)];
my_binary_to_list(<<>>) -> [].
too see what is happening under the hood.
看看發生了什麼。
The very first time my_binary_to_list/1 is called, a match context will be created. The match context will point to the first byte of the binary. One byte will be matched out and the match context will be updated to point to the second byte in the binary.
第一次調用 my_binary_to_list/1函數,就建立了一個match context(匹配上下文)。這個match context(匹配上下文)將指向二進制數據的首個字節。一個字節被匹配出來,而後這個match context(匹配上下文)會被更新,指向二進制數據中的第二個字節。
In R11B, at this point a sub binary would be created. In R12B, the compiler sees that there is no point in creating a sub binary, because there will soon be a call to a function (in this case, to my_binary_to_list/1 itself) that will immediately create a new match context and discard the sub binary.
在R11B版本中,這裏會建立一個sub binary(子二進制數據)。在R12B版本中,編譯器會明白建立一個sub binary(子二進制數據)是沒有意義的,由於接下來很快會調用一個函數,當即產生一個match context(匹配上下文),拋棄sub binary(子二進制數據)。
Therefore, in R12B, my_binary_to_list/1 will call itself with the match context instead of with a sub binary. The instruction that initializes the matching operation will basically do nothing when it sees that it was passed a match context instead of a binary.
所以,在R12B版本 中,my_binary_list/1函數帶着match context(匹配上下文)去調用自身,而不是sub binary(子二進制數據)。當初始化匹配的操做知道,傳過來的是一個match context(匹配上下文)而不是一個二進制數據,它基本不作任何工做。
When the end of the binary is reached and the second clause matches, the match context will simply be discarded (removed in the next garbage collection, since there is no longer any reference to it).
當到達二進制數據的尾端,第二個函數段匹配時,這個match context(匹配上下文)會被棄用。(因爲再也不有對它的引用,在下一次垃圾回收的時候會被移除)
To summarize, my_binary_to_list/1 in R12B only needs to create one match context and no sub binaries. In R11B, if the binary contains N bytes, N+1 match contexts and N sub binaries will be created.
總結一下,R12B版本 中,my_binary_to_list/1只需建立一個match context(匹配上下文),不要建立sub binaries(子二進制數據)。在R11B版本中,若是二進制數據包含N個字節,N+1個match context(匹配上下文)和N個sub bianry(子二進制數據)會被建立。
(%% sub binary好理解,這個N+1個match context,是還有一個指向不存在的尾端嗎 %%)
In R11B, the fastest way to match binaries is:
在R11B版本中,匹配二進制數據最快的方式是:(不要在R12B中使用)
DO NOT (in R12B)
my_complicated_binary_to_list(Bin) ->
my_complicated_binary_to_list(Bin, 0).
my_complicated_binary_to_list(Bin, Skip) ->
case Bin of
<<_:Skip/binary,Byte,_/binary>> ->
[Byte|my_complicated_binary_to_list(Bin, Skip+1)];
<<_:Skip/binary>> ->
[]
end.
This function cleverly avoids building sub binaries, but it cannot avoid building a match context in each recursion step. Therefore, in both R11B and R12B, my_complicated_binary_to_list/1 builds N+1 match contexts. (In a future release, the compiler might be able to generate code that reuses the match context, but don't hold your breath.)
這個函數聰明的避免了構造sub binaries(子二進制數據),可是免不了在每次迭代中構造一個match context(匹配上下文)。因此,在R11B和R12B版本中,my_complicated_binary_to_list/1函數都要構造N+1 個match context(匹配上下文)。(在將來的版本中,編譯器可能可以生成複用match context(匹配上下文)的代碼)
(%% 怎樣避免構造sub binary仍是不太明白,是由於編譯器判斷出要產生match context而棄用sub binary嗎 %%)
Returning to my_binary_to_list/1, note that the match context was discarded when the entire binary had been traversed. What happens if the iteration stops before it has reached the end of the binary? Will the optimization still work?
回到my_binary_to_list/1函數,注意到,當整個二進制數據遍歷完之後match context(匹配上下文)會被棄用。那若是在還沒遍歷到二進制數據尾端的時候終止迭代,會發生什麼呢?優化還可以進行嗎?
after_zero(<<0,T/binary>>) ->
T;
after_zero(<<_,T/binary>>) ->
after_zero(T);
after_zero(<<>>) ->
<<>>.
Yes, it will. The compiler will remove the building of the sub binary in the second clause
是的,優化仍能進行。編譯器會不在第二個函數段中構造sub binary(子二進制數據)。
.
.
.
after_zero(<<_,T/binary>>) ->
after_zero(T);
.
.
.
but will generate code that builds a sub binary in the first clause
而是生成一種代碼,在第一個函數段中構造sub binary(子二進制數據)
after_zero(<<0,T/binary>>) ->
T;
.
.
.
Therefore, after_zero/1 will build one match context and one sub binary (assuming it is passed a binary that contains a zero byte).
所以,after_zero/1函數會構造一個match context(匹配上下文)和一個sub binary(子二進制數據)。(假設傳過去一個包含0個字節的二進制數據)
(%% 靠編譯器來判斷 %%)
Code like the following will also be optimized:
像下面這樣的代碼也會被優化:
all_but_zeroes_to_list(Buffer, Acc, 0) ->
{lists:reverse(Acc),Buffer};
all_but_zeroes_to_list(<<0,T/binary>>, Acc, Remaining) ->
all_but_zeroes_to_list(T, Acc, Remaining-1);
all_but_zeroes_to_list(<<Byte,T/binary>>, Acc, Remaining) ->
all_but_zeroes_to_list(T, [Byte|Acc], Remaining-1).
The compiler will remove building of sub binaries in the second and third clauses, and it will add an instruction to the first clause that will convert Buffer from a match context to a sub binary (or do nothing if Buffer already is a binary).
編譯器在第二個函數段和第三個函數段將不會構造 sub binary(子二進制數據),它會對第一個函數段加個操做,把Buffer從一個match context(匹配上下文)轉換爲sub binary(子二進制數據)(或者不作任何操做,若是Buffer已是一個二進制數據)。
(%% 仍是那套原則嗎?編譯器判斷出會用到match context,因此在最後才建立sub binary %%)
Before you begin to think that the compiler can optimize any binary patterns, here is a function that the compiler (currently, at least) is not able to optimize:
在你認爲編譯器能夠優化任何二進制數據以前,這裏有個編譯器不能優化的函數(至少如今是這樣):
non_opt_eq([H|T1], <<H,T2/binary>>) ->
non_opt_eq(T1, T2);
non_opt_eq([_|_], <<_,_/binary>>) ->
false;
non_opt_eq([], <<>>) ->
true.
It was briefly mentioned earlier that the compiler can only delay creation of sub binaries if it can be sure that the binary will not be shared. In this case, the compiler cannot be sure.
在以前簡短地提到過,只有當編譯器能明確地知道二進制數據不會被共享的時候,它才能延遲sub binary(子二進制數據)的建立。在這個例子中,編譯器不能肯定這個條件。
(%% 不會被共享,才能延遲建立,機制仍是不太明白 %%)
We will soon show how to rewrite non_opt_eq/2 so that the delayed sub binary optimization can be applied, and more importantly, we will show how you can find out whether your code can be optimized.
不久,咱們會講解怎樣來重寫non_opt_eq/2函數,使得可以延遲sub binary(子二進制數據)的建立。更重要的是,咱們會講到怎樣判斷你的代碼能不能被優化。
bin_opt_info 選項
Use the bin_opt_info option to have the compiler print a lot of information about binary optimizations. It can be given either to the compiler or erlc
使用bin_opt_info選項可讓編譯器打印不少關於二進制數據優化的信息。這個選項能夠加到erlc
erlc +bin_opt_info Mod.erl
or passed via an environment variable
或者經過環境變量來傳遞
export ERL_COMPILER_OPTIONS=bin_opt_info
Note that the bin_opt_info is not meant to be a permanent option added to your Makefiles, because it is not possible to eliminate all messages that it generates. Therefore, passing the option through the environment is in most cases the most practical approach.
注意到,bin_opt_info選項不該該是一個加到Makefiles文件中的固定選項,由於它不可能消除它產生的全部消息。所以,大多數狀況下,經過環境變量來傳遞這個選項是最實際的方式。
(%% 翻譯很彆扭,應該是方便改Makefiles中的全部bin_opt_info變量吧 %%)
The warnings will look like this:
提醒消息相似這樣:
./efficiency_guide.erl:60: Warning: NOT OPTIMIZED: sub binary is used or returned
./efficiency_guide.erl:62: Warning: OPTIMIZED: creation of sub binary delayed
To make it clearer exactly what code the warnings refer to, in the examples that follow, the warnings are inserted as comments after the clause they refer to:
爲了清晰地表示這些提醒是對應於哪些代碼,在下面的例子中,提醒信息做爲註釋插入到了對應函數段的後面:
after_zero(<<0,T/binary>>) ->
%% NOT OPTIMIZED: sub binary is used or returned
T;
after_zero(<<_,T/binary>>) ->
%% OPTIMIZED: creation of sub binary delayed
after_zero(T);
after_zero(<<>>) ->
<<>>.
The warning for the first clause tells us that it is not possible to delay the creation of a sub binary, because it will be returned. The warning for the second clause tells us that a sub binary will not be created (yet).
第一個函數段的提醒信息告訴咱們,不能延遲一個sub binary(子二進制數據)的建立,由於它這裏會被返回。第二個函數段的提醒信息告訴咱們,將不會建立一個sub binary(子二進制數據)。
It is time to revisit the earlier example of the code that could not be optimized and find out why:
如今讓咱們回過頭來看看先前那個不能優化的代碼例子,找出爲何:
non_opt_eq([H|T1], <<H,T2/binary>>) ->
%% INFO: matching anything else but a plain variable to
%% the left of binary pattern will prevent delayed
%% sub binary optimization;
%% SUGGEST changing argument order
%% NOT OPTIMIZED: called function non_opt_eq/2 does not
%% begin with a suitable binary matching instruction
non_opt_eq(T1, T2);
non_opt_eq([_|_], <<_,_/binary>>) ->
false;
non_opt_eq([], <<>>) ->
true.
The compiler emitted two warnings. The INFO warning refers to the function non_opt_eq/2 as a callee, indicating that any functions that call non_opt_eq/2 will not be able to make delayed sub binary optimization. There is also a suggestion to change argument order. The second warning (that happens to refer to the same line) refers to the construction of the sub binary itself.
編譯器產生了兩條提醒信息。INFO提醒把 non_opt_eq/2函數當作被調函數,代表任何調用non_opt_eq/2的函數沒法作出延遲建立sub binary(子二進制數據)的優化。這裏指出了一個建議,改變參數順序。第二個提醒(碰巧是在同一行)指的是sub binary(子二進制數據)的構造。
We will soon show another example that should make the distinction between INFO and NOT OPTIMIZED warnings somewhat clearer, but first we will heed the suggestion to change argument order:
咱們將用另一個例子來更加明顯地表示INFO和NOT OPTIMIZED的區別,但首先咱們須要採用改變參數順序的建議:
opt_eq(<<H,T1/binary>>, [H|T2]) ->
%% OPTIMIZED: creation of sub binary delayed
opt_eq(T1, T2);
opt_eq(<<_,_/binary>>, [_|_]) ->
false;
opt_eq(<<>>, []) ->
true.
The compiler gives a warning for the following code fragment:
編譯器爲下面代碼給出了一個提醒信息:
match_body([0|_], <<H,_/binary>>) ->
%% INFO: matching anything else but a plain variable to
%% the left of binary pattern will prevent delayed
%% sub binary optimization;
%% SUGGEST changing argument order
done;
.
.
.
The warning means that if there is a call to match_body/2 (from another clause in match_body/2 or another function), the delayed sub binary optimization will not be possible. There will be additional warnings for any place where a sub binary is matched out at the end of and passed as the second argument to match_body/2. For instance:
這條提醒表示的是,若是 match_body/2函數被調用(從另一個match_body/2函數段或者其餘函數),就沒法來優化延遲建立sub binary(子二進制數據)。在任何地方(例如,下面的match_head/2函數中),sub binary(子二進制數據)做爲第二個參數,而且最後作了匹配,會有其它的提醒信息。例如:
(%% 它的建議是,要優化,二進制數據都應該放在左邊 %%)
match_head(List, <<_:10,Data/binary>>) ->
%% NOT OPTIMIZED: called function match_body/2 does not
%% begin with a suitable binary matching instruction
match_body(List, Data).
不使用的變量
The compiler itself figures out if a variable is unused. The same code is generated for each of the following functions
編譯器自身會判斷出一個變量是否被使用。對於下面的每個函數,會產生相同的代碼
count1(<<_,T/binary>>, Count) -> count1(T, Count+1);
count1(<<>>, Count) -> Count.
count2(<<H,T/binary>>, Count) -> count2(T, Count+1);
count2(<<>>, Count) -> Count.
count3(<<_H,T/binary>>, Count) -> count3(T, Count+1);
count3(<<>>, Count) -> Count.
In each iteration, the first 8 bits in the binary will be skipped, not matched out.
每次迭代中,二進制數據的頭8個bit位會被跳過,不匹配出來。