Android 性能建議 Performance Tips

譯文 ( By Chikeong ):html

 

這篇文章主要介紹一些結合起來使用能提高app 總體性能的細小的優化方法,但不要期待這些修改能帶來巨大的性能改變。你應該花更多精力在選擇合適的算法和數據結構,但這些不在該文章的主題以內。爲了寫出高性能的代碼,你應該將這些幫助提示融入你的編碼習慣中。java

編寫高效代碼有兩個基本原則:android

  • 不作多餘的事。
  • 儘可能避免內存分配(操做)。
當對一個Android app 進行細小優化時,一個最棘手的問題是這個app 將運行在不一樣類型的硬件設備上。不一樣版本的VM 以不一樣的效率運行在不一樣的處理器上。這不只僅是你能夠簡單說「設備A 比設備Y 運行得快的問題」,而後把這個結論從一個設備映射到其餘設備的問題。特別是,在模擬器上的測量結果只能告訴你不多的性能信息。有沒使用JIT,設備跟設備之間可能有巨大的性能差別:對於一個使用JIT的設備而言優秀的代碼,並不老是意味着對於沒有JIT的設備也如此。
爲了確保你的App 在許多種設備中運行良好,確保你的代碼在全部方面上都是高效的,並儘量地對其進行優化。
 

避免多餘對象建立


 

對象建立並不是沒有代價的。一個每一個線程用於臨時對象的的分配池的垃圾收集器能夠下降內存分配的代價,可是一個須要分配內存的操做老是比不須要的代價大。算法

一旦你建立了過多的對象,便意味着你必須按期進行垃圾回收,從而對用戶體驗形成輕微卡頓的感受。多線程的垃圾收集器在Android2.3 時被引入,可是仍然應該避免沒必要要的操做。api

所以,你應該避免建立沒必要要的對象實例。一些例子:數組

  • 若是你有一個方法返回了一個String,而你知道這個返回值一定爲被拼接(append)到一個StringBuffer 中,這時應該改變你的方法簽名和方法實現,讓你的方法內部直接完成這個拼接,而不是建立一個臨時對象。
  • 當你從一連串的輸入數據裏面抽取出一個字符串時,試着返回一個源數據的substring(子串),而不是建立一個副本。你將會建立出一個新的string 對象,可是他將與源數據共享底部的char[].(這麼作的代價是,即便你只適用源數據的其中一部分,你也會將全部數據保留在內存裏)。

一個更激進的作法是將一個多維的數組分解成多個平行的一維數組:promise

  • 一個存儲int 的數組比存儲Integer 對象的數組更好,同時這也引出一個事實:兩個並行的使用的int 數組要比一個存儲(int,int)結構對象的數組高效許多。這種狀況對於全部基本類型都適用。
  • 若是你想要實現一個存儲(Foo,bar)元組對象的容器(集合類),記住,兩個並行使用的Foo[] 和 bar[] 通常來講都比一個自定義的(Foo,bar)元組對象 的數組高效得多。(固然,若是你是在設計一個對外開發使用的API,通常來講犧牲一點性能來達到良好的API 設計老是值得的。可是,在你內部的代碼實現中,你應該儘可能使用高效的作法)。
通常說來,儘可能避免建立臨時的對象。更少的對象建立意味着更低頻的垃圾回收,這對用戶體驗有直接影響。

 

對象建立並不是沒有代價的。一個每一個線程用於臨時對象的的分配池的垃圾收集器能夠下降內存分配的代價,可是一個須要分配內存的操做老是比不須要的代價大。數據結構

一旦你建立了過多的對象,便意味着你必須按期進行垃圾回收,從而對用戶體驗形成輕微卡頓的感受。多線程的垃圾收集器在Android2.3 時被引入,可是仍然應該避免沒必要要的操做。多線程

所以,你應該避免建立沒必要要的對象實例。一些例子:架構

  • 若是你有一個方法返回了一個String,而你知道這個返回值一定爲被拼接(append)到一個StringBuffer 中,這時應該改變你的方法簽名和方法實現,讓你的方法內部直接完成這個拼接,而不是建立一個臨時對象。
  • 當你從一連串的輸入數據裏面抽取出一個字符串時,試着返回一個源數據的substring(子串),而不是建立一個副本。你將會建立出一個新的string 對象,可是他將與源數據共享底部的char[].(這麼作的代價是,即便你只適用源數據的其中一部分,你也會將全部數據保留在內存裏)。

一個更激進的作法是將一個多維的數組分解成多個平行的一維數組:

  • 一個存儲int 的數組比存儲Integer 對象的數組更好,同時這也引出一個事實:兩個並行的使用的int 數組要比一個存儲(int,int)結構對象的數組高效許多。這種狀況對於全部基本類型都適用。
  • 若是你想要實現一個存儲(Foo,bar)元組對象的容器(集合類),記住,兩個並行使用的Foo[] 和 bar[] 通常來講都比一個自定義的(Foo,bar)元組對象 的數組高效得多。(固然,若是你是在設計一個對外開發使用的API,通常來講犧牲一點性能來達到良好的API 設計老是值得的。可是,在你內部的代碼實現中,你應該儘可能使用高效的作法)。
通常說來,儘可能避免建立臨時的對象。更少的對象建立意味着更低頻的垃圾回收,這對用戶體驗有直接影響。
 

使用靜態


 

若是你的方法沒有使用到對象的域(成員變量),把你的方法改爲static的。方法調用能提高15 % -20%。 這也是一個良好的作法,由於從方法的簽名你能知道這個調用這個方法將不會改變對象的狀態。

 

爲變量添加 Static Final 標識


 

假設類裏定義了一些變量:

staticint intVal =42;
staticString strVal ="Hello, world!";

編譯器使用<clinit> 的初始化這個類,這是在類第一次使用時被執行的。這個方法將42 存儲到intVal,而後爲strVal 從類文件字符串常量表中取出一個引用。這些值以後經過域查找被使用。

咱們能夠經過final 關鍵字來優化:

static final int intVal =42;
static final String strVal ="Hello, world!";

這個類再也不須要<clinit>的調用,由於這些常量進入dex 文件的靜態域初始化。使用intVal 的代碼將直接使用整型值42,而對strVal 的使用時經過相對代價低廉的 」字符串常量「(String constant)指令而非域查找來實現的。

Note: 注意這個優化只適用於基本類型和String 常量,並不是全部類型。不過,儘量使用static final 來標示常量是一個良好的作法。

 

避免內部的Getter/Setters 


 

在原生的語言裏,如C++, 一個常有的作法是使用getters(i = getCount()) 而不是直接使用域(i = mCount). 對於C++ 來講,這是一個好習慣。這也被其餘的面嚮對象語言中使用,好比C#和Java,由於編譯器本身能內聯域。若是須要限制或者debug 域的使用,你能夠加上這樣的代碼。

可是可是,這在Android 裏頭是個糟糕的作法。方法調用比域查找代價高得多。遵循面嚮對象語言的規範,在公共接口裏使用getter 和setter 是合理的,可是在類的內部,則應該老是使用域。

不使用JIT,直接使用域比調用getter 獲取 快三倍多。使用JIT(這時直接適用域就跟使用本地數據同樣廉價)後,速度提高到7倍多。

注意:若是你在使用ProGuard 的話,你能夠任意使用這兩種方法,由於ProGuard 會幫你自動內聯域。

使用增強版的迭代


 

增強版的迭代(即 for-each 方法迭代)能被使用在實現了Iterable 接口的集合類和數組上。對於集合類,可使用iterator來 調用hasnext() 和next()來進行迭代。對於ArrayList,手動計數的迭代要快3倍多比iterator或者for-each迭代(無論有沒使用JIT)。而對於其餘集合類,使用for-each迭代跟使用iterator差很少。

通常來講,你應該使用for-each 迭代。可是對於ArrayList 則應考慮使用手動計數的迭代。

 

使用包 以取代對私有內部類的私有使用


 

假設有以下的類定義:

publicclassFoo{
    privateclassInner{
        void stuff(){
            Foo.this.doStuff(Foo.this.mValue);
        }
    }

    privateint mValue;

    publicvoid run(){
        Innerin=newInner();
        mValue =27;
        in.stuff();
    }

    privatevoid doStuff(int value){
        System.out.println("Value is "+ value);
    }
}

 

這裏的關鍵點在於,咱們定義了一個私有的內部類(Foo$Inner),內部類的方法裏又調用了外部類的一個私有方法和一個私有的成員域(成員變量)。這是合法的,最後的結果也如預期打印出「Value is27」。

可是,問題在於VM 將把(Foo$Inner) 直接引用Foo 的私有成員當成是非法的操做,由於Foo 和 Foo$Inner 是不一樣的類,即便Java 容許一個內部類使用外部類的私有成員。爲了解決這個問題,編譯器將本身生成幾個合成方法:

/*package*/staticintFoo.access$100(Foo foo){
    return foo.mValue;
}
/*package*/staticvoidFoo.access$200(Foo foo,int value){
    foo.doStuff(value);
}

當內部類須要引用外部類的mValue 私有成員域,或者調用doStuff()私有方法時,將調用這些靜態方法。這意味着上述的代碼變成了你是在使用方法調用來獲取成員變量的值。早些時候咱們已經說起方法調用要比直接的域使用效率低,所以這是一種程序語言慣用語法致使一個「隱性」的性能損失的例子。

若是你正在一個性能的關鍵處使用了相似語法,你能夠經過將那些被內部類使用的域和方法改寫成包訪問權限的,而非私有權限的,以此來避免這些性能損失。不過,這會致使包內的其餘類都能訪問到該域和方法,所以,在公共(public)api 中,你不應這麼作。

 

避免使用浮點數


經驗告訴咱們,在Android 設備上,使用浮點數,將比使用整形數慢上兩倍。

從速度上講,float 和double 在現代的硬件設備上沒區別。存儲空間上,double 是兩倍大。對於桌面設備,不須要太關注考慮存儲空間問題,因此你應該更多地使用double。

一樣,對於整型數integer,一些處理器支持硬件乘法,但不支持硬件除法。在這種狀況下,整型的除法和模除是在軟件上進行的,這是當你在設計一個hash 表 或者作大量數學計算時應該考慮的事情。

使用的類庫


除了通常咱們說起的儘可能使用類庫而不是總靠本身實現的緣由以外,有一點應該被牢記的是,系統能夠把對類庫的調用替換成更高效的彙編語言,這可能會比JIT 能生成等量Java 代碼性能更好。典型的例子是,String.indexOf() 還有它相關的APIs,Dalvik 使用內聯來替換原碼。相似的System.arraycopy() 方法在使用JIT 的 Nexus One 設備上的效率是本身手寫的循環複製的差很少9倍。

 

謹慎使用Native 方法


 

謹慎使用Native 方法

利用Android NDK 使用Native (本地)語言 開發的Android App 並不必定就比使用Java開發的性能更卓越。至少有一點值得提出的,Java-native 的關聯和通訊是有代價的,JIT 並不能實現優化這種語言之間的差別。若是你正在分配native 資源(在native heap 上分配內存,文件描述符,或者其餘的),對這些資源的按期回收可能明顯困可貴多。同時你也須要將你的代碼爲運行其上的不一樣的架構分開編譯,而非只是依賴JIT 去完成。你甚至可能還要爲同一架構編寫不一樣版本代碼:對於運行在ARM 處理器,爲G1 編譯的native代碼並不能充分發揮Nexus One 上的A處理器的性能,而後爲Nexus One 編譯的Native 代碼將不能運行在G1上,雖然都是ARM 架構。

性能誤區


對於一個沒有JIT 優化 的設備,經過一個具體類型對象調用比經過一個接口調用方法確實是快一些。(好比,調用HashMap map 的方法是要比調用Map map 的方法代價小些,即便實際上map 都是引用的HashMap 的一個實例)。可是並非所以形成2倍的性能差別,事實上它只是快了6% 左右。事實是,JIT的優化進一步擴大了這種差別。

對於沒有JIT 優化的設備,保存一個類成員域的引用並屢次使用(就像局部變量)比屢次請求這個類成員域(須要域查找)提高20%的性能。使用JIT 優化,他們二者是性能至關的,因此這種優化並不值得,除非你以爲這麼作能提高你的代碼可讀性。(這個狀況適用於final, static和 static final 標識的域)。

 

堅持評估


 

在你開始優化以前,你應該確保你當前有一個問題亟需解決。你必須肯定你能準確評估你當前的性能,不然,你將不可能評估你在嘗試的優化的措施效益。

在這篇文章裏的每一個結論都是有基準測試做支撐的。這些數據能夠在code.google.com"dalvik" project 找到。

這些基準測試 是使用 Caliper 構建的。Caliper  是適用於Java的微型基準測試框架。Caliper  替你完成了微型基準測試的困難工做,甚至能檢測到你設計的測試的誤差。(好比說,VM 已經幫你的代碼進行了優化)咱們很是推薦你使用Caliper  來運行的微型基準測試。

你可能會發現Traceview 對於分析很是有用,可是很重要的一點是,確保你當前禁用了JIT,不然可能致使最後的結果錯誤地將JIT 實現的提高歸功於代碼自己。特別是,當你根據Traceview 的信息建議進行了一些修改以後,想觀察最後的代碼是否比修改前運行得更快。

 

 

 

 

 

 

原文:來自Android developers

Performance Tips

This document primarily covers micro-optimizations that can improve overall app performancewhen combined, but it's unlikely that these changes will result in dramatic performance effects. Choosing the right algorithms and data structures should always be yourpriority, but is outside the scope of this document. You should use the tips in this documentas general coding practices that you can incorporate into your habits for general code efficiency.

  

 

There are two basic rules for writing efficient code:

  • Don't do work that you don't need to do.
  • Don't allocate memory if you can avoid it.

One of the trickiest problems you'll face when micro-optimizing an Android app is that your app is certain to be running on multiple types ofhardware. Different versions of the VM running on differentprocessors running at different speeds. It's not even generally the casethat you can simply say "device X is a factor F faster/slower than device Y",and scale your results from one device to others. In particular, measurement on the emulator tells you very little about performance on any device. There are also huge differences between devices with and without a JIT: the best code for a device with a JIT is not always the best code for a devicewithout.

To ensure your app performs well across a wide variety of devices, ensureyour code is efficient at all levels and agressively optimize your performance.

Avoid Creating Unnecessary Objects


Object creation is never free. A generational garbage collector with per-thread allocationpools for temporary objects can make allocation cheaper, but allocating memoryis always more expensive than not allocating memory.

As you allocate more objects in your app, you will force a periodic garbage collection, creating little "hiccups" in the user experience. Theconcurrent garbage collector introduced in Android 2.3 helps, but unnecessary workshould always be avoided.

 

 

Thus, you should avoid creating object instances you don't need to. Someexamples of things that can help:

  • If you have a method returning a string, and you know that its result will always be appended to a StringBuffer anyway, change your signature and implementation so that the function does the append directly, instead of creating a short-lived temporary object.
  • When extracting strings from a set of input data, try to return a substring of the original data, instead of creating a copy. You will create a new String object, but it will share the char[] with the data. (The trade-off being that if you're only using a small part of the original input, you'll be keeping it all around in memory anyway if you go this route.)

A somewhat more radical idea is to slice up multidimensional arrays into parallel single one-dimension arrays:

  • An array of ints is a much better than an array of Integer objects, but this also generalizes to the fact that two parallel arrays of ints are also a lot more efficient than an array of (int,int) objects. The same goes for any combination of primitive types.
  • If you need to implement a container that stores tuples of (Foo,Bar) objects, try to remember that two parallel Foo[] and Bar[] arrays are generally much better than a single array of custom (Foo,Bar) objects. (The exception to this, of course, is when you're designing an API for other code to access. In those cases, it's usually better to make a small compromise to the speed in order to achieve a good API design. But in your own internal code, you should try and be as efficient as possible.)

Generally speaking, avoid creating short-term temporary objects if youcan. Fewer objects created mean less-frequent garbage collection, which hasa direct impact on user experience.

Prefer Static Over Virtual


If you don't need to access an object's fields, make your method static.Invocations will be about 15%-20% faster.It's also good practice, because you can tell from the methodsignature that calling the method can't alter the object's state.

Use Static Final For Constants


Consider the following declaration at the top of a class:

staticint intVal =42;
staticString strVal ="Hello, world!";

The compiler generates a class initializer method, called<clinit>, that is executed when the class is first used.The method stores the value 42 into intVal, and extracts areference from the classfile string constant table for strVal.When these values are referenced later on, they are accessed with fieldlookups.

We can improve matters with the "final" keyword:

static final int intVal =42;
static final String strVal ="Hello, world!";

The class no longer requires a <clinit> method,because the constants go into static field initializers in the dex file.Code that refers to intVal will usethe integer value 42 directly, and accesses to strVal willuse a relatively inexpensive "string constant" instruction instead of afield lookup.

Note: This optimization applies only to primitive types andString constants, not arbitrary reference types. Still, it's goodpractice to declare constants static final whenever possible.

Avoid Internal Getters/Setters


In native languages like C++ it's common practice to use getters(i = getCount()) instead of accessing the field directly (i= mCount). This is an excellent habit for C++ and is often practiced in otherobject oriented languages like C# and Java, because the compiler canusually inline the access, and if you need to restrict or debug field accessyou can add the code at any time.

However, this is a bad idea on Android. Virtual method calls are expensive,much more so than instance field lookups. It's reasonable to followcommon object-oriented programming practices and have getters and settersin the public interface, but within a class you should always accessfields directly.

Without a JIT,direct field access is about 3x faster than invoking atrivial getter. With the JIT (where direct field access is as cheap asaccessing a local), direct field access is about 7x faster than invoking atrivial getter.

Note that if you're using ProGuard,you can have the best of both worlds because ProGuard can inline accessors for you.

Use Enhanced For Loop Syntax


The enhanced for loop (also sometimes known as "for-each" loop) can be usedfor collections that implement the Iterable interface and for arrays.With collections, an iterator is allocated to make interface callsto hasNext() and next(). With an ArrayList,a hand-written counted loop isabout 3x faster (with or without JIT), but for other collections the enhancedfor loop syntax will be exactly equivalent to explicit iterator usage.

There are several alternatives for iterating through an array:

staticclassFoo{
   
int mSplat;
}

Foo[] mArray =...

publicvoid zero(){
   
int sum =0;
   
for(int i =0; i < mArray.length;++i){
        sum
+= mArray[i].mSplat;
   
}
}

publicvoid one(){
   
int sum =0;
   
Foo[] localArray = mArray;
   
int len = localArray.length;

   
for(int i =0; i < len;++i){
        sum
+= localArray[i].mSplat;
   
}
}

publicvoid two(){
   
int sum =0;
   
for(Foo a : mArray){
        sum
+= a.mSplat;
   
}
}

zero() is slowest, because the JIT can't yet optimize awaythe cost of getting the array length once for every iteration through theloop.

one() is faster. It pulls everything out into localvariables, avoiding the lookups. Only the array length offers a performancebenefit.

two() is fastest for devices without a JIT, andindistinguishable from one() for devices with a JIT.It uses the enhanced for loop syntax introduced in version 1.5 of the Javaprogramming language.

So, you should use the enhanced for loop by default, but consider ahand-written counted loop for performance-critical ArrayList iteration.

Tip:Also see Josh Bloch's Effective Java, item 46.

Consider Package Instead of Private Access with Private Inner Classes


Consider the following class definition:

假設有以下的類定義:

publicclassFoo{
   
privateclassInner{
       
void stuff(){
           
Foo.this.doStuff(Foo.this.mValue);
       
}
   
}

   
privateint mValue;

   
publicvoid run(){
       
Innerin=newInner();
        mValue
=27;
       
in.stuff();
   
}

   
privatevoid doStuff(int value){
       
System.out.println("Value is "+ value);
   
}
}

What's important here is that we define a private inner class(Foo$Inner) that directly accesses a private method and a privateinstance field in the outer class. This is legal, and the code prints "Value is27" as expected.

The problem is that the VM considers direct access to Foo'sprivate members from Foo$Inner to be illegal becauseFoo and Foo$Inner are different classes, even thoughthe Java language allows an inner class to access an outer class' privatemembers. To bridge the gap, the compiler generates a couple of syntheticmethods:

/*package*/staticintFoo.access$100(Foo foo){
   
return foo.mValue;
}
/*package*/staticvoidFoo.access$200(Foo foo,int value){
    foo
.doStuff(value);
}

The inner class code calls these static methods whenever it needs toaccess the mValue field or invoke the doStuff() methodin the outer class. What this means is that the code above really boils down to a case where you're accessing member fields through accessor methods.Earlier we talked about how accessors are slower than direct fieldaccesses, so this is an example of a certain language idiom resulting in an"invisible" performance hit.

If you're using code like this in a performance hotspot, you can avoid the overhead by declaring fields and methods accessed by inner classes to have package access, rather than private access. Unfortunately this means the fieldscan be accessed directly by other classes in the same package, so you shouldn'tuse this in public API.

Avoid Using Floating-Point


As a rule of thumb, floating-point is about 2x slower than integer onAndroid-powered devices.

In speed terms, there's no difference between float anddouble on the more modern hardware. Space-wise, doubleis 2x larger. As with desktop machines, assuming space isn't an issue, youshould prefer double to float.

Also, even for integers, some processors have hardware multiply but lack hardware divide. In such cases, integer division and modulus operations areperformed in software—something to think about if you're designing a hash table or doing lots of math.

 

Know and Use the Libraries


In addition to all the usual reasons to prefer library code over rolling your own, bear in mind that the system is at liberty to replace callsto library methods with hand-coded assembler, which may be better than the best code the JIT can produce for the equivalent Java. The typical examplehere is String.indexOf() and related APIs, which Dalvik replaces withan inlined intrinsic. Similarly, the System.arraycopy() methodis about 9x faster than a hand-coded loop on a Nexus One with the JIT.

Use Native Methods Carefully


Developing your app with native code using theAndroid NDKisn't necessarily more efficient than programming with theJava language. For one thing,there's a cost associated with the Java-native transition, and the JIT can'toptimize across these boundaries. If you're allocating native resources (memoryon the native heap, file descriptors, or whatever), it can be significantlymore difficult to arrange timely collection of these resources. You alsoneed to compile your code for each architecture you wish to run on (ratherthan rely on it having a JIT). You may even have to compile multiple versionsfor what you consider the same architecture: native code compiled for the ARMprocessor in the G1 can't take full advantage of the ARM in the Nexus One, andcode compiled for the ARM in the Nexus One won't run on the ARM in the G1.

Native code is primarily useful when you have an existing native codebasethat you want to port to Android, not for "speeding up" parts of your Android appwritten with the Java language.

If you do need to use native code, you should read ourJNI Tips.

Performance Myths


On devices without a JIT, it is true that invoking methods via avariable with an exact type rather than an interface is slightly moreefficient. (So, for example, it was cheaper to invoke methods on aHashMap map than a Map map, even though in bothcases the map was a HashMap.) It was not the case that thiswas 2x slower; the actual difference was more like 6% slower. Furthermore,the JIT makes the two effectively indistinguishable.

On devices without a JIT, caching field accesses is about 20% faster than repeatedly accesssing the field. With a JIT, field access costs about the sameas local access, so this isn't a worthwhile optimization unless you feel itmakes your code easier to read. (This is true of final, static, and staticfinal fields too.)

 

Always Measure


Before you start optimizing, make sure you have a problem that youneed to solve. Make sure you can accurately measure your existing performance,or you won't be able to measure the benefit of the alternatives you try.

Every claim made in this document is backed up by a benchmark. The sourceto these benchmarks can be found in the code.google.com"dalvik" project.

The benchmarks are built with theCaliper microbenchmarkingframework for Java. Microbenchmarks are hard to get right, so Caliper goes outof its way to do the hard work for you, and even detect some cases where you'renot measuring what you think you're measuring (because, say, the VM hasmanaged to optimize all your code away). We highly recommend you use Caliperto run your own microbenchmarks.

You may also findTraceview usefulfor profiling, but it's important to realize that it currently disables the JIT,which may cause it to misattribute time to code that the JIT may be able to winback. It's especially important after making changes suggested by Traceviewdata to ensure that the resulting code actually runs faster when run withoutTraceview.

相關文章
相關標籤/搜索