Integer和Long部分源碼分析

時間 2019-12-14

標籤 integer long 部分源碼分析简体版

原文原文鏈接

Integer和Long的java中使用特別普遍，本人主要一下Integer.toString(int i)和Long.toString(long i)方法，其餘方法都比較容易理解。java

Integer.toString(int i)和Long.toString(long i)，以Integer.toString(int i)爲例，先看源碼：git

 1    /**
 2      * Returns a {@code String} object representing the
 3      * specified integer. The argument is converted to signed decimal
 4      * representation and returned as a string, exactly as if the
 5      * argument and radix 10 were given as arguments to the {@link
 6      * #toString(int, int)} method.
 7      *
 8      * @param   i   an integer to be converted.
 9      * @return  a string representation of the argument in base&nbsp;10.
10      */
11     public static String toString(int i) {
12         if (i == Integer.MIN_VALUE)
13             return "-2147483648";
14         int size = (i < 0) ? stringSize(-i) + 1 : stringSize(i);
15         char[] buf = new char[size];
16         getChars(i, size, buf);
17         return new String(buf, true);
18     }

經過調用stringSize來計算i的長度，也就是位數，用來分配合適大小的字符數組buf，而後調用getChars來設置buf的值。數組

stringSize的Integer和Long中的實現有所不一樣，先看看源碼ui

Integer.stringSize(int x)源碼：spa

1    final static int [] sizeTable = { 9, 99, 999, 9999, 99999, 999999, 9999999,
2                                       99999999, 999999999, Integer.MAX_VALUE };
3 
4     // Requires positive x
5     static int stringSize(int x) {
6         for (int i=0; ; i++)
7             if (x <= sizeTable[i])
8                 return i+1;
9     }

將數據存放在數組中，數組中的下標+1就是i的長度，當x小於sizeTable中的某個值時，這樣設計只須要循環就能夠得出長度，效率高。設計

Long.stringSize(long x)源碼：code

 1     // Requires positive x
 2     static int stringSize(long x) {
 3         long p = 10;
 4         for (int i=1; i<19; i++) {
 5             if (x < p)
 6                 return i;
 7             p = 10*p;
 8         }
 9         return 19;
10     }

由於Long的十進制最大長度是19，在計算長度時經過反覆乘以10的方式求出來的，可能會問爲何不用Integer.stringSize(int x)的方法，我也沒有找到合適的解釋。blog

傳統的方案多是經過反覆除以10的方法求出來的，可是這樣的效率低，由於計算機在處理乘法時要比除法快。ci

getChars(int i, int index, char[] buf)源碼：get

 1    /**
 2      * Places characters representing the integer i into the
 3      * character array buf. The characters are placed into
 4      * the buffer backwards starting with the least significant
 5      * digit at the specified index (exclusive), and working
 6      * backwards from there.
 7      *
 8      * Will fail if i == Integer.MIN_VALUE
 9      */
10     static void getChars(int i, int index, char[] buf) {
11         int q, r;
12         int charPos = index;
13         char sign = 0;
14 
15         if (i < 0) {
16             sign = '-';
17             i = -i;
18         }
19 
20         // Generate two digits per iteration
21         while (i >= 65536) {
22             q = i / 100;
23         // really: r = i - (q * 100);
24             r = i - ((q << 6) + (q << 5) + (q << 2));
25             i = q;
26             buf [--charPos] = DigitOnes[r];
27             buf [--charPos] = DigitTens[r];
28         }
29 
30         // Fall thru to fast mode for smaller numbers
31         // assert(i <= 65536, i);
32         for (;;) {
33             q = (i * 52429) >>> (16+3);
34             r = i - ((q << 3) + (q << 1));  // r = i-(q*10) ...
35             buf [--charPos] = digits [r];
36             i = q;
37             if (i == 0) break;
38         }
39         if (sign != 0) {
40             buf [--charPos] = sign;
41         }
42     }

這是整個轉換過程的核心代碼，首先肯定符號，其次當i>=65536時將i除以100，而且經過DigitOnes[r]和DigitTens[r]來獲取十位和個位上的值，由於除法慢，因此一次性除以100提升效率，DigitOnes和DigitTens以下：

 1   final static char [] DigitTens = {
 2         '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
 3         '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
 4         '2', '2', '2', '2', '2', '2', '2', '2', '2', '2',
 5         '3', '3', '3', '3', '3', '3', '3', '3', '3', '3',
 6         '4', '4', '4', '4', '4', '4', '4', '4', '4', '4',
 7         '5', '5', '5', '5', '5', '5', '5', '5', '5', '5',
 8         '6', '6', '6', '6', '6', '6', '6', '6', '6', '6',
 9         '7', '7', '7', '7', '7', '7', '7', '7', '7', '7',
10         '8', '8', '8', '8', '8', '8', '8', '8', '8', '8',
11         '9', '9', '9', '9', '9', '9', '9', '9', '9', '9',
12         } ;
13 
14     final static char [] DigitOnes = {
15         '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
16         '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
17         '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
18         '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
19         '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
20         '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
21         '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
22         '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
23         '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
24         '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
25         } ;

假設r=34，經過查表能夠得出DigitOnes[r]=4，DigitTens[r]=3。

1 q = (i * 52429) >>> (16+3); 的本質是將i/10，並去掉小數部分，2¹⁹=524288，52429/524288=0.10000038146972656，爲何會選擇52429/524288呢，看了下面就知道了：

 1 2^10=1024, 103/1024=0.1005859375
 2 2^11=2048, 205/2048=0.10009765625
 3 2^12=4096, 410/4096=0.10009765625
 4 2^13=8192, 820/8192=0.10009765625
 5 2^14=16384, 1639/16384=0.10003662109375
 6 2^15=32768, 3277/32768=0.100006103515625
 7 2^16=65536, 6554/65536=0.100006103515625
 8 2^17=131072, 13108/131072=0.100006103515625
 9 2^18=262144, 26215/262144=0.10000228881835938
10 2^19=524288, 52429/524288=0.10000038146972656

能夠看出52429/524288的精度最高，而且在Integer的取值範圍內。

1 r = i - ((q << 3) + (q << 1)); // r = i-(q*10) ... 用位運算而不用乘法也是爲了提升效率。

注：以上分析內容僅我的觀點（部分參考網上），若有不正確的地方但願能夠相互交流。