背景:數據庫的前期設計,主鍵用的是uuid,但這個是大數據量的應用。通過n久的折騰,數據大於1億條了。返回去看錶,發現,表的不少字段是varchar2的,可是長度不超過20字符。佔據大部分空間的竟然是uuid。因而萌生改造uuid的想法。java
過程:通過一番搜索,通常就是縮短至22位的uuid了。算法
這位仁兄是由短域名想到uuid用64進制改造http://www.iteye.com/topic/1028058,其方法是把uuid生成的字符串去掉「-」,在補一個「0」,獲得33位的16進制數,再用22個64進制數表示。數據庫
而這位仁兄呢,想到的是用base64的方法縮短uuid,http://cengjingdemimang.iteye.com/blog/1022149。看了代碼,主要是利用uuid生成的mostSigBits、leastSigBits來作位移,再經過base64的算法將16字節的2個long類型轉換成字符。通過本人一番搜索,在網上找到這位做者可能直接使用的base64的代碼http://blog.csdn.net/lastsweetop/article/details/5314640(^_^就是不用作什麼封裝直接用的代碼)。數組
接着呢,就是把這兩位仁兄的22位uuid生成速度作一番比較。顯然,第二位的速度快一些,由於少了字符串操做。因而採用第二位仁兄的方法。可是呢,看着看着我就糊塗了。由於這裏有段奇怪的代碼。dom
for (int i = 8; i < 16; i++) { buffer[i] = (byte) (lsb >>> 8 * (7 - i)); }
右移負數。通過我一番嘗試,得出一個比較隨意的結論:右移負數能夠獲得數字。測試
int a = 3*2^28; System.out.println(a>>>28);//輸出3 System.out.println(a>>>-4);//輸出3 //28-(-4)=32 int b = 3*2^4; System.out.println(b>>>4);//輸出3 System.out.println(b>>>-28);//輸出3 //這個是變相的左移?
以後,我研究了第二位仁兄的方法,結合base64,大概是這樣的,把mostSigBits、leastSigBits轉成16字節的字節數組,而後採用3個字節分爲4個字節,再對應64進制的字符。採用本身的寫法,獲得如下代碼。大數據
//import java.util.Date; import java.util.UUID; public class UUID22 { /** * 採用URL Base64字符,即把「+/」換成「-_」 */ static private char[] alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_=".toCharArray(); /** * Base64 編碼 * @param data * @return */ private char[] encode(byte[] data) { char[] out = new char[((data.length + 2) / 3) * 4]; boolean quad, trip; for (int i = 0, index = 0; i < data.length; i += 3, index += 4) { quad = trip = false; int val = (0xFF & (int) data[i]); val <<= 8; if ((i + 1) < data.length) { val |= (0xFF & (int) data[i + 1]); trip = true; } val <<= 8; if ((i + 2) < data.length) { val |= (0xFF & (int) data[i + 2]); quad = true; } out[index + 3] = alphabet[(quad ? (val & 0x3F) : 64)]; val >>= 6; out[index + 2] = alphabet[(trip ? (val & 0x3F) : 64)]; val >>= 6; out[index + 1] = alphabet[val & 0x3F]; val >>= 6; out[index + 0] = alphabet[val & 0x3F]; } return out; } /** * 轉成字節 * @return */ // private byte[] toBytes(String u) { // UUID uuid = UUID.fromString(u); private byte[] toBytes() { UUID uuid = UUID.randomUUID(); long msb = uuid.getMostSignificantBits(); long lsb = uuid.getLeastSignificantBits(); byte[] buffer = new byte[16]; for (int i = 0; i < 8; i++) { buffer[i] = (byte) ((msb >>> 8 * (7 - i)) & 0xFF); buffer[i + 8] = (byte) ((lsb >>> 8 * (7 - i)) & 0xFF); } return buffer; } // public String getUUID(String u) { // char[] res = encode(toBytes(u)); public String getUUID() { char[] res = encode(toBytes()); System.out.println(new String(res)); return new String(res, 0, res.length - 2); } public static void main(String[] args) { System.out.println(getUUID22()); // System.out.println(getUUID("c19b9de1-f33a-494b-afbe-f06817218d64")); // System.out.println(getUUID22("c19b9de1-f33a-494b-afbe-f06817218d64")); // Date d1 = new Date(); // for(int i = 0; i < 1000000; i++) { // UUID.randomUUID().toString(); // getUUID22(); // } // Date d2 = new Date(); // System.out.print(d2.getTime() - d1.getTime()); } /** * 將隨機UUID轉換成22位字符串 * @return */ // public static String getUUID22(String u) { // UUID uuid = UUID.fromString(u); public static String getUUID22() { UUID uuid = UUID.randomUUID(); // System.out.println(uuid.toString()); long msb = uuid.getMostSignificantBits(); long lsb = uuid.getLeastSignificantBits(); char[] out = new char[24]; int tmp = 0, idx = 0; // 基礎寫法 /*tmp = (int) ((msb >>> 40) & 0xffffff); out[idx + 3] = alphabet[tmp & 0x3f]; tmp >>= 6; out[idx + 2] = alphabet[tmp & 0x3f]; tmp >>= 6; out[idx + 1] = alphabet[tmp & 0x3f]; tmp >>= 6; out[idx] = alphabet[tmp & 0x3f]; idx += 4; tmp = (int) ((msb >>> 16) & 0xffffff); out[idx + 3] = alphabet[tmp & 0x3f]; tmp >>= 6; out[idx + 2] = alphabet[tmp & 0x3f]; tmp >>= 6; out[idx + 1] = alphabet[tmp & 0x3f]; tmp >>= 6; out[idx] = alphabet[tmp & 0x3f]; idx += 4; tmp = (int) (((msb & 0xffff) << 8) | (lsb >>> 56 & 0xff)); out[idx + 3] = alphabet[tmp & 0x3f]; tmp >>= 6; out[idx + 2] = alphabet[tmp & 0x3f]; tmp >>= 6; out[idx + 1] = alphabet[tmp & 0x3f]; tmp >>= 6; out[idx] = alphabet[tmp & 0x3f]; idx += 4; tmp = (int) ((lsb >>> 32) & 0xffffff); out[idx + 3] = alphabet[tmp & 0x3f]; tmp >>= 6; out[idx + 2] = alphabet[tmp & 0x3f]; tmp >>= 6; out[idx + 1] = alphabet[tmp & 0x3f]; tmp >>= 6; out[idx] = alphabet[tmp & 0x3f]; idx += 4; tmp = (int) ((lsb >>> 8) & 0xffffff); out[idx + 3] = alphabet[tmp & 0x3f]; tmp >>= 6; out[idx + 2] = alphabet[tmp & 0x3f]; tmp >>= 6; out[idx + 1] = alphabet[tmp & 0x3f]; tmp >>= 6; out[idx] = alphabet[tmp & 0x3f]; idx += 4; tmp = (int) (lsb & 0xff); out[idx + 3] = alphabet[64]; out[idx + 2] = alphabet[64]; tmp <<= 4; out[idx + 1] = alphabet[tmp & 0x3f]; tmp >>= 6; out[idx] = alphabet[tmp & 0x3f]; idx += 4;*/ // 循環寫法 int bit = 0, bt1 = 8, bt2 = 8; int mask = 0x00, offsetm = 0, offsetl = 0; for(; bit < 16; bit += 3, idx += 4) { offsetm = 64 - (bit + 3) * 8; offsetl = 0; tmp = 0; if(bt1 > 3) { mask = (1 << 8 * 3) - 1; } else if(bt1 >= 0) { mask = (1 << 8 * bt1) - 1; bt2 -= 3 - bt1; } else { mask = (1 << 8 * ((bt2 > 3) ? 3 : bt2)) - 1; bt2 -= 3; } if(bt1 > 0) { bt1 -= 3; tmp = (int) ((offsetm < 0) ? msb : (msb >>> offsetm) & mask); if(bt1 < 0) { tmp <<= Math.abs(offsetm); mask = (1 << 8 * Math.abs(bt1)) - 1; } } if(offsetm < 0) { offsetl = 64 + offsetm; tmp |= ((offsetl < 0) ? lsb : (lsb >>> offsetl)) & mask; } if(bit == 15) { out[idx + 3] = alphabet[64]; out[idx + 2] = alphabet[64]; tmp <<= 4; } else { out[idx + 3] = alphabet[tmp & 0x3f]; tmp >>= 6; out[idx + 2] = alphabet[tmp & 0x3f]; tmp >>= 6; } out[idx + 1] = alphabet[tmp & 0x3f]; tmp >>= 6; out[idx] = alphabet[tmp & 0x3f]; } return new String(out, 0, 22); } }
其中,//註釋的代碼是測試代碼和部分註釋。/**/註釋的是最簡單的寫法,等價於下面的for循環寫法。只是本身看着不「優雅」改了。
ui
通過測試,改造完的getUUID22方法生成字符串的速度,比原生的UUID.randomUUID().toString()方法快,在100W次的測試中,輸出是時間大概是原生1850+ms:改造1040+ms的速度。編碼
究其緣由,應該是toString()方法拼接字符串"-"致使速度慢了。拼接字符串用+通常比較沒效率。.net
結果:本身改造代碼的過程比較糾結,寫for循環老是卡死在某一步驟,最後把本身「變傻」,模仿計算機思惟一步一步來就寫出來了。就是不知道有沒有更好的寫法 。若有更好的寫法請指教。