聊聊base62與tinyURL

base64你們確定是很熟悉了,那base62是什麼東東,它常被用來作短url的映射。算法

ascii編碼的62個字母數字

Value Encoding  Value Encoding  Value Encoding  Value Encoding
  0 a            17 r            34 I            51 Z
  1 b            18 s            35 J            52 0
  2 c            19 t            36 K            53 1
  3 d            20 u            37 L            54 2
  4 e            21 v            38 M            55 3
  5 f            22 w            39 N            56 4
  6 g            23 x            40 O            57 5
  7 h            24 y            41 P            58 6
  8 i            25 z            42 Q            59 7
  9 j            26 A            43 R            60 8
 10 k            27 B            44 S            61 9
 11 l            28 C            45 T
 12 m            29 D            46 U
 13 n            30 E            47 V
 14 o            31 F            48 W
 15 p            32 G            49 X
 16 q            33 H            50 Y

26個小寫字母+26個大寫字母+10個數字=62app

public static final String BASE_62_CHAR = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
    public static final int BASE = BASE_62_CHAR.length();

62進制與十進制的映射

62進制轉10進制

還記得二進制轉十進制的算法麼,從右到左用二進制的每一個數去乘以2的相應次方,次方要從0開始。62進制轉10進制也相似,從右往左每一個數*62的N次方,N從0開始。測試

public static long toBase10(String str) {
        //從右邊開始
        return toBase10(new StringBuilder(str).reverse().toString().toCharArray());
    }

    private static long toBase10(char[] chars) {
        long n = 0;
        int pow = 0;
        for(char item: chars){
            n += toBase10(BASE_62_CHAR.indexOf(item),pow);
            pow++;
        }
        return n;
    }

    private static long toBase10(int n, int pow) {
        return n * (long) Math.pow(BASE, pow);
    }

十進制轉62進制

還記得十進制轉二進制的算法麼,除二取餘,而後倒序排列,高位補零。轉62進制也相似,不斷除以62取餘數,而後倒序。ui

public static String fromBase10(long i) {
        StringBuilder sb = new StringBuilder("");
        if (i == 0) {
            return "a";
        }
        while (i > 0) {
            i = fromBase10(i, sb);
        }
        return sb.reverse().toString();
    }

    private static long fromBase10(long i, final StringBuilder sb) {
        int rem = (int)(i % BASE);
        sb.append(BASE_62_CHAR.charAt(rem));
        return i / BASE;
    }

短url的轉換

主要思路,維護一個全局自增的id,每來一個長url,將其與一個自增id綁定,而後利用base62將該自增id轉換爲base62字符串,即完成轉換。編碼

public class Base62UrlShorter {

    private long autoIncrId = 10000;

    Map<Long, String> longUrlIdMap = new HashMap<Long, String>();

    public long incr(){
        return autoIncrId ++ ;
    }

    public String shorten(String longUrl){
        long id = incr();
        //add to mapping
        longUrlIdMap.put(id,longUrl);
        return Base62.fromBase10(id);
    }

    public String lookup(String shortUrl){
        long id = Base62.toBase10(shortUrl);
        return longUrlIdMap.get(id);
    }
}

測試url

@Test
    public void testLongUrl2Short(){
        Base62UrlShorter shorter= new Base62UrlShorter();
        String longUrl = "https://movie.douban.com/subject/26363254/";
        String shortUrl = shorter.shorten(longUrl);
        System.out.println("short url:"+shortUrl);
        System.out.println(shorter.lookup(shortUrl));
    }

關於容量

自增id爲long型,最大2^64 -1設計

doc

相關文章
相關標籤/搜索