Base64

時間 2020-07-26

標籤 base64 base 简体版

原文原文鏈接

1：Base64算法的由來java

　　Base64算法最先應用於解決電子郵件傳輸的問題，在早起，因爲"歷史問題"，電子郵件只容許ASCII碼字符。如要傳輸一封帶有非ASCII碼字符的電子郵件，當它經過有「歷史問題」的網關時就可能出現問題，這個網關可能會對這個非ASCII碼字符的二進制作調整，即將,這個非ASCII碼的8位二進制的最高位置爲0，此時用戶收到的郵件就會是一封存粹的亂碼郵件，基於此產生了BASE64算法。web

2：BASE64算法的定義算法

　　Base64算法是一種基於64個字符的編碼算法，根據RFC2045(http://www.ietf.org/rfc/rfc2045.txt)的定義：「Base64內容傳送編碼是一種以任意8位字節序列組成的描述形式，這種形式不宜被人直接識別」。通過BASE64編碼後的數據會比元數據略長，爲原來的4/3倍，經Base64編碼後的字符串的字符數是4位單位的整數倍。apache

　　RFC2045還規定，在電子郵件中，每行爲76個字符，每行末須要添加一個回車換行符（「\r\n」）。不管每行是否足夠76個字符，都須要添加一個回車換行符，但在實際應用中，每每根據實際需求忽略了這一要求。less

　　RFC2045文件中給出以下字符映射表:ide

　　　　　　　　　　　　　　　　Base64字符映射表svn

在這張字符映射表中，value指的是十進制編碼，Encoding指的是字符，工映射了64個字符，這也是Base64算法命名的由來，映射表的最後一個字符是等號，它用來部位，所以，一般咱們在看到一串字符串的末尾有個=號時就會聯想到Base64算法。ui

　　Base64算法還有幾個同胞兄弟，Base32和Base16算法，爲了能在http請求中一Get方式傳遞二進制數據，有Base64算法衍生出Url Base64算法。this

　　Url Base64算法主要是替換了Base64算法字符映射表中的第62和63個字符，也就是將「+」和「/」符號替換成「-」和「_」。但對於補位符號「=」，一種建議是使用「~」，另外一種建議是使用「.」，其中因爲「~」符號與文件系統衝突，不建議使用，而對於「.」符號，若是連續出現兩次，則認爲是錯誤的，關於補位符號的問題，commons Codec是徹底杜絕使用補位符號，二Bouncy Castle使用「.」做爲補位符號。編碼

　　3：Base64算法與加密算法的關係

　　Base64算法有編碼和解碼操做可充當加密和解密操做，還有一張字符映射表充當了祕鑰，Base64算法是借鑑表單置換算法，將原文通過二進制轉換後與字符映射表相對應，獲得密文，Base64算法常常用作一個簡單的「加密」來保護某些數據。

　　嚴格意義上來說，Base64不能算做是加密算法，由於充當祕鑰的字符映射表公開，直接違背了柯克霍夫原則，而且Base64算法的加密強度不夠高，不能將Bse64當作咱們所承認的如今加密算法。可是，轉換個思路，咱們稍微對字符映射表修改爲自定義私有的，那麼是否是就能夠做爲數據加密的一種簡單的方式呢？文章末尾咱們來演示。

　　4：Base64實現原理

　　Base64算法主要是將給定的字符與字符編碼（如ASCII碼，UTF-8碼）對應的十進制數據做爲基準，作編碼操做：

　　　　1）將給定的字符串以字符爲單位，轉換爲對應的字符編碼（如ASCII碼）。

　　　　2）將得到的字符編碼轉換成二進制碼。

　　　　3）對得到的二進制碼作分組轉換操做，每3個8位二進制爲一組，轉換爲每4個6位二進制碼爲一組（不足6位時低位補0）。這是一個分組變化的過程，3個8位二進制碼和4個6位二進制碼的長度都是24位。

　　　　4）對得到的4-6二進制碼補位，像6位二進制添加2位高0，組成4個8位二進制。

　　　　5）將得到的4-8二進制轉換爲十進制碼。

　　　　6）將得到的十進制碼轉來爲Base64字符表中對應的字符。

　　4.1:ASCII碼字符編碼

　　　　咱們隊字符串「A」進行Base64編碼，以下所示

　　　　字符　　　　　　A

　　　　ASCII碼　　　　65

　　　　二進制　　　　 01000001

　　　　4-6二進制　　 010000　　　　　　010000

　　　　4-8二進制　　　00010000　　　　 00010000

　　　　十進制　　　　 16　　　　　　　　16

　　　　字符表映射碼　 Q 　　　　　　Q

　　由此，字符串「A」通過Base64編碼後就獲得了「QQ==」這樣的一個字符串。

　　Base64的解碼操做就是編碼操做的逆運算，反推上述流程很容易就得到原文信息。

　　4.2：非ASCII碼字符編碼

　　Base64算法很好地解決了非ASCII碼字符的傳輸問題，譬如中文字符的傳輸問題。

　　因爲ASCII碼錶示範圍有限，所以，咱們使用UTF-8碼錶來進行編碼

　　　　字符　　　　密

　　　　UTF-8　　 -27　　　　　　-81　　　　　　-122

　　　　二進制　　 11100101　　　10101111　　 10000110

　　　　4-6二進制　 111001　　　　 011010　　　　111110　　　　000110　　

　　　　4-8二進制　 00111001　　　 0011010　　 00111110　　 00000110

　　　　十進制　　　67　　　　　　 26　　　　　　 62　　　　　　6　

　　　　字符映射碼　5　　　　　　　a　　　　　　　+　　　　　　G

　　字符串「密」通過Base64編碼後獲得字符串「5a+G」。若是使用其餘碼錶，那麼結果就是另外一種形式

　　5.Commons Codec http://commons.apache.org/proper/commons-codec/

　　Apache Commons Codec (TM) software provides implementations of common encoders and decoders such as Base64, Hex, Phonetic and URLs.它遵照了RPC2045相關定義，實現了Base64算法，同時也支持了通常Base64算法的實現

package com.orange.encoder;
import org.apache.commons.codec.binary.Base64;
import java.io.UnsupportedEncodingException;


public class Base64Coder {

    //字符編碼
    public  final  static String ENCODING="UTF-8";


    /**
     * Base64通常編碼  不遵照RFC2045
     * @param data 待編碼數據
     * @return   編碼後數據
     * @throws UnsupportedEncodingException
     */
   public static String encode(String data) throws UnsupportedEncodingException {
       byte[] bytes = Base64.encodeBase64(data.getBytes(ENCODING));
       return  new String(bytes,ENCODING);
   }

    /**
     * Base64  遵照RFC2045
     * @param data 待編碼數據
     * @return   編碼後數據
     * @throws UnsupportedEncodingException
     */
    public static String encodeSafe(String data) throws UnsupportedEncodingException {
        byte[] bytes = Base64.encodeBase64(data.getBytes(ENCODING),true);
        return  new String(bytes,ENCODING);
    }

    /**
     * Base64 解碼
     * @param data
     * @return
     * @throws UnsupportedEncodingException
     */
    public  static String decode(String data) throws UnsupportedEncodingException {
        byte[] bytes = Base64.decodeBase64(data.getBytes(ENCODING));
        return  new String(bytes,ENCODING);
    }

}

package com.orange;

import com.orange.encoder.Base64Coder;
import org.junit.Assert;
import org.junit.Test;

import java.io.UnsupportedEncodingException;

public class Base64CoderTest {


    @Test
    public void test() throws UnsupportedEncodingException {
        String str="hello world";
        System.out.println(String.format("原文前:%s",str));

        String encodeData = Base64Coder.encode(str);
        System.out.println(String.format("編碼後:%s",encodeData));

        String decodeData = Base64Coder.decode(encodeData);
        System.out.println(String.format("解碼後:%s",decodeData));

        Assert.assertEquals(decodeData,str);
    }
}

輸出結果爲：

原文前:hello world
編碼後:aGVsbG8gd29ybGQ=
解碼後:hello world

同時，Commons Codec支持更多的輸入方式如流輸入輸出實現，更提供的Base64算法的定製實現，能夠自定每行字符數和行末符號,更多詳情請查閱Commons Codec文檔。

　　在sum.misc包下是Sun公司提供內部使用的專門API，所以不建議使用此包下所提供開發的Base64算法實現。

　　結尾：附上Commons Codec Base64的源代碼，咱們這樣設想：假如我把編碼的數據表格對應的位置改變一下，那是否是就能實現私有的Base64編碼？

 /**
     * This array is a lookup table that translates 6-bit positive integer index values into their "Base64 Alphabet"
     * equivalents as specified in Table 1 of RFC 2045.
     *
     * Thanks to "commons" project in ws.apache.org for this code.
     * http://svn.apache.org/repos/asf/webservices/commons/trunk/modules/util/
     */
    private static final byte[] STANDARD_ENCODE_TABLE = {
            'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M',
            'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z',
            'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
            'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z',
            '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '+', '/'
    };

    /**
     * This is a copy of the STANDARD_ENCODE_TABLE above, but with + and /
     * changed to - and _ to make the encoded Base64 results more URL-SAFE.
     * This table is only used when the Base64's mode is set to URL-SAFE.
     */
    private static final byte[] URL_SAFE_ENCODE_TABLE = {
            'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M',
            'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z',
            'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
            'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z',
            '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '-', '_'
    };

    /**
     * This array is a lookup table that translates Unicode characters drawn from the "Base64 Alphabet" (as specified
     * in Table 1 of RFC 2045) into their 6-bit positive integer equivalents. Characters that are not in the Base64
     * alphabet but fall within the bounds of the array are translated to -1.
     *
     * Note: '+' and '-' both decode to 62. '/' and '_' both decode to 63. This means decoder seamlessly handles both
     * URL_SAFE and STANDARD base64. (The encoder, on the other hand, needs to know ahead of time what to emit).
     *
     * Thanks to "commons" project in ws.apache.org for this code.
     * http://svn.apache.org/repos/asf/webservices/commons/trunk/modules/util/
     */
    private static final byte[] DECODE_TABLE = {
            -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
            -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
            -1, -1, -1, -1, -1, -1, -1, -1, -1, 62, -1, 62, -1, 63, 52, 53, 54,
            55, 56, 57, 58, 59, 60, 61, -1, -1, -1, -1, -1, -1, -1, 0, 1, 2, 3, 4,
            5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23,
            24, 25, -1, -1, -1, -1, 63, -1, 26, 27, 28, 29, 30, 31, 32, 33, 34,
            35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51
    };

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。