[20200103]GUID轉換GUID_BASE64.txt

[20200103]GUID轉換GUID_BASE64.txt

--//最近在作一個項目優化,裏面大量使用uuid.優缺點在連接:http://blog.itpub.net.x.y265/viewspace-2670513/=>[20191225]主鍵使
--//用uuid優缺點.txt 有相關討論.我本身的觀點不要濫用,或者講處處都用,合理使用纔是比較正確的選擇.

--//昨天看12c相關書籍,發現oracle給每一個PDB設置惟一GUID.我查看視圖V$CONTAINERs,發現有1個字段GUID_BASE64,很明顯這個從guid轉
--//換過來,本身想知道這個轉換如何實現的?

1.環境:
SYS@192.168.x.y:1521/orclcdb> select banner from v$version;
BANNER
----------------------------------------------------------------------
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production

SYS@192.168.x.y:1521/orclcdb> select sys_guid() from dual ;
SYS_GUID()
--------------------------------
9B25CF226E3E36A5E0558253DD747177

SYS@192.168.x.y:1521/orclcdb>  select CON_ID,DBID,CON_UID,NAME,GUID,GUID_BASE64 from V$CONTAINERs;
CON_ID       DBID    CON_UID NAME     GUID                             GUID_BASE64
------ ---------- ---------- -------- -------------------------------- ------------------------
     1 2756091850          1 CDB$ROOT 64A52F53A7683286E053CDA9E80AED76 ZKUvU6doMobgU82p6Artdg==
     2 1474312904 1474312904 PDB$SEED 742DCFA2CE044FDEE0558253DD747177 dC3Pos4ET97gVYJT3XRxdw==
     3  115310104  115310104 ORCL     74A69DC145F5662BE0558253DD747177 dKadwUX1ZivgVYJT3XRxdw==

--//注意看sys_guid()後面16位E0558253DD747177,竟然沒有變化,不知道爲何.
--//GUID_BASE64後面2位是字符'==',不可能3個正好都是==,必定是用來填充保持字符串長度24.

2.首先我必須肯定GUID_BASE64的編碼:
--//base64,個人理解就是64進制,肯定編碼很重要,最容易聯想到的rowid編碼也是64進制,是否其編碼與它同樣.
--//我檢索發現以下連接:
https://docs.oracle.com/cd/E18150_01/javadocs/DevelopmentKit/com/stc/connector/framework/util/Base64.html

--//內容以下:
            Table 1: The Base64 Alphabet
Value Encoding  Value Encoding  Value Encoding  Value Encoding
   0 A            17 R            34 i            51 z
   1 B            18 S            35 j            52 0
   2 C            19 T            36 k            53 1
   3 D            20 U            37 l            54 2
   4 E            21 V            38 m            55 3
   5 F            22 W            39 n            56 4
   6 G            23 X            40 o            57 5
   7 H            24 Y            41 p            58 6
   8 I            25 Z            42 q            59 7
   9 J            26 a            43 r            60 8
  10 K            27 b            44 s            61 9
  11 L            28 c            45 t            62 +
  12 M            29 d            46 u            63 /
  13 N            30 e            47 v
  14 O            31 f            48 w         (pad) =
  15 P            32 g            49 x
  16 Q            33 h            50 y
--//= 做爲pad與看到結果同樣.

The encoded output stream must be represented in lines of no more than 76 characters each. All line breaks or other
characters no found in Table 1 must be ignored by decoding software. In base64 data, characters other than those in
Table 1, line breaks, and other white space probably indicate a transmission error, about which a warning message or
even a message rejection might be appropriate under some circumstances.

編碼的輸出流必須以不超過76個字符的行表示。全部行打斷或其餘解碼軟件必須忽略表1中沒有找到的字符。在base64數據中,除了那些
表一、斷行和其餘空白可能表示傳輸錯誤,有關該錯誤的警告消息或在某些狀況下,即便是拒絕信息也多是合適的。

Special processing is performed if fewer than 24 bits are available at the end of the data being encoded. A full
encoding quantum is always completed at the end of a body. When fewer than 24 input bits are available in an input
group, zero bits are added (on the right) to form an integral number of 6-bit groups. Padding at the end of the data is
performed using the "=" character. Since all base64 input is an integral number of octets, only the following cases can
arise: (1) the final quantum of encoding input is an integral multiple of 24 bits; here, the final unit of encoded
output will be an integral multiple of 4 characters with no "=" padding, (2) the final quantum of encoding input is
exactly 8 bits; here, the final unit of encoded output will be two characters followed by two "=" padding characters, or
(3) the final quantum of encoding input is exactly 16 bits; here, the final unit of encoded output will be three
characters followed by one "=" padding character.

若是在被編碼數據的末尾有少於24位可用,則執行特殊處理。一個完整的編碼量子老是在物體的末端完成。當一個輸入中可用的輸入位少
於24位時組,零位被添加(在右邊)造成一個6位組的整數。數據末尾的填充物是使用"="字符執行。因爲全部base64輸入都是八進制的整數
,因此只有如下狀況才能產生:
(1)編碼輸入的最終量子是24位的整數倍;這裏是編碼的最終單位輸出將是4個字符的整數倍,沒有"="填充.
(2)編碼輸入的最終數量是8位,編碼輸出的最終單位將是兩個字符,後面跟着兩個"="填充字符,或者
(3)編碼輸入的最終量子正好是16位;這裏,編碼輸出的最終單位是三位字符後面跟着一個"="填充字符。

Because it is used only for padding at the end of the data, the occurrence of any "=" characters may be taken as
evidence that the end of the data has been reached (without truncation in transit). No such assurance is possible,
however, when the number of octets transmitted was a multiple of three and no "=" characters are present.

由於它只用於數據末尾的填充,因此任何"="字符的出現均可以做爲數據結束的證據已經到達(在運輸過程當中沒有截斷)。不可能有這樣的
保證,然而,當傳輸的八位數是三個的倍數,而且沒有"="字符存在時。

Any characters outside of the base64 alphabet are to be ignored in base64-encoded data.

在base64編碼的數據中,base64字母表以外的任何字符都將被忽略。

Care must be taken to use the proper octets for line breaks if base64 encoding is applied directly to text material that
has not been converted to canonical form. In particular, text line breaks must be converted into CRLF sequences prior to
base64 encoding. The important thing to note is that this may be done directly by the encoder rather than in a prior
canonicalization step in some implementations.

若是base64編碼直接應用於還沒有轉換爲規範形式。特別是,文本換行必須先轉換成CRLF序列。base64編碼。重要的是要注意的是,這可能
是由編碼器直接完成的,而不是在之前完成的。規範化步驟在一些實現中。

NOTE: There is no need to worry about quoting potential boundary delimiters within base64-encoded bodies within
multipart entities because no hyphen characters are used in the base64 encoding.   

--//注:翻譯我使用金山詞霸,可能存在一些瑕疵...
--//這樣base64的編碼能夠肯定.A_Z a-z 0-9 +/

3.我寫一個測試腳本:
 $ cat 64base.sh
#! /bin/bash
v2=$1
BASE64=($( echo {A..Z} {a..z} {0..9} + / ))

res=''
for i in $(echo "obase=64;ibase=16; $v2" | bc| tr -d '\\\r\n')
do
    res=${res}${BASE64[$(( 10#$i ))]}
done

echo $res

$ ./64base.sh 74A69DC145F5662BE0558253DD747177
B0pp3BRfVmK+BVglPddHF3

--//徹底對不上.我再仔細看連接https://docs.oracle.com/cd/E18150_01/javadocs/DevelopmentKit/com/stc/connector/framework/util/Base64.html
--//說明.
When fewer than 24 input bits are available in an input group, zero bits are added (on the right) to form an integral
number of 6-bit groups. Padding at the end of the data is performed using the "=" character.

--//有1個很是明顯的提示 "zero bits are added (on the right) to form an integral number of 6-bit groups".
--//guid的顯示74A69DC145F5662BE0558253DD747177,佔32字符.32*4 = 128 bits.
--//base64 至關於2^6,也就是6 bits表示1個64進制字符.
--//128/6 = 21.333,明顯沒法整除,這樣結尾要補上1個0(佔4bits).
--//(128+4)/6 = 22,這樣正好整除.
--//補上1個0再計算以下:

$ ./64base.sh 74A69DC145F5662BE0558253DD7471770
dKadwUX1ZivgVYJT3XRxdw

--//^_^正好對上.能夠驗證看看.

$ echo 64A52F53A7683286E053CDA9E80AED760 742DCFA2CE044FDEE0558253DD7471770 74A69DC145F5662BE0558253DD7471770 | tr ' ' '\n' | xargs -IQ ./64base.sh Q
ZKUvU6doMobgU82p6Artdg
dC3Pos4ET97gVYJT3XRxdw
dKadwUX1ZivgVYJT3XRxdw

SYS@192.168.x.y:1521/orclcdb>  select CON_ID,DBID,CON_UID,NAME,GUID,GUID_BASE64 from V$CONTAINERs;
CON_ID       DBID    CON_UID NAME     GUID                             GUID_BASE64
------ ---------- ---------- -------- -------------------------------- ------------------------
     1 2756091850          1 CDB$ROOT 64A52F53A7683286E053CDA9E80AED76 ZKUvU6doMobgU82p6Artdg==
     2 1474312904 1474312904 PDB$SEED 742DCFA2CE044FDEE0558253DD747177 dC3Pos4ET97gVYJT3XRxdw==
     3  115310104  115310104 ORCL     74A69DC145F5662BE0558253DD747177 dKadwUX1ZivgVYJT3XRxdw==
--//對比徹底能對上.固然不包括後面的兩個=.

4.修改腳本以下:
$ cat o64base.sh
#! /bin/bash
# convert guid to guid_base64
odebug=${ODEBUG:-0}

v2=${1}0
BASE64=($( echo {A..Z} {a..z} {0..9} + / ))

res=''
for i in $(echo "obase=64;ibase=16; $v2" | bc| tr -d '\\\r\n')
do
    res=${res}${BASE64[$(( 10#$i ))]}
done

if [ $odebug -eq 1 ] ; then
    echo v2=$v2 res=$res
fi

res=${res}==
echo $res

$ ./o64base.sh 74A69DC145F5662BE0558253DD747177
dKadwUX1ZivgVYJT3XRxdw==

5.總結:
--//純屬無聊,浪費一個下午探究這個問題.
--//在測試時我使用連接https://toolslick.com/conversion/data/guid的在線轉換工具,否則估計我沒法猜想到如何實現.

html

相關文章
相關標籤/搜索