爲了學習unicode的utf-8和utf-16編碼,寫了以下程序進行學習。java
import java.nio.charset.Charset; public class MyStudy { public static String field = "%-20s"; public static void main(String[] args){ System.out.format(field, "utf-16 length"); System.out.format(field, "utf-8 length"); System.out.format(field, "utf-16"); System.out.format(field, "utf-8"); System.out.format(field, "text"); System.out.println(); String[] arr = {"瓴", "龍", "瓴龍", "一", "一二", "一二三", "a", "ab", "abc"}; for (String str: arr){ System.out.format(field, str.getBytes(Charset.forName("UTF-16")).length); System.out.format(field, str.getBytes(Charset.forName("UTF-8")).length); System.out.format(field, toHex(str.getBytes(Charset.forName("UTF-16")))); System.out.format(field, toHex(str.getBytes(Charset.forName("UTF-8")))); System.out.format(field, str); System.out.println(); } } public static String toHex(byte[] b) { StringBuilder builder = new StringBuilder(); for (int i = 0; i < b.length; i++) { builder.append(String.format("%02x", b[i])); } return builder.toString(); } }
該程序的輸出結果是:app
作了如下總結:學習
1,utf-16以兩個字節爲一個單元;ui
2,utf-8以一個字節爲一個單元;編碼
3,utf-16的字節前邊有"feff"的表示。"feff"表示Big-Endian,和Little- Endian(fffe)相對應;code
附:https://zh.wikipedia.org/wiki/UTF-16orm