本文節選自《Netkiller Java 手札》git
中國廣東省深圳市望海路半島城邦三期 518067 +86 13113668890 <netkiller@msn.com>
github
$Id: book.xml 606 2013-05-29 09:52:58Z netkiller $編程
版權 © 2015-2018 Neo Chan編程語言
版權聲明編輯器
轉載請與做者聯繫,轉載時請務必標明文章原始出處和做者信息及本聲明。工具
http://netkiller.github.io編碼 |
http://netkiller.sourceforge.netspa |
個人系列文檔.net
編程語言
Netkiller Architect 手札 |
Netkiller Developer 手札 |
Netkiller Java 手札 |
Netkiller Spring 手札 |
Netkiller PHP 手札 |
Netkiller Python 手札 |
---|---|---|---|---|---|
Netkiller Testing 手札 |
Netkiller Cryptography 手札 |
Netkiller Perl 手札 |
Netkiller Docbook 手札 |
Netkiller Project 手札 |
Netkiller Database 手札 |
咱們運行下面一段程序,向文件 netkiller.bin 中寫入一個整形數值 1 ,而後觀察文件變化
String filename = "netkiller.bin"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); out.writeInt(1); out.close();
打開終端,使用 xxd 命令查看二進制文件
neo@MacBook-Pro ~/workspace/netkiller % xxd -b netkiller.bin 00000000: 00000000 00000000 00000000 00000001 ....
能夠看到一串二進制 00000000 00000000 00000000 00000001,運行下面程序能夠講二進制轉換爲十進制,注意替換掉空格。
int n = Integer.valueOf("00000000 00000000 00000000 00000001".replaceAll(" ", ""), 2); System.out.println(n);
運行結果是 1 ,爲什前面那麼多 0 呢?請運行下面一段程序
String filename = "netkiller.bin"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); out.writeInt(Integer.MAX_VALUE); out.close();
如今觀察結果
neo@MacBook-Pro ~/workspace/netkiller % xxd -b netkiller.bin 00000000: 01111111 11111111 11111111 11111111 ....
int n = Integer.valueOf("01111111 11111111 11111111 11111111".replaceAll(" ", ""), 2); System.out.println(n);
輸出結果是 2147483647, 這是 int 得最大值,2147483647 + 1 會怎麼樣呢?
String filename = "netkiller.bin"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); out.writeInt(Integer.MAX_VALUE + 1); out.close(); System.out.println(Integer.MAX_VALUE + 1);
輸出結果是 -2147483648,正確應該是 2147483648 這就是整形溢出。整形變量得二進制表示方法是4個字節長度32位 00000000 00000000 00000000 00000000 到 01111111 11111111 11111111 11111111 , 其中第一位0表示正數1表示負數。
neo@MacBook-Pro ~/workspace/netkiller % xxd -b netkiller.bin 00000000: 10000000 00000000 00000000 00000000 ....
整形溢出演示,超出整形範圍怎麼辦? 使用 Long 型。
System.out.println(Integer.MAX_VALUE); System.out.println(Integer.MAX_VALUE + 1); System.out.println(Integer.MIN_VALUE); System.out.println(Integer.MIN_VALUE - 1); 輸出結果以下: 2147483647 -2147483648 -2147483648 2147483647
負數演示
String filename = "netkiller.bin"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); out.writeInt(-1); out.writeInt(Integer.MAX_VALUE + 1); out.close();
-1 得結果是 11111111 11111111 11111111 11111111
neo@MacBook-Pro ~/workspace/netkiller % xxd -b netkiller.bin 00000000: 11111111 11111111 11111111 11111111 ....
如今咱們存儲兩個整形數值
String filename = "netkiller.bin"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); out.writeInt(1); out.writeInt(-1); out.close();
很清楚的看到裏面有兩個數值,1 和 -1
neo@MacBook-Pro ~/workspace/netkiller % xxd -c 4 -b netkiller.bin 00000000: 00000000 00000000 00000000 00000001 .... 00000004: 11111111 11111111 11111111 11111111 ....
讀取二進制文件中的 int 數據
DataInputStream in = new DataInputStream(new BufferedInputStream(new FileInputStream(filename))); try { int i = in.readInt(); System.out.println(i); } catch (EOFException e) { e.printStackTrace(); }
String filename = "netkiller.bin"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); out.writeByte(1); out.close();
byte 只佔用一個字節8位
neo@MacBook-Pro ~/workspace/netkiller % xxd -c 4 -b netkiller.bin 00000000: 00000001
若是寫入 -1 結果是,由此得出 第一位 0 是正數,1 是負數,能夠得出他的取值範圍 -128 ~ 127。超出範圍也會溢出。
neo@MacBook-Pro ~/workspace/netkiller % xxd -c 4 -b netkiller.bin 00000000: 11111111
經常寫入最小值與最大值
String filename = "netkiller.bin"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); out.writeByte(Byte.MIN_VALUE); out.writeByte(Byte.MAX_VALUE); out.close();
運行結果
neo@MacBook-Pro ~/workspace/netkiller % xxd -c 1 -b netkiller.bin 00000000: 10000000 . 00000001: 01111111 .
寫入一個字符
String filename = "netkiller.bin"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); out.writeBytes("a"); out.close();
寫入結果
neo@MacBook-Pro ~/workspace/netkiller % xxd -c 1 -b netkiller.bin 00000000: 01100001 a
從 ASCII 表中查出 01100001 十進制 97 十六進制 61 對應字母 a
寫入一段字符串
String filename = "netkiller.bin"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); out.writeBytes("http://www.netkiller.cn"); out.close();
運行結果
neo@MacBook-Pro ~/workspace/netkiller % xxd -c 8 -b netkiller.bin 00000000: 01101000 01110100 01110100 01110000 00111010 00101111 00101111 01110111 http://w 00000008: 01110111 01110111 00101110 01101110 01100101 01110100 01101011 01101001 ww.netki 00000010: 01101100 01101100 01100101 01110010 00101110 01100011 01101110 ller.cn
讀取二進制文件中的 byte 字符串,readAllBytes() 能夠一次讀取全部 byte 到 byte[] 中。
DataInputStream in = new DataInputStream(new BufferedInputStream(new FileInputStream(filename))); try { System.out.println(new String(in.readAllBytes())); } catch (EOFException e) { e.printStackTrace(); }
readByte() 逐字節讀取
DataInputStream in = new DataInputStream(new BufferedInputStream(new FileInputStream(filename))); try { char c = ' '; while (true) { try { c = (char) in.readByte(); System.out.print(c); } catch (EOFException e) { System.out.println(); break; } } } catch (Exception e) { e.printStackTrace(); }
如今咱們已經掌握了 byte 的操做方法,如今咱們來作一個例子,讀取 int 數據,int 是由 4 個字節組成一組。因此咱們每次取 4個字節。
// 這個例子中,咱們寫入三個數值到 netkiller.bin 文件,分別是 1024,-128,2147483647 String filename = "netkiller.bin"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); out.writeInt(1024); out.writeInt(-128); out.writeInt(Integer.MAX_VALUE); out.close();
二進制文件以下
neo@MacBook-Pro ~/workspace/netkiller % xxd -c 4 -b netkiller.bin 00000000: 00000000 00000000 00000100 00000000 .... 00000004: 11111111 11111111 11111111 10000000 .... 00000008: 01111111 11111111 11111111 11111111 ....
從二進制文件讀出 int 數據。
String filename = "netkiller.bin"; FileInputStream stream = new FileInputStream(filename); byte[] buffer = new byte[4]; while (stream.read(buffer) != -1) { ByteBuffer byteBuffer = ByteBuffer.wrap(buffer); System.out.println(byteBuffer.getInt()); }
運行結果
1024 -128 2147483647
咱們想文件寫入兩個布爾類型,一個是 true, 另外一個是 false
String filename = "netkiller.bin"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); out.writeBoolean(true); out.writeBoolean(false); out.close();
運行結果能夠看出 boolean 使用了一個字節,最後一位 1 表示true, 0 表示 false。因此對於二進制文件最小單位就是 byte 字節,雖然boolean型只須要一個 1 bit 位,可是存儲的最小單位是字節,因此前面須要補7個零 0000000。
neo@MacBook-Pro ~/workspace/netkiller % xxd -c 1 -b netkiller.bin 00000000: 00000001 . 00000001: 00000000 .
使用 ls 命令能夠看這個文件佔用了 2B(兩個字節)
neo@MacBook-Pro ~/workspace/netkiller % ll netkiller.bin -rw-r--r-- 1 neo staff 2B Oct 18 13:47 netkiller.bin
讀取二進制文件中的 boolean 數據
DataInputStream in = new DataInputStream(new BufferedInputStream(new FileInputStream(filename))); try { boolean bool = in.readBoolean(); System.out.println(bool); } catch (EOFException e) { e.printStackTrace(); }
String filename = "netkiller.bin"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); out.writeLong(1); out.close();
有了上面 int 型數據的經驗,下面一看你就會明白。long 型採用 8 個字節保存數據,是 int 的一倍。取值範圍這裏就很少說了,也會存在溢出現象。
neo@MacBook-Pro ~/workspace/netkiller % xxd -c 8 -b netkiller.bin 00000000: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001 ........
取值範圍
String filename = "netkiller.bin"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); out.writeLong(Long.MIN_VALUE); out.writeLong(Long.MAX_VALUE); out.close();
輸出文件
neo@MacBook-Pro ~/workspace/netkiller % xxd -c 8 -b netkiller.bin 00000000: 10000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ........ 00000008: 01111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111 ........
讀取二進制文件中的 long 數據
DataInputStream in = new DataInputStream(new BufferedInputStream(new FileInputStream(filename))); try { long l = in.readLong(); System.out.println(l); } catch (EOFException e) { e.printStackTrace(); }
有符號 signed char 類型的範圍爲 -128~127
無符號 unsigned char 的範圍爲0~ 255
char 與 byte 操做相似,咱們首先去 ASCII 表查找字符 A 對應 65,咱們將 65 寫入二進制文件。而後讀取該字符,輸出結果是 A。
String filename = "netkiller.bin"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); out.writeChar(65); out.close(); DataInputStream in = new DataInputStream(new BufferedInputStream(new FileInputStream(filename))); try { char c = in.readChar(); System.out.println(c); } catch (EOFException e) { e.printStackTrace(); }
從二進制文件中咱們能夠看到 char 類型佔用2個字節16位
neo@MacBook-Pro ~/workspace/netkiller % xxd -c 2 -b netkiller.bin 00000000: 00000000 01000001 .A
使用 writeChars()寫入字符串到二進制文件
String filename = "netkiller.bin"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); out.writeChars("http://www.netkiller.cn"); out.close(); DataInputStream in = new DataInputStream(new BufferedInputStream(new FileInputStream(filename))); char c = ' '; while (true) { try { c = in.readChar(); System.out.print(c); } catch (EOFException e) { System.out.println(); break; } }
二進制文件以下,你會發現第一個字節沒有用到,不少 00000000 因此若是存儲英文 byte 更適合,char 是雙倍 byte 開銷。
neo@MacBook-Pro ~/workspace/netkiller % xxd -c 8 -b netkiller.bin 00000000: 00000000 01101000 00000000 01110100 00000000 01110100 00000000 01110000 .h.t.t.p 00000008: 00000000 00111010 00000000 00101111 00000000 00101111 00000000 01110111 .:././.w 00000010: 00000000 01110111 00000000 01110111 00000000 00101110 00000000 01101110 .w.w...n 00000018: 00000000 01100101 00000000 01110100 00000000 01101011 00000000 01101001 .e.t.k.i 00000020: 00000000 01101100 00000000 01101100 00000000 01100101 00000000 01110010 .l.l.e.r 00000028: 00000000 00101110 00000000 01100011 00000000 01101110 ...c.n
存儲漢字
String filename = "netkiller.bin"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); String s = "陳"; char name = s.charAt(s.length() - 1); out.writeChar(name); out.close(); DataInputStream in = new DataInputStream(new BufferedInputStream(new FileInputStream(filename))); char c = ' '; while (true) { try { c = in.readChar(); System.out.print(c); } catch (EOFException e) { System.out.println(); break; } }
二進制文件以下,使用兩個字節表示一個漢字
neo@MacBook-Pro ~/workspace/netkiller % xxd -c 2 -b netkiller.bin 00000000: 10010110 01001000 .H
轉成 Hex 十六進制,獲得 96 48 兩個數字。
neo@MacBook-Pro ~/workspace/netkiller % hexdump netkiller.bin 0000000 96 48 0000002
如今去搜索引擎搜索「漢字內碼」,而後查詢「陳」這個漢字,能夠看到 Unicode編碼16進制就是 96 48
嘗試寫入漢字字符串
String filename = "netkiller.bin"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); out.writeChars("陳景峯"); out.close(); DataInputStream in = new DataInputStream(new BufferedInputStream(new FileInputStream(filename))); try { char c = ' '; while (true) { try { c = in.readChar(); System.out.print(c); } catch (EOFException e) { System.out.println(); break; } } } catch (Exception e) { e.printStackTrace(); }
neo@MacBook-Pro ~/workspace/netkiller % xxd -b netkiller.bin 00000000: 10010110 01001000 01100110 01101111 01011100 11110000 .Hfo\.
此次咱們使用新的文件名 netkiller.txt
String filename = "netkiller.txt"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); out.writeUTF("峯"); out.close(); DataInputStream in = new DataInputStream(new BufferedInputStream(new FileInputStream(filename))); try { System.out.println(in.readUTF()); } catch (EOFException e) { e.printStackTrace(); }
查看二進制文件,一個漢字怎麼這麼多字節?
neo@MacBook-Pro ~/workspace/netkiller % xxd -b netkiller.txt 00000000: 00000000 00000011 11100101 10110011 10110000 .....
轉成 16 禁止看看。
neo@MacBook-Pro ~/workspace/netkiller % hexdump netkiller.txt 0000000 00 03 e5 b3 b0 0000005
咱們在網上查詢 「峯」 字的漢字內碼,能夠看到UTF-8 內碼是 E5 B3 B0。這是由於UTF8使用三個字節存儲漢字。 00000000 00000011 多是 UTF 標誌位,具體我也不太清楚,總之不是 BOM 信息。
咱們如今寫入一個字符串試試
out.writeUTF("陳景峯");
xxd -s 2 -c 3 表示跳過兩個字節,三列顯示
neo@MacBook-Pro ~/workspace/netkiller % xxd -s 2 -c 3 -b netkiller.txt 00000002: 11101001 10011001 10001000 ... 00000005: 11100110 10011001 10101111 ... 00000008: 11100101 10110011 10110000 ...
UTF字符是能夠直接使用文本工具查看的。
neo@MacBook-Pro ~/workspace/netkiller % cat netkiller.txt 陳景峯
String filename = "netkiller.bin"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); out.writeShort(1); out.flush(); out.close();
輸出結果,Short 使用兩個字節16位表示。
neo@MacBook-Pro ~/workspace/netkiller % xxd -c 2 -b netkiller.bin 00000000: 00000000 00000001 ..
Short 分爲有符號和無符號類型
String filename = "netkiller.bin"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); out.writeShort(1); out.writeShort(1); out.writeShort(-1); out.writeShort(-1); out.flush(); out.close(); DataInputStream in = new DataInputStream(new BufferedInputStream(new FileInputStream(filename))); try { System.out.println(in.readShort()); System.out.println(in.readUnsignedShort()); System.out.println(in.readShort()); System.out.println(in.readUnsignedShort()); } catch (EOFException e) { e.printStackTrace(); }
運行結果
1 1 -1 65535
有符號的取值範圍
最小值:Short.MIN_VALUE=-32768 (-2的15此方) 最大值:Short.MAX_VALUE=32767 (2的15次方-1)
無符號的取值範圍是 0 ~ 65535
String filename = "netkiller.bin"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); out.writeFloat(0); out.writeFloat(1.0f); out.writeFloat(1.1f); out.flush(); out.close(); DataInputStream in = new DataInputStream(new BufferedInputStream(new FileInputStream(filename))); float c = 0; while (true) { try { c = in.readFloat(); System.out.println(c); } catch (EOFException e) { System.out.println(); break; } }
float 使用 4 字節 32 爲表示浮點類型,float 不一樣於前面數據類型,沒法直接讀取浮點數,須要通過計算才能得出,有點複雜。
neo@MacBook-Pro ~/workspace/netkiller % xxd -c 4 -b netkiller.bin 00000000: 00000000 00000000 00000000 00000000 .... 00000004: 00111111 10000000 00000000 00000000 ?... 00000008: 00111111 10001100 11001100 11001101 ?...
浮點型示意圖
/------------- 32 bit ----------------\ | 1 | 8 | 23 | |--------------------------------------| 31 30 22 0 ^ ^ ^ 符號位 指數位 尾數部分 32位 首先float二進制是從後向前讀。與上面全部類型相反。 符號位(Sign) : 0表明正,1表明爲負 指數位(Exponent):用於存儲科學計數法中的指數數據,而且採用移位存儲 尾數部分(Mantissa):尾數部分 將一個內存存儲的float二進制格式轉化爲十進制的步驟: (1)將第22位到第0位的二進制數寫出來,在最左邊補一位「1」,獲得二十四位有效數字。將小數點點在最左邊那個「1」的右邊。 (2)取出第29到第23位所表示的值n。當30位是「0」時將n各位求反。當30位是「1」時將n增1。 (3)將小數點左移n位(當30位是「0」時)或右移n位(當30位是「1」時),獲得一個二進制表示的實數。 (4)將這個二進制實數化爲十進制,並根據第31位是「0」仍是「1」加上正號或負號便可。 1.0f = 00111111 10000000 00000000 00000000 Sign 31 位是 0 表示正數 Exponent 23~30 位 0111111 1 Mantissa 0~22 位 0000000 00000000 00000000 獲得 | 0 | 0111111 1 | 0000000 00000000 00000000 | 具體細節請參考 IEEE R32.24
String filename = "netkiller.bin"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); out.writeDouble(12.5d); out.flush(); out.close(); DataInputStream in = new DataInputStream(new BufferedInputStream(new FileInputStream(filename))); double d = 0d; while (true) { try { d = in.readDouble(); System.out.println(d); } catch (EOFException e) { System.out.println(); break; } }
二進制文件
neo@MacBook-Pro ~/workspace/netkiller % xxd -c 8 -b netkiller.bin 00000000: 01000000 00101001 00000000 00000000 00000000 00000000 00000000 00000000 @)......
/------------------------- 64 bit ------------------------------\ | 1 | 11 | 52 | |----------------------------------------------------------------| 63 62 51 0 ^ ^ ^ 符號位 指數位 尾數部分 64位 首先float二進制是從後向前讀。與上面全部類型相反。 符號位(Sign) : 0表明正,1表明爲負 指數位(Exponent):用於存儲科學計數法中的指數數據,而且採用移位存儲 尾數部分(Mantissa):尾數部分 詳細參加考 IEEE R64.53
String filename = "netkiller.bin"; DataOutputStream out = new DataOutputStream(new FileOutputStream(filename)); out.writeInt(1024); out.writeShort(255); out.writeLong(100000000000L); out.writeFloat(3.14f); out.writeDouble(3.141592653579d); out.writeBoolean(true); out.writeChar(165); out.writeChars("陳景峯"); out.writeUTF("Netkiller Java 手札 - http://www.netkiller.cn"); out.writeChars("這是最後一行\r\n"); out.flush(); out.close(); DataInputStream in = new DataInputStream(new BufferedInputStream(new FileInputStream(filename))); System.out.println(in.readInt()); System.out.println(in.readUnsignedShort()); System.out.println(in.readLong()); System.out.println(in.readFloat()); System.out.println(in.readDouble()); System.out.println(in.readBoolean()); System.out.println(in.readChar()); int i = 0; String name = ""; while (i < 3) { try { char c = in.readChar(); name += c; } catch (EOFException e) { break; } i++; } System.out.println(name); System.out.println(in.readUTF()); System.out.println(in.readUTF());
須要注意的一點是 out.writeChars("陳景峯"); 寫入char字符串,在讀取的時候你須要知道字符串的長度。而後循環取出char數據。
二進制文件內容
neo@MacBook-Pro ~/workspace/netkiller % xxd -c 8 -b netkiller.bin 00000000: 00000000 00000000 00000100 00000000 00000000 11111111 00000000 00000000 ........ 00000008: 00000000 00010111 01001000 01110110 11101000 00000000 01000000 01001000 ..Hv..@H 00000010: 11110101 11000011 01000000 00001001 00100001 11111011 01010100 01000011 ..@.!.TC 00000018: 11001110 00101000 00000001 00000000 10100101 10010110 01001000 01100110 .(....Hf 00000020: 01101111 01011100 11110000 00000000 00101111 01001110 01100101 01110100 o\../Net 00000028: 01101011 01101001 01101100 01101100 01100101 01110010 00100000 01001010 killer J 00000030: 01100001 01110110 01100001 00100000 11100110 10001001 10001011 11100110 ava .... 00000038: 10011100 10101101 00100000 00101101 00100000 01101000 01110100 01110100 .. - htt 00000040: 01110000 00111010 00101111 00101111 01110111 01110111 01110111 00101110 p://www. 00000048: 01101110 01100101 01110100 01101011 01101001 01101100 01101100 01100101 netkille 00000050: 01110010 00101110 01100011 01101110 10001111 11011001 01100110 00101111 r.cn..f/ 00000058: 01100111 00000000 01010100 00001110 01001110 00000000 10001000 01001100 g.T.N..L 00000060: 00000000 00001101 00000000 00001010 ....
16 進制編輯器更好閱讀一些
neo@MacBook-Pro ~/workspace/netkiller % hexdump -C netkiller.bin 00000000 00 00 04 00 00 ff 00 00 00 17 48 76 e8 00 40 48 |..........Hv..@H| 00000010 f5 c3 40 09 21 fb 54 43 ce 28 01 00 a5 96 48 66 |..@.!.TC.(....Hf| 00000020 6f 5c f0 00 2f 4e 65 74 6b 69 6c 6c 65 72 20 4a |o\../Netkiller J| 00000030 61 76 61 20 e6 89 8b e6 9c ad 20 2d 20 68 74 74 |ava ...... - htt| 00000040 70 3a 2f 2f 77 77 77 2e 6e 65 74 6b 69 6c 6c 65 |p://www.netkille| 00000050 72 2e 63 6e 8f d9 66 2f 67 00 54 0e 4e 00 88 4c |r.cn..f/g.T.N..L| 00000060 00 0d 00 0a |....| 00000064