Java讀取文本文件中文亂碼問題

時間 2021-08-15

標籤 java ide 編碼 code class 原理亂碼百度 List 欄目 Java 简体版

原文原文鏈接

最近遇到一個問題，Java讀取文本文件（例如csv文件、txt文件等），遇到中文就變成亂碼。讀取代碼以下：java

List<String> lines=new ArrayList<String>();  
BufferedReader&nbsp;br&nbsp;=&nbsp;new&nbsp;BufferedReader(new&nbsp;FileReader(fileName));
String&nbsp;line&nbsp;=&nbsp;null;
while&nbsp;((line&nbsp;=&nbsp;br.readLine())&nbsp;!=&nbsp;null)&nbsp;{ 
      lines.add(line);
}
br.close();

後來百度和Google了以後，終於找到緣由，仍是從原理開始講吧：ide

Reader 類是 Java 的 I/O 中讀字符的父類，而 InputStream 類是讀字節的父類，InputStreamReader 類就是關聯字節到字符的橋樑，它負責在 I/O 過程當中處理讀取字節到字符的轉換，而具體字節到字符的解碼實現它由 StreamDecoder 去實現，在 StreamDecoder 解碼過程當中必須由用戶指定 Charset 編碼格式。值得注意的是若是你沒有指定 Charset，將使用本地環境中的默認字符集，例如在中文環境中將使用 GBK 編碼。編碼

總結：Java讀取數據流的時候，必定要指定數據流的編碼方式，不然將使用本地環境中的默認字符集。code

通過上述分析，修改以後的代碼以下：it

List<String> lines=new ArrayList<String>();
BufferedReader br=new BufferedReader(new InputStreamReader(new FileInputStream(fileName),"UTF-8"));
String line = null;
while ((line = br.readLine()) != null) {
      lines.add(line);
}
br.close();