NIO前奏之Path、Files、AsynchronousFileChannel

時間 2019-11-18

標籤 nio 前奏 path files asynchronousfilechannel 欄目 Netty 简体版

原文原文鏈接

NIO前奏之Path、Files、AsynchronousFileChannel

Java 1.4加入了nio包，Java 1.7 加入了真正的AIO（異步IO），AsynchronousFileChannel就是一個典型的能夠異步處理文件的類。html

以前咱們處理文件時，只能阻塞着，等待文件寫入完畢以後才能繼續執行，讀取也是同樣的道理，系統內核沒有準備好數據時，進程只能乾等着數據過來而不能作其餘事。AsynchronousFileChannel則能夠異步處理文件，而後作其餘事，等到真正須要處理數據的時候再處理。java

Path

在引入新的nio的同時，專門添加了不少新的類來取代原來的IO操做方式。
Path和Files就是其中比較基礎的兩個類，這裏先簡單介紹下。網絡

建立Pathapp

//單個路徑建立，注意這裏是Paths靜態工廠方法
Path path = Paths.get("data.txt");//以「/」和盤符開頭的爲絕對路徑，「/」會被認爲是C盤，至關於「C:/」
//多路徑建立，basePath是基礎路徑，relativePath是相對第一個參數的相對路徑，後面能夠添加多個相對路徑
Path path2 = Path.get(basePath, relativePath, ...);//

Path, File, URI 互轉：異步

File file = path.toFile();
Path path = file.toPath();
URI uri = path.toUri();
URI uri = file.toURI();

路徑正常化、絕對路徑、真實路徑async

路徑中可使用.表示當前路徑，../表示父級路徑，可是這種表示有時會形成路徑冗餘，這是可使用下面的幾個方法來處理。ide

Path path = Paths.get("./src");
System.out.println("path = " + path);//path = .\src
System.out.println("normalize : "+ path.normalize());//normalize : src
System.out.println("toAbsolutePath : "+path.toAbsolutePath());//toAbsolutePath : C:\Users\Dell\IdeaProjects\test\.\src
System.out.println("toRealPath : "+path.toRealPath());//toRealPath : C:\Users\Dell\IdeaProjects\test\src

這幾個方法返回的仍是Path
normalize()：壓縮路徑
toAbsolutePath()：絕對路徑
toRealPath()：真實路徑，至關於同時調用上面兩個方法。（須要注意調用此方法時路徑須要存在，不然會拋出異常）函數

Files

Files雖然看起來像File的工具類，可是實際卻在java.nio.file包下面，是後來引入的工具類，通常配合Path使用。工具

這裏介紹幾個經常使用的方法：post

複製
- copy(Path source, Path target)Path到Path複製，還有第三個可選參數爲複製選項，是否覆蓋之類的
- copy(InputStream in, Path target)從流到Path
- copy(Path source, OutputStream out)從Path到流
建立文件（夾）
- createFile(Path path)建立文件
- createDirectory(Path dir)建立文件夾，父級目錄不存在則報錯
- createDirectories(Path dir)建立文件夾，父級目錄不存在則自動建立父級目錄
移動：move(Path source, Path target)能夠移動或者重命名文件（注意這裏必須是文件，不能是文件夾），一樣也可選是否覆蓋
刪除：delete(Path path)刪除文件
文件是否存在：exists(Path path)
寫文本：Files.write(path, Arrays.asList("落霞與孤鶩齊飛，","秋水共長天一色"), StandardOpenOption.APPEND);

具體方法能夠查看API文檔，這裏再也不一一贅述。

Files.walkFileTree()

這是個比較強大的方法，以前咱們遍歷在一個文件夾裏搜索文件或者刪除一個非空文件夾的時候只能使用遞歸（或者本身手動維護一個棧），可是遞歸效率很是低而且容易爆棧，使用walkFileTree()方法能夠很優雅地遍歷文件夾。

首先來看看這個方法的其中一種重載的定義：

walkFileTree(Path start, FileVisitor<? super Path> visitor)

第一個參數是遍歷的根目錄，第二個參數是控制遍歷行爲的接口，咱們來看看這個接口的定義：

public interface FileVisitor<T> {
    //訪問目錄前作什麼
    FileVisitResult preVisitDirectory(T dir, BasicFileAttributes attrs) throws IOException;

    //訪問文件時作什麼
    FileVisitResult visitFile(T file, BasicFileAttributes attrs) throws IOException;

    //訪問文件失敗作什麼
    FileVisitResult visitFileFailed(T file, IOException exc) throws IOException;

    //訪問目錄後作什麼
    FileVisitResult postVisitDirectory(T dir, IOException exc) throws IOException;
}

接口中定義了四個方法，分別規定了咱們訪問一個文件（夾）前中後失敗分別作什麼，咱們本身訪問文件時也就這麼幾個時間，使用這個接口就不用本身去遞歸遍歷文件夾了。

FileVisitor接口定義了四個抽象方法，有時候咱們只是想要訪問文件時作點什麼，不關心訪問前、訪問後作什麼，可是卻必須實現其功能，這樣顯得臃腫。
此時咱們可使用它的適配器實現類：SimpleFileVisitor，該類提供了四個抽象方法的平庸實現，使用的時候只須要重寫特定方法便可。

放上個刪除非空文夾的Demo：

刪除非空文件夾

Path path = Paths.get("D:/dir/test");
Files.walkFileTree(path,new SimpleFileVisitor<Path>(){
    @Override
    public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
        System.out.println("刪除文件："+file);
        Files.delete(file);
        return FileVisitResult.CONTINUE;
    }

    @Override
    public FileVisitResult postVisitDirectory(Path dir, IOException exc) throws IOException {
        System.out.println("刪除文件夾："+dir);
        Files.delete(dir);
        return FileVisitResult.CONTINUE;
    }
});

上面的抽象方法的返回值是一個枚舉，用來決定是否繼續遍歷：

CONTINUE：繼續遍歷
SKIP_SIBLINGS：繼續遍歷，但忽略當前節點的全部兄弟節點直接返回上一層繼續遍歷
SKIP_SUBTREE：繼續遍歷，可是忽略子目錄，可是子文件仍是會訪問；
TERMINATE：終止遍歷

所以若是是用來所搜文件的話，在找到文件以後能夠終止遍歷，具體實現這裏就不贅述了。

AsynchronousFileChannel

閱讀本節以前請先看另外一篇文章：淺析Java NIO

異步的通道有好幾個：

AsynchronousFileChannel：lock，read，write
AsynchronousSocketChannel：connect，read，write
AsynchronousDatagramChannel：read，write，send，receive
AsynchronousServerSocketChannel：accept

分別對應文件IO、TCP IO、UDP IO、服務端TCP IO。和非異步通道正好是對應的。

這裏就只說文件異步IO ：AsynchronousFileChannel

異步文件通道的建立

AsynchronousFileChannel是經過靜態工廠的方法建立，經過open()方法能夠打開一個Path的異步通道：

Path path = Paths.get("data.txt");
AsynchronousFileChannel channel = AsynchronousFileChannel.open(path,StandardOpenOption.READ);

第一個參數是關聯的Path，第二個參數是操做的方式（或者叫權限，該參數能夠省略）

AsynchronousFileChannel通道的讀寫分別都有兩種方式，一種是Futrue方式，另外一種是註冊回調函數CompletionHandler的方式。這裏稍微演示一下。

Future方式讀寫

讀：

Path path = Paths.get("data.txt");
AsynchronousFileChannel channel = AsynchronousFileChannel.open(path,StandardOpenOption.READ);//第二個參數是操做方式
ByteBuffer buffer = ByteBuffer.allocate(1024 * 1024);
//當即返回，不會阻塞
Future<Integer> future = channel.read(buffer, 0);//第二個參數是從哪開始讀
//主線程能夠繼續處理
System.out.println("主線程繼續處理...");

//須要處理數據時先判斷是否讀取完畢
while (!future.isDone()){
    System.out.println("還未完成...");
}
byte[] data = new byte[buffer.limit()];
buffer.flip();//切換成讀模式
buffer.get(data);
System.out.println(new String(data));
buffer.clear();
channel.close();

寫：

Path path = Paths.get("data2.txt");
if (!Files.exists(path)) {
    Files.createFile(path);
}
AsynchronousFileChannel channel = AsynchronousFileChannel.open(path, StandardOpenOption.WRITE);
ByteBuffer buffer = ByteBuffer.allocate(1024);
buffer.put("爲美好的世界獻上祝福".getBytes());
buffer.flip();
Future<Integer> future = channel.write(buffer, 0);
//主線程繼續處理
System.out.println("主線程繼續...");

//須要處理數據時
while (!future.isDone()){
    System.out.println("寫入未完成");
    Thread.sleep(1);
}
System.out.println("寫入完成!");

很惋惜，上面open方法的第二個參數不能設置爲StandardOpenOption.APPEND，也就是說這種方式的異步寫出只能寫入一個新文件，寫入已有數據的文件的時候源數據會被覆蓋。（Stack Overflow上好像有人給出瞭解決方式，可是我沒看太明白）

回調函數CompletionHandler方式讀寫

讀：

Path path = Paths.get("data.txt");
if (!Files.exists(path)) {
    Files.createFile(path);
}
AsynchronousFileChannel channel = AsynchronousFileChannel.open(path, StandardOpenOption.READ);
ByteBuffer buffer = ByteBuffer.allocate(1024*1024);
//這裏，使用了read()的重載方法
channel.read(buffer, 0, buffer, new CompletionHandler<Integer, ByteBuffer>() {
    @Override
    public void completed(Integer result, ByteBuffer attachment) {
        System.out.println("讀取完成，讀取了"+result+"個字節");
        byte[] bytes = new byte[attachment.position()];
        attachment.flip();
        attachment.get(bytes);
        System.out.println(new String(bytes));
        attachment.clear();
    }

    @Override
    public void failed(Throwable exc, ByteBuffer attachment) {
        System.out.println("讀取失敗...");
    }
});
System.out.println("主線繼續運行...");

read()的重載方式，能夠添加一個回調函數CompletionHandler，當讀取成功的時候會執行completed方法，讀取失敗執行failed方法。
這個read方法的第一個參數和第二個參數是要讀入的緩衝和位置，第三個參數是一個附件，能夠理解爲傳入completed方法的參數（通常用來傳遞上下文，好比下面的異步讀取大文件就是這麼作的），能夠爲null，第四個參數則是傳入的回調函數CompletionHandler，完成或失敗的時候會執行這個函數的特定方法。
須要指出的是在異步讀取完成以前不要操做緩衝，也就是read方法的第一個參數。
回調函數CompletionHandler的第一個泛型表明讀取的字節數，第二個泛型就是read方法的第三個參數的類型，例子中我使用了Buffer。

寫：

Path path = Paths.get("data2.txt");
if (!Files.exists(path)){
    Files.createFile(path);
}
AsynchronousFileChannel channel = AsynchronousFileChannel.open(path, StandardOpenOption.WRITE);
ByteBuffer buffer = ByteBuffer.allocate(1024);
buffer.put("爲美好的世界獻上祝福".getBytes());
buffer.flip();
channel.write(buffer, 0, path, new CompletionHandler<Integer, Path>() {
    @Override
    public void completed(Integer result, Path attachment) {
        System.out.println("寫入完畢...");
    }

    @Override
    public void failed(Throwable exc, Path attachment) {
        System.out.println("寫入失敗...");
    }
});

System.out.println("主線程繼續執行...");

至此，四種方式的讀寫已展現完畢。

不過你有沒有發現，讀文件時，不論是Future的方式仍是回調的方式，都須要把整個文件加載到內存中來，也就是Buffer的尺寸必須比文件大，有時文件比較大的時候確定會內存暴漲甚至溢出，那麼有沒有一種方法能夠在一個1000 byte大小的Buffer下讀取大文件呢？

神奇的Stack Overflow告訴咱們：有！不廢話，直接上源碼：

import java.nio.*;
import java.nio.channels.*;
import java.nio.file.*;
import java.io.IOException;

public class TryNio implements CompletionHandler<Integer, AsynchronousFileChannel> {

    //讀取到文件的哪一個位置
    int pos = 0;
    ByteBuffer buffer = null;

    public void completed(Integer result, AsynchronousFileChannel attachment) {
        //若是result爲-1表明未讀取任何數據
        if (result != -1) {
            pos += result;  //防止讀取相同的數據

            //操做讀取的數據，這裏直接輸出了
            System.out.print(new String(buffer.array(),0,result));

            buffer.clear();  //清空緩衝區來繼續下一次讀取
        }
        //啓動另外一個異步讀取
        attachment.read(buffer, pos , attachment, this );


    }
    public void failed(Throwable exc,
                       AsynchronousFileChannel attachment) {
        System.err.println ("Error!");
        exc.printStackTrace();
    }

    //主邏輯方法
    public void doit() {
        Path file = Paths.get("data.txt");
        AsynchronousFileChannel channel =  null;
        try {
            channel = AsynchronousFileChannel.open(file);
        } catch (IOException e) {
            System.err.println ("Could not open file: " + file.toString());
            System.exit(1); 
        }
        buffer = ByteBuffer.allocate(1000);

        // 開始異步讀取
        channel.read(buffer, pos , channel, this );
        // 此方法調用後會直接返回，不會阻塞
    }

    public static void main (String [] args) {
        TryNio tn = new TryNio();
        tn.doit();
        //由於doit()方法會直接返回不會阻塞，而且異步讀取數據不能讓虛擬機保持運行，因此這裏添加一個輸入來防止程序結束。
        try { System.in.read(); } catch (IOException e) { }
    }
}

Stack Overflow上的答主選擇直接實現CompletionHandler接口，而不是使用匿名內部類，他給出的緣由是：

The magic happens when you initiate another asynchronous read during the complete method. This is why I discarded the anonymous class and implemented the interface itself.

翻譯過來就是：

當您在complete方法期間啓動另外一個異步讀取時，會發生魔法。這就是爲何我放棄了匿名類並實現了接口自己。

不過我本身試了試匿名內部類卻並麼有發生魔法：

Path path = Paths.get("data.txt");
if (!Files.exists(path)) {
    Files.createFile(path);
}
AsynchronousFileChannel channel = AsynchronousFileChannel.open(path);
ByteBuffer buffer = ByteBuffer.allocate(100);

channel.read(buffer, 0, channel, new CompletionHandler<Integer, AsynchronousFileChannel>() {
    int pos = 0;
    @Override
    public void completed(Integer result, AsynchronousFileChannel attachment) {
        //若是result爲-1表明未讀取任何數據
        if (result != -1) {
            pos += result;  //防止讀取相同的數據

            //操做讀取的數據，這裏直接輸出了
            System.out.print(new String(buffer.array(),0,result));

            buffer.clear();  //清空緩衝區來繼續下一次讀取
        }
        //啓動另外一個異步讀取
        attachment.read(buffer, pos , attachment, this );
    }

    @Override
    public void failed(Throwable exc, AsynchronousFileChannel attachment) {
        System.out.println("讀取失敗...");
    }
});

System.out.println("主線繼續運行...");
new Scanner(System.in).nextLine();

因此仍是不太明白爲何答主使用實現類而不是直接使用匿名內部類。

用上面的方法雖然實現了異步讀取大文件，但也不是沒有缺點，由於這種方法的原理是在異步中遞歸調用異步讀取，也就是說每次讀取1000個字節都須要創建新異步，因此效率並無理想中的高（不過異步的開銷仍是比線程低就是了）。
還有一個小瑕疵就是讀取中文的時候會亂碼，由於UTF-8中中文通常是3個字節，生僻字會是4個字節，換行是1個字節，也就是說一個字有可能會被分紅兩半，接着就亂碼了😆