Java程序運行、中止Shell腳本

時間 2019-11-12

標籤 java 程序運行中止 shell 腳本欄目 Java 简体版

原文原文鏈接

碰到這樣一個問題——用Java程序來控制shell腳本的運行和中止。具體來說，這個Java程序至少要有三個功能：java

運行Shell腳本；shell
等待Shell腳本執行結束；session
中止運行中的Shell程序；ide

從功能需求來看，彷佛是比較容易作到的。儘管沒有寫過相似功能的程序，Google一下，很快就有答案了。工具

用Runtime或者ProcessBuilder能夠運行程序，而Process類的waitFor()和destroy()方法分別知足功能2和3。測試

import java.io.BufferedReader;
import java.io.File;
import java.io.IOException;
import java.io.InputStreamReader;


public class ShellRunner extends Thread
{
  private Process proc;
  private String dir;
  private String shell;

  public ShellRunner(String dir, String shell)
  {
    super();
    this.proc = null;
    this.dir = dir;
    this.shell = shell;
  }

  @Override
  public void run() {
    try
    {
      ProcessBuilder builder = new ProcessBuilder("sh", dir + shell);
      builder.directory(new File(dir));

      proc = builder.start();
      System.out.println("Running ...");
      int exitValue = proc.waitFor();
      System.out.println("Exit Value: " + exitValue);
    }
    catch (IOException e)
    {
      e.getLocalizedMessage();
    }
    catch (InterruptedException e)
    {
      e.getLocalizedMessage();
    }
  }

  public void kill()
  {
    if (this.getState() != State.TERMINATED) {
      proc.destroy();
    }
  }

  public static void main(String args[]) {
    ShellRunner runner = new ShellRunner("/tmp/", "run.sh");
    runner.start();

    InputStreamReader inputStreamReader = new InputStreamReader(System.in);
    BufferedReader reader = new BufferedReader(inputStreamReader);
    try
    {
      String line = null;
      while ( (line = reader.readLine()) != null ) {
        if (line.equals("kill")) {
          runner.kill();
        }
        else if (line.equals("break")) {
          break;
        }
        else {
          System.out.println(runner.getState());
        }
      }
      reader.close();
      inputStreamReader.close();
    }
    catch (IOException e)
    {
      e.printStackTrace();
    }

  }
}

跑一下上面這個測試程序，waitFor()方法能夠正確等待shell程序退出，可是destroy()方法並無結束shell腳本相關的進程。ui

爲何呢？this

這是一個BUG。spa

JDK-bug-4770092：Process.destroy() 不能結束孫子進程（grandchildren）。上述例子中，java程序的子進程是」sh run.sh」，而shell腳本中的任何命令都是」sh run.sh」這個進程的子進程（也可能有孫子進程，或者更遠的後代進程）。因此shell腳本中執行的命令並不能隨着 Process.destroy()結束。這是一個很老的BUG，可是出於各個平臺兼容性的考慮，官方並不許備修復這個BUG。彷佛依賴Java程序來完成功能3的路已經斷了。設計

如今剩下的問題能夠歸結爲：如何結束一顆進程樹上的全部進程？其中的某些進程可能已經退出，也就是說進程樹的某些分支可能已經斷開了。

一個比較天然的想法是：記錄全部以」sh run.sh」這個進程爲根進程的全部進程號，須要的時候統一kill。這須要一點Linux進程的相關知識：

Linux下每一個進程有不少ID屬性：PID（進程號）、PPID（父進程號）、PGID（進程組號）、SID（進程所在session的ID）
子進程會繼承父進程的進程組信息和會話信息
一個進程只能建立從屬於（和它自身）同一個會話的進程組，除非使用setsid系統調用的方式新建一個會話
進程組不能在不一樣的會話中遷移，進程所屬的進程組能夠變，可是僅限於同一個會話中的進程組

還要一點Linux命令工具的知識：

用「kill -9 -1234」能夠殺死進程組號爲1234的全部進程
strace命令能夠跟蹤某個進程以及它fork出來的全部子進程產生的系統調用

有了這些知識，思路就比較清晰了：用strace跟蹤setsid這個系統調用，記下全部伴隨系統調用產生的SID；在須要殺死這棵進程樹上的全部進程時，用ps命令把進程樹上還沒退出的進程組所有找出，一併kill。

造成代碼就是：

import java.io.BufferedReader;
import java.io.File;
import java.io.IOException;
import java.io.InputStreamReader;


public class ShellRunner2 extends Thread
{
  private Process proc;
  private String dir;
  private String shell;
  private File tmpFile;

  public ShellRunner2(String dir, String shell) throws IOException
  {
    super();
    this.proc = null;
    this.dir = dir;
    this.shell = shell;
    this.tmpFile = createTempFile(dir, shell);
  }

  @Override
  public void run() {
    try
    {
      ProcessBuilder builder = new ProcessBuilder("sh", "-c",
          "strace -o " + tmpFile.getPath() + " -f -e trace=setsid setsid sh " + dir + shell);
      builder.directory(new File(dir));

      proc = builder.start();
      System.out.println("Running ...");
      int exitValue = proc.waitFor();
      System.out.println("Exit Value: " + exitValue);
    }
    catch (IOException e)
    {
      e.getLocalizedMessage();
    }
    catch (InterruptedException e)
    {
      e.getLocalizedMessage();
    }
  }

  public void kill()
  {
    if (this.getState() != State.TERMINATED) {
      try
      {
        ProcessBuilder builder = new ProcessBuilder("sh", "-c",
            "ps -o sid,pgid ax | " +
            "grep $(grep -e \"setsid()\" -e \"<... setsid resumed>\" " +
              tmpFile.getPath() +
            " | awk '{printf \" -e\" $NF}') | awk {'print $NF'} | " +
            "sort | uniq | sed 's/^/-/g' | xargs kill -9 2>/dev/null");
        builder.directory(new File(dir));
        Process proc = builder.start();
        proc.waitFor();
      }
      catch (IOException e)
      {
        e.printStackTrace();
      }
      catch (InterruptedException e)
      {
        e.printStackTrace();
      }
    }
  }

  public static void main(String args[]) throws IOException {
    ShellRunner2 runner = new ShellRunner2("/tmp/", "a.sh");
    runner.start();

    InputStreamReader inputStreamReader = new InputStreamReader(System.in);
    BufferedReader reader = new BufferedReader(inputStreamReader);
    try
    {
      String line = null;
      while ( (line = reader.readLine()) != null ) {
        if (line.equals("kill")) {
          runner.kill();
        }
        else if (line.equals("break")) {
          break;
        }
        else {
          System.out.println(runner.getState());
        }
      }
      reader.close();
      inputStreamReader.close();
    }
    catch (IOException e)
    {
      e.printStackTrace();
    }

  }

  private File createTempFile(String dir, String prefix) throws IOException {
    String name = "." + prefix + "-" + System.currentTimeMillis();
    File tempFile = new File(dir, name);
    if (tempFile.createNewFile()) {
      return tempFile;
    }
    throw new IOException("Failed to create file " + tempFile.getPath() + ".");
  }
}

設計一個針對性的測試：

下面這個程序fork子程序並設置PGID，測試過成中編譯爲可執行文件pgid_test。

#include <unistd.h>
#include <sys/types.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>

int main(int argc, void *argv[]) {
  pid_t pid;
  int slp = 10;

  if (argc > 1) {
      int tmp = atoi(argv[1]);
      if (tmp > 0 && tmp < 120) {
          slp = tmp;
      }
  }

  if ((pid = fork()) != 0) {
      if (setpgid(pid, 0) != 0) {
          fprintf(stderr, "setpgid() error - %s", strerror(errno));
      }
  }

  sleep(slp);
  
  return 0;
}

{a,b,c,d}.sh這幾個shell腳本之間的調用模擬了一個進程樹的生成，而且這些進程有着不一樣的SID或者PGID。

a.sh

sh b.sh &(sleep 15)&sleep 10

b.sh

(sleep 12)&sh c.sh &

c.sh

./pgid_test 20 &setsid sh d.sh &setsid sh d.sh &

d.sh

./pgid_test 30 &(sleep 30; echo "bad killer " `date` >> /tmp/bad_killer) &

運行java程序，調用kill()後，全部子進程都被成功結束了。

這個方法並不那麼優美，也有可能存在問題：

strace的輸出sesion id的形式可能不止那麼兩種；不一樣版本的strace的輸出可能不同。
shell程序可能以其餘用戶身份啓動了一些進程，而kill又沒有權限殺死那些進程。
在ps命令執行以後，kill執行以前，正好有setsid調用，這個調用產生的session的相關進程會被漏掉。這個問題能夠經過屢次執行解決。

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。