通過前兩期的鋪墊及燒腦的分享,咱們大概對「如何實現 Java 應用進程的狀態監控,若是被監控的進程 down 掉,是否有機制能啓動起來?」問題自己有了一個新的認識,那這期咱們不妨拿出攻城獅的絕招 Ctrl + C、Ctrl + V,從 Resin 源碼中摘取一二,稍微簡單實踐一下。bash
按照圖示,我們先演示一下實踐效果吧,首先找到並運行程序入口 MonitorApp,日誌輸出以下。服務器
此時咱們不妨在控制檯輸入 jps 命令,看一看效果。app
18830 MonitorAppjvm
18831 Resinsocket
發現成功啓動了 MonitorApp、Resin 兩個進程,和 Resin 應用服務器是如出一轍的,若是咱們把進程號爲 18831 的 kill 掉,會是什麼效果?發現控制檯日誌輸出又多了一些,貌似丫環 Resin 又被從新給啓動了。ide
在控制檯輸入 jps 命令再確認一下是否真的變了。函數
18830 MonitorAppui
18935 Resin
那咱們到底該如何實現?那不妨照葫蘆畫瓢,模仿一下 Resin 的實現一下(這就是絕招:仿一仿)。
首先定義咱們的監控應用入口 MonitorApp,很簡單就是把建立子進程的任務給啓動起來。
package com.caucho.server.resin;
public class MonitorApp {
public static void main(String[] args) {
new WatchdogChildTask().start();
接下來再編寫 WatchdogChildTask 子進程任務的代碼,大部分來源於 Resin 的源碼,只是剔除了不少不少不少,簡化了不少不少不少。仔細看發現也很簡單,就有一個循環一直調用 WatchdogChildProcess 的 run 方法,目的也就是一直讓丫環進程跑起來。
package com.caucho.server.resin;
import java.util.concurrent.Executors;
import java.util.logging.Level;
import java.util.logging.Logger;
class WatchdogChildTask implements Runnable {
private static final Logger log = Logger.getLogger(WatchdogChildTask.class.getName());
private WatchdogChildProcess _process;
* Starts management of the watchdog process
public void start() {
//TODO 手動建立線程池會更好 【阿里開發規約】
* Main thread watching over the health of the Resin instances.
public void run() {
try {
int i = 0;
long retry = Long.MAX_VALUE;
while (i++ < retry) {
WatchdogChildProcess process = new WatchdogChildProcess();
_process = process;
try {
log.log(Level.INFO, "我是大總管,準備讓乳名爲Resin的丫環跑起來");
} catch (Exception e) {
log.log(Level.WARNING, e.toString(), e);
} finally {
_process = null;
if (process != null) {
log.log(Level.INFO, "我是大總管,發現乳名爲Resin的丫環出情況了,須要讓她釋放資源,從新跑起來");
} catch (Exception e) {
log.log(Level.WARNING, e.toString(), e);
} finally {
if (_process != null) {
_process = null;
具體是怎麼把丫環進程跑起來的,這個事情專門交給 WatchdogChildProcess 去作了,先啓動了一個 socket 通信端口;而後採用 ProcessBuilder 啓動 Resin 進程;而後等待丫環進程創建 socket 鏈接通信。大部分也是來源於 Resin 的源碼,只不過作了大量刪減。另外重點提一嘴:拿下去只需修改 com.caucho.server.resin.Resin 爲你要監控應用的主函數便可。
package com.caucho.server.resin;
import java.io.*;
import java.net.*;
import java.util.*;
import java.util.concurrent.atomic.AtomicReference;
import java.util.logging.Level;
import java.util.logging.Logger;
class WatchdogChildProcess {
private static final Logger log = Logger.getLogger(WatchdogChildProcess.class.getName());
private Socket _childSocket;
private OutputStream _stdOs;
private int _status = -1;
private AtomicReference _processRef = new AtomicReference();
public void run() {
ServerSocket ss = null;
Socket s = null;
try {
ss = new ServerSocket(0, 5, InetAddress.getByName(""));
int port = ss.getLocalPort();
log.log(Level.INFO, "我是大總管,我啓動一個端口爲{0}的socket,讓丫環們實時與我通信",port);
Process process = createProcess(port);
if (process != null) {
_processRef.compareAndSet(null, process);
InputStream stdIs = process.getInputStream();
_stdOs = process.getOutputStream();
//TODO 不要顯式建立線程,請使用線程池【阿里開發規約】
new Thread(new WatchdogProcessLogThread(stdIs)).start();
s = connectToChild(ss);
_status = process.waitFor();
} catch (Exception e) {
log.log(Level.WARNING, e.toString(), e);
try {
} catch (Exception e1) {
} catch (Throwable e) {
log.log(Level.WARNING, e.toString(), e);
} finally {
if (ss != null) {
try {
} catch (Throwable e) {
try {
if (s != null) {
} catch (Throwable e) {
log.log(Level.FINER, e.toString(), e);
synchronized (this) {
private void logStatus(int status) {
String code = " (exit code=" + status + ")";
void kill() {
Process process = _processRef.getAndSet(null);
if (process != null) {
try {
} catch (Exception e) {
log.log(Level.FINE, e.toString(), e);
OutputStream stdOs = _stdOs;
_stdOs = null;
if (stdOs != null) {
try {
} catch (Throwable e) {
log.log(Level.FINE, e.toString(), e);
Socket childSocket = _childSocket;
_childSocket = null;
if (childSocket != null) {
try {
} catch (Throwable e) {
log.log(Level.FINE, e.toString(), e);
if (process != null) {
try {
} catch (Exception e) {
log.log(Level.INFO, e.toString(), e);
* Waits for a socket connection from the child, returning the socket
* @param ss TCP ServerSocket from the watchdog for the child to connect to
private Socket connectToChild(ServerSocket ss)
throws IOException {
Socket s = null;
try {
for (int i = 0; i < 120 && s == null; i++) {
try {
s = ss.accept();
} catch (SocketTimeoutException e) {
if (s != null) {
_childSocket = s;
} catch (Exception e) {
log.log(Level.WARNING, e.toString(), e);
} finally {
return s;
* Creates a new Process for the Resin JVM, initializing the environment
* and passing value to the new process.
* @param socketPort the watchdog socket port
* @param out the debug log jvm-default.log
private Process createProcess(int socketPort)
throws IOException {
HashMap env = buildEnv();
ArrayList jvmArgs = buildJvmArgs();
ProcessBuilder builder = new ProcessBuilder();
builder = builder.command(jvmArgs);
return builder.start();
private HashMap buildEnv()
throws IOException {
HashMap env = new HashMap();
StringBuilder classPath = new StringBuilder();
// TODO 系統不同分割符也不一樣 windows爲分號;
String appPath = System.getProperty("user.dir");
env.put("CLASSPATH", classPath.toString());
// 。。。 刪除了可多可多的代碼 。。。
return env;
private ArrayList buildJvmArgs() {
ArrayList jvmArgs = new ArrayList();
// ... 又刪除了可多代碼 ...
return jvmArgs;
* Watchdog thread responsible for writing jvm-default.log by reading the
* JVM's stdout and copying it to the log. */ class WatchdogProcessLogThread implements Runnable { private InputStream _is; /** * @param is the stdout stream from the Resin */ WatchdogProcessLogThread(InputStream is) { _is = is; } @Override public void run() { try { int len; byte[] data = new byte[4096]; while ((len = _is.read(data, 0, data.length)) > 0) { System.out.print(new String(data, 0, len)); } } catch (Throwable e) { log.log(Level.WARNING, e.toString(), e); } finally { kill(); } } } }複製代碼
下面這個要重點說下,由於這套模型你拿過去,只需修改下面 Resin 這個類的代碼,這個其實也就是咱們要監控的應用。其實很簡單,就有一個 connect 方法主要用於與大總管進行通信,一旦通信失敗自己就退出。
package com.caucho.server.resin;
import java.io.IOException;
import java.io.InputStream;
import java.net.Socket;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.logging.Level;
import java.util.logging.Logger;
public class Resin {
private static ExecutorService executorService = Executors.newFixedThreadPool(1);
private static final Logger log = Logger.getLogger(Resin.class.getName());
public static void main(String[] args) {
log.log(Level.INFO, "我是乳名爲Resin的丫環,大總管給的通信端口爲{0} {1}", args);
//獲取傳入的參數 port
int port = Integer.parseInt(args[1]);
public static void connect(final int port) {
log.log(Level.INFO, "我是乳名爲Resin的丫環,我要開始與端口爲{0}的大總管進行通信",port);
executorService.execute(new Runnable() {
public void run() {
Socket socket = null;
try {
socket = new Socket("", port);
InputStream s = socket.getInputStream();
byte[] buf = new byte[1024];
int len;
while ((len = s.read(buf)) != -1) {
log.log(Level.INFO, "通信信息 {0}", new String(buf, 0, len));
} catch (IOException e) {
log.log(Level.WARNING, "我是乳名爲Resin的丫環,與端口爲{0}的大總管進行通信發生異常",port);
} finally {
try {
} catch (IOException e) {
log.log(Level.WARNING, e.getMessage(), e);
log.log(Level.INFO, "我是乳名爲Resin的丫環,與端口爲{0}的大總管進行通信結束,我要退下啦",port);
log.log(Level.INFO, "丫環與大總管通信完成");
到這,代碼也就碼完了,不妨把代碼拔下去,運行一下,稍微體驗體驗,看看是否是那回事兒!其中爲了演示須要刪除了 N 多代碼,有些地方很不優雅,還需按照阿里開發規約適當調整調整,不過這些不是我們這期分享的重點,我們重點是思想 + 輕實踐。