http://www.cnblogs.com/me-sa/archive/2012/01/10/erlang0030.htmlhtml
Supervisors are used to build an hierarchical process structure called a supervision tree, a nice way to structure a fault tolerant application.
--Erlang/OTP Doc數據庫
Supervisor的基本思想就是經過創建層級結構實現錯誤隔離和管理,具體方法是經過重啓的方式保持子進程一直活着.若是supervisor是進程樹的一部分,它會被它的supervisor自動終止,當它的supervisor讓它shutdown的時候,它會按照子進程啓動順序的逆序終止其全部的子進程,最後終止掉本身.重啓的目的是讓系統迴歸到一個穩定的狀態,迴歸穩定狀態後再出現異常能夠進行重試,若是初始化都不穩定,後續的監控-重啓策略意義不大.換句話說,Application初始化的階段要有可靠性的保障,初始化階段可能讀取配置文件或者從數據庫加載恢復數據,哪怕執行時間長一點都等待同步執行完.若是application依賴非本地數據庫或外部服務就能夠採起更快的異步啓動,由於這種服務在正常使用過程當中也常常出情況,早一點仍是晚一點啓動沒有什麼關係.app
[Erlang 0025]理解Erlang/OTP - Application以log4erl項目爲學習了Erlang/OTP application,咱們說到application在start的方法中啓動了log4erl的頂層監控樹.今天咱們繼續跟進,看log4erl的監控樹是怎麼構建起來的,並作實驗看supervisor如何經過重啓恢復服務的.使用application:start(log4erl).啓動起來以後的進程樹:異步
下面是log4erl_sup文件的start_link方法,supervisor:start_link方法的執行是同步的,直到全部的子進程都啓動了纔會返回. supervisor:start_link會使用回調函數init/1.ide
start_link(Default_logger) ->
R = supervisor:start_link({local, ?MODULE}, ?MODULE, []),
%log4erl:start_link(Default_logger),
add_logger(Default_logger),
?LOG2("Result in supervisor is ~p~n",[R]),
R.
%%回調的方法init/1
init([]) ->
{ok,
{
{one_for_one,3,10},
[]
}
}.
%%rabbit_sup.erl 來自大名鼎鼎的rabbitmq
init([]) ->
{ok, {{one_for_all, 0, 1}, []}}.
%%yaws_sup.erl Yaws項目 - Yet Another Web Server
init([]) ->
ChildSpecs = child_specs(),
%% 0, 1 means that we never want supervisor restarts
{ok,{{one_for_all, 0, 1}, ChildSpecs}}.
%%ejabberd_sup ejabberd項目
init([]) ->
Hooks =
{ejabberd_hooks,
{ejabberd_hooks, start_link, []},
%%......................... 省略代碼
{ok, {{one_for_one, 10, 1},
[Hooks,
GlobalRouter,
Cluster,
..................
Listener]}}.
Note: one of the big differences between one_for_one and simple_one_for_one is that one_for_one holds a list of all the children it has (and had, if you don't clear it), started in order, while simple_one_for_one holds a single definition for all its children and works using a dict to hold its data. Basically, when a process crashes, the simple_one_for_one supervisor will be much faster when you have a large number of children.Note: it is important to note thatsimple_one_for_one
children are not respecting this rule with the Shutdown time. In the case ofsimple_one_for_one
, the supervisor will just exit and it will be left to each of the workers to terminate on their own, after their supervisor is gone.For the most part, writing asimple_one_for_one
supervisor is similar to writing any other type of supervisor, except for one thing. The argument list in the{M,F,A}
tuple is not the whole thing, but is going to be appended to what you call it with when you dosupervisor:start_child(Sup, Args)
. That's right,supervisor:start_child/2
changes API. So instead of doingsupervisor:start_child(Sup, Spec)
, which would callerlang:apply(M,F,A)
, we now havesupervisor:start_child(Sup, Args)
, which callserlang:apply(M,F,Args++A)
.
add_logger(Name) when is_atom(Name) ->
N = atom_to_list(Name),
add_logger(N);
add_logger(Name) when is_list(Name) ->
C1 = {Name,
{log_manager, start_link ,[Name]},
permanent,
10000,
worker,
[log_manager]},
?LOG2("Adding ~p to ~p~n",[C1, ?MODULE]),
supervisor:start_child(?MODULE, C1).
Modules is a list of one element, the name of the callback module used by the child behavior. The exception to that is when you have callback modules whose identity you do not know beforehand (such as event handlers in an event manager). In this case, the value of
Modules should be
dynamic
so that the whole OTP system knows who to contact when using more advanced features, such as
releases.
%%log4erl.conf文件 內容我作了簡單的縮排%%mod
logger default_logger{
file_appender default_app{
dir = "./log", level = debug, file = default_log, type = size, max = 1000000, suffix = log, rotation = 50, format = ' %d %h:%m:%s.%i %l%n'
}
}
%%mail mod
logger mail_logger{
file_appender mail_app{
dir = "./log", level = debug, file = mail_log, type = size, max = 1000000, suffix = log, rotation = 50, format = ' %d %h:%m:%s.%i %l%n'
}
}
咱們沿着調用關係,逐步跟進代碼:函數
%==== File : log4erl_conf =======
%%log4erl_conf:conf(File).
conf(File) ->
application:start(log4erl), %%啓動log4erl
Tree = parse(leex(File)), %%解析配置文件
traverse(Tree). %%遍歷配置項構造監控樹
%%跟進遍歷的邏輯,對於每一條配置執行的是element/1方法
traverse([]) ->
ok;
traverse([H|Tree]) ->
element(H),
traverse(Tree).
%%對於咱們自定義的logger走的是{logger, Logger, Appenders}邏輯
element({cutoff_level, CutoffLevel}) ->
log_filter_codegen:set_cutoff_level(CutoffLevel);
element({default_logger, Appenders}) ->
appenders(Appenders);
element({logger, Logger, Appenders}) ->
log4erl:add_logger(Logger),
appenders(Logger, Appenders).
%==== File : log4erl =======
%%繼續跟進咱們走到log4erl:add_logger/1
add_logger(Logger) ->
try_msg({add_logger, Logger}).
%%try_msg 是的添加了異常捕獲的通用方法
try_msg(Msg) ->
try
handle_call(Msg)
catch
exit:{noproc, _M} ->
io:format("log4erl has not been initialized yet. To do so, please run~n"),
io:format("> application:start(log4erl).~n"),
{error, log4erl_not_started};
E:M ->
?LOG2("Error message received by log4erl is ~p:~p~n",[E, M]),
{E, M}
end.
%%handle_call的代碼片斷
handle_call({add_logger, Logger}) ->
log_manager:add_logger(Logger);
%==== File : log_manager =======
%%邏輯轉到log_manager的add_logger(Logger)
%%最終調用的是log4erl_sup:add_logger(Logger).這個咱們上面已經分析過了
add_logger(Logger) ->
log4erl_sup:add_logger(Logger).
%%element方法在添加loger以後會添加appender
appenders([]) ->
ok;
appenders([H|Apps]) ->
appender(H),
appenders(Apps).
appenders(_, []) ->
ok;
appenders(Logger, [H|Apps]) ->
appender(Logger, H),
appenders(Logger, Apps).
appender({appender, App, Name, Conf}) ->
log4erl:add_appender({App, Name}, {conf, Conf}).
appender(Logger, {appender, App, Name, Conf}) ->
log4erl:add_appender(Logger, {App, Name}, {conf, Conf}).
%==== File : log4erl =======
%% Appender = {Appender, Name}
add_appender(Logger, Appender, Conf) ->
try_msg({add_appender, Logger, Appender, Conf}).
handle_call({add_appender, Logger, Appender, Conf}) ->
log_manager:add_appender(Logger, Appender, Conf);
%==== File : log_manager =======
add_appender(Logger, {Appender, Name} , Conf) ->
?LOG2("add_appender ~p with name ~p to ~p with Conf ~p ~n",[Appender, Name, Logger, Conf]),
log4erl_sup:add_guard(Logger, Appender, Name, Conf).
%==== File : log4erl_sup =======
add_guard(Logger, Appender, Name, Conf) ->
C = {Name,
{logger_guard, start_link ,[Logger, Appender, Name, Conf]},
permanent,
10000,
worker,
[logger_guard]},
?LOG2("Adding ~p to ~p~n",[C, ?MODULE]),
supervisor:start_child(?MODULE, C).
%==== File : logger_guard =======
start_link(Logger, Appender, Name, Conf) ->
%?LOG2("starting guard for logger ~p~n",[Logger]),
{ok, Pid} = gen_server:start_link(?MODULE, [Appender, Name], []),
case add_sup_handler(Pid, Logger, Conf) of
{error, E} ->
gen_server:call(Pid, stop),
{error, E};
_R ->
{ok, Pid}
end.
add_sup_handler(G_pid, Logger, Conf) ->
?LOG("add_sup()~n"),
gen_server:call(G_pid, {add_sup_handler, Logger, Conf}).
handle_call({add_sup_handler, Logger, Conf}, _From, [{appender, Appender, Name}] = State) ->
?LOG2("Adding handler ~p with name ~p for ~p From ~p~n",[Appender, Name, Logger, _From]),
try
Res = gen_event:add_sup_handler(Logger, {Appender, Name}, Conf),
{reply, Res, State}
catch
E:R ->
{reply, {error, {E,R}}, State}
end;
gen_event:add_sup_handler會創建EventManager與Event Handler之間的link的關係,因此咱們修改一下,註釋掉這段,看看監控樹是什麼樣子:學習
% ?LOG("add_sup()~n"),
% gen_server:call(G_pid, {add_sup_handler, Logger, Conf}).
ok.ui
3> whereis(default_logger).
<0.45.0>
4> exit(whereis(default_logger),some_reason).
true
5> whereis(default_logger).
<0.45.0>
6> exit(whereis(default_logger),some_reason). %%因爲gen_event默認process_flag(trap_exit, true),因此some_reason的退出消息並無把它幹掉
true
7> whereis(default_logger).
<0.45.0>
8> exit(whereis(default_logger),kill). %%向進程發送強制退出消息,
true
=SUPERVISOR REPORT==== 10-Jan-2012::10:35:21 === %首先可以看到log4erl報出的子進程終止的報告
Supervisor: {local,log4erl_sup}
Context: child_terminated
Reason: killed
Offender: [{pid,<0.45.0>},
{name,"default_logger"},
{mfargs,{log_manager,start_link,["default_logger"]}},
{restart_type,permanent},
{shutdown,10000},
{child_type,worker}]
=PROGRESS REPORT==== 10-Jan-2012::10:35:21 === %log4erl_sup重建default_logger,新進程pid是<0.69.0>
supervisor: {local,log4erl_sup}
started: [{pid,<0.69.0>},
{name,"default_logger"},
{mfargs,{log_manager,start_link,["default_logger"]}},
{restart_type,permanent},
{shutdown,10000},
{child_type,worker}]
=SUPERVISOR REPORT==== 10-Jan-2012::10:35:21 === %default_logger退出消息轉變成爲killed繼續廣播給link的進程,對應的logger_guard終止
Supervisor: {local,log4erl_sup}
Context: child_terminated
Reason: killed
Offender: [{pid,<0.46.0>},
{name,default_app},
{mfargs,
{logger_guard,start_link,
[default_logger,file_appender,default_app,
{conf, [{dir,"./log"},{level,debug},{file,default_log},{type,size},
{max,1000000},{suffix,log}, {rotation,50},
{format," %d %h:%m:%s.%i %l%n"}]}]}},
{restart_type,permanent},
{shutdown,10000},
{child_type,worker}]
=PROGRESS REPORT==== 10-Jan-2012::10:35:21 === %logger_guard 重建
supervisor: {local,log4erl_sup}
started: [{pid,<0.70.0>},
{name,default_app},
{mfargs,
{logger_guard,start_link,
[default_logger,file_appender,default_app,
{conf,
[{dir,"./log"},{level,debug}, {file,default_log},{type,size},
{max,1000000}, {suffix,log},{rotation,50},
{format," %d %h:%m:%s.%i %l%n"}]}]}},
{restart_type,permanent},
{shutdown,10000},
{child_type,worker}]
9> whereis(default_logger).
<0.69.0>
10> is_process_alive(pid(0,70,0)). %這是新啓動的logger_guard進程
true
11> exit(pid(0,70,0),some_reason). %向進程發送一個退出消息
true
=SUPERVISOR REPORT==== 10-Jan-2012::11:07:51 ===
Supervisor: {local,log4erl_sup}
Context: child_terminated
Reason: some_reason
Offender: [{pid,<0.70.0>},
{name,default_app},
{mfargs,
{logger_guard,start_link,
[default_logger,file_appender,default_app,
{conf,
[{dir,"./log"},{level,debug},{file,default_log},{type,size},{max,1000000},
{suffix,log},{rotation,50},{format," %d %h:%m:%s.%i %l%n"}]}]}},
{restart_type,permanent},
{shutdown,10000},
{child_type,worker}]
12>
=PROGRESS REPORT==== 10-Jan-2012::11:07:51 ===
supervisor: {local,log4erl_sup}
started: [{pid,<0.76.0>},
{name,default_app},
{mfargs,
{logger_guard,start_link,
[default_logger,file_appender,default_app,
{conf,
[{dir,"./log"},{level,debug},{file,default_log},{type,size},{max,1000000},
{suffix,log},{rotation,50},{format," %d %h:%m:%s.%i %l%n"}]}]}},
{restart_type,permanent},
{shutdown,10000},{child_type,worker}]
12> is_process_alive(pid(0,70,0)).
false
13> whereis(default_logger). %退出消息廣播對default_logger沒有影響
<0.69.0>
14> whereis(log4erl_sup).
<0.44.0>
15> exit(whereis(log4erl_sup),some_reason). % Supervisor 初始化的時候也會設置 process_flag(trap_exit, true),
true
16> whereis(log4erl_sup).
<0.44.0>
17> exit(whereis(log4erl_sup),kill). %殺掉log4erl_sup 應用程序中止
true
=CRASH REPORT==== 10-Jan-2012::13:26:23 ===
crasher:
initial call: gen_event:init_it/6
pid: <0.69.0>
registered_name: default_logger
exception exit: killed
in function gen_event:terminate_server/4
ancestors: [log4erl_sup,<0.43.0>]
messages: [{'EXIT',<0.76.0>,killed}]
links: [#Port<0.1891>,#Port<0.1885>]
dictionary: []
trap_exit: true
status: running
heap_size: 610
stack_size: 24
reductions: 720
neighbours:
18>
=CRASH REPORT==== 10-Jan-2012::13:26:23 ===
crasher:
initial call: gen_event:init_it/6
pid: <0.47.0>
registered_name: mail_logger
exception exit: killed
in function gen_event:terminate_server/4
ancestors: [log4erl_sup,<0.43.0>]
messages: [{'EXIT',<0.48.0>,killed}]
links: [#Port<0.546>]
dictionary: []
trap_exit: true
status: running
heap_size: 377
stack_size: 24
reductions: 411
neighbours:
18>
=CRASH REPORT==== 10-Jan-2012::13:26:25 ===
crasher:
initial call: application_master:init/4
pid: <0.42.0>
registered_name: []
exception exit: killed
in function application_master:terminate/2
ancestors: [<0.41.0>]
messages: []
links: [<0.6.0>]
dictionary: []
trap_exit: true
status: running
heap_size: 610
stack_size: 24
reductions: 1555
neighbours:
18>
=INFO REPORT==== 10-Jan-2012::13:26:25 ===
application: log4erl
exited: killed
type: temporary
18>
最後再貼一次log4erl項目的地址: http://code.google.com/p/log4erl/,建議下載下來代碼本身動手作一下上面的實驗.this