elixir 高可用系列(五) Supervisor

概述

OTP 平臺的容錯性高,是由於它提供了機制來監控全部 processes 的狀態,若是有進程出現異常, 不只能夠及時檢測到錯誤,還能夠對 processes 進行重啓等操做。html

有了 supervisor,能夠有效的提升系統的可用性,一個 supervior 監督一個或多個應用, 同時, supervior 也能夠監督 supervior,從而造成一個監督樹,提升整個系統的可用性。測試

注意 ,supervior 最好只用於監督,不要有其餘的業務邏輯處理,越是接近監督樹根部的 supervior 就要越簡單, 由於 supervior 簡單就不容易出錯,它是保證系統高可用的關鍵。rest

監督者示例

下面,使用 elixir 中提供的 Supervisor 模塊,構造簡單的監督示例來演示如何提升系統的可用性。code

監督策略

監督策略有4種:server

  1. :one_for_one 只重啓出錯的 process
  2. :one_for_all 當有 process 出錯時,重啓全部的 process
  3. :rest_for_one 重啓出錯的 process ,以及全部在它以後啓動的 process(也就是重啓對出錯 process 有依賴的 全部 process)
  4. :simple_one_for_one 相似 :one_for_one ,可是 supervior 只能包含一個 process

監督策略的轉換很是簡單,下面演示2種監督策略的示例:htm

one for one

defmodule PseudoServerA do
  use GenServer

  def start_link(state, opts \\ []) do
    GenServer.start_link(__MODULE__, state, opts)
  end

  def handle_call(:display, _from, []) do
    {:reply, 'ServerA PID: ' ++ :erlang.pid_to_list(self()), []}
  end

  def handle_cast(:err, []) do
    {:stop, "stop ServerA", []}
  end
end

defmodule PseudoServerB do
  use GenServer

  def start_link(state, opts \\ []) do
    GenServer.start_link(__MODULE__, state, opts)
  end

  def handle_call(:display, _from, []) do
    {:reply, 'ServerB PID: ' ++ :erlang.pid_to_list(self()), []}
  end

  def handle_cast(:err, []) do
    {:stop, "stop ServerB", []}
  end
end

defmodule PseudoServerC do
  use GenServer

  def start_link(state, opts \\ []) do
    GenServer.start_link(__MODULE__, state, opts)
  end

  def handle_call(:display, _from, []) do
    {:reply, 'ServerC PID: ' ++ :erlang.pid_to_list(self()), []}
  end

  def handle_cast(:err, []) do
    {:stop, "stop ServerC", []}
  end
end

defmodule SupervisorTest do
  import Supervisor.Spec

  def init() do
    children = [
      worker(PseudoServerA, [[], [name: :server_a]]),
      worker(PseudoServerB, [[], [name: :server_b]]),
      worker(PseudoServerC, [[], [name: :server_c]])
    ]

    # Start the supervisor with children
    Supervisor.start_link(children, strategy: :one_for_one)
  end

end

測試方式:blog

$ iex -S mix

# 啓動 supervisor 及其監督的3個 process 
iex(1)> SupervisorTest.init
{:ok, #PID<0.145.0>}

# 啓動後, 3個 process 的 PID 以下
iex(2)> GenServer.call(:server_a, :display)
'ServerA PID: <0.146.0>'
iex(3)> GenServer.call(:server_b, :display)
'ServerB PID: <0.147.0>'
iex(4)> GenServer.call(:server_c, :display)
'ServerC PID: <0.148.0>'

# 經過消息 :err 讓 serverA 出錯
iex(5)> GenServer.cast(:server_a, :err)
:ok
iex(6)>
14:47:53.119 [error] GenServer :server_a terminating
** (stop) "stop ServerA"
Last message: {:"$gen_cast", :err}
State: []

nil

# serverA 出錯後,再次查看3個process的PID,發現 supervisor 只重啓了 serverA,符合策略 :one_for_one
iex(7)> GenServer.call(:server_a, :display)
'ServerA PID: <0.155.0>'
iex(8)> GenServer.call(:server_b, :display)
'ServerB PID: <0.147.0>'
iex(9)> GenServer.call(:server_c, :display)
'ServerC PID: <0.148.0>'

one_for_all

咱們換一種監督策略試試看,只須要將上面的代碼進程

# Start the supervisor with children
Supervisor.start_link(children, strategy: :one_for_one)

改爲get

# Start the supervisor with children
Supervisor.start_link(children, strategy: :one_for_all)

測試步驟 和 one_for_one 同樣:it

$ iex -S mix

# 啓動 supervisor 及其監督的3個 process 
iex(1)> SupervisorTest.init
{:ok, #PID<0.145.0>}

# 啓動後, 3個 process 的 PID 以下
iex(2)> GenServer.call(:server_a, :display)
'ServerA PID: <0.146.0>'
iex(3)> GenServer.call(:server_b, :display)
'ServerB PID: <0.147.0>'
iex(4)> GenServer.call(:server_c, :display)
'ServerC PID: <0.148.0>'

# 經過消息 :err 讓 serverA 出錯
iex(5)> GenServer.cast(:server_a, :err)
:ok
iex(6)>
14:55:16.183 [error] GenServer :server_a terminating
 ** (stop) "stop ServerA"
 Last message: {:"$gen_cast", :err}
 State: []

 nil

# serverA 出錯後,再次查看3個process的PID,發現 supervisor 重啓了全部 process,符合策略 :one_for_all
iex(7)> GenServer.call(:server_a, :display)
'ServerA PID: <0.153.0>'
iex(8)> GenServer.call(:server_b, :display)
'ServerB PID: <0.154.0>'
iex(9)> GenServer.call(:server_c, :display)
'ServerC PID: <0.156.0>'

監督樹

監督者並非一維的,監督者也能夠監督其它監督者,從而造成樹狀的監督關係。

修改上面的測試代碼以下:(只修改了 Supervisor 的部分)

defmodule PseudoServerA do
  use GenServer

  def start_link(state, opts \\ []) do
    GenServer.start_link(__MODULE__, state, opts)
  end

  def handle_call(:display, _from, []) do
    {:reply, 'ServerA PID: ' ++ :erlang.pid_to_list(self()), []}
  end

  def handle_cast(:err, []) do
    {:stop, "stop ServerA", []}
  end
end

defmodule PseudoServerB do
  use GenServer

  def start_link(state, opts \\ []) do
    GenServer.start_link(__MODULE__, state, opts)
  end

  def handle_call(:display, _from, []) do
    {:reply, 'ServerB PID: ' ++ :erlang.pid_to_list(self()), []}
  end

  def handle_cast(:err, []) do
    {:stop, "stop ServerB", []}
  end
end

defmodule PseudoServerC do
  use GenServer

  def start_link(state, opts \\ []) do
    GenServer.start_link(__MODULE__, state, opts)
  end

  def handle_call(:display, _from, []) do
    {:reply, 'ServerC PID: ' ++ :erlang.pid_to_list(self()), []}
  end

  def handle_cast(:err, []) do
    {:stop, "stop ServerC", []}
  end
end

defmodule SupervisorBranch do
  import Supervisor.Spec

  def start_link(state) do
    children = [
      worker(PseudoServerA, [[], [name: :server_a]]),
      worker(PseudoServerB, [[], [name: :server_b]]),
    ]

    Supervisor.start_link(children, strategy: :one_for_one)
  end

end

defmodule SupervisorRoot do
  import Supervisor.Spec

  def init() do
    children = [
      supervisor(SupervisorBranch, [[name: :supervisor_branch]]),
      worker(PseudoServerC, [[], [name: :server_c]])
    ]

    # Start the supervisor with children
    Supervisor.start_link(children, strategy: :one_for_all)
  end

end

測試流程以下:

# 啓動 根 監督者 
iex(1)> SupervisorRoot.init
{:ok, #PID<0.149.0>}

# 啓動後,查看 3 個process 的PID
iex(2)> GenServer.call(:server_a, :display)
'ServerA PID: <0.151.0>'
iex(3)> GenServer.call(:server_b, :display)
'ServerB PID: <0.152.0>'
iex(4)> GenServer.call(:server_c, :display)
'ServerC PID: <0.153.0>'

# 經過消息 :err 讓 serverA 出錯
iex(5)> GenServer.cast(:server_a, :err)
:ok
iex(6)>
15:31:15.846 [error] GenServer :server_a terminating
 ** (stop) "stop ServerA"
 Last message: {:"$gen_cast", :err}
 State: []

 nil

 # serverA 出錯後,由於它的監督者 SupervisorBranch 的策略是 :one_for_one,因此只重啓了 serverA
 iex(7)> GenServer.call(:server_a, :display)
 'ServerA PID: <0.158.0>'
 iex(8)> GenServer.call(:server_b, :display)
 'ServerB PID: <0.152.0>'
 iex(9)> GenServer.call(:server_c, :display)
 'ServerC PID: <0.153.0>'

 # 經過消息 :err 讓 serverC 出錯
 iex(10)> GenServer.cast(:server_c, :err)
 :ok

 15:31:35.264 [error] GenServer :server_c terminating
 ** (stop) "stop ServerC"
 Last message: {:"$gen_cast", :err}
 State: []

 # serverC 出錯後,由於它的監督者 SupervisorRoot 的策略是 :one_for_all,因此全部的 proocess 都重啓了
 iex(11)> GenServer.call(:server_a, :display)
 'ServerA PID: <0.166.0>'
 iex(12)> GenServer.call(:server_c, :display)
 'ServerC PID: <0.168.0>'
 iex(13)> GenServer.call(:server_b, :display)
 'ServerB PID: <0.167.0>'

經過監督樹,咱們能夠給不一樣的 process 分組,而後讓每一個組有不一樣的監督策略。

總結

有了監督機制,能夠及時的把握全部 process 的狀態,經過監督樹,還能夠加入不一樣恢復機制。 所以,用好 Supervisor 模塊,能夠極大提升系統的可用性。

Supervisor 模塊詳細內容能夠參見:http://elixir-lang.org/docs/stable/elixir/Supervisor.html

來源:http://blog.iotalabs.io/

相關文章
相關標籤/搜索