elixir/erlang內存泄漏排查.

時間 2019-11-10

標籤 elixir erlang 內存泄漏排查简体版

原文原文鏈接

前言

對服務端程序來講, 內存泄漏是常常會面臨的問題. 使用erlang的狀況下, 不用程序員手動管理內存. 若是不寫c driver, 通常的內存問題仍是很容易定位的. 這篇blog對常見的內存泄漏類型, 排查手段作個小結.html

內存泄漏類型

process泄漏

若是沒有etopgit

iex(xxxx@xxxx.)1> :erlang.system_info(:process_count)
5369

能夠經過process_count來獲取erlang vm中已分配的process數量. 若process數量和業務實際須要不吻合, 則須要排查.程序員

消息堆積

iex(xxxx@xxxx.)5> spawn fn -> :etop.start([sort: :msg_q]) end 
#PID<0.6255.1>
                                              
========================================================================================
 'xxxx@xxxx.'                                  03:04:46
 Load:  cpu         0               Memory:  total      147234    binary       2839
        procs    5371                        processes   59008    code        42641
        runq        0                        atom         1722    ets          8239
                                              
Pid            Name or Initial Func    Time    Reds  Memory    MsgQ Current Function
----------------------------------------------------------------------------------------
<7796.0.0>     init                     '-'  339058   29540       0 init:loop/1         
<7796.1.0>     erts_code_purger         '-'  479850  285160       0 erts_code_purger:wai
<7796.2.0>     erts_literal_area_co     '-'  337591    2688       0 erts_literal_area_co
<7796.3.0>     erts_dirty_process_s     '-'   37924    2688       0 erts_dirty_process_s

通常消息堆積都會伴隨着memory增加, 無論是sort by msg_q 或 memory, 都很容易發現問題.
若是沒有etopgithub

iex(xxxx@xxxx.)8> Enum.map(:erlang.processes(), fn proc -> {:erlang.process_info(proc, :message_queue_len), proc} end) |> Enum.sort(fn({{_, a}, _}, {{_, b}, _}) -> a > b end) |> List.first
{{:message_queue_len, 0}, #PID<0.32638.0>}

ets表泄漏

找出佔用最多內存的ets表

iex(7)> :ets.all() |> Enum.map(fn ets_name -> {:ets.info(ets_name, :memory), ets_name} end) |> Enum.sort(fn a, b -> a > b end)
[
  {18002942, :test},
  {41940, #Reference<0.3983585142.1897791489.87703>},
...
]

總體內存分析

:erlang.memory 能夠一眼看出是否ets表存在泄漏
值得注意的是, 大於64bit的binary, 會在:erlang.memory的binary項體現. 不會計入ets項中.ide

65bit

iex(1)> :ets.new(:test, [:public, :named_table])
:test
iex(2)> :erlang.memory
[
  total: 23688632,
  processes: 4940400,
  processes_used: 4939456,
  system: 18748232,
  atom: 463465,
  atom_used: 442288,
  binary: 27872,
  code: 8462310,
  ets: 589664
]
iex(3)> for num <- 1..1000000 do
...(3)> :ets.insert(:test, {num, :crypto.strong_rand_bytes(65)})
...(3)> end
[true, true, true, true, true, true, true, true, true, true, true, true, true,
 true, true, true, true, true, true, true, true, true, true, true, true, true,
 true, true, true, true, true, true, true, true, true, true, true, true, true,
 true, true, true, true, true, true, true, true, true, true, true, ...]
iex(4)> :erlang.memory
[
  total: 284511736,
  processes: 33626760,
  processes_used: 33625816,
  system: 250884976,
  atom: 463465,
  atom_used: 446381,
  binary: 112090520,
  code: 8553627,
  ets: 120619384
]

64bit

iex(1)> :ets.new(:test, [:public, :named_table])
:test
iex(2)> :erlang.memory
[
  total: 23569856,
  processes: 4778728,
  processes_used: 4777784,
  system: 18791128,
  atom: 463465,
  atom_used: 442288,
  binary: 70736,
  code: 8462310,
  ets: 589680
]
iex(3)> for num <- 1..1000000 do                                
...(3)> :ets.insert(:test, {num, :crypto.strong_rand_bytes(64)})
...(3)> end
[true, true, true, true, true, true, true, true, true, true, true, true, true,
 true, true, true, true, true, true, true, true, true, true, true, true, true,
 true, true, true, true, true, true, true, true, true, true, true, true, true,
 true, true, true, true, true, true, true, true, true, true, true, ...]
iex(4)> :erlang.memory
[
  total: 204325192,
  processes: 33373520,
  processes_used: 33372576,
  system: 170951672,
  atom: 463465,
  atom_used: 447944,
  binary: 39168,
  code: 8653586,
  ets: 152623976
]

數據過大

首先, 應該能估算出業務大體的內存佔用. 能夠經過process_info, 找出可疑的進程.oop

iex(11)> :erlang.process_info(:ets.info(:test, :owner), :memory)
{:memory, 28693220}

經過:sys.get_state能夠發現一些邏輯錯誤形成的, list/map無限增加的bug.ui

iex(xxxxx@xxxxx.)13> :sys.get_state(:erlang.list_to_pid('<0.2362.0>'))
{:state, {:local, :prometheus_sup}, :one_for_one, {[], %{}}, :undefined, 5, 1,
 [], 0, :prometheus_sup, []}

內存估算

https://github.com/okeuday/erlang_term
http://erlang.org/doc/efficiency_guide/advanced.html#id68923

一些源碼的閱讀記錄

elixir數據類型

integer
float
boolean
atom
string
list
tuple

Map MapSet, func, nil, ets?atom

#if ET_DEBUG
ERTS_GLB_INLINE unsigned tag_val_def(Wterm x, const char *file, unsigned line)
#else
ERTS_GLB_INLINE unsigned tag_val_def(Wterm x)
#define file __FILE__
#define line __LINE__
#endif
{
    static char *msg = "tag_val_def error";
    switch (x & _TAG_PRIMARY_MASK) {
    case TAG_PRIMARY_LIST:
    ET_ASSERT(_list_precond(x),file,line);
    return LIST_DEF;
      case TAG_PRIMARY_BOXED: {
      Eterm hdr = *boxed_val(x);
      ET_ASSERT(is_header(hdr),file,line);
      switch ((hdr & _TAG_HEADER_MASK) >> _TAG_PRIMARY_SIZE) {
        case (_TAG_HEADER_ARITYVAL >> _TAG_PRIMARY_SIZE):   return TUPLE_DEF;
        case (_TAG_HEADER_POS_BIG >> _TAG_PRIMARY_SIZE):    return BIG_DEF;
        case (_TAG_HEADER_NEG_BIG >> _TAG_PRIMARY_SIZE):    return BIG_DEF;
        case (_TAG_HEADER_REF >> _TAG_PRIMARY_SIZE):    return REF_DEF;
        case (_TAG_HEADER_FLOAT >> _TAG_PRIMARY_SIZE):  return FLOAT_DEF;
        case (_TAG_HEADER_EXPORT >> _TAG_PRIMARY_SIZE):     return EXPORT_DEF;
        case (_TAG_HEADER_FUN >> _TAG_PRIMARY_SIZE):    return FUN_DEF;
        case (_TAG_HEADER_EXTERNAL_PID >> _TAG_PRIMARY_SIZE):   return EXTERNAL_PID_DEF;
        case (_TAG_HEADER_EXTERNAL_PORT >> _TAG_PRIMARY_SIZE):  return EXTERNAL_PORT_DEF;
        case (_TAG_HEADER_EXTERNAL_REF >> _TAG_PRIMARY_SIZE):   return EXTERNAL_REF_DEF;
        case (_TAG_HEADER_MAP >> _TAG_PRIMARY_SIZE):    return MAP_DEF;
        case (_TAG_HEADER_REFC_BIN >> _TAG_PRIMARY_SIZE):   return BINARY_DEF;
        case (_TAG_HEADER_HEAP_BIN >> _TAG_PRIMARY_SIZE):   return BINARY_DEF;
        case (_TAG_HEADER_SUB_BIN >> _TAG_PRIMARY_SIZE):    return BINARY_DEF;
        case (_TAG_HEADER_BIN_MATCHSTATE >> _TAG_PRIMARY_SIZE): return MATCHSTATE_DEF;
      }
      break;
      }
      case TAG_PRIMARY_IMMED1: {
      switch ((x & _TAG_IMMED1_MASK) >> _TAG_PRIMARY_SIZE) {
        case (_TAG_IMMED1_PID >> _TAG_PRIMARY_SIZE):    return PID_DEF;
        case (_TAG_IMMED1_PORT >> _TAG_PRIMARY_SIZE):   return PORT_DEF;
        case (_TAG_IMMED1_IMMED2 >> _TAG_PRIMARY_SIZE): {
        switch ((x & _TAG_IMMED2_MASK) >> _TAG_IMMED1_SIZE) {
          case (_TAG_IMMED2_ATOM >> _TAG_IMMED1_SIZE):  return ATOM_DEF;
          case (_TAG_IMMED2_NIL >> _TAG_IMMED1_SIZE):   return NIL_DEF;
        }
        break;
        }
        case (_TAG_IMMED1_SMALL >> _TAG_PRIMARY_SIZE):  return SMALL_DEF;
      }
      break;
      }
    }
    erl_assert_error(msg, __FUNCTION__, file, line);
#undef file
#undef line
}
#endif

integer

small integer

能夠看到, erlang區分了大小整數, 小整數根據64/32系統不一樣, 使用了 N-4 bit字節. 最低位爲0xF, 即0b1111spa

#define is_integer(x)       (is_small(x) || is_big(x))
/* fixnum ("small") access methods */
#if defined(ARCH_64)
#define SMALL_BITS  (64-4)
#define SMALL_DIGITS    (17)
#else
#define SMALL_BITS  (28)
#define SMALL_DIGITS    (8)
#endif
#define MAX_SMALL   ((SWORD_CONSTANT(1) << (SMALL_BITS-1))-1)
#define MIN_SMALL   (-(SWORD_CONSTANT(1) << (SMALL_BITS-1)))
#define _TAG_IMMED1_SMALL   ((0x3 << _TAG_PRIMARY_SIZE) | TAG_PRIMARY_IMMED1)
#define make_small(x)   (((Uint)(x) << _TAG_IMMED1_SIZE) + _TAG_IMMED1_SMALL)
#define is_small(x) (((x) & _TAG_IMMED1_MASK) == _TAG_IMMED1_SMALL)

尤爲是make_small宏.debug

#define make_small(x)   (((Uint)(x) << _TAG_IMMED1_SIZE) + _TAG_IMMED1_SMALL)

故, 小整形佔用64/32 bit空間.

big integer

最低位是否爲0, boxed?

#define make_big(x) make_boxed((x))
#define make_boxed(x)       _ET_APPLY(make_boxed,(x))
#define TAG_PRIMARY_BOXED   0x2
#define _unchecked_make_boxed(x) ((Uint)(x) + TAG_PRIMARY_BOXED)
#define _TAG_PRIMARY_MASK   0x3
#define _is_not_boxed(x)    ((x) & (_TAG_PRIMARY_MASK-TAG_PRIMARY_BOXED))

atom

#define make_atom(x)  ((Eterm)(((x) << _TAG_IMMED2_SIZE) + _TAG_IMMED2_ATOM))
#define is_atom(x)  (((x) & _TAG_IMMED2_MASK) == _TAG_IMMED2_ATOM)

nil

一個固定uint值.

#define NIL  ((~((Uint) 0) << _TAG_IMMED2_SIZE) | _TAG_IMMED2_NIL)

ets

map

flat_map

若size小於MAP_SMALL_MAP_LIMIT(32), 大部分的map都屬於flat_map.

erts_produce_heap(factory, 3 + 1 + (2 * n), 0);
ERTS_GLB_INLINE Eterm *1 =
erts_produce_heap(ErtsHeapFactory* factory, Uint need, Uint xtra)
{
    Eterm* res;
    ASSERT((unsigned int)factory->mode > (unsigned int)FACTORY_CLOSED);
    if (factory->hp + need > factory->hp_end) {
    erts_reserve_heap__(factory, need, xtra);
    }
    res = factory->hp;
    factory->hp += need;
    return res;
}

即分配4+2*n wordsize byte內存.

iex(1)> :erlang.system_info(:wordsize) 
8
iex(2)> :erts_debug.flat_size(%{})
4
iex(3)> :erlang_term.byte_size(%{})
# 這是由於erlang term自己指針有8字節. 加對上的32字節, 共40字節.
40
iex(4)> :erts_debug.flat_size(%{1 => 1})
6
iex(5)> :erlang_term.byte_size(%{1 => 1})
56
iex(6)> :erts_debug.flat_size(%{1 => 1, 2 => 2})
8

hash_map

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。