中秋的時候正在外面愉快的在外賣喝着咖啡玩電腦。。。。。。突發 redis 報警從 sentry 應用端曝出的錯誤html
MISCONF Redis is configured to save RDB snapshots, but it is currently not able to persist on disk. Commands that may modify the data set are disabled,
because this instance is configured to report errors during writes if RDB snapshotting fails (stop-writes-on-bgsave-error option).
Please check the Redis logs for details about the RDB error.
因而又開始愉快的處理問題了,看上去像是執行 rdb 快照持久化的時候出現的問題,上到 redis 機器查看日誌定位詳細問題linux
420:M 14 Sep 15:56:27.067 # Can't save in background: fork: Cannot allocate memory 420:M 14 Sep 15:56:33.071 * 10000 changes in 60 seconds. Saving... 420:M 14 Sep 15:56:33.072 # Can't save in background: fork: Cannot allocate memory 420:M 14 Sep 15:56:39.079 * 10000 changes in 60 seconds. Saving... 420:M 14 Sep 15:56:39.080 # Can't save in background: fork: Cannot allocate memory 420:M 14 Sep 15:56:45.083 * 10000 changes in 60 seconds. Saving... 420:M 14 Sep 15:56:45.083 # Can't save in background: fork: Cannot allocate memory 420:M 14 Sep 15:56:51.094 * 10000 changes in 60 seconds. Saving... 420:M 14 Sep 15:56:51.095 # Can't save in background: fork: Cannot allocate memory 420:M 14 Sep 15:56:57.002 * 10000 changes in 60 seconds. Saving...
能夠很明顯的發現應該是嘗試 fork 的時候內存不夠,並無被 linux 內核放行。redis
這裏有兩個點我認爲須要注意一下,一個是 redis 在默認配置的狀況是下是開啓參數less
stop-writes-on-bgsave-error yes
也就是 若是 bgsave 存儲快照失敗,那麼 redis 將阻止數據繼續寫入,若是將這個設置成 False 那麼即便是 bgsave 快照寫入磁盤失敗,也不會讓 redis 當即對外中止服務。ui
可是沒法 bgsave 讓數據落盤始終是隱患,要是機器一重啓,就完蛋了。因此我嘗試查詢一些熱修復的手段來修復這個問題。this
最終 linux 端有一個參數 vm.overcommit_memory 能夠解決這個問題默認參數是 0 ,它有三個值能夠配置。spa
這時候就是內存不足,到了這裏,操做系統要怎麼辦,就要祭出咱們的主角「overcommit_memory」參數了(/proc/sys/vm/overcommit_memory); vm.overcommit_memory = 0 啓發策略 比較 這次請求分配的虛擬內存大小和系統當前空閒的物理內存加上swap,決定是否放行。系統在爲應用進程分配虛擬地址空間時,會判斷當前申請的虛擬地址空間大小是否超過剩餘內存大小,若是超過,則虛擬地址空間分配失敗。所以,也就是若是進程自己佔用的虛擬地址空間比較大或者剩餘內存比較小時,fork、malloc等調用可能會失敗。 vm.overcommit_memory = 1 容許overcommit 直接放行,系統在爲應用進程分配虛擬地址空間時,徹底不進行限制,這種狀況下,避免了fork可能產生的失敗,但因爲malloc是先分配虛擬地址空間,然後經過異常陷入內核分配真正的物理內存,在內存不足的狀況下,這至關於徹底屏蔽了應用進程對系統內存狀態的感知,即malloc老是能成功,一旦內存不足,會引發系統OOM殺進程,應用程序對於這種後果是沒法預測的。 vm.overcommit_memory = 2 禁止overcommit 根據系統內存狀態肯定了虛擬地址空間的上限,因爲不少狀況下,進程的虛擬地址空間佔用遠大於其實際佔用的物理內存,這樣一旦內存使用量上去之後,對於一些動態產生的進程(須要複製父進程地址空間)則很容易建立失敗,若是業務過程沒有過多的這種動態申請內存或者建立子進程,則影響不大,不然會產生比較大的影響 。這種狀況下系統所能分配的內存不會超過上面提到的CommitLimit大小,若是這麼多資源已經用光,那麼後面任未嘗試申請內存的行爲都會返回錯誤,這一般意味着此時無法運行任何新程序。 ———————————————— 版權聲明:本文爲CSDN博主「朱清震」的原創文章,遵循 CC 4.0 BY-SA 版權協議,轉載請附上原文出處連接及本聲明。 原文連接:https://blog.csdn.net/zqz_zqz/article/details/53384854
因此這裏 bgsave 咱們 redis 應用會嘗試對主進程進行 fork ,而後內存不夠申請未被內核放行。因此 hotfix 我嘗試將參數 vm.overcommit_memory 設置成 1 直接進行放行。操作系統
/etc/sysctl.conf vm.overcommit_memory=1 sysctl -p
生效,再看日誌發現就能夠成功了。.net
這裏我找到官方 FAQ 也對相似問題有描述日誌
Background saving fails with a fork() error under Linux even if I have a lot of free RAM!
Short answer:
echo 1 > /proc/sys/vm/overcommit_memory
:)And now the long one:
Redis background saving schema relies on the copy-on-write semantic of fork in modern operating systems: Redis forks (creates a child process) that is an exact copy of the parent. The child process dumps the DB on disk and finally exits. In theory the child should use as much memory as the parent being a copy, but actually thanks to the copy-on-write semantic implemented by most modern operating systems the parent and child process will share the common memory pages. A page will be duplicated only when it changes in the child or in the parent. Since in theory all the pages may change while the child process is saving, Linux can't tell in advance how much memory the child will take, so if the
overcommit_memory
setting is set to zero fork will fail unless there is as much free RAM as required to really duplicate all the parent memory pages, with the result that if you have a Redis dataset of 3 GB and just 2 GB of free memory it will fail.Setting
overcommit_memory
to 1 tells Linux to relax and perform the fork in a more optimistic allocation fashion, and this is indeed what you want for Redis.A good source to understand how Linux Virtual Memory works and other alternatives for
overcommit_memory
andovercommit_ratio
is this classic from Red Hat Magazine, "Understanding Virtual Memory". Beware, this article had1
and2
configuration values forovercommit_memory
reversed: refer to the proc(5) man page for the right meaning of the available values.
後來 hotfix 以後,咱們清理了一些好久未能釋放的大 key,將內存恢復到比較小的水平。就很穩了,此次問題發生以後沒有無腦進行重啓,而是迅速經過必定的思路來查詢問題,感受本身解決問題的方法稍微成熟了一點點。
Reference:
https://zhuanlan.zhihu.com/p/36872365 fork 的原理及實現
https://stackoverflow.com/questions/11752544/redis-bgsave-failed-because-fork-cannot-allocate-memory redis bgsave failed because fork Cannot allocate memory
https://www.freebsd.org/doc/zh_CN/books/handbook/configtuning-sysctl.html 12.11. 用 sysctl 進行調整
https://blog.csdn.net/zqz_zqz/article/details/53384854 redis Can’t save in background: fork: Cannot allocate memory 解決及原理
https://redis.io/topics/faq 官方 FAQ