搞懂Redis RDB和AOF持久化及工做原理

時間 2019-12-13

標籤 redis rdb aof 持久原理欄目 Redis 简体版

原文原文鏈接

前言

　　由於Redis的數據都儲存在內存中，當進程退出時，全部數據都將丟失。爲了保證數據安全，Redis支持RDB和AOF兩種持久化機制有效避免數據丟失問題。RDB能夠看做在某一時刻Redis的快照（snapshot），很是適合災難恢復。AOF則是寫入操做的日誌。本文主要講解RDB、AOF和混合結合使用。html

一.探索RDB

　　RDB就像是一臺給Redis內存數據存儲拍照的照相機，生成快照保存到磁盤的過程。觸發RDB持久化分爲手動觸發和自動觸發。Redis重啓讀取RDB速度快，可是沒法作到實時持久化，所以通常用於數據冷備和複製傳輸。redis

手動觸發

　　使用save命令：此命令會使用Redis的主線程進程同步存儲，阻塞當前的Redis服務器，形成服務不可用，直到RDB過程完成。不管當前服務器數據量大小，線上不要用。json

127.0.0.1:6379> save
OK
(1.14s)
59117:M 13 Apr 13:34:51.948 * DB saved on disk

　　使用bgsave命令：此命令會經過fork()建立子進程，在後臺進程存儲。只有fork階段會阻塞當前Redis服務器，沒必要到整個RDB過程結束，通常時間很短。所以Redis內部涉及到RDB都採用bgsave命令。這裏注意一點，不管RDB仍是AOF，因爲使用了寫時複製，fork出來的子進程不須要拷貝父進程的物理內存空間，可是會複製父進程的空間內存頁表。promise

127.0.0.1:6379> bgsave
Background saving started
59117:M 13 Apr 13:44:40.312 * Background saving started by pid 59180
59180:C 13 Apr 13:44:40.314 * DB saved on disk
59117:M 13 Apr 13:44:40.317 * Background saving terminated with success

自動觸發

　　通常咱們是不會直接用命令生成RDB文件的，Redis支持自動觸發RDB持久化機制，配置都在redis.conf文件裏面，咱們先來看一下文件裏關於rdb的默認配置，這邊都用紅色字體標註出來了，英文的文檔解釋的十分清楚，註釋也寫的很不錯。安全

################################ SNAPSHOTTING  ################################
#
# Save the DB on disk:
#
#   save <seconds> <changes>
#
#   Will save the DB if both the given number of seconds and the given
#   number of write operations against the DB occurred.
#
#   In the example below the behaviour will be to save:
#   after 900 sec (15 min) if at least 1 key changed
#   after 300 sec (5 min) if at least 10 keys changed
#   after 60 sec if at least 10000 keys changed
#
#   Note: you can disable saving completely by commenting out all "save" lines.
#
#   It is also possible to remove all the previously configured save
#   points by adding a save directive with a single empty string argument
#   like in the following example:
#
#   save ""

save 900 1
save 300 10
save 60 10000

# By default Redis will stop accepting writes if RDB snapshots are enabled
# (at least one save point) and the latest background save failed.
# This will make the user aware (in a hard way) that data is not persisting
# on disk properly, otherwise chances are that no one will notice and some
# disaster will happen.
#
# If the background saving process will start working again Redis will
# automatically allow writes again.
#
# However if you have setup your proper monitoring of the Redis server
# and persistence, you may want to disable this feature so that Redis will
# continue to work as usual even if there are problems with disk,
# permissions, and so forth.
stop-writes-on-bgsave-error yes

# Compress string objects using LZF when dump .rdb databases?
# For default that's set to 'yes' as it's almost always a win.
# If you want to save some CPU in the saving child set it to 'no' but
# the dataset will likely be bigger if you have compressible values or keys.
rdbcompression yes

# Since version 5 of RDB a CRC64 checksum is placed at the end of the file.
# This makes the format more resistant to corruption but there is a performance
# hit to pay (around 10%) when saving and loading RDB files, so you can disable it
# for maximum performances.
#
# RDB files created with checksum disabled have a checksum of zero that will
# tell the loading code to skip the check.
rdbchecksum yes

# The filename where to dump the DB
dbfilename dump.rdb

# The working directory.
#
# The DB will be written inside this directory, with the filename specified
# above using the 'dbfilename' configuration directive.
#
# The Append Only File will also be created inside this directory.
#
# Note that you must specify a directory here, not a file name.
dir /usr/local/var/db/redis/

save m n：表明Redis服務器在m秒內數據存在n次修改時，自動觸發rdb。這個參數比較關鍵。
stop-writes-on-bgsave-error：若是是yes，當bgsave命令失敗時Redis將中止寫入操做。
rdbcompression：是否對RDB文件進行壓縮，可是在LZF壓縮消耗更多CPU
rdbchecksum：是否對RDB文件進程校驗
dbfilename：配置文件名稱，默認dump.rdb
dir：配置rdb文件存放的路勁，這個參數比較重要。

工做原理

　　首先咱們來看一下server.h文件內saveparams參數，能夠看到，seconds就是秒數，changes就是改變量。是否是就對應着剛剛說的save m n的配置呢？服務器

struct redisServer {
    ....
    struct saveparam *saveparams;   /* Save points array for RDB */
    ...
};

struct saveparam {
    time_t seconds;
    int changes;
};

　　接下來咱們看這個redis.c文件，有個週期性函數，叫作serverCron，它會週期調用，大概作這幾件事情，見註釋。用紅色標註的說明會觸發bgsave和aof rewrite。app

/* This is our timer interrupt, called server.hz times per second.
 * Here is where we do a number of things that need to be done asynchronously.
 * For instance:
 *
 * - Active expired keys collection (it is also performed in a lazy way on
 *   lookup).
 * - Software watchdog.
 * - Update some statistic.
 * - Incremental rehashing of the DBs hash tables.
 * - Triggering BGSAVE / AOF rewrite, and handling of terminated children.
 * - Clients timeout of different kinds.
 * - Replication reconnection.
 * - Many more...
 *
 * Everything directly called here will be called server.hz times per second,
 * so in order to throttle execution of things we want to do less frequently
 * a macro is used: run_with_period(milliseconds) { .... }
 */

int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) {

　　在這個方法裏面有這樣一段代碼，這邊單獨拿出來，這段代碼的意思是判斷changes是否知足並執行save操做。less

/* If there is not a background saving/rewrite in progress check if
         * we have to save/rewrite now */
         for (j = 0; j < server.saveparamslen; j++) {
            struct saveparam *sp = server.saveparams+j;

            /* Save if we reached the given amount of changes,
             * the given amount of seconds, and if the latest bgsave was
             * successful or if, in case of an error, at least
             * CONFIG_BGSAVE_RETRY_DELAY seconds already elapsed. */
            if (server.dirty >= sp->changes &&
                server.unixtime-server.lastsave > sp->seconds &&
                (server.unixtime-server.lastbgsave_try >
                 CONFIG_BGSAVE_RETRY_DELAY ||
                 server.lastbgsave_status == C_OK))
            {
                serverLog(LL_NOTICE,"%d changes in %d seconds. Saving...",
                    sp->changes, (int)sp->seconds);
                rdbSaveBackground(server.rdb_filename); break;
            }
         }

　　接着繼續看這個方法的部分代碼片斷，在rdb.c文件裏。咱們能夠看到子進程名爲"redis-rdb-bgsave"async

int rdbSaveBackground(char *filename) {
    pid_t childpid;
    long long start;

    if (server.aof_child_pid != -1 || server.rdb_child_pid != -1) return C_ERR;

    server.dirty_before_bgsave = server.dirty;
    server.lastbgsave_try = time(NULL);

    start = ustime();
    if ((childpid = fork()) == 0) {
        int retval;

        /* Child */
        closeListeningSockets(0);
 redisSetProcTitle("redis-rdb-bgsave");
        retval = rdbSave(filename);
        if (retval == C_OK) {
            size_t private_dirty = zmalloc_get_private_dirty();

            if (private_dirty) {
                serverLog(LL_NOTICE,
                    "RDB: %zu MB of memory used by copy-on-write",
                    private_dirty/(1024*1024));
            }
        }
        exitFromChild((retval == C_OK) ? 0 : 1);
    }

　　最後咱們看一下RDB的運做流程圖：ide

redis執行bgsave命令，Redis判斷當前存在正在進行執行的子進程，如RDB/AOF子進程，存在bgsave命令直接返回
fork出子進程，fork操做中Redis父進程會阻塞
fork完成返回　　59117:M 13 Apr 13:44:40.312 * Background saving started by pid 59180
子進程進程對內存數據生成快找文件
子進程告訴父進程處理完成

探索RDB文件

　　咱們可使用redis-rdb-tools來分析rdb快照文件，他能夠把rdb快照文件生成json文件，看起來比較方便。

rdb -c memory dump.rdb > testMjx.csv

　　而後咱們看下生成的文件長啥樣

database,type,key,size_in_bytes,encoding,num_elements,len_largest_element,expiry
0,string,mjx3,56,string,4,4,
0,string,mjx5,56,string,4,4,
0,string,mjx2,56,string,4,4,
0,string,mjx,48,string,8,8,
0,string,mjx4,56,string,4,4,

　　生成的數據有database（key在Redis的db）、type（key類型）、key（key值）、size_in_bytes（key的內存大小）、encoding（value的存儲編碼形式）、num_elements（key中的value的個數）、len_largest_element（key中的value的長度）、超時時間。

優缺點

　　RDB持久化方式的優勢：

很是適合全量備份
恢復速度比AOF快

　　RDB持久化方式的缺點：

RDB方式沒有辦法作到實時持久化
版本兼容RDB格式問題

二.探索AOF

　　RDB方式不能提供強一致性，若是Redis進程崩潰，那麼兩次RDB之間的數據也隨之消失。那麼AOF的出現很好的解決了數據持久化的實時性，AOF以獨立日誌的方式記錄每次寫命令，重啓時再從新執行AOF文件中的命令來恢復數據。AOF會先把命令追加在AOF緩衝區，而後根據對應策略寫入硬盤（appendfsync），具體參數後面有講。接下來介紹一下AOF重寫命令。

手動觸發

　　使用bgrewriteaof命令：Redis主進程fork子進程來執行AOF重寫，這個子進程建立新的AOF文件來存儲重寫結果，防止影響舊文件。由於fork採用了寫時複製機制，子進程不能訪問在其被建立出來以後產生的新數據。Redis使用「AOF重寫緩衝區」保存這部分新數據，最後父進程將AOF重寫緩衝區的數據寫入新的AOF文件中而後使用新AOF文件替換老文件。

127.0.0.1:6379> bgrewriteoaf
OK

自動觸發

　　和RDB同樣，配置在redis.conf文件裏，固然你也能夠經過調用CONFIG SET命令設置。咱們先看來看AOF相關配置：

############################## APPEND ONLY MODE ###############################

# By default Redis asynchronously dumps the dataset on disk. This mode is
# good enough in many applications, but an issue with the Redis process or
# a power outage may result into a few minutes of writes lost (depending on
# the configured save points).
#
# The Append Only File is an alternative persistence mode that provides
# much better durability. For instance using the default data fsync policy
# (see later in the config file) Redis can lose just one second of writes in a
# dramatic event like a server power outage, or a single write if something
# wrong with the Redis process itself happens, but the operating system is
# still running correctly.
#
# AOF and RDB persistence can be enabled at the same time without problems.
# If the AOF is enabled on startup Redis will load the AOF, that is the file
# with the better durability guarantees.
#
# Please check http://redis.io/topics/persistence for more information.
 appendonly no

# The name of the append only file (default: "appendonly.aof")

appendfilename "appendonly.aof"

# The fsync() call tells the Operating System to actually write data on disk
# instead of waiting for more data in the output buffer. Some OS will really flush
# data on disk, some other OS will just try to do it ASAP.
#
# Redis supports three different modes:
#
# no: don't fsync, just let the OS flush the data when it wants. Faster.
# always: fsync after every write to the append only log. Slow, Safest.
# everysec: fsync only one time every second. Compromise.
#
# The default is "everysec", as that's usually the right compromise between
# speed and data safety. It's up to you to understand if you can relax this to
# "no" that will let the operating system flush the output buffer when
# it wants, for better performances (but if you can live with the idea of
# some data loss consider the default persistence mode that's snapshotting),
# or on the contrary, use "always" that's very slow but a bit safer than
# everysec.
#
# More details please check the following article:
# http://antirez.com/post/redis-persistence-demystified.html
#
# If unsure, use "everysec".

# appendfsync always
appendfsync everysec
# appendfsync no

# When the AOF fsync policy is set to always or everysec, and a background
# saving process (a background save or AOF log background rewriting) is
# performing a lot of I/O against the disk, in some Linux configurations
# Redis may block too long on the fsync() call. Note that there is no fix for
# this currently, as even performing fsync in a different thread will block
# our synchronous write(2) call.
#
# In order to mitigate this problem it's possible to use the following option
# that will prevent fsync() from being called in the main process while a
# BGSAVE or BGREWRITEAOF is in progress.
#
# This means that while another child is saving, the durability of Redis is
# the same as "appendfsync none". In practical terms, this means that it is
# possible to lose up to 30 seconds of log in the worst scenario (with the
# default Linux settings).
#
# If you have latency problems turn this to "yes". Otherwise leave it as
# "no" that is the safest pick from the point of view of durability.

no-appendfsync-on-rewrite no

# Automatic rewrite of the append only file.
# Redis is able to automatically rewrite the log file implicitly calling
# BGREWRITEAOF when the AOF log size grows by the specified percentage.
#
# This is how it works: Redis remembers the size of the AOF file after the
# latest rewrite (if no rewrite has happened since the restart, the size of
# the AOF at startup is used).
#
# This base size is compared to the current size. If the current size is
# bigger than the specified percentage, the rewrite is triggered. Also
# you need to specify a minimal size for the AOF file to be rewritten, this
# is useful to avoid rewriting the AOF file even if the percentage increase
# is reached but it is still pretty small.
#
# Specify a percentage of zero in order to disable the automatic AOF
# rewrite feature.

auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

# An AOF file may be found to be truncated at the end during the Redis
# startup process, when the AOF data gets loaded back into memory.
# This may happen when the system where Redis is running
# crashes, especially when an ext4 filesystem is mounted without the
# data=ordered option (however this can't happen when Redis itself
# crashes or aborts but the operating system still works correctly).
#
# Redis can either exit with an error when this happens, or load as much
# data as possible (the default now) and start if the AOF file is found
# to be truncated at the end. The following option controls this behavior.
#
# If aof-load-truncated is set to yes, a truncated AOF file is loaded and
# the Redis server starts emitting a log to inform the user of the event.
# Otherwise if the option is set to no, the server aborts with an error
# and refuses to start. When the option is set to no, the user requires
# to fix the AOF file using the "redis-check-aof" utility before to restart
# the server.
#
# Note that if the AOF file will be found to be corrupted in the middle
# the server will still exit with an error. This option only applies when
# Redis will try to read more data from the AOF file but not enough bytes
# will be found.
aof-load-truncated yes

# When rewriting the AOF file, Redis is able to use an RDB preamble in the
# AOF file for faster rewrites and recoveries. When this option is turned
# on the rewritten AOF file is composed of two different stanzas:
#
#   [RDB file][AOF tail]
#
# When loading Redis recognizes that the AOF file starts with the "REDIS"
# string and loads the prefixed RDB file, and continues loading the AOF
# tail.
#
# This is currently turned off by default in order to avoid the surprise
# of a format change, but will at some point be used as the default.
aof-use-rdb-preamble no

appendonly：是否打開AOF持久化功能
appendfilename：AOF文件名稱
appendfsync：同步頻率
auto-aof-rewrite-min-size：若是文件大小小於此值不會觸發AOF，默認64MB
auto-aof-rewrite-percentage：Redis記錄最近的一次AOF操做的文件大小，若是當前AOF文件大小增加超過這個百分比則觸發一次重寫，默認100

　　這裏介紹一下appendfsync參數的可配置值

always：命令寫入aof緩衝區後，每一次寫入都須要同步，直到寫入磁盤（阻塞，系統調用fsync）結束後返回。顯然和Redis高性能背道而馳，不建議配置
everysec：命令寫入aof緩衝區後，在寫入系統緩衝區直接返回（系統調用write），而後有專門線程每秒執行寫入磁盤（阻塞，系統調用fsync）後返回
no：命令寫入aof緩衝區後，在寫入系統緩衝區直接返回（系統調用write）。以後寫入磁盤（阻塞，系統調用fsync）的操做由操做系統負責，一般最長30s

工做原理

　　這裏看一段aof.c的代碼，咱們能夠看到fork出名爲"redis-aof-rewrite"的子進程

/* This is how rewriting of the append only file in background works:
 *
 * 1) The user calls BGREWRITEAOF
 * 2) Redis calls this function, that forks():
 *    2a) the child rewrite the append only file in a temp file.
 *    2b) the parent accumulates differences in server.aof_rewrite_buf.
 * 3) When the child finished '2a' exists.
 * 4) The parent will trap the exit code, if it's OK, will append the
 *    data accumulated into server.aof_rewrite_buf into the temp file, and
 *    finally will rename(2) the temp file in the actual file name.
 *    The the new file is reopened as the new append only file. Profit!
 */
int rewriteAppendOnlyFileBackground(void) {
    pid_t childpid;
    long long start;

    if (server.aof_child_pid != -1 || server.rdb_child_pid != -1) return C_ERR;
    if (aofCreatePipes() != C_OK) return C_ERR;
    start = ustime();
    if ((childpid = fork()) == 0) {
        char tmpfile[256];

        /* Child */
        closeListeningSockets(0);
 redisSetProcTitle("redis-aof-rewrite");
        snprintf(tmpfile,256,"temp-rewriteaof-bg-%d.aof", (int) getpid());
        if (rewriteAppendOnlyFile(tmpfile) == C_OK) {
            size_t private_dirty = zmalloc_get_private_dirty();

            if (private_dirty) {
                serverLog(LL_NOTICE,
                    "AOF rewrite: %zu MB of memory used by copy-on-write",
                    private_dirty/(1024*1024));
            }
            exitFromChild(0);
        } else {
            exitFromChild(1);
        }
    }
...
...

　　一樣咱們也看一下AOF的運做流程圖：

全部的寫入命令追加到aof緩衝區
AOF緩衝區根據對應appendfsync配置向硬盤作同步操做
按期對AOF文件進行重寫
Redis重啓時，能夠加載AOF文件進行數據恢復

探索AOF文件

　　首先打開aof功能

127.0.0.1:6379> CONFIG SET appendonly yes
OK

59117:M 13 Apr 19:24:53.940 * Background append only file rewriting started by pid 59895
59117:M 13 Apr 19:24:53.964 * AOF rewrite child asks to stop sending diffs.
59895:C 13 Apr 19:24:53.965 * Parent agreed to stop sending diffs. Finalizing AOF...
59895:C 13 Apr 19:24:53.965 * Concatenating 0.00 MB of AOF diff received from parent.
59895:C 13 Apr 19:24:53.966 * SYNC append only file rewrite performed
59117:M 13 Apr 19:24:53.996 * Background AOF rewrite terminated with success
59117:M 13 Apr 19:24:53.996 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
59117:M 13 Apr 19:24:53.997 * Background AOF rewrite finished successfully

　　而後咱們放一些數據，並執行bgrewriteaof命令

127.0.0.1:6379> CONFIG SET appendonly yes
OK
127.0.0.1:6379> set miao 24
OK
127.0.0.1:6379> set miao 177
OK
127.0.0.1:6379> lpush mlist 1
(integer) 1
127.0.0.1:6379> lpush mlist 2
(integer) 2
127.0.0.1:6379> lpush mlist 3
(integer) 3
127.0.0.1:6379> keys *
1) "miao"
2) "mlist"

　　接下來看一下aof文件：

*2
$6
SELECT
$1
0
*3
$3
SET
$4
miao
$3
177
*2
$6
SELECT
$1
0
*3
$5
lpush
$5
mlist
$1
1
*3
$5
lpush
$5
mlist
$1
2
*3
$5
lpush
$5
mlist
$1
3

　　這時候咱們手動執行aof重寫命令：

127.0.0.1:6379> bgrewriteaof
Background append only file rewriting started

59117:M 13 Apr 19:29:31.017 * 10 changes in 300 seconds. Saving...
59117:M 13 Apr 19:29:31.017 * Background saving started by pid 59905
59905:C 13 Apr 19:29:31.020 * DB saved on disk
59117:M 13 Apr 19:29:31.120 * Background saving terminated with success
59117:M 13 Apr 19:29:49.409 * Background append only file rewriting started by pid 59906
59117:M 13 Apr 19:29:49.433 * AOF rewrite child asks to stop sending diffs.
59906:C 13 Apr 19:29:49.433 * Parent agreed to stop sending diffs. Finalizing AOF...
59906:C 13 Apr 19:29:49.434 * Concatenating 0.00 MB of AOF diff received from parent.
59906:C 13 Apr 19:29:49.434 * SYNC append only file rewrite performed
59117:M 13 Apr 19:29:49.533 * Background AOF rewrite terminated with success
59117:M 13 Apr 19:29:49.533 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
59117:M 13 Apr 19:29:49.534 * Background AOF rewrite finished successfully

　　而後再看一下文件：

*2
$6
SELECT
$1
0
*3
$3
SET
$4
miao
$3
177
*5
$5
RPUSH
$5
mlist
$1
3
$1
2
$1
1

　　爲何AOF文件會變小？爲了解決AOF文件會愈來愈大，Redis引入重寫機制來縮小文件體積，體積變小由於：

多條寫入命令能夠合併成一條。好比上面的lpush命令了3次，最後合併成1條
重寫後AOF文件只保留最終數據的寫入命令

優缺點

　　AOF持久化方式的優勢：

作到最多丟失1-2s內的數據（最多丟失2s數據，由於AOF追加阻塞）

　　AOF持久化方式的缺點：

AOF文件比RDB文件大
可能致使追加阻塞

參考：

書籍參考和上文同樣

http://www.javashuo.com/article/p-dstsqudw-ks.html

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。