本文所引用的源碼所有來自Redis2.8.2版本。 ios
Redis AOF數據持久化機制的實現相關代碼是redis.c, redis.h, aof.c, bio.c, rio.c, config.c redis
在閱讀本文以前請先閱讀Redis數據持久化機制AOF原理分析之配置詳解文章,瞭解AOF相關參數的解析,文章連接 數據庫
http://blog.csdn.net/acceptedxukai/article/details/18135219 服務器
接着上一篇文章,本文將介紹Redis是如何實現AOF rewrite的。 app
轉載請註明,文章出自http://blog.csdn.net/acceptedxukai/article/details/18181563 ide
AOF rewrite的觸發機制
若是Redis只是將客戶端修改數據庫的指令重現存儲在AOF文件中,那麼AOF文件的大小會不斷的增長,由於AOF文件只是簡單的重現存儲了客戶端的指令,而並無進行合併。對於該問題最簡單的處理方式,即當AOF文件知足必定條件時就對AOF進行rewrite,rewrite是根據當前內存數據庫中的數據進行遍歷寫到一個臨時的AOF文件,待寫完後替換掉原來的AOF文件便可。 函數
Redis觸發AOF rewrite機制有三種: 網站
一、Redis Server接收到客戶端發送的BGREWRITEAOF指令請求,若是當前AOF/RDB數據持久化沒有在執行,那麼執行,反之,等當前AOF/RDB數據持久化結束後執行AOF rewrite this
二、在Redis配置文件redis.conf中,用戶設置了auto-aof-rewrite-percentage和auto-aof-rewrite-min-size參數,而且當前AOF文件大小server.aof_current_size大於auto-aof-rewrite-min-size(server.aof_rewrite_min_size),同時AOF文件大小的增加率大於auto-aof-rewrite-percentage(server.aof_rewrite_perc)時,會自動觸發AOF rewrite
三、用戶設置「config set appendonly yes」開啓AOF的時,調用startAppendOnly函數會觸發rewrite
下面分別介紹上述三種機制的處理.
接收到BGREWRITEAOF指令
當AOF rewrite請求被掛起時,在serverCron函數中,會處理。
- <span style="font-size:12px;">void bgrewriteaofCommand(redisClient *c) {
- //AOF rewrite正在執行,那麼直接返回
- if (server.aof_child_pid != -1) {
- addReplyError(c,"Background append only file rewriting already in progress");
- } else if (server.rdb_child_pid != -1) {
- //AOF rewrite未執行,但RDB數據持久化正在執行,那麼設置AOF rewrite狀態爲scheduled
- //待RDB結束後執行AOF rewrite
- server.aof_rewrite_scheduled = 1;
- addReplyStatus(c,"Background append only file rewriting scheduled");
- } else if (rewriteAppendOnlyFileBackground() == REDIS_OK) {
- //直接執行AOF rewrite
- addReplyStatus(c,"Background append only file rewriting started");
- } else {
- addReply(c,shared.err);
- }
- }</span>
- /* Start a scheduled AOF rewrite if this was requested by the user while
- * a BGSAVE was in progress. */
- // 若是用戶執行 BGREWRITEAOF 命令的話,在後臺開始 AOF 重寫
- //當用戶執行BGREWRITEAOF命令時,若是RDB文件正在寫,那麼將server.aof_rewrite_scheduled標記爲1
- //當RDB文件寫完後開啓AOF rewrite
- if (server.rdb_child_pid == -1 && server.aof_child_pid == -1 &&
- server.aof_rewrite_scheduled)
- {
- rewriteAppendOnlyFileBackground();
- }
Server自動對AOF進行rewrite
在serverCron函數中會週期性判斷
- /* Trigger an AOF rewrite if needed */
- //知足必定條件rewrite AOF文件
- if (server.rdb_child_pid == -1 &&
- server.aof_child_pid == -1 &&
- server.aof_rewrite_perc &&
- server.aof_current_size > server.aof_rewrite_min_size)
- {
- long long base = server.aof_rewrite_base_size ?
- server.aof_rewrite_base_size : 1;
- long long growth = (server.aof_current_size*100/base) - 100;
- if (growth >= server.aof_rewrite_perc) {
- redisLog(REDIS_NOTICE,"Starting automatic rewriting of AOF on %lld%% growth",growth);
- rewriteAppendOnlyFileBackground();
- }
- }
config set appendonly yes
當客戶端發送該指令時,config.c中的configSetCommand函數會作出響應,startAppendOnly函數會執行AOF rewrite
- if (!strcasecmp(c->argv[2]->ptr,"appendonly")) {
- int enable = yesnotoi(o->ptr);
- if (enable == -1) goto badfmt;
- if (enable == 0 && server.aof_state != REDIS_AOF_OFF) {//appendonly no 關閉AOF
- stopAppendOnly();
- } else if (enable && server.aof_state == REDIS_AOF_OFF) {//appendonly yes rewrite AOF
- if (startAppendOnly() == REDIS_ERR) {
- addReplyError(c,
- "Unable to turn on AOF. Check server logs.");
- return;
- }
- }
- }
- int startAppendOnly(void) {
- server.aof_last_fsync = server.unixtime;
- server.aof_fd = open(server.aof_filename,O_WRONLY|O_APPEND|O_CREAT,0644);
- redisAssert(server.aof_state == REDIS_AOF_OFF);
- if (server.aof_fd == -1) {
- redisLog(REDIS_WARNING,"Redis needs to enable the AOF but can't open the append only file: %s",strerror(errno));
- return REDIS_ERR;
- }
- if (rewriteAppendOnlyFileBackground() == REDIS_ERR) {//rewrite
- close(server.aof_fd);
- redisLog(REDIS_WARNING,"Redis needs to enable the AOF but can't trigger a background AOF rewrite operation. Check the above logs for more info about the error.");
- return REDIS_ERR;
- }
- /* We correctly switched on AOF, now wait for the rerwite to be complete
- * in order to append data on disk. */
- server.aof_state = REDIS_AOF_WAIT_REWRITE;
- return REDIS_OK;
- }
Redis AOF rewrite機制的實現
從上述分析能夠看出rewrite的實現所有依靠rewriteAppendOnlyFileBackground函數,下面分析該函數,經過下面的代碼能夠看出,Redis是fork出一個子進程來操做AOF rewrite,而後子進程調用rewriteAppendOnlyFile函數,將數據寫到一個臨時文件temp-rewriteaof-bg-%d.aof中。若是子進程完成會經過exit(0)函數通知父進程rewrite結束,在serverCron函數中使用wait3函數接收子進程退出狀態,而後執行後續的AOF rewrite的收尾工做,後面將會分析。父進程的工做主要包括清楚server.aof_rewrite_scheduled標誌,記錄子進程IDserver.aof_child_pid = childpid,記錄rewrite的開始時間server.aof_rewrite_time_start = time(NULL)等。接下來介紹rewriteAppendOnlyFile函數,該函數的主要工做爲:遍歷全部數據庫中的數據,將其寫入到臨時文件temp-rewriteaof-%d.aof中,寫入函數定義在rio.c中,比較簡單,而後將數據刷新到硬盤中,而後將文件名rename爲其調用者給定的臨時文件名,注意仔細看代碼,這裏並無修改成正式的AOF文件名。
- int rewriteAppendOnlyFileBackground(void) {
- pid_t childpid;
- long long start;
- // 後臺重寫正在執行
- if (server.aof_child_pid != -1) return REDIS_ERR;
- start = ustime();
- if ((childpid = fork()) == 0) {
- char tmpfile[256];
- /* Child */
- closeListeningSockets(0);//
- redisSetProcTitle("redis-aof-rewrite");
- snprintf(tmpfile,256,"temp-rewriteaof-bg-%d.aof", (int) getpid());
- if (rewriteAppendOnlyFile(tmpfile) == REDIS_OK) {
- size_t private_dirty = zmalloc_get_private_dirty();
- if (private_dirty) {
- redisLog(REDIS_NOTICE,
- "AOF rewrite: %zu MB of memory used by copy-on-write",
- private_dirty/(1024*1024));
- }
- exitFromChild(0);
- } else {
- exitFromChild(1);
- }
- } else {
- /* Parent */
- server.stat_fork_time = ustime()-start;
- if (childpid == -1) {
- redisLog(REDIS_WARNING,
- "Can't rewrite append only file in background: fork: %s",
- strerror(errno));
- return REDIS_ERR;
- }
- redisLog(REDIS_NOTICE,
- "Background append only file rewriting started by pid %d",childpid);
- server.aof_rewrite_scheduled = 0;
- server.aof_rewrite_time_start = time(NULL);
- server.aof_child_pid = childpid;
- updateDictResizePolicy();
- /* We set appendseldb to -1 in order to force the next call to the
- * feedAppendOnlyFile() to issue a SELECT command, so the differences
- * accumulated by the parent into server.aof_rewrite_buf will start
- * with a SELECT statement and it will be safe to merge. */
- server.aof_selected_db = -1;
- replicationScriptCacheFlush();
- return REDIS_OK;
- }
- return REDIS_OK; /* unreached */
- }
在寫入文件時若是設置server.aof_rewrite_incremental_fsync參數,那麼在rioWrite函數中fwrite部分數據就會將數據fsync到硬盤中,來保證數據的正確性。AOF rewrite工做到這裏已經結束一半,上一篇文章提到若是server.aof_state != REDIS_AOF_OFF,那麼就會將客戶端請求指令修改的數據經過feedAppendOnlyFile函數追加到AOF文件中,那麼此時AOF已經rewrite了,必需要處理此時出現的差別數據,記得在feedAppendOnlyFile函數中有這麼一段代碼
- int rewriteAppendOnlyFile(char *filename) {
- dictIterator *di = NULL;
- dictEntry *de;
- rio aof;
- FILE *fp;
- char tmpfile[256];
- int j;
- long long now = mstime();
- /* Note that we have to use a different temp name here compared to the
- * one used by rewriteAppendOnlyFileBackground() function. */
- snprintf(tmpfile,256,"temp-rewriteaof-%d.aof", (int) getpid());
- fp = fopen(tmpfile,"w");
- if (!fp) {
- redisLog(REDIS_WARNING, "Opening the temp file for AOF rewrite in rewriteAppendOnlyFile(): %s", strerror(errno));
- return REDIS_ERR;
- }
- rioInitWithFile(&aof,fp); //初始化讀寫函數,rio.c
- //設置r->io.file.autosync = bytes;每32M刷新一次
- if (server.aof_rewrite_incremental_fsync)
- rioSetAutoSync(&aof,REDIS_AOF_AUTOSYNC_BYTES);
- for (j = 0; j < server.dbnum; j++) {//遍歷每一個數據庫
- char selectcmd[] = "*2\r\n$6\r\nSELECT\r\n";
- redisDb *db = server.db+j;
- dict *d = db->dict;
- if (dictSize(d) == 0) continue;
- di = dictGetSafeIterator(d);
- if (!di) {
- fclose(fp);
- return REDIS_ERR;
- }
- /* SELECT the new DB */
- if (rioWrite(&aof,selectcmd,sizeof(selectcmd)-1) == 0) goto werr;
- if (rioWriteBulkLongLong(&aof,j) == 0) goto werr;
- /* Iterate this DB writing every entry */
- while((de = dictNext(di)) != NULL) {
- sds keystr;
- robj key, *o;
- long long expiretime;
- keystr = dictGetKey(de);
- o = dictGetVal(de);
- initStaticStringObject(key,keystr);
- expiretime = getExpire(db,&key);
- /* If this key is already expired skip it */
- if (expiretime != -1 && expiretime < now) continue;
- /* Save the key and associated value */
- if (o->type == REDIS_STRING) {
- /* Emit a SET command */
- char cmd[]="*3\r\n$3\r\nSET\r\n";
- if (rioWrite(&aof,cmd,sizeof(cmd)-1) == 0) goto werr;
- /* Key and value */
- if (rioWriteBulkObject(&aof,&key) == 0) goto werr;
- if (rioWriteBulkObject(&aof,o) == 0) goto werr;
- } else if (o->type == REDIS_LIST) {
- if (rewriteListObject(&aof,&key,o) == 0) goto werr;
- } else if (o->type == REDIS_SET) {
- if (rewriteSetObject(&aof,&key,o) == 0) goto werr;
- } else if (o->type == REDIS_ZSET) {
- if (rewriteSortedSetObject(&aof,&key,o) == 0) goto werr;
- } else if (o->type == REDIS_HASH) {
- if (rewriteHashObject(&aof,&key,o) == 0) goto werr;
- } else {
- redisPanic("Unknown object type");
- }
- /* Save the expire time */
- if (expiretime != -1) {
- char cmd[]="*3\r\n$9\r\nPEXPIREAT\r\n";
- if (rioWrite(&aof,cmd,sizeof(cmd)-1) == 0) goto werr;
- if (rioWriteBulkObject(&aof,&key) == 0) goto werr;
- if (rioWriteBulkLongLong(&aof,expiretime) == 0) goto werr;
- }
- }
- dictReleaseIterator(di);
- }
- /* Make sure data will not remain on the OS's output buffers */
- fflush(fp);
- aof_fsync(fileno(fp));//將tempfile文件刷新到硬盤
- fclose(fp);
- /* Use RENAME to make sure the DB file is changed atomically only
- * if the generate DB file is ok. */
- if (rename(tmpfile,filename) == -1) {//重命名文件名,注意rename後的文件也是一個臨時文件
- redisLog(REDIS_WARNING,"Error moving temp append only file on the final destination: %s", strerror(errno));
- unlink(tmpfile);
- return REDIS_ERR;
- }
- redisLog(REDIS_NOTICE,"SYNC append only file rewrite performed");
- return REDIS_OK;
- werr:
- fclose(fp);
- unlink(tmpfile);
- redisLog(REDIS_WARNING,"Write error writing append only file on disk: %s", strerror(errno));
- if (di) dictReleaseIterator(di);
- return REDIS_ERR;
- }
若是AOF rewrite正在進行,那麼就將修改數據的指令字符串存儲到server.aof_rewrite_buf_blocks鏈表中,等待AOF rewrite子進程結束後處理,處理此部分數據的代碼在serverCron函數中。須要指出的是wait3函數我不瞭解,可能下面註釋會有點問題。
- if (server.aof_child_pid != -1)
- aofRewriteBufferAppend((unsigned char*)buf,sdslen(buf));
對於AOF rewrite期間出現的差別數據,Server經過backgroundSaveDoneHandler函數將 server.aof_rewrite_buf_blocks鏈表中數據追加到新的AOF文件中。
- /* Check if a background saving or AOF rewrite in progress terminated. */
- //若是RDB bgsave或AOF rewrite子進程已經執行,經過獲取子進程的退出狀態,對後續的工做進行處理
- if (server.rdb_child_pid != -1 || server.aof_child_pid != -1) {//
- int statloc;
- pid_t pid;
- if ((pid = wait3(&statloc,WNOHANG,NULL)) != 0) {
- int exitcode = WEXITSTATUS(statloc);//獲取退出的狀態
- int bysignal = 0;
- if (WIFSIGNALED(statloc)) bysignal = WTERMSIG(statloc);
- if (pid == server.rdb_child_pid) {
- backgroundSaveDoneHandler(exitcode,bysignal);
- } else if (pid == server.aof_child_pid) {
- backgroundRewriteDoneHandler(exitcode,bysignal);
- } else {
- redisLog(REDIS_WARNING,
- "Warning, detected child with unmatched pid: %ld",
- (long)pid);
- }
- // 若是 BGSAVE 和 BGREWRITEAOF 都已經完成,那麼從新開始 REHASH
- updateDictResizePolicy();
- }
- }
backgroundSaveDoneHandler函數執行步驟:
一、經過判斷子進程的退出狀態,正確的退出狀態爲exit(0),即exitcode爲0,bysignal我不清楚具體意義,若是退出狀態正確,backgroundSaveDoneHandler函數纔會開始處理二、經過對rewriteAppendOnlyFileBackground函數的分析,能夠知道rewrite後的AOF臨時文件名爲temp-rewriteaof-bg-%d.aof(%d=server.aof_child_pid)中,接着須要打開此臨時文件三、調用aofRewriteBufferWrite函數將server.aof_rewrite_buf_blocks中差別數據寫到該臨時文件中四、若是舊的AOF文件未打開,那麼打開舊的AOF文件,將文件描述符賦值給臨時變量oldfd五、將臨時的AOF文件名rename爲正常的AOF文件名六、若是舊的AOF文件未打開,那麼此時只須要關閉新的AOF文件,此時的server.aof_rewrite_buf_blocks數據應該爲空;若是舊的AOF是打開的,那麼將server.aof_fd指向newfd,而後根據相應的fsync策略將數據刷新到硬盤上七、調用aofUpdateCurrentSize函數統計AOF文件的大小,更新server.aof_rewrite_base_size,爲serverCron中自動AOF rewrite作相應判斷八、若是以前是REDIS_AOF_WAIT_REWRITE狀態,則設置server.aof_state爲REDIS_AOF_ON,由於只有「config set appendonly yes」指令纔會設置這個狀態,也就是須要寫完快照後,當即打開AOF;而BGREWRITEAOF不須要打開AOF九、調用後臺線程去關閉舊的AOF文件下面是backgroundSaveDoneHandler函數的註釋代碼
- /* A background append only file rewriting (BGREWRITEAOF) terminated its work.
- * Handle this. */
- void backgroundRewriteDoneHandler(int exitcode, int bysignal) {
- if (!bysignal && exitcode == 0) {//子進程退出狀態正確
- int newfd, oldfd;
- char tmpfile[256];
- long long now = ustime();
- redisLog(REDIS_NOTICE,
- "Background AOF rewrite terminated with success");
- /* Flush the differences accumulated by the parent to the
- * rewritten AOF. */
- snprintf(tmpfile,256,"temp-rewriteaof-bg-%d.aof",
- (int)server.aof_child_pid);
- newfd = open(tmpfile,O_WRONLY|O_APPEND);
- if (newfd == -1) {
- redisLog(REDIS_WARNING,
- "Unable to open the temporary AOF produced by the child: %s", strerror(errno));
- goto cleanup;
- }
- //處理server.aof_rewrite_buf_blocks中DIFF數據
- if (aofRewriteBufferWrite(newfd) == -1) {
- redisLog(REDIS_WARNING,
- "Error trying to flush the parent diff to the rewritten AOF: %s", strerror(errno));
- close(newfd);
- goto cleanup;
- }
- redisLog(REDIS_NOTICE,
- "Parent diff successfully flushed to the rewritten AOF (%lu bytes)", aofRewriteBufferSize());
- /* The only remaining thing to do is to rename the temporary file to
- * the configured file and switch the file descriptor used to do AOF
- * writes. We don't want close(2) or rename(2) calls to block the
- * server on old file deletion.
- *
- * There are two possible scenarios:
- *
- * 1) AOF is DISABLED and this was a one time rewrite. The temporary
- * file will be renamed to the configured file. When this file already
- * exists, it will be unlinked, which may block the server.
- *
- * 2) AOF is ENABLED and the rewritten AOF will immediately start
- * receiving writes. After the temporary file is renamed to the
- * configured file, the original AOF file descriptor will be closed.
- * Since this will be the last reference to that file, closing it
- * causes the underlying file to be unlinked, which may block the
- * server.
- *
- * To mitigate the blocking effect of the unlink operation (either
- * caused by rename(2) in scenario 1, or by close(2) in scenario 2), we
- * use a background thread to take care of this. First, we
- * make scenario 1 identical to scenario 2 by opening the target file
- * when it exists. The unlink operation after the rename(2) will then
- * be executed upon calling close(2) for its descriptor. Everything to
- * guarantee atomicity for this switch has already happened by then, so
- * we don't care what the outcome or duration of that close operation
- * is, as long as the file descriptor is released again. */
- if (server.aof_fd == -1) {
- /* AOF disabled */
- /* Don't care if this fails: oldfd will be -1 and we handle that.
- * One notable case of -1 return is if the old file does
- * not exist. */
- oldfd = open(server.aof_filename,O_RDONLY|O_NONBLOCK);
- } else {
- /* AOF enabled */
- oldfd = -1; /* We'll set this to the current AOF filedes later. */
- }
- /* Rename the temporary file. This will not unlink the target file if
- * it exists, because we reference it with "oldfd". */
- //把臨時文件更名爲正常的AOF文件名。因爲當前oldfd已經指向這個以前的正常文件名的文件,
- //因此當前不會形成unlink操做,得等那個oldfd被close的時候,內核判斷該文件沒有指向了,就刪除之。
- if (rename(tmpfile,server.aof_filename) == -1) {
- redisLog(REDIS_WARNING,
- "Error trying to rename the temporary AOF file: %s", strerror(errno));
- close(newfd);
- if (oldfd != -1) close(oldfd);
- goto cleanup;
- }
- //若是AOF關閉了,那隻要處理新文件,直接關閉這個新的文件便可
- //可是這裏會不會致使服務器卡呢?這個newfd應該是臨時文件的最後一個fd了,不會的,
- //由於這個文件在本函數不會寫入數據,由於stopAppendOnly函數會清空aof_rewrite_buf_blocks列表。
- if (server.aof_fd == -1) {
- /* AOF disabled, we don't need to set the AOF file descriptor
- * to this new file, so we can close it. */
- close(newfd);
- } else {
- /* AOF enabled, replace the old fd with the new one. */
- oldfd = server.aof_fd;
- //指向新的fd,此時這個fd因爲上面的rename語句存在,已經爲正常aof文件名
- server.aof_fd = newfd;
- //fsync到硬盤
- if (server.aof_fsync == AOF_FSYNC_ALWAYS)
- aof_fsync(newfd);
- else if (server.aof_fsync == AOF_FSYNC_EVERYSEC)
- aof_background_fsync(newfd);
- server.aof_selected_db = -1; /* Make sure SELECT is re-issued */
- aofUpdateCurrentSize();
- server.aof_rewrite_base_size = server.aof_current_size;
- /* Clear regular AOF buffer since its contents was just written to
- * the new AOF from the background rewrite buffer. */
- //rewrite獲得的確定是最新的數據,因此aof_buf中的數據沒有意義,直接清空
- sdsfree(server.aof_buf);
- server.aof_buf = sdsempty();
- }
- server.aof_lastbgrewrite_status = REDIS_OK;
- redisLog(REDIS_NOTICE, "Background AOF rewrite finished successfully");
- /* Change state from WAIT_REWRITE to ON if needed */
- //下面判斷是否須要打開AOF,好比bgrewriteaofCommand就不須要打開AOF。
- if (server.aof_state == REDIS_AOF_WAIT_REWRITE)
- server.aof_state = REDIS_AOF_ON;
- /* Asynchronously close the overwritten AOF. */
- //讓後臺線程去關閉這個舊的AOF文件FD,只要CLOSE就行,會自動unlink的,由於上面已經有rename
- if (oldfd != -1) bioCreateBackgroundJob(REDIS_BIO_CLOSE_FILE,(void*)(long)oldfd,NULL,NULL);
- redisLog(REDIS_VERBOSE,
- "Background AOF rewrite signal handler took %lldus", ustime()-now);
- } else if (!bysignal && exitcode != 0) {
- server.aof_lastbgrewrite_status = REDIS_ERR;
- redisLog(REDIS_WARNING,
- "Background AOF rewrite terminated with error");
- } else {
- server.aof_lastbgrewrite_status = REDIS_ERR;
- redisLog(REDIS_WARNING,
- "Background AOF rewrite terminated by signal %d", bysignal);
- }
- cleanup:
- aofRewriteBufferReset();
- aofRemoveTempFile(server.aof_child_pid);
- server.aof_child_pid = -1;
- server.aof_rewrite_time_last = time(NULL)-server.aof_rewrite_time_start;
- server.aof_rewrite_time_start = -1;
- /* Schedule a new rewrite if we are waiting for it to switch the AOF ON. */
- if (server.aof_state == REDIS_AOF_WAIT_REWRITE)
- server.aof_rewrite_scheduled = 1;
- }
至此,AOF數據持久化已經所有結束了,剩下的就是一些細節的處理,以及一些Linux庫函數的理解,對於rename、unlink、wait3等庫 函數的深刻認識就去問Google吧。
小結
Redis AOF數據持久化的實現機制經過三篇文章基本上比較詳細的分析了, 但這只是從代碼層面去看AOF,對於AOF持久化的優缺點網上有不少分析,Redis的官方網站也有英文介紹,Redis的數據持久化還有一種方法叫RDB,更多RDB的內容等下次再分析。感謝此篇博客給我在理解Redis AOF數據持久化方面的巨大幫助, http://chenzhenianqing.cn/articles/786.html,此篇博客對AOF的分析十分的詳細。