redis性能調優筆記（can not get Resource from jedis pool和jedis connect time out）

時間 2019-11-10

標籤 redis 性能筆記 resource jedis pool connect time 欄目 Redis 简体版

原文原文鏈接

對這段時間redis性能調優作一個記錄。linux

一、單進程單線程redis

redis是單進程單線程實現的，若是你沒有特殊的配置，redis內部默認是FIFO排隊，即你對redis的訪問都是要在redis進行排隊，先入先出的串行執行。express

之因此可以保持高性能是由於如下3點：vim

1）內存操做緩存

2）數據結構簡單安全

3）大多數是hash操做網絡

redis基本的命令耗時都是us級別的，因此及時是單進程單線程，也能保證很高的QPS。數據結構

二、can not get Resource from jedis pool和jedis connect time out併發

若是你對redis訪問不正常，表現爲拋上面兩個異常，基本能夠判斷你須要對redis使用方式或者性能進行調優了。運維

你能夠嘗試使用以下幾個辦法：

1）修改redis客戶端最大連接數

這個配置很簡單，默認是10000，通常來講，10000這個數字已經夠大了。

################################### LIMITS ####################################

# Set the max number of connected clients at the same time. By default
# this limit is set to 10000 clients, however if the Redis server is not
# able to configure the process file limit to allow for the specified limit
# the max number of allowed clients is set to the current file limit
# minus 32 (as Redis reserves a few file descriptors for internal uses).
#
# Once the limit is reached Redis will close all the new connections sending
# an error 'max number of clients reached'.
#
maxclients 10000

不過，對這個數字的設置你要參考你的linux系統文件打開數。

通常來講，maxclients不能大於linux系統的最大鏈接數。

固然你能夠將使用redis的linux用戶對應的最大連接數改大，方式是修改/etc/security/limits.conf，咱們這裏把arch用戶的最大連接數改爲了100000

#<domain>      <type>  <item>         <value>
arch          soft    nofile          100000
arch          hard    nofile          100000

2）安全的釋放連接！

必定要安全的釋放連接，注意一些異常狀況必定要能捕獲到，而且釋放連接，這裏給出一個親測有效的釋放連接的方式，使用jedis set作例子吧

public void set(String key, String value) {
        Jedis jedis = null;
        boolean broken = false;
        try {
            jedis = jedisPool.getResource();
            jedis.set(key, value);
        } catch (JedisConnectionException ex) {
            jedisPool.returnBrokenResource(jedis);
            broken = true;
            throw ex;
        }finally {
            if (jedis != null && !broken) {
                jedisPool.returnResource(jedis);
            }
        }
    }

3）jedis對超時的配置

通常來講有兩個，maxWaitMillis和timeout，前者是連接等待時間，後者是cmd執行時間。

a）maxWaitMillis：客戶端嘗試與redis創建連接，當達到這個配置值，則拋出異常can not get Resource from jedis pool

b）timeout：客戶端讀超時時間，當超過這個配置值，拋出異常read timeout。

固然通常來講，若是大規模的爆發這兩個異常，那麼單純修改這兩個值其實沒有什麼用！

4）若是你是高併發場景，請注意慢查詢

我遇到的慢查詢場景有兩個：

a）keys *

b）讀寫很大的數據塊100M

若是高併發的有上面兩種操做，那麼問題就暴露了，回到redis單進程單線程的模式，那麼當查詢很慢的時候，勢必會阻塞後面client的連接和cmd，若是慢查詢併發量很高，形成超時和連接失敗也就不足爲奇了。

遇到這些狀況，通常來講，都要改業務代碼。方式是：

a）拆key，把key拆小，下降單個cmd的執行時間；

b）橫向擴展redis集羣，經過分桶將這種類型的操做均衡到多個實例上去。

固然你能經過修改業務實現，不用keys或者不在redis裏緩存大量數據，最好了。

三、慢查詢分析

若是你須要運維本身的redis，那麼按期去查看redis的慢日誌會給你一些很好的發現。

慢日誌會幫你記錄redis的慢查詢，慢日誌的配置以下。

就兩個：

a）slowlog-log-slower-than，單位是us，超過這個值就會被認爲是慢查詢了 (負數：禁用慢查詢功能，0：記錄全部執行命令，也就是全部命令都大於0，正數：大於該值則記錄執行命令)
b）slowlog-max-len，單位是條數，這個值是保留最新redis緩存慢日誌的條數List，通常你能夠設置個1000條，分析夠用了。

################################## SLOW LOG ###################################

# The Redis Slow Log is a system to log queries that exceeded a specified
# execution time. The execution time does not include the I/O operations
# like talking with the client, sending the reply and so forth,
# but just the time needed to actually execute the command (this is the only
# stage of command execution where the thread is blocked and can not serve
# other requests in the meantime).
#
# You can configure the slow log with two parameters: one tells Redis
# what is the execution time, in microseconds, to exceed in order for the
# command to get logged, and the other parameter is the length of the
# slow log. When a new command is logged the oldest one is removed from the
# queue of logged commands.

# The following time is expressed in microseconds, so 1000000 is equivalent
# to one second. Note that a negative number disables the slow log, while
# a value of zero forces the logging of every command.
slowlog-log-slower-than 10000

# There is no limit to this length. Just be aware that it will consume memory.
# You can reclaim memory used by the slow log with SLOWLOG RESET.
slowlog-max-len 128

查詢慢日誌的方式是，經過redis-cli

一、slowlog get [n]

n：條數，可選

10.93.84.53:6379> slowlog get 10
 1) 1) (integer) 105142
    2) (integer) 1503742342
    3) (integer) 66494
    4) 1) "KEYS"
       2) "prometheus:report:fuse:*"
 2) 1) (integer) 105141
    2) (integer) 1503742336
    3) (integer) 67145
    4) 1) "KEYS"
       2) "prometheus:report:fuse:*"

4個值，從上到下依次是分日誌id、發生時間戳、命令耗時、執行命令和參數。

慢查詢功能能夠有效地幫助咱們找到Redis可能存在的瓶頸,但在實際使用過程當中要注意如下幾點:

slowlog-max-len:線上建議調大慢查詢列表,記錄慢查詢時Redis會對長命令作階段操做,並不會佔用大量內存.增大慢查詢列表能夠減緩慢查詢被剔除的可能,例如線上可設置爲1000以上.
slowlog-log-slower-than:默認值超過10毫秒斷定爲慢查詢,須要根據Redis併發量調整該值.因爲Redis採用單線程相應命令,對於高流量的場景,若是命令執行時間超過1毫秒以上,那麼Redis最多可支撐OPS不到1000所以對於高OPS場景下的Redis建議設置爲1毫秒.
慢查詢只記錄命令的執行時間,並不包括命令排隊和網絡傳輸時間.所以客戶端執行命令的時間會大於命令的實際執行時間.由於命令執行排隊機制,慢查詢會致使其餘命令級聯阻塞,所以客戶端出現請求超時時,須要檢查該時間點是否有對應的慢查詢,從而分析是否爲慢查詢致使的命令級聯阻塞.
因爲慢查詢日誌是一個先進先出的隊列,也就是說若是慢查詢比較多的狀況下,可能會丟失部分慢查詢命令,爲了防止這種狀況發生,能夠按期執行slowlog get命令將慢查詢日誌持久化到其餘存儲中(例如:MySQL、ElasticSearch等),而後能夠經過可視化工具進行查詢.

四、Linux內存不足

當你啓動redis的時候，你可能會看到以下warning

16890:M 21 Aug 12:43:18.354 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf 
and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.

若是overcommit_memory=0，linux的OOM機制在內存不足的狀況下，會自動選擇性Kill進程點數太高的進程。

如何修改這個值呢

// 修改這個文件
vim /etc/sysctl.conf
// 加上下面這個配置
vm.overcommit_memory = 1
// 在不重啓機器的狀況下生效
sysctl vm.overcommit_memory=1

0：表示內核將檢查是否有足夠的可用內存供應用進程使用；若是有足夠的可用內存，內存申請容許；不然，內存申請失敗，並把錯誤返回給應用進程。

1：表示內核容許分配全部的物理內存，而無論當前的內存狀態如何。

2：表示內核容許分配超過全部物理內存和交換空間總和的內存

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。