上次Redis MQ分佈式改造完成以後, 編排的容器穩定運行了一個多月,昨天忽然收到ETL端同事通知,沒有采集到解析日誌了。html
趕忙進服務器看了一下,用於數據接收的receiver容器掛掉了, 嘗試docker container start [containerid], 幾分鐘後該容器再次崩潰。 git
docker log [containerid] 查看容器日誌; 重點:CSRedis.RedisException: ERR max number of clients reachedgithub
Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker[2] Executed action EqidManager.Controllers.EqidController.BatchPutEqidAndProfileIds (EqidReceiver) in 7.1767ms fail: Microsoft.AspNetCore.Server.Kestrel[13] Connection id "0HLPR3AP8ODKH", Request id "0HLPR3AP8ODKH:00000001": An unhandled exception was thrown by the application. CSRedis.RedisException: ERR max number of clients reached at CSRedis.CSRedisClient.GetAndExecute[T](RedisClientPool pool, Func`2 handler, Int32 jump, Int32 errtimes) at CSRedis.CSRedisClient.ExecuteScalar[T](String key, Func`3 hander) at CSRedis.CSRedisClient.LPush[T](String key, T[] value) at RedisHelper.LPush[T](String key, T[] value) at EqidManager.Controllers.EqidController.BatchPutEqidAndProfileIds(List`1 eqidPairs) in /home/gitlab-runner/builds/haD2h5xC/0/webdissector/datasource/eqid-manager/src/EqidReceiver/Controllers/EqidController.cs:line 31 at lambda_method(Closure , Object ) at Microsoft.Extensions.Internal.ObjectMethodExecutorAwaitable.Awaiter.GetResult() at Microsoft.AspNetCore.Mvc.Internal.ActionMethodExecutor.AwaitableResultExecutor.Execute(IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments) at System.Threading.Tasks.ValueTask`1.get_Result() at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeActionMethodAsync() at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeNextActionFilterAsync() at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.Rethrow(ActionExecutedContext context) at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted) at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeInnerFilterAsync() at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.InvokeNextResourceFilter() at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.Rethrow(ResourceExecutedContext context) at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted) at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.InvokeFilterPipelineAsync() at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.InvokeAsync() at Microsoft.AspNetCore.Builder.RouterMiddleware.Invoke(HttpContext httpContext) at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol.ProcessRequests[TContext](IHttpApplication`1 application) info: Microsoft.AspNetCore.Hosting.Internal.WebHost[2] Request finished in 8.9549ms 500 【dockerhost:6379/0】仍然不可用,下一次恢復檢查時間:09/17/2019 03:11:25,錯誤:(ERR max number of clients reached) 【dockerhost:6379/0】仍然不可用,下一次恢復檢查時間:09/17/2019 03:11:25,錯誤:(ERR max number of clients reached) 【dockerhost:6379/0】仍然不可用,下一次恢復檢查時間:09/17/2019 03:11:25,錯誤:(ERR max number of clients reached) 【dockerhost:6379/0】仍然不可用,下一次恢復檢查時間:09/17/2019 03:11:25,錯誤:(ERR max number of clients reached)【dockerhost:6379/0】仍然不可用,下一次恢復檢查時間:09/17/2019 03:11:25,錯誤:(ERR max number of clients reached) 【dockerhost:6379/0】仍然不可用,下一次恢復檢查時間:09/17/2019 03:11:25,錯誤:(ERR max number of clients reached) 【dockerhost:6379/0】仍然不可用,下一次恢復檢查時間:09/17/2019 03:11:25,錯誤:(ERR max number of clients reached)
日誌上顯示鏈接Redis服務器的客戶端數量超限,頭腦快速思考,目前編排的某容器使用CSRedisCore 對於16個Redis DB實例化了16個客戶端,但Redis服務器也不至於這麼不經摺騰吧。web
趕忙進redis.io官網蒐集相關資料。redis
After the client is initialized, Redis checks if we are already at the limit of the number of clients that it is possible to handle simultaneously (this is configured using the
maxclients
configuration directive, see the next section of this document for further information).dockerIn case it can't accept the current client because the maximum number of clients was already accepted, Redis tries to send an error to the client in order to make it aware of this condition, and closes the connection immediately. The error message will be able to reach the client even if the connection is closed immediately by Redis because the new socket output buffer is usually big enough to contain the error, so the kernel will handle the transmission of the error.api
大體意思是:Redis服務器maxclients配置了最大客戶端鏈接數, 若是當前鏈接的客戶端超限,Redis會回發一個錯誤消息給客戶端,並迅速關閉客戶端鏈接。服務器
馬上登陸Redis服務器查看默認配置,確認當前Redis服務器maxclients=10000(這是一個動態值,由maxclients和最大進程文件句柄決定),數據結構
# Set the max number of connected clients at the same time. By default
# this limit is set to 10000 clients, however if the Redis server is not
# able to configure the process file limit to allow for the specified limit
# the max number of allowed clients is set to the current file limit
# minus 32 (as Redis reserves a few file descriptors for internal uses).
#
# Once the limit is reached Redis will close all the new connections sending
# an error 'max number of clients reached'.
# maxclients 10000併發
左圖代表:經過Redis-Cli 登陸進服務器當即就被踢下線。
基本可認定redis客戶端使用方式有問題。
繼續查看相關資料,可在redis服務器上利用redis-cli命令:info clients、client list仔細分析客戶端鏈接。
info clients 命令顯示現場確實有10000的鏈接數;
id=2205 addr=172.16.1.3:36954 fd=1276 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=2215 addr=172.16.1.3:45923 fd=1277 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=2216 addr=172.16.1.3:44233 fd=1278 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=2217 addr=172.16.1.3:41144 fd=1279 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=2218 addr=172.16.1.3:44528 fd=1280 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=2219 addr=172.16.1.3:41626 fd=1281 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=2220 addr=172.16.1.3:39045 fd=1282 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=2221 addr=172.16.1.3:42862 fd=1283 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=2222 addr=172.16.1.3:41356 fd=1284 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=2223 addr=172.16.1.3:36076 fd=1285 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1798 addr=172.16.1.3:44865 fd=1070 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1799 addr=172.16.1.3:40315 fd=1072 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1800 addr=172.16.1.3:44051 fd=1073 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1801 addr=172.16.1.3:45183 fd=1074 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1802 addr=172.16.1.3:42352 fd=1075 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1803 addr=172.16.1.3:44401 fd=1076 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1804 addr=172.16.1.3:41325 fd=1068 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1805 addr=172.16.1.3:42309 fd=1069 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1815 addr=172.16.1.3:33341 fd=1077 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1816 addr=172.16.1.3:42546 fd=1078 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1817 addr=172.16.1.3:43985 fd=1079 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=lpush id=1818 addr=172.16.1.3:44852 fd=1080 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1819 addr=172.16.1.3:40936 fd=1081 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1820 addr=172.16.1.3:36922 fd=1082 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1821 addr=172.16.1.3:40507 fd=1083 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1822 addr=172.16.1.3:37327 fd=1084 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1823 addr=172.16.1.3:44966 fd=1085 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1825 addr=172.16.1.3:38138 fd=1086 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1836 addr=172.16.1.3:45613 fd=1087 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1837 addr=172.16.1.3:39475 fd=1089 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1838 addr=172.16.1.3:43459 fd=1090 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1839 addr=172.16.1.3:37892 fd=1088 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1840 addr=172.16.1.3:40415 fd=1092 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1841 addr=172.16.1.3:37844 fd=1093 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1842 addr=172.16.1.3:34432 fd=1091 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1843 addr=172.16.1.3:38402 fd=1094 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1844 addr=172.16.1.3:41417 fd=1095 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1845 addr=172.16.1.3:44452 fd=1096 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1856 addr=172.16.1.3:37699 fd=1097 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1857 addr=172.16.1.3:43107 fd=1098 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1858 addr=172.16.1.3:46324 fd=1099 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1859 addr=172.16.1.3:33636 fd=1100 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1860 addr=172.16.1.3:42645 fd=1101 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1861 addr=172.16.1.3:46533 fd=1102 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1862 addr=172.16.1.3:45811 fd=1103 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1863 addr=172.16.1.3:43083 fd=1104 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1864 addr=172.16.1.3:34539 fd=1105 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1865 addr=172.16.1.3:43872 fd=1106 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1874 addr=172.16.1.3:32960 fd=1107 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1875 addr=172.16.1.3:42920 fd=1109 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1876 addr=172.16.1.3:37355 fd=1110 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1877 addr=172.16.1.3:41505 fd=1111 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1878 addr=172.16.1.3:34633 fd=1112 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1879 addr=172.16.1.3:44362 fd=1114 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1882 addr=172.16.1.3:41947 fd=1108 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1883 addr=172.16.1.3:46534 fd=1113 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1884 addr=172.16.1.3:36814 fd=1115 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1885 addr=172.16.1.3:42278 fd=1116 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1893 addr=172.16.1.3:43971 fd=1120 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1894 addr=172.16.1.3:42935 fd=1122 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1896 addr=172.16.1.3:42742 fd=1119 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1898 addr=172.16.1.3:34410 fd=1117 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1899 addr=172.16.1.3:34112 fd=1118 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1900 addr=172.16.1.3:39654 fd=1121 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1901 addr=172.16.1.3:41308 fd=1123 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1902 addr=172.16.1.3:44353 fd=1125 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1904 addr=172.16.1.3:36208 fd=1126 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1905 addr=172.16.1.3:32785 fd=1124 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1915 addr=172.16.1.3:32928 fd=1127 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=1916 addr=172.16.1.3:43645 fd=1128 name= age=26 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping
官方對client list命令輸出字段的解釋:
addr: The client address, that is, the client IP and the remote port number it used to connect with the Redis server.
fd: The client socket file descriptor number.
name: The client name as set by CLIENT SETNAME.
age: The number of seconds the connection existed for.
idle: The number of seconds the connection is idle.
flags: The kind of client (N means normal client, check the full list of flags).
omem: The amount of memory used by the client for the output buffer.
cmd: The last executed command.
根據以上解釋,代表 Redis服務器收到不少ip=172.16.1.3(故障容器在網橋內的Ip 地址)的客戶端鏈接,這些鏈接最後發出的是ping命令(這是一個測試命令)
故障容器使用的Redis客戶端是CSRedisCore,該客戶端只是單純將 Msg 寫入Redis list 數據結構,CSRedisCore上相關github issue給了我一些啓發。
發現本身將CSRedisClient實例化代碼寫在 .netcore api Controller構造函數,這樣每次請求構造Controller時都實例化一次Redis客戶端,最終Redis客戶端鏈接數達到最大容許鏈接值。
依賴注入三種模式: 單例(系統內單一實例,一次性注入);瞬態(每次請求產生實例並注入);自定義範圍。
有關dotnet apicontroller 以瞬態模式 注入,請查閱連接。
還有一個疑問? 爲何Redis服務器沒有釋放空閒的 客戶端鏈接,若是空閒鏈接被釋放了,即便我寫了low代碼也不至於如此吧?
查詢官方:
By default recent versions of Redis don't close the connection with the client if the client is idle for many seconds: the connection will remain open forever.
However if you don't like this behavior, you can configure a timeout, so that if the client is idle for more than the specified number of seconds, the client connection will be closed.
You can configure this limit via
redis.conf
or simply usingCONFIG SET timeout <value>
.
大體意思是最近的Redis服務端版本 默認不會釋放空閒的客戶端鏈接:
# Close the connection after a client is idle for N seconds (0 to disable)
timeout 0
可經過修改Redis配置釋放 空閒客戶端鏈接。
咱們最佳實踐固然不是修改Redis idle timeout 配置,問題核心仍是由於我實例化了多客戶端,趕忙將CSRedisCore實例化代碼移到startup.cs並註冊爲單例。
info clients命令顯示穩定在53個Redis鏈接。
client list命令顯示:172.16.1.3(故障容器)創建了50個客戶端鏈接,編排的另外一個容器webapp創建了2個鏈接,redis-cli命令登陸到服務器創建了1個鏈接。
127.0.0.1:6379> client list id=20409 addr=172.16.1.3:44834 fd=18 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20410 addr=172.16.1.3:39881 fd=20 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20411 addr=172.16.1.3:42756 fd=17 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20419 addr=172.16.1.3:46224 fd=21 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20423 addr=172.16.1.3:34748 fd=28 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20424 addr=172.16.1.3:37483 fd=22 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20425 addr=172.16.1.3:44064 fd=29 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20426 addr=172.16.1.3:43993 fd=25 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20427 addr=172.16.1.3:34092 fd=24 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20428 addr=172.16.1.3:35347 fd=27 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20429 addr=172.16.1.3:46126 fd=30 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20430 addr=172.16.1.3:42627 fd=23 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20431 addr=172.16.1.3:35098 fd=26 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20442 addr=172.16.1.3:34471 fd=31 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20443 addr=172.16.1.3:35092 fd=32 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20444 addr=172.16.1.3:46168 fd=33 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20445 addr=172.16.1.3:42879 fd=34 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20446 addr=172.16.1.3:46627 fd=35 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20447 addr=172.16.1.3:44731 fd=36 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20448 addr=172.16.1.3:36705 fd=37 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20449 addr=172.16.1.3:38668 fd=38 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20450 addr=172.16.1.3:45484 fd=39 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20451 addr=172.16.1.3:40802 fd=40 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20459 addr=172.16.1.3:36973 fd=45 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20460 addr=172.16.1.3:37814 fd=41 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20462 addr=172.16.1.3:44642 fd=44 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20463 addr=172.16.1.3:35272 fd=42 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20465 addr=172.16.1.3:42843 fd=47 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20466 addr=172.16.1.3:46785 fd=48 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20468 addr=172.16.1.3:38481 fd=49 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20504 addr=127.0.0.1:40902 fd=60 name= age=1478 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=26 qbuf-free=32742 obl=0 oll=0 omem=0 events=r cmd=client id=20469 addr=172.16.1.3:45822 fd=50 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20470 addr=172.16.1.3:37211 fd=43 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20471 addr=172.16.1.3:39386 fd=46 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20481 addr=172.16.1.3:37346 fd=51 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20482 addr=172.16.1.3:42387 fd=52 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20483 addr=172.16.1.3:41523 fd=53 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20484 addr=172.16.1.3:37088 fd=54 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20485 addr=172.16.1.3:41371 fd=55 name=receiver age=73384 idle=8 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=lpush id=20486 addr=172.16.1.3:34362 fd=56 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20487 addr=172.16.1.3:45409 fd=57 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20488 addr=172.16.1.3:36119 fd=58 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20489 addr=172.16.1.3:46631 fd=59 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20389 addr=172.16.1.3:42971 fd=8 name=receiver age=73387 idle=73387 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20392 addr=172.16.1.4:42699 fd=11 name=f176f125a4c5 age=73386 idle=4 flags=P db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20393 addr=172.16.1.4:40179 fd=12 name=f176f125a4c5 age=73386 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=rpop id=20400 addr=172.16.1.3:36255 fd=10 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20401 addr=172.16.1.3:36118 fd=9 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20402 addr=172.16.1.3:42346 fd=13 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20403 addr=172.16.1.3:40437 fd=14 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20404 addr=172.16.1.3:37910 fd=15 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20405 addr=172.16.1.3:35374 fd=16 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping id=20408 addr=172.16.1.3:34197 fd=19 name=receiver age=73384 idle=73384 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=pin
那麼問題來了,修改以後,receiver容器爲何還穩定創建了50個redis鏈接?
進一步與CSRedisCore原做者溝通,肯定CSRedisCore有預熱機制,默認在鏈接池中預熱了50個鏈接。
bingo,故障和困惑所有排查清楚。
經此一役,在使用CSRedisCore客戶端時, 要深刻理解
① Stackexchange.Redis 使用的多路複用鏈接機制(使用時很容易想到註冊到單例),CSRedisCore開源庫採用鏈接池機制,在高併發場景下強烈建議註冊爲單例, 不然在生產使用中可能會誤用在瞬態請求中實例化,致使redis客戶端幾天以後被佔滿。
② CSRedisCore會默認創建鏈接池,預熱50個鏈接, 開發者內心要有數。
額外的方法論: 儘可能不要從某度找答案,要學會問問題,並嘗試從官方、stackoverflow 、github社區尋求解答,你挖過的坑也許別人早就挖過並踏平過。
------------------------------update 多說兩句---------------------------------------------
不少博友說問題在於我沒有細看CSRedisCore官方readme(readme推薦使用單例),使用方式上我確實沒有作成單例:
③ 通常鏈接池都會有空閒釋放回收機制 (CSRedisCore也是鏈接池機制),因此當時並無把 單例放在心上
④ 本次重要知識點:Redis默認並不會釋放空閒客戶端鏈接(可是又設置了最大鏈接數),這也直接促成了本次容器崩潰事故。
嗯,坑是本身挖的。