[root@iZ23nn1p4mjZ zookeeper-3.4.10]# bin/zkCli.sh -server localhost:2181,localhost:2182,localhost:2183 Connecting to localhost:2181,localhost:2182,localhost:2183 log4j:WARN No appenders could be found for logger (org.apache.zookeeper.ZooKeeper). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Welcome to ZooKeeper! JLine support is enabled [zk: localhost:2181,localhost:2182,localhost:2183(CONNECTING) 0] create -s /[sPacktPub] Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 3 at org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:706) at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:599) at org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:371) at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:331) at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:290) [root@iZ23nn1p4mjZ zookeeper-3.4.10]# ls /
上述問題近發生在單機三節點模式時。html
ZooKeeper在啓動時會將zookeeper.out輸出到當前目錄,不只不友好,有時候可能會由於目錄權限問題引起一些沒必要要的麻煩。java
解決方案:設置ZOO_LOG_DIR環境變量,也能夠改在shell中寫,以下:shell
if [ $ZOO_LOG_DIR ] ; then export ZOO_LOG_DIR=/usr/local/zookeeper/logs fi if [ -z $ZOO_LOG_DIR ];then export ZOO_LOG_DIR=/usr/local/zookeeper/logs fi
conf/log4j.properties中定義:apache
zookeeper.log.dir=/usr/local/zookeeper/logs zookeeper.tracelog.dir=/usr/local/zookeeper/logs
有時候在網絡變化、抖動或者服務器負載高的狀況下會出現下列異常:服務器
14:18:06.812 [main] ERROR org.apache.curator.ConnectionState - Connection timed out for connection string (127.0.0.1:2181) and timeout (5000) / elapsed (59220) org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = ConnectionLoss
重啓應用後仍然如此,重啓zk後便自動恢復。網絡
最近有個奇怪的現象{20200420更新},只要鏈接非本地,客戶端IDEA中老是第一次鏈接的時候超時,第二次就成功了,每一個客戶端均如此。以下:session
19:51:53.245 [main] INFO org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=xx.xx.xx.xx:2181 sessionTimeout=60000 watcher=org.apache.curator.ConnectionState@310d05c5 19:51:58.334 [main] ERROR org.apache.curator.ConnectionState - Connection timed out for connection string (xx.xx.xx.xx:2181) and timeout (5000) / elapsed (5104) org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = ConnectionLoss at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:225) [curator-client-2.13.0.jar:?] at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:94) [curator-client-2.13.0.jar:?] at org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:117) [curator-client-2.13.0.jar:?] at org.apache.curator.framework.imps.CuratorFrameworkImpl.getZooKeeper(CuratorFrameworkImpl.java:489) [curator-framework-2.13.0.jar:?] at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:740) [curator-framework-2.13.0.jar:?] at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:723) [curator-framework-2.13.0.jar:?] at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:109) [curator-client-2.13.0.jar:?] at org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:720) [curator-framework-2.13.0.jar:?] at org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:484) [curator-framework-2.13.0.jar:?] at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:474) [curator-framework-2.13.0.jar:?] at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:454) [curator-framework-2.13.0.jar:?] at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:44) [curator-framework-2.13.0.jar:?] at com.alibaba.dubbo.remoting.zookeeper.curator.CuratorZookeeperClient.createPersistent(CuratorZookeeperClient.java:96) [classes/:1.0.17]
鏈接超時時間過短,加大便可,有需求可有償解決。app
curator KeeperErrorCode = Unimplemented異常socket
Curator 4.0 supports ZooKeeper 3.4.x ensembles in a soft-compatibility mode. To use this mode you must exclude ZooKeeper when adding Curator to your dependency management tool.oop
Curator 4.0默認不支持zk 3.4,必須兼容性處理,以下:
<dependency> <groupId>org.apache.zookeeper</groupId> <artifactId>zookeeper</artifactId> <version>3.4.12</version> </dependency> <dependency> <groupId>org.apache.curator</groupId> <artifactId>curator-recipes</artifactId> <version>4.0.0</version> <!-- 4.0.0原生不兼容zk 3.4, 必須進行兼容性處理 --> <exclusions> <exclusion> <groupId>org.apache.zookeeper</groupId> <artifactId>zookeeper</artifactId> </exclusion> </exclusions> </dependency>
見http://curator.apache.org/zk-compatibility.html
其餘zk自身及zk與dubbo相關的異常可參見:http://www.javashuo.com/article/p-dbxfdyta-dn.html
2019-11-04 12:05:17,165 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data from client sessionid 0x1001e6b86a30003, likely client has closed socket 2019-11-04 12:05:17,166 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1040] - Closed socket connection for client /10.2.104.6:22804 which had sessionid 0x1001e6b86a30003 2019-11-04 12:05:17,240 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data from client sessionid 0x1001e6b86a30000, likely client has closed socket 2019-11-04 12:05:17,240 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1040] - Closed socket connection for client /10.2.104.6:22665 which had sessionid 0x1001e6b86a30000 2019-11-04 12:05:17,241 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data from client sessionid 0x1001e6b86a30001, likely client has closed socket 2019-11-04 12:05:17,241 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1040] - Closed socket connection for client /10.2.104.6:22672 which had sessionid 0x1001e6b86a30001 2019-11-04 12:05:17,241 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@376] - Unable to read additional data from client sessionid 0x1001e6b86a30002, likely client has closed socket 2019-11-04 12:05:17,242 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1040] - Closed socket connection for client /10.2.104.6:22673 which had sessionid 0x1001e6b86a30002 2019-11-04 12:05:47,455 [myid:] - INFO [SessionTracker:ZooKeeperServer@354] - Expiring session 0x1001e6b86a30002, timeout of 40000ms exceeded 2019-11-04 12:05:47,456 [myid:] - INFO [ProcessThread(sid:0 cport:2181)::PrepRequestProcessor@487] - Processed session termination for sessionid: 0x1001e6b86a30002 2019-11-04 12:05:51,456 [myid:] - INFO [SessionTracker:ZooKeeperServer@354] - Expiring session 0x1001e6b86a30003, timeout of 40000ms exceeded 2019-11-04 12:05:51,456 [myid:] - INFO [SessionTracker:ZooKeeperServer@354] - Expiring session 0x1001e6b86a30000, timeout of 40000ms exceeded 2019-11-04 12:05:51,456 [myid:] - INFO [ProcessThread(sid:0 cport:2181)::PrepRequestProcessor@487] - Processed session termination for sessionid: 0x1001e6b86a30003 2019-11-04 12:05:51,457 [myid:] - INFO [ProcessThread(sid:0 cport:2181)::PrepRequestProcessor@487] - Processed session termination for sessionid: 0x1001e6b86a30000 2019-11-04 12:05:53,456 [myid:] - INFO [SessionTracker:ZooKeeperServer@354] - Expiring session 0x1001e6b86a30001, timeout of 40000ms exceeded 2019-11-04 12:05:53,457 [myid:] - INFO [ProcessThread(sid:0 cport:2181)::PrepRequestProcessor@487] - Processed session termination for sessionid: 0x1001e6b86a30001
該異常看起來和https://www.cnblogs.com/zhyg/p/11089736.html所述很像。
zookeeper服務器日誌報鏈接數超出:
2020-04-16 17:16:11,135 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@211] - Too many connections from /10.20.25.225 - max is 60 2020-04-16 17:16:11,241 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@211] - Too many connections from /10.20.25.225 - max is 60 2020-04-16 17:16:11,371 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@211] - Too many connections from /10.20.25.225 - max is 60 2020-04-16 17:16:11,381 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@211] - Too many connections from /10.20.25.225 - max is 60 2020-04-16 17:16:11,463 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@211] - Too many connections from /10.20.25.225 - max is 60 2020-04-16 17:16:11,527 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@211] - Too many connections from /10.20.25.225 - max is 60 2020-04-16 17:16:11,553 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@211] - Too many connections from /10.20.25.225 - max is 60 2020-04-16 17:16:11,580 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@211] - Too many connections from /10.20.25.225 - max is 60 2020-04-16 17:16:11,609 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@211] - Too many connections from /10.20.25.225 - max is 60 2020-04-16 17:16:11,686 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@211] - Too many connections from /10.20.25.225 - max is 60 2020-04-16 17:16:11,796 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@211] - Too many connections from /10.20.25.225 - max is 60 2020-04-16 17:16:11,818 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@211] - Too many connections from /10.20.25.225 - max is 60 2020-04-16 17:16:11,853 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@211] - Too many connections from /10.20.25.225 - max is 60
該緣由大機率是客戶端有鏈接泄露,可經過netstat -ano | grep 2181分析,以下:
[root@hs-10-20-37-72 logs]# lsof -i:2181 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME java 2706 root 24u IPv6 18738 0t0 TCP *:eforward (LISTEN) java 2706 root 25u IPv6 3786862 0t0 TCP 10.20.25.225:eforward->10.20.25.225:63464 (ESTABLISHED) java 2706 root 26u IPv6 4748980 0t0 TCP 10.20.25.225:eforward->10.20.25.225:59063 (ESTABLISHED) java 2706 root 27u IPv6 6781917 0t0 TCP 10.20.25.225:eforward->10.20.25.225:51577 (ESTABLISHED) java 2706 root 28u IPv6 6781920 0t0 TCP 10.20.25.225:eforward->10.20.25.225:51578 (ESTABLISHED) java 2706 root 30u IPv6 6784242 0t0 TCP 10.20.25.225:eforward->10.20.25.225:51880 (ESTABLISHED) java 2706 root 31u IPv6 6784604 0t0 TCP 10.20.25.225:eforward->10.20.25.225:51989 (ESTABLISHED) java 2706 root 32u IPv6 6784610 0t0 TCP 10.20.25.225:eforward->10.20.25.225:51990 (ESTABLISHED) java 2706 root 33u IPv6 6788885 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52069 (ESTABLISHED) java 2706 root 34u IPv6 4412727 0t0 TCP 10.20.25.225:eforward->10.20.25.225:12441 (ESTABLISHED) java 2706 root 35u IPv6 6789241 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52127 (ESTABLISHED) java 2706 root 36u IPv6 6789263 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52130 (ESTABLISHED) java 2706 root 37u IPv6 6789329 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52140 (ESTABLISHED) java 2706 root 38u IPv6 6789422 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52149 (ESTABLISHED) java 2706 root 39u IPv6 6789572 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52166 (ESTABLISHED) java 2706 root 40u IPv6 6789693 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52183 (ESTABLISHED) java 2706 root 41u IPv6 6789807 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52200 (ESTABLISHED) java 2706 root 42u IPv6 6789981 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52220 (ESTABLISHED) java 2706 root 43u IPv6 6790066 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52239 (ESTABLISHED) java 2706 root 44u IPv6 6790173 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52250 (ESTABLISHED) java 2706 root 45u IPv6 6790238 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52259 (ESTABLISHED) java 2706 root 46u IPv6 6790287 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52272 (ESTABLISHED) java 2706 root 47u IPv6 6790356 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52285 (ESTABLISHED) java 2706 root 48u IPv6 6790447 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52297 (ESTABLISHED) java 2706 root 49u IPv6 6790485 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52307 (ESTABLISHED) java 2706 root 50u IPv6 6790597 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52337 (ESTABLISHED) java 2706 root 51u IPv6 6790638 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52343 (ESTABLISHED) java 2706 root 52u IPv6 6790677 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52356 (ESTABLISHED) java 2706 root 53u IPv6 6790726 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52368 (ESTABLISHED) java 2706 root 54u IPv6 6790842 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52387 (ESTABLISHED) java 2706 root 55u IPv6 6790871 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52395 (ESTABLISHED) java 2706 root 56u IPv6 6790883 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52402 (ESTABLISHED) java 2706 root 57u IPv6 6790992 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52415 (ESTABLISHED) java 2706 root 58u IPv6 6791084 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52433 (ESTABLISHED) java 2706 root 59u IPv6 6791118 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52449 (ESTABLISHED) java 2706 root 60u IPv6 6791187 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52460 (ESTABLISHED) java 2706 root 61u IPv6 6791270 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52479 (ESTABLISHED) java 2706 root 62u IPv6 6791334 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52488 (ESTABLISHED) java 2706 root 63u IPv6 6791427 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52501 (ESTABLISHED) java 2706 root 64u IPv6 6791470 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52518 (ESTABLISHED) java 2706 root 65u IPv6 6791542 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52531 (ESTABLISHED) java 2706 root 66u IPv6 6791631 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52545 (ESTABLISHED) java 2706 root 67u IPv6 6791669 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52555 (ESTABLISHED) java 2706 root 68u IPv6 6791719 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52566 (ESTABLISHED) java 2706 root 69u IPv6 6791797 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52581 (ESTABLISHED) java 2706 root 70u IPv6 6791847 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52592 (ESTABLISHED) java 2706 root 71u IPv6 6791883 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52602 (ESTABLISHED) java 2706 root 72u IPv6 6791931 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52615 (ESTABLISHED) java 2706 root 73u IPv6 6792017 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52627 (ESTABLISHED) java 2706 root 74u IPv6 6792056 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52641 (ESTABLISHED) java 2706 root 75u IPv6 6792090 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52648 (ESTABLISHED) java 2706 root 76u IPv6 6792199 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52661 (ESTABLISHED) java 2706 root 77u IPv6 6792283 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52681 (ESTABLISHED) java 2706 root 78u IPv6 6792299 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52688 (ESTABLISHED) java 2706 root 79u IPv6 6792345 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52698 (ESTABLISHED) java 2706 root 80u IPv6 6792413 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52709 (ESTABLISHED) java 2706 root 81u IPv6 6792495 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52727 (ESTABLISHED) java 2706 root 82u IPv6 6792563 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52736 (ESTABLISHED) java 2706 root 83u IPv6 6792625 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52745 (ESTABLISHED) java 2706 root 84u IPv6 6792674 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52758 (ESTABLISHED) java 2706 root 85u IPv6 6792755 0t0 TCP 10.20.25.225:eforward->10.20.25.225:52775 (ESTABLISHED) java 2706 root 86u IPv6 6290356 0t0 TCP 10.20.25.225:eforward->192.168.166.145:34608 (ESTABLISHED) java 2871 root 176u IPv6 4748979 0t0 TCP 10.20.25.225:59063->10.20.25.225:eforward (ESTABLISHED)
找到對應的進程和服務器。而後分析泄露,一般大機率是由於本身手寫zk相關鏈接和監控、而不是使用客戶端庫如curator所致。
21:12:03.568 [pool-taskSchedule-thread-54-SendThread(10.20.25.225:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server 10.20.25.225/10.20.25.225:2181. Will not attempt to authenticate using SASL (unknown error) 21:12:03.568 [pool-taskSchedule-thread-54-SendThread(10.20.25.225:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established to 10.20.25.225/10.20.25.225:2181, initiating session 21:12:03.576 [pool-taskSchedule-thread-54-SendThread(10.20.25.225:2181)] WARN org.apache.zookeeper.ClientCnxn - Session 0x0 for server 10.20.25.225/10.20.25.225:2181, unexpected error, closing socket connection and attempting reconnect java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[?:1.8.0_191] at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[?:1.8.0_191] at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[?:1.8.0_191] at sun.nio.ch.IOUtil.read(IOUtil.java:192) ~[?:1.8.0_191] at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) ~[?:1.8.0_191] at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68) ~[zookeeper-3.4.8.jar!/:3.4.8--1] at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366) ~[zookeeper-3.4.8.jar!/:3.4.8--1] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141) [zookeeper-3.4.8.jar!/:3.4.8--1] 21:12:03.729 [pool-taskSchedule-thread-52-SendThread(10.20.25.225:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server 10.20.25.225/10.20.25.225:2181. Will not attempt to authenticate using SASL (unknown error) 21:12:03.730 [pool-taskSchedule-thread-52-SendThread(10.20.25.225:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established to 10.20.25.225/10.20.25.225:2181, initiating session 21:12:03.730 [pool-taskSchedule-thread-52-SendThread(10.20.25.225:2181)] WARN org.apache.zookeeper.ClientCnxn - Session 0x0 for server 10.20.25.225/10.20.25.225:2181, unexpected error, closing socket connection and attempting reconnect java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[?:1.8.0_191] at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[?:1.8.0_191] at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[?:1.8.0_191] at sun.nio.ch.IOUtil.read(IOUtil.java:192) ~[?:1.8.0_191] at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) ~[?:1.8.0_191] at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68) ~[zookeeper-3.4.8.jar!/:3.4.8--1] at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366) ~[zookeeper-3.4.8.jar!/:3.4.8--1] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141) [zookeeper-3.4.8.jar!/:3.4.8--1]
Caused by: org.apache.zookeeper.KeeperException$OperationTimeoutException: KeeperErrorCode = OperationTimeout
at com.dangdang.ddframe.job.reg.zookeeper.ZookeeperRegistryCenter.init(ZookeeperRegistryCenter.java:102) ~[elastic-job-common-core-2.1.5.jar:?]
檢查網絡是否正常,ping/telnet看下,基本上就這一個緣由。
This ZooKeeper instance is not currently serving requests
說明zookeeper還活着,但不提供服務了。此時要查看zk的日誌,dataLogDir指定的目錄,通常裏面能夠看到爲何zk不提供服務了。以下:
2020-08-17 00:13:20,050 [myid:0] - INFO [WorkerReceiver[myid=0]:FastLeaderElection@595] - Notification: 1 (message format version), 2 (n.leader), 0x103c00000494 (n.zxid), 0x17af2 (n.round), LOOKING (n.state), 2 (n.sid), 0x103c (n.peerEpoch) LEADING (my state) 2020-08-17 00:13:20,252 [myid:0] - INFO [WorkerReceiver[myid=0]:FastLeaderElection@595] - Notification: 1 (message format version), 2 (n.leader), 0x103c00000494 (n.zxid), 0x17af2 (n.round), LOOKING (n.state), 2 (n.sid), 0x103c (n.peerEpoch) LEADING (my state) 2020-08-17 00:13:20,654 [myid:0] - INFO [WorkerReceiver[myid=0]:FastLeaderElection@595] - Notification: 1 (message format version), 2 (n.leader), 0x103c00000494 (n.zxid), 0x17af2 (n.round), LOOKING (n.state), 2 (n.sid), 0x103c (n.peerEpoch) LEADING (my state) 2020-08-17 00:13:21,463 [myid:0] - INFO [WorkerReceiver[myid=0]:FastLeaderElection@595] - Notification: 1 (message format version), 2 (n.leader), 0x103c00000494 (n.zxid), 0x17af2 (n.round), LOOKING (n.state), 2 (n.sid), 0x103c (n.peerEpoch) LEADING (my state) 2020-08-17 00:13:23,065 [myid:0] - INFO [WorkerReceiver[myid=0]:FastLeaderElection@595] - Notification: 1 (message format version), 2 (n.leader), 0x103c00000494 (n.zxid), 0x17af2 (n.round), LOOKING (n.state), 2 (n.sid), 0x103c (n.peerEpoch) LEADING (my state) 2020-08-17 00:13:26,271 [myid:0] - INFO [WorkerReceiver[myid=0]:FastLeaderElection@595] - Notification: 1 (message format version), 2 (n.leader), 0x103c00000494 (n.zxid), 0x17af2 (n.round), LOOKING (n.state), 2 (n.sid), 0x103c (n.peerEpoch) LEADING (my state) 2020-08-17 00:13:32,679 [myid:0] - INFO [WorkerReceiver[myid=0]:FastLeaderElection@595] - Notification: 1 (message format version), 2 (n.leader), 0x103c00000494 (n.zxid), 0x17af2 (n.round), LOOKING (n.state), 2 (n.sid), 0x103c (n.peerEpoch) LEADING (my state) 2020-08-17 00:13:40,040 [myid:0] - WARN [QuorumPeer[myid=0]/0:0:0:0:0:0:0:0:2181:QuorumPeer@998] - Unexpected exception java.lang.InterruptedException: Timeout while waiting for epoch to be acked by quorum at org.apache.zookeeper.server.quorum.Leader.waitForEpochAck(Leader.java:936) at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:415) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:995) 2020-08-17 00:13:40,040 [myid:0] - INFO [QuorumPeer[myid=0]/0:0:0:0:0:0:0:0:2181:Leader@520] - Shutting down 2020-08-17 00:13:40,040 [myid:0] - INFO [QuorumPeer[myid=0]/0:0:0:0:0:0:0:0:2181:Leader@526] - Shutdown called java.lang.Exception: shutdown Leader! reason: Forcing shutdown at org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:526) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1001) 2020-08-17 00:13:40,041 [myid:0] - INFO [LearnerCnxAcceptor-/10.20.30.17:2182:Leader$LearnerCnxAcceptor@340] - exception while shutting down acceptor: java.net.SocketException: Socket closed 2020-08-17 00:13:40,041 [myid:0] - INFO [QuorumPeer[myid=0]/0:0:0:0:0:0:0:0:2181:QuorumPeer@910] - LOOKING 2020-08-17 00:13:40,042 [myid:0] - INFO [QuorumPeer[myid=0]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@813] - New election. My id = 0, proposed zxid=0x103c00000496 2020-08-17 00:13:40,042 [myid:0] - INFO [WorkerReceiver[myid=0]:FastLeaderElection@595] - Notification: 1 (message format version), 0 (n.leader), 0x103c00000496 (n.zxid), 0x17af2 (n.round), LOOKING (n.state), 0 (n.sid), 0x103c (n.peerEpoch) LOOKING (my state) 2020-08-17 00:13:40,043 [myid:0] - WARN [WorkerSender[myid=0]:QuorumCnxManager@584] - Cannot open channel to 1 at election address/*緣由和解決方法可見http://www.javashuo.com/article/p-npkfofmn-hn.html*/ /10.20.30.13:2183 java.net.ConnectException: Connection refused (Connection refused) at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:558) at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:534) at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:454) at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:435) at java.lang.Thread.run(Thread.java:748) 2020-08-17 00:13:40,043 [myid:0] - INFO [WorkerSender[myid=0]:QuorumPeer$QuorumServer@185] - Resolved hostname: 10.20.30.13 to address: /10.20.30.13 2020-08-17 00:13:40,045 [myid:0] - INFO [WorkerReceiver[myid=0]:FastLeaderElection@595] - Notification: 1 (message format version), 0 (n.leader), 0x103c00000496 (n.zxid), 0x17af2 (n.round), LOOKING (n.state), 2 (n.sid), 0x103c (n.peerEpoch) LOOKING (my state) 2020-08-17 00:13:40,245 [myid:0] - INFO [QuorumPeer[myid=0]/0:0:0:0:0:0:0:0:2181:QuorumPeer@992] - LEADING 2020-08-17 00:13:40,263 [myid:0] - INFO [QuorumPeer[myid=0]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@174] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datad ir /home/component/zookeeper/log/version-2 snapdir /home/component/zookeeper/data/version-2 2020-08-17 00:13:40,264 [myid:0] - INFO [QuorumPeer[myid=0]/0:0:0:0:0:0:0:0:2181:Leader@380] - LEADING - LEADER ELECTION TOOK - 222 2020-08-17 00:13:41,271 [myid:0] - INFO [LearnerHandler-/10.20.30.18:11030:LearnerHandler@346] - Follower sid: 2 : info : org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@1efeac 2020-08-17 00:13:41,307 [myid:0] - ERROR [LearnerHandler-/10.20.30.18:11030:LearnerHandler@648] - Unexpected exception causing shutdown while sock still open java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:85) at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99) at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:373) 2020-08-17 00:13:41,308 [myid:0] - WARN [LearnerHandler-/10.20.30.18:11030:LearnerHandler@661] - ******* GOODBYE /10.20.30.18:11030 ********
因此,此處根本緣由是集羣的其餘節點連不上,致使節點異常退出。可是socket沒有關閉,這樣是爲了在其餘節點恢復後,集羣總體自動恢復。