近日,參考文檔 html
http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.htmljava
在搭建Hadoop HA QJM集羣的時候,出現一個問題 。node
1、觀察到的現象以下git
HA按照規劃配置好,啓動後,NameNode不能正常啓動。剛啓動的時候 jps 看到了NameNode,可是隔了一兩分鐘,再看NameNode就不見了。web
可是測試以後,發現下面2種狀況:shell
1)先啓動JournalNode,再啓動Hdfs,NameNode能夠啓動並能夠正常運行apache
2)使用start-dfs.sh啓動,隔一段時間NameNode進程自動消失,再次hadoop-daemon.sh start namenode單獨啓動能夠成功穩定運行NameNode,kill掉 一個Active 的NameNode 或者 stop掉,主備節點能夠自動切換,不影響高可用方案。
收集好 NameNode 的日誌,以下。json
2016-11-26 22:01:35,294 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = master.com/192.168.31.136 STARTUP_MSG: args = [] STARTUP_MSG: version = 2.7.2 STARTUP_MSG: classpath = /home/jxlgzwh/hadoop-2.7.2/etc/hadoop:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/commons-compress-1.4.1.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/jersey-server-1.9.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/jets3t-0.9.0.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/jersey-core-1.9.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/hadoop-auth-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/commons-digester-1.8.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/log4j-1.2.17.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/curator-client-2.7.1.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/jetty-util-6.1.26.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/xmlenc-0.52.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/activation-1.1.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/curator-framework-2.7.1.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/netty-3.6.2.Final.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/commons-collections-3.2.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/htrace-core-3.1.0-incubating.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/jetty-6.1.26.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/commons-configuration-1.6.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/asm-3.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/commons-io-2.4.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/commons-codec-1.4.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/curator-recipes-2.7.1.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/mockito-all-1.8.5.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/commons-math3-3.1.1.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/commons-net-3.1.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/jsch-0.1.42.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/stax-api-1.0-2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/jsp-api-2.1.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/httpclient-4.2.5.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/guava-11.0.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/zookeeper-3.4.6.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/commons-lang-2.6.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/xz-1.0.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/hadoop-annotations-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/jersey-json-1.9.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/httpcore-4.2.5.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/avro-1.7.4.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/servlet-api-2.5.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/gson-2.2.4.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/commons-cli-1.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/junit-4.11.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/jettison-1.1.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/jsr305-3.0.0.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/commons-logging-1.1.3.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/hamcrest-core-1.3.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/slf4j-api-1.7.10.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/commons-httpclient-3.1.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/lib/paranamer-2.3.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/hadoop-nfs-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/hadoop-common-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/common/hadoop-common-2.7.2-tests.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/leveldbjni-all-1.8.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/xercesImpl-2.9.1.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/jackson-core-asl-1.9.13.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/htrace-core-3.1.0-incubating.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/asm-3.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/commons-io-2.4.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/jackson-mapper-asl-1.9.13.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/guava-11.0.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/commons-lang-2.6.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/jsr305-3.0.0.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/hadoop-hdfs-2.7.2-tests.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/hadoop-hdfs-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/hdfs/hadoop-hdfs-nfs-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/jersey-server-1.9.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/leveldbjni-all-1.8.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/jersey-core-1.9.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/log4j-1.2.17.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/jersey-client-1.9.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/jetty-util-6.1.26.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/activation-1.1.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/jackson-core-asl-1.9.13.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/jaxb-impl-2.2.3-1.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/netty-3.6.2.Final.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/commons-collections-3.2.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/aopalliance-1.0.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/jetty-6.1.26.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/asm-3.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/commons-io-2.4.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/commons-codec-1.4.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/jackson-mapper-asl-1.9.13.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/zookeeper-3.4.6-tests.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/javax.inject-1.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/stax-api-1.0-2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/jackson-jaxrs-1.9.13.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/guice-3.0.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/guava-11.0.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/zookeeper-3.4.6.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/commons-lang-2.6.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/xz-1.0.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/jackson-xc-1.9.13.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/jaxb-api-2.2.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/jersey-json-1.9.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/servlet-api-2.5.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/commons-cli-1.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/jettison-1.1.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/jsr305-3.0.0.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/lib/commons-logging-1.1.3.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-api-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-registry-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-client-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-common-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-server-common-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/yarn/hadoop-yarn-server-tests-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/leveldbjni-all-1.8.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/jackson-core-asl-1.9.13.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/asm-3.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/commons-io-2.4.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.9.13.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/javax.inject-1.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/guice-3.0.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/xz-1.0.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/hadoop-annotations-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/junit-4.11.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/hamcrest-core-1.3.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.2-tests.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.2.jar:/home/jxlgzwh/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.7.2.jar:/contrib/capacity-scheduler/*.jar:/contrib/capacity-scheduler/*.jar:/contrib/capacity-scheduler/*.jar STARTUP_MSG: build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r b165c4fe8a74265c792ce23f546c64604acf0e41; compiled by 'jenkins' on 2016-01-26T00:08Z STARTUP_MSG: java = 1.8.0_111 ************************************************************/ 2016-11-26 22:01:35,352 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT] 2016-11-26 22:01:35,375 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: createNameNode [] 2016-11-26 22:01:36,496 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2016-11-26 22:01:36,767 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2016-11-26 22:01:36,767 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system started 2016-11-26 22:01:36,771 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: fs.defaultFS is hdfs://mycluster 2016-11-26 22:01:36,774 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Clients are to use mycluster to access this namenode/service. 2016-11-26 22:01:37,394 INFO org.apache.hadoop.hdfs.DFSUtil: Starting Web-server for hdfs at: http://master:50070 2016-11-26 22:01:37,608 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2016-11-26 22:01:37,629 INFO org.apache.hadoop.security.authentication.server.AuthenticationFilter: Unable to initialize FileSignerSecretProvider, falling back to use random secrets. 2016-11-26 22:01:37,703 INFO org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.namenode is not defined 2016-11-26 22:01:37,716 INFO org.apache.hadoop.http.HttpServer2: Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter) 2016-11-26 22:01:37,724 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context hdfs 2016-11-26 22:01:37,725 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs 2016-11-26 22:01:37,725 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static 2016-11-26 22:01:37,818 INFO org.apache.hadoop.http.HttpServer2: Added filter 'org.apache.hadoop.hdfs.web.AuthFilter' (class=org.apache.hadoop.hdfs.web.AuthFilter) 2016-11-26 22:01:37,827 INFO org.apache.hadoop.http.HttpServer2: addJerseyResourcePackage: packageName=org.apache.hadoop.hdfs.server.namenode.web.resources;org.apache.hadoop.hdfs.web.resources, pathSpec=/webhdfs/v1/* 2016-11-26 22:01:37,897 INFO org.apache.hadoop.http.HttpServer2: Jetty bound to port 50070 2016-11-26 22:01:37,897 INFO org.mortbay.log: jetty-6.1.26 2016-11-26 22:01:40,790 INFO org.mortbay.log: Started HttpServer2$SelectChannelConnectorWithSafeStartup@master:50070 2016-11-26 22:01:41,378 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image storage directory (dfs.namenode.name.dir) configured. Beware of data loss due to lack of redundant storage directories! 2016-11-26 22:01:41,591 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: No KeyProvider found. 2016-11-26 22:01:41,592 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsLock is fair:true 2016-11-26 22:01:42,191 INFO org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000 2016-11-26 22:01:42,191 INFO org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true 2016-11-26 22:01:42,194 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000 2016-11-26 22:01:42,197 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: The block deletion will start around 2016 十一月 26 22:01:42 2016-11-26 22:01:42,201 INFO org.apache.hadoop.util.GSet: Computing capacity for map BlocksMap 2016-11-26 22:01:42,201 INFO org.apache.hadoop.util.GSet: VM type = 64-bit 2016-11-26 22:01:42,311 INFO org.apache.hadoop.util.GSet: 2.0% max memory 966.7 MB = 19.3 MB 2016-11-26 22:01:42,311 INFO org.apache.hadoop.util.GSet: capacity = 2^21 = 2097152 entries 2016-11-26 22:01:42,466 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: dfs.block.access.token.enable=false 2016-11-26 22:01:42,467 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: defaultReplication = 3 2016-11-26 22:01:42,467 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplication = 512 2016-11-26 22:01:42,467 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: minReplication = 1 2016-11-26 22:01:42,467 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplicationStreams = 2 2016-11-26 22:01:42,467 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: replicationRecheckInterval = 3000 2016-11-26 22:01:42,467 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: encryptDataTransfer = false 2016-11-26 22:01:42,467 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxNumBlocksToLog = 1000 2016-11-26 22:01:42,477 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner = jxlgzwh (auth:SIMPLE) 2016-11-26 22:01:42,477 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup = supergroup 2016-11-26 22:01:42,477 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled = false 2016-11-26 22:01:42,478 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Determined nameservice ID: mycluster 2016-11-26 22:01:42,478 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: HA Enabled: true 2016-11-26 22:01:42,480 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Append Enabled: true 2016-11-26 22:01:46,225 INFO org.apache.hadoop.util.GSet: Computing capacity for map INodeMap 2016-11-26 22:01:46,225 INFO org.apache.hadoop.util.GSet: VM type = 64-bit 2016-11-26 22:01:46,225 INFO org.apache.hadoop.util.GSet: 1.0% max memory 966.7 MB = 9.7 MB 2016-11-26 22:01:46,225 INFO org.apache.hadoop.util.GSet: capacity = 2^20 = 1048576 entries 2016-11-26 22:01:46,229 INFO org.apache.hadoop.hdfs.server.namenode.FSDirectory: ACLs enabled? false 2016-11-26 22:01:46,229 INFO org.apache.hadoop.hdfs.server.namenode.FSDirectory: XAttrs enabled? true 2016-11-26 22:01:46,229 INFO org.apache.hadoop.hdfs.server.namenode.FSDirectory: Maximum size of an xattr: 16384 2016-11-26 22:01:46,230 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names occuring more than 10 times 2016-11-26 22:01:46,276 INFO org.apache.hadoop.util.GSet: Computing capacity for map cachedBlocks 2016-11-26 22:01:46,276 INFO org.apache.hadoop.util.GSet: VM type = 64-bit 2016-11-26 22:01:46,277 INFO org.apache.hadoop.util.GSet: 0.25% max memory 966.7 MB = 2.4 MB 2016-11-26 22:01:46,277 INFO org.apache.hadoop.util.GSet: capacity = 2^18 = 262144 entries 2016-11-26 22:01:46,279 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033 2016-11-26 22:01:46,279 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0 2016-11-26 22:01:46,279 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000 2016-11-26 22:01:46,296 INFO org.apache.hadoop.hdfs.server.namenode.top.metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10 2016-11-26 22:01:46,296 INFO org.apache.hadoop.hdfs.server.namenode.top.metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10 2016-11-26 22:01:46,296 INFO org.apache.hadoop.hdfs.server.namenode.top.metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25 2016-11-26 22:01:46,299 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Retry cache on namenode is enabled 2016-11-26 22:01:46,299 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis 2016-11-26 22:01:46,303 INFO org.apache.hadoop.util.GSet: Computing capacity for map NameNodeRetryCache 2016-11-26 22:01:46,303 INFO org.apache.hadoop.util.GSet: VM type = 64-bit 2016-11-26 22:01:46,311 INFO org.apache.hadoop.util.GSet: 0.029999999329447746% max memory 966.7 MB = 297.0 KB 2016-11-26 22:01:46,311 INFO org.apache.hadoop.util.GSet: capacity = 2^15 = 32768 entries 2016-11-26 22:01:46,664 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /home/jxlgzwh/hadoop-2.7.2/data/tmp/dfs/name/in_use.lock acquired by nodename 4185@master.com 2016-11-26 22:01:58,241 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:01:58,251 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:01:58,251 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:01:59,251 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:01:59,255 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:01:59,255 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:00,061 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 6001 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet. 2016-11-26 22:02:00,251 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:00,256 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:00,273 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:01,067 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 7008 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet. 2016-11-26 22:02:01,260 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:01,261 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:01,281 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:02,069 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 8009 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet. 2016-11-26 22:02:02,261 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:02,261 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:02,282 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:03,069 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 9010 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet. 2016-11-26 22:02:03,262 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:03,263 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:03,284 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:04,072 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 10013 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet. 2016-11-26 22:02:04,263 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:04,264 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:04,285 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:05,075 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 11015 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet. 2016-11-26 22:02:05,265 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:05,269 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:05,287 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:06,080 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 12021 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet. 2016-11-26 22:02:06,265 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:06,272 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:06,290 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:07,086 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 13026 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet. 2016-11-26 22:02:07,268 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:07,275 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:07,299 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:07,306 WARN org.apache.hadoop.hdfs.server.namenode.FSEditLog: Unable to determine input streams from QJM to [192.168.31.136:8485, 192.168.31.130:8485, 192.168.31.229:8485]. Skipping. org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/3. 3 exceptions thrown: 192.168.31.136:8485: Call From master.com/192.168.31.136 to master:8485 failed on connection exception: java.net.ConnectException: 拒絕鏈接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused 192.168.31.229:8485: Call From master.com/192.168.31.136 to slave02:8485 failed on connection exception: java.net.ConnectException: 拒絕鏈接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused 192.168.31.130:8485: Call From master.com/192.168.31.136 to slave01:8485 failed on connection exception: java.net.ConnectException: 拒絕鏈接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81) at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223) at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:142) at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectInputStreams(QuorumJournalManager.java:471) at org.apache.hadoop.hdfs.server.namenode.JournalSet.selectInputStreams(JournalSet.java:278) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1508) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1532) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:652) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:975) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:681) at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:584) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:644) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:811) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:795) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1488) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1554) 2016-11-26 22:02:07,403 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: No edit log streams selected. 2016-11-26 22:02:07,910 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode: Loading 1 INodes. 2016-11-26 22:02:08,104 INFO org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf: Loaded FSImage in 0 seconds. 2016-11-26 22:02:08,104 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: Loaded image for txid 0 from /home/jxlgzwh/hadoop-2.7.2/data/tmp/dfs/name/current/fsimage_0000000000000000000 2016-11-26 22:02:08,113 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Need to save fs image? false (staleImage=true, haEnabled=true, isRollingUpgrade=false) 2016-11-26 22:02:08,130 INFO org.apache.hadoop.hdfs.server.namenode.NameCache: initialized with 0 entries 0 lookups 2016-11-26 22:02:08,130 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 21814 msecs 2016-11-26 22:02:11,546 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: RPC server is binding to master:8020 2016-11-26 22:02:11,610 INFO org.apache.hadoop.ipc.CallQueueManager: Using callQueue class java.util.concurrent.LinkedBlockingQueue 2016-11-26 22:02:11,781 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 8020 2016-11-26 22:02:12,209 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemState MBean 2016-11-26 22:02:12,435 INFO org.apache.hadoop.hdfs.server.namenode.LeaseManager: Number of blocks under construction: 0 2016-11-26 22:02:12,436 INFO org.apache.hadoop.hdfs.server.namenode.LeaseManager: Number of blocks under construction: 0 2016-11-26 22:02:12,436 INFO org.apache.hadoop.hdfs.StateChange: STATE* Leaving safe mode after 30 secs 2016-11-26 22:02:12,437 INFO org.apache.hadoop.hdfs.StateChange: STATE* Network topology has 0 racks and 0 datanodes 2016-11-26 22:02:12,437 INFO org.apache.hadoop.hdfs.StateChange: STATE* UnderReplicatedBlocks has 0 blocks 2016-11-26 22:02:12,574 INFO org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor: Number of failed storage changes from 0 to 0 2016-11-26 22:02:12,859 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: NameNode RPC up at: master/192.168.31.136:8020 2016-11-26 22:02:12,860 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required for standby state 2016-11-26 22:02:12,864 INFO org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Will roll logs on active node at slave01/192.168.31.130:8020 every 120 seconds. 2016-11-26 22:02:12,912 INFO org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Starting standby checkpoint thread... Checkpointing active NN at http://slave01:50070 Serving checkpoints at http://master:50070 2016-11-26 22:02:12,878 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting 2016-11-26 22:02:12,878 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8020: starting 2016-11-26 22:02:14,015 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:14,016 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:14,016 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:15,018 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:15,033 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:15,034 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:16,038 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:16,039 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:16,039 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:17,056 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:17,057 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:17,057 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:18,057 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:18,060 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:18,060 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:19,004 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 6003 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet. 2016-11-26 22:02:19,060 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:19,062 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:19,063 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:20,014 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 7013 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet. 2016-11-26 22:02:20,064 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:20,065 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:20,065 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:21,016 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 8014 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet. 2016-11-26 22:02:21,065 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:21,067 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:21,068 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:22,017 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 9016 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet. 2016-11-26 22:02:22,068 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:22,072 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:22,074 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:23,019 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 10017 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet. 2016-11-26 22:02:23,070 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:23,074 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:23,076 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:23,079 WARN org.apache.hadoop.hdfs.server.namenode.FSEditLog: Unable to determine input streams from QJM to [192.168.31.136:8485, 192.168.31.130:8485, 192.168.31.229:8485]. Skipping. org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/3. 3 exceptions thrown: 192.168.31.136:8485: Call From master.com/192.168.31.136 to master:8485 failed on connection exception: java.net.ConnectException: 拒絕鏈接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused 192.168.31.229:8485: Call From master.com/192.168.31.136 to slave02:8485 failed on connection exception: java.net.ConnectException: 拒絕鏈接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused 192.168.31.130:8485: Call From master.com/192.168.31.136 to slave01:8485 failed on connection exception: java.net.ConnectException: 拒絕鏈接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81) at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223) at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:142) at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectInputStreams(QuorumJournalManager.java:471) at org.apache.hadoop.hdfs.server.namenode.JournalSet.selectInputStreams(JournalSet.java:278) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1508) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1532) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:214) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:331) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:284) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:301) at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:297) 2016-11-26 22:02:24,714 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started for standby state 2016-11-26 22:02:24,718 WARN org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer interrupted java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:347) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:284) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:301) at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:297) 2016-11-26 22:02:24,795 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required for active state 2016-11-26 22:02:24,905 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Starting recovery process for unclosed journal segments... 2016-11-26 22:02:25,970 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:25,974 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:25,992 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:26,971 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:26,976 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:26,994 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:27,989 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:27,991 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:27,995 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:28,990 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:28,994 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:28,998 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:29,992 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:29,996 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:29,999 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:30,993 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:30,998 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:31,002 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:31,999 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:32,000 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:32,011 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:32,999 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:33,004 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:33,012 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:34,000 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:34,004 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:34,013 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:35,001 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.31.136:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:35,005 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave01/192.168.31.130:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:35,014 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: slave02/192.168.31.229:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2016-11-26 22:02:35,016 FATAL org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: recoverUnfinalizedSegments failed for required journal (JournalAndStream(mgr=QJM to [192.168.31.136:8485, 192.168.31.130:8485, 192.168.31.229:8485], stream=null)) org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/3. 3 exceptions thrown: 192.168.31.229:8485: Call From master.com/192.168.31.136 to slave02:8485 failed on connection exception: java.net.ConnectException: 拒絕鏈接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused 192.168.31.130:8485: Call From master.com/192.168.31.136 to slave01:8485 failed on connection exception: java.net.ConnectException: 拒絕鏈接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused 192.168.31.136:8485: Call From master.com/192.168.31.136 to master:8485 failed on connection exception: java.net.ConnectException: 拒絕鏈接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81) at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223) at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:142) at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createNewUniqueEpoch(QuorumJournalManager.java:182) at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.recoverUnfinalizedSegments(QuorumJournalManager.java:436) at org.apache.hadoop.hdfs.server.namenode.JournalSet$8.apply(JournalSet.java:624) at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393) at org.apache.hadoop.hdfs.server.namenode.JournalSet.recoverUnfinalizedSegments(JournalSet.java:621) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.recoverUnclosedStreams(FSEditLog.java:1439) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1112) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1710) at org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61) at org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:64) at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49) at org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1583) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1478) at org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107) at org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:4460) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) 2016-11-26 22:02:35,018 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1 2016-11-26 22:02:35,021 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at master.com/192.168.31.136 ************************************************************/ |
2、問題分析windows
咱們發現NameNode 一直在用RPC嘗試請求JournalNode 數據。並根據日誌,咱們發現最大嘗試請求次數爲10次(retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS))。api
根據start-dfs.sh 啓動日誌,發現是先啓動NN,後啓動JN,結合觀察到的現象,能夠判斷在JN沒有啓動好前,NN已經達到了嘗試請求的最大數,致使系統不能正常運行。
3、措施
修改core-site.xml中的ipc參數
<property> <name>ipc.client.connect.max.retries</name> <value>100</value> </property> <property> <name>ipc.client.connect.retry.interval</name> <value>10000</value> </property> |
Namenode向JournalNode發起的ipc鏈接請求的重試間隔時間和重試次數,集羣實驗大約須要2分鐘,NameNode便可鏈接上JournalNode。鏈接後很穩定。