hadoop集羣安裝完畢,在yarn的控制檯顯示節點id和節點地址都是localhostjava
hadoop@master sbin]$ yarn node -list 20/12/17 12:21:19 INFO client.RMProxy: Connecting to ResourceManager at master/172.16.8.42:18040 Total Nodes:1 Node-Id Node-State Node-Http-Address Number-of-Running-Containers localhost:43141 RUNNING localhost:8042 0
提交做業時在yarn的日誌中也打印出節點信息爲127.0.0.1,而且使用該ip做爲節點IP,確定鏈接出錯node
2020-12-17 00:53:30,721 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Setting up container Container: [ContainerId: container_1607916354082_0008_01_000001, AllocationRequestId: 0, Version: 0, NodeId: localhost:43141, NodeHttpAddress: localhost:8042, Resource: <memory:2048, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 127.0.0.1:35845 }, ExecutionType: GUARANTEED, ] for AM appattempt_1607916354082_0008_000001 020-12-17 00:56:30,801 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Error launching appattempt_1607916354082_0008_000001. Got exception: java.net.ConnectException: Call From master/172.16.8.42 to localhost:43141 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.GeneratedConstructorAccessor46.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:827) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:757) at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1553) at org.apache.hadoop.ipc.Client.call(Client.java:1495) at org.apache.hadoop.ipc.Client.call(Client.java:1394) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
在hadoop的源碼中,獲取節點信息的代碼以下apache
private NodeId buildNodeId(InetSocketAddress connectAddress,String hostOverride) { if (hostOverride != null) { connectAddress = NetUtils.getConnectAddress( new InetSocketAddress(hostOverride, connectAddress.getPort())); } return NodeId.newInstance( connectAddress.getAddress().getCanonicalHostName(), connectAddress.getPort()); }
其中主機名是經過connectAddress.getAddress().getCanonicalHostName()
進行獲取,咱們知道獲取主機名還能夠經過getHostName
獲取,那麼這兩種有什麼區別?getCanonicalHostName獲取的是全域名,getHostName獲取的是主機名,好比主機名是definesys但可能dns上面配的域名是definesys.com,getCanonicalHostName就是經過dns進行解析獲取全域名,實際上getAddress獲取到的是127.0.0.1,在hosts文件中是這樣配置的bash
127.0.0.1 localhost localhost.localdomain
所以解析成了localhostapp
在hadoop的推薦方案裏是這麼寫的dom
翻譯過來是建議刪除127.0.0.1 和 127.0.1.1在hosts中的配置,刪除後恢復正常,問題解決。ide