#該主機已與 Cloudera Manager Server 未創建聯繫
1
1
#該主機已與 Cloudera Manager Server 未創建聯繫

server端monitor服務正常agent連不上
#該主機已與 Cloudera Manager Server 創建聯繫。 該主機未與 Host Monitor 創建聯繫。 [20/Feb/2020 16:51:51 +0000] 22086 MonitorDaemon-Reporter firehoses INFO Creating a connection to the ACTIVITYMONITOR. [20/Feb/2020 16:51:51 +0000] 22086 MonitorDaemon-Reporter firehoses INFO Creating a connection to the SERVICEMONITOR. [20/Feb/2020 16:51:51 +0000] 22086 MonitorDaemon-Reporter firehoses INFO Creating a connection to the HOSTMONITOR. [20/Feb/2020 16:51:51 +0000] 22086 MonitorDaemon-Reporter throttling_logger ERROR Error sending messages to firehose: mgmt-HOSTMONITOR-d592ed6aea0516a09027c2cf834d8979 Traceback (most recent call last): File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/monitor/firehose.py", line 121, in _send self._port) File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/avro/ipc.py", line 469, in __init__ self.conn.connect() File "/usr/lib64/python2.7/httplib.py", line 833, in connect self.timeout, self.source_address) File "/usr/lib64/python2.7/socket.py", line 571, in create_connection raise err error: [Errno 111] Connection refused
15
1
#該主機已與 Cloudera Manager Server 創建聯繫。 該主機未與 Host Monitor 創建聯繫。
2
[20/Feb/2020 16:51:51 +0000] 22086 MonitorDaemon-Reporter firehoses INFO Creating a connection to the ACTIVITYMONITOR.
3
[20/Feb/2020 16:51:51 +0000] 22086 MonitorDaemon-Reporter firehoses INFO Creating a connection to the SERVICEMONITOR.
4
[20/Feb/2020 16:51:51 +0000] 22086 MonitorDaemon-Reporter firehoses INFO Creating a connection to the HOSTMONITOR.
5
[20/Feb/2020 16:51:51 +0000] 22086 MonitorDaemon-Reporter throttling_logger ERROR Error sending messages to firehose: mgmt-HOSTMONITOR-d592ed6aea0516a09027c2cf834d8979
6
Traceback (most recent call last):
7
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/monitor/firehose.py", line 121, in _send
8
self._port)
9
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/avro/ipc.py", line 469, in __init__
10
self.conn.connect()
11
File "/usr/lib64/python2.7/httplib.py", line 833, in connect
12
self.timeout, self.source_address)
13
File "/usr/lib64/python2.7/socket.py", line 571, in create_connection
14
raise err
15
error: [Errno 111] Connection refused
參考:

server日誌裏
2020-02-20 17:25:06,371 WARN New I/O boss #388:com.cloudera.server.cmf.log.AgentResponseAsyncHandler: (2 skipped) Exception thrown while trying to get log search results from agent on host: creative
java.net.ConnectException: Connection timed out: creative/172.19.40.203:9000
。。
2020-02-20 17:35:17,209 ERROR ParcelUpdateService:com.cloudera.parcel.components.ParcelDownloaderImpl: (10 skipped) Unable to retrieve remote parcel repository manifest
java.util.concurrent.ExecutionException: java.net.UnknownHostException: archive.cloudera.com: Name or service not known
cloudera agent monitor firehose error: [Errno 111] Connection refused #從新添加主機 2020-02-20 20:19:57,879 ERROR scm-web-4143:com.cloudera.cmf.model.DbCommand: Command null(DeployClusterClientConfig) has completed. finalstate:FINISHED, success:false, msg:Command Deploy Client Configuration is not currently available for execution. 2020-02-20 20:19:57,894 INFO scm-web-4143:com.cloudera.enterprise.JavaMelodyFacade: Exiting HTTP Operation: Method:POST, Path:/v7/clusters/LogServerClu/commands/deployClientConfig, Status:200 2020-02-20 20:19:57,978 WARN scm-web-4105:com.cloudera.cmf.command.flow.SeqFlowCmd: Invalid command state json com.cloudera.enterprise.JsonUtil2$JsonRuntimeException: com.fasterxml.jackson.databind.exc.MismatchedInputException: No content to map due to end-of-input at [Source: (String)""; line: 1, column: 0] at com.cloudera.enterprise.JsonUtil2.valueFromString(JsonUtil2.java:193)
8
1
cloudera agent monitor firehose error: [Errno 111] Connection refused
2
#從新添加主機
3
2020-02-20 20:19:57,879 ERROR scm-web-4143:com.cloudera.cmf.model.DbCommand: Command null(DeployClusterClientConfig) has completed. finalstate:FINISHED, success:false, msg:Command Deploy Client Configuration is not currently available for execution.
4
2020-02-20 20:19:57,894 INFO scm-web-4143:com.cloudera.enterprise.JavaMelodyFacade: Exiting HTTP Operation: Method:POST, Path:/v7/clusters/LogServerClu/commands/deployClientConfig, Status:200
5
2020-02-20 20:19:57,978 WARN scm-web-4105:com.cloudera.cmf.command.flow.SeqFlowCmd: Invalid command state json
6
com.cloudera.enterprise.JsonUtil2$JsonRuntimeException: com.fasterxml.jackson.databind.exc.MismatchedInputException: No content to map due to end-of-input
7
at [Source: (String)""; line: 1, column: 0]
8
at com.cloudera.enterprise.JsonUtil2.valueFromString(JsonUtil2.java:193)
不是JDK的緣由!
搞了一天最終大法:
把170,171,172,221四臺agent停掉,停掉170 server;而後再重啓server,四個agent
#四臺 systemctl stop cloudera-scm-agent systemctl stop cloudera-scm-server #170 systemctl start cloudera-scm-server #四臺 systemctl start cloudera-scm-agent
7
1
#四臺
2
systemctl stop cloudera-scm-agent
3
systemctl stop cloudera-scm-server
4
#170
5
systemctl start cloudera-scm-server
6
#四臺
7
systemctl start cloudera-scm-agent
仍是沒解決221節點(內網ip映射)從cloudera刪除集羣:四臺節點都是配置221的公網ip映射;而後重新添加到集羣。
#scm-status.log 20/Feb/2020 21:56:44 +0000] 5440 MainThread _cplogging INFO [20/Feb/2020:21:56:44] ENGINE Started monitor thread 'Autoreloader'. [20/Feb/2020 21:56:44 +0000] 5440 MainThread _cplogging INFO [20/Feb/2020:21:56:44] ENGINE Started monitor thread '_TimeoutMonitor'. [20/Feb/2020 21:56:44 +0000] 5440 HTTPServer Thread-3 _cplogging ERROR [20/Feb/2020:21:56:44] ENGINE Error in HTTP server: shutting down Traceback (most recent call last): File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cherrypy/process/servers.py", line 225, in _start_http_thread self.httpserver.start() File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cheroot/server.py", line 1326, in start raise socket.error(msg) error: No socket could be created -- (('47.103.112.221', 9000): [Errno 99] Cannot assign requested address) [20/Feb/2020 21:56:44 +0000] 5440 HTTPServer Thread-3 _cplogging INFO [20/Feb/2020:21:56:44] ENGINE Bus STOPPING [20/Feb/2020 21:56:44 +0000] 5440 HTTPServer Thread-3 _cplogging INFO [20/Feb/2020:21:56:44] ENGINE HTTP Server cherrypy._cpwsgi_server.CPWSGIServer(('creative', 9000)) already shut down [20/Feb/2020 21:56:44 +0000] 5440 HTTPServer Thread-3 _cplogging INFO [20/Feb/2020:21:56:44] ENGINE Stopped thread '_TimeoutMonitor'. [20/Feb/2020 21:56:44 +0000] 5440 HTTPServer Thread-3 _cplogging INFO [20/Feb/2020:21:56:44] ENGINE Stopped thread 'Autoreloader'. [20/Feb/2020 21:56:44 +0000] 5440 HTTPServer Thread-3 _cplogging INFO [20/Feb/2020:21:56:44] ENGINE Bus STOPPED [20/Feb/2020 21:56:44 +0000] 5440 HTTPServer Thread-3 _cplogging INFO [20/Feb/2020:21:56:44] ENGINE Bus EXITING [20/Feb/2020 21:56:44 +0000] 5440 HTTPServer Thread-3 _cplogging INFO [20/Feb/2020:21:56:44] ENGINE Bus EXITED #scm-agent.log [20/Feb/2020 21:56:35 +0000] 5322 MainThread _cplogging INFO [20/Feb/2020:21:56:35] ENGINE Serving on http://127.0.0.1:9001 [20/Feb/2020 21:56:35 +0000] 5322 MainThread _cplogging INFO [20/Feb/2020:21:56:35] ENGINE Bus STARTED [20/Feb/2020 21:56:37 +0000] 5322 MainThread main ERROR Top-level exception: <Fault 40: 'ABNORMAL_TERMINATION: status_server'> Traceback (most recent call last): File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/main.py", line 107, in main_impl ag.start(legacy_supervisor) File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/agent.py", line 839, in start self.supervisor_client.start_process(STATUS_SERVER_PROC) File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/util/__init__.py", line 531, in new_fn return fn(self, *args, **kwargs) File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/supervisor.py", line 406, in start_process raise RetryableProcessException(fault) RetryableProcessException: <Fault 40: 'ABNORMAL_TERMINATION: status_server'> ###查看ip及hostname對應關係 [root@creative cloudera-scm-agent]# python -c 'import socket; print socket.getfqdn(), socket.gethostbyname(socket.getfqdn())' creative 47.103.112.221
36
1
#scm-status.log
2
20/Feb/2020 21:56:44 +0000] 5440 MainThread _cplogging INFO [20/Feb/2020:21:56:44] ENGINE Started monitor thread 'Autoreloader'.
3
[20/Feb/2020 21:56:44 +0000] 5440 MainThread _cplogging INFO [20/Feb/2020:21:56:44] ENGINE Started monitor thread '_TimeoutMonitor'.
4
[20/Feb/2020 21:56:44 +0000] 5440 HTTPServer Thread-3 _cplogging ERROR [20/Feb/2020:21:56:44] ENGINE Error in HTTP server: shutting down
5
Traceback (most recent call last):
6
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cherrypy/process/servers.py", line 225, in _start_http_thread
7
self.httpserver.start()
8
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cheroot/server.py", line 1326, in start
9
raise socket.error(msg)
10
error: No socket could be created -- (('47.103.112.221', 9000): [Errno 99] Cannot assign requested address)
11
12
[20/Feb/2020 21:56:44 +0000] 5440 HTTPServer Thread-3 _cplogging INFO [20/Feb/2020:21:56:44] ENGINE Bus STOPPING
13
[20/Feb/2020 21:56:44 +0000] 5440 HTTPServer Thread-3 _cplogging INFO [20/Feb/2020:21:56:44] ENGINE HTTP Server cherrypy._cpwsgi_server.CPWSGIServer(('creative', 9000)) already shut down
14
[20/Feb/2020 21:56:44 +0000] 5440 HTTPServer Thread-3 _cplogging INFO [20/Feb/2020:21:56:44] ENGINE Stopped thread '_TimeoutMonitor'.
15
[20/Feb/2020 21:56:44 +0000] 5440 HTTPServer Thread-3 _cplogging INFO [20/Feb/2020:21:56:44] ENGINE Stopped thread 'Autoreloader'.
16
[20/Feb/2020 21:56:44 +0000] 5440 HTTPServer Thread-3 _cplogging INFO [20/Feb/2020:21:56:44] ENGINE Bus STOPPED
17
[20/Feb/2020 21:56:44 +0000] 5440 HTTPServer Thread-3 _cplogging INFO [20/Feb/2020:21:56:44] ENGINE Bus EXITING
18
[20/Feb/2020 21:56:44 +0000] 5440 HTTPServer Thread-3 _cplogging INFO [20/Feb/2020:21:56:44] ENGINE Bus EXITED
19
#scm-agent.log
20
[20/Feb/2020 21:56:35 +0000] 5322 MainThread _cplogging INFO [20/Feb/2020:21:56:35] ENGINE Serving on http://127.0.0.1:9001
21
[20/Feb/2020 21:56:35 +0000] 5322 MainThread _cplogging INFO [20/Feb/2020:21:56:35] ENGINE Bus STARTED
22
[20/Feb/2020 21:56:37 +0000] 5322 MainThread main ERROR Top-level exception: <Fault 40: 'ABNORMAL_TERMINATION: status_server'>
23
Traceback (most recent call last):
24
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/main.py", line 107, in main_impl
25
ag.start(legacy_supervisor)
26
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/agent.py", line 839, in start
27
self.supervisor_client.start_process(STATUS_SERVER_PROC)
28
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/util/__init__.py", line 531, in new_fn
29
return fn(self, *args, **kwargs)
30
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/supervisor.py", line 406, in start_process
31
raise RetryableProcessException(fault)
32
RetryableProcessException: <Fault 40: 'ABNORMAL_TERMINATION: status_server'>
33
34
###查看ip及hostname對應關係
35
[root@creative cloudera-scm-agent]# python -c 'import socket; print socket.getfqdn(), socket.gethostbyname(socket.getfqdn())'
36
creative 47.103.112.221
最終刪除agent重新安裝用公網ip配置hosts文件映射
creative: IOException thrown while collecting data from host: Connection refused (Connection refused) #agent.log [20/Feb/2020 22:48:42 +0000] 11398 MonitorDaemon-Reporter throttling_logger ERROR (10 skipped) Error sending messages to firehose: mgmt-HOSTMONITOR-d592ed6aea0516a09027c2cf834d8979 Traceback (most recent call last): File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/monitor/firehose.py", line 121, in _send self._port) File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/avro/ipc.py", line 469, in __init__ self.conn.connect() File "/usr/lib64/python2.7/httplib.py", line 833, in connect self.timeout, self.source_address) File "/usr/lib64/python2.7/socket.py", line 571, in create_connection raise err error: [Errno 111] Connection refused #/var/log/cloudera-scm-firehose #activemontor日誌 2020-02-20 21:01:43,753 WARN com.cloudera.cmf.BasicScmProxy: Exception while getting current fragments hashes java.net.ConnectException: Connection refused (Connection refused) ... 2020-02-20 21:02:40,203 INFO com.cloudera.cmon.firehose.Main: Starting Firehose. JVM Args: [-XX:+UseConcMarkSweepGC, -XX:+UseParNewGC, -Dmgmt.log.file=mgmt-cmf-mgmt-ACTIVITYMONITOR-hz-seeing-bg-01.log.out, -Djava.awt.headless=true, -Djava.net.preferIPv4Stack=true, -Dfirehose.schema.dir=/opt/cloudera/cm/schema, -Xms1073741824, -Xmx1073741824, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=/tmp/mgmt_mgmt-ACTIVITYMONITOR-d592ed6aea0516a09027c2cf834d8979_pid43982.hprof, -XX:OnOutOfMemoryError=/opt/cloudera/cm-agent/service/common/killparent.sh], Args: [--pipeline-type, ACTIVITY_MONITORING_TREE, --mgmt-home, /opt/cloudera/cm], Version: 6.2.0 (#968826 built by jenkins on 20190314-1704 git: 16bbe6211555460a860cf22d811680b35755ea81) ...#hostmontor日誌 2020-02-20 21:02:45,838 WARN com.cloudera.cmon.firehose.HMONToSMONHostSubjectRecordPublisher: Failed to send messages to SMON. java.lang.reflect.UndeclaredThrowableException at com.sun.proxy.$Proxy23.writeStatusRecords(Unknown Source) at com.cloudera.cmon.firehose.BasicFirehoseClient.writeStatusRecords(BasicFirehoseClient.java:75) at com.cloudera.cmon.firehose.HMONToSMONHostSubjectRecordPublisher.processRecords(HMONToSMONHostSubjectRecordPublisher.java:107) at com.cloudera.cmon.tstore.leveldb.LDBSubjectRecordStore.write(LDBSubjectRecordStore.java:399) at com.cloudera.cmon.kaiser.HMONTestRunner.runHostTestsForSession(HMONTestRunner.java:86) at com.cloudera.cmon.kaiser.HMONTestRunner.runTestsForSession(HMONTestRunner.java:66) at com.cloudera.cmon.kaiser.BaseTestRunner.runTestsOnAllSubjects(BaseTestRunner.java:143) at com.cloudera.cmon.kaiser.KaiserService$KaiserServiceRunner.run(KaiserService.java:138) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.avro.AvroRemoteException: java.net.ConnectException: Connection refused (Connection refused)
33
1
creative: IOException thrown while collecting data from host: Connection refused (Connection refused)
2
#agent.log
3
[20/Feb/2020 22:48:42 +0000] 11398 MonitorDaemon-Reporter throttling_logger ERROR (10 skipped) Error sending messages to firehose: mgmt-HOSTMONITOR-d592ed6aea0516a09027c2cf834d8979
4
Traceback (most recent call last):
5
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/monitor/firehose.py", line 121, in _send
6
self._port)
7
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/avro/ipc.py", line 469, in __init__
8
self.conn.connect()
9
File "/usr/lib64/python2.7/httplib.py", line 833, in connect
10
self.timeout, self.source_address)
11
File "/usr/lib64/python2.7/socket.py", line 571, in create_connection
12
raise err
13
error: [Errno 111] Connection refused
14
#/var/log/cloudera-scm-firehose
15
#activemontor日誌
16
2020-02-20 21:01:43,753 WARN com.cloudera.cmf.BasicScmProxy: Exception while getting current fragments hashes
17
java.net.ConnectException: Connection refused (Connection refused)
18
...
19
2020-02-20 21:02:40,203 INFO com.cloudera.cmon.firehose.Main: Starting Firehose. JVM Args: [-XX:+UseConcMarkSweepGC, -XX:+UseParNewGC, -Dmgmt.log.file=mgmt-cmf-mgmt-ACTIVITYMONITOR-hz-seeing-bg-01.log.out, -Djava.awt.headless=true, -Djava.net.preferIPv4Stack=true, -Dfirehose.schema.dir=/opt/cloudera/cm/schema, -Xms1073741824, -Xmx1073741824, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=/tmp/mgmt_mgmt-ACTIVITYMONITOR-d592ed6aea0516a09027c2cf834d8979_pid43982.hprof, -XX:OnOutOfMemoryError=/opt/cloudera/cm-agent/service/common/killparent.sh], Args: [--pipeline-type, ACTIVITY_MONITORING_TREE, --mgmt-home, /opt/cloudera/cm], Version: 6.2.0 (#968826 built by jenkins on 20190314-1704 git: 16bbe6211555460a860cf22d811680b35755ea81)
20
...#hostmontor日誌
21
2020-02-20 21:02:45,838 WARN com.cloudera.cmon.firehose.HMONToSMONHostSubjectRecordPublisher: Failed to send messages to SMON.
22
java.lang.reflect.UndeclaredThrowableException
23
at com.sun.proxy.$Proxy23.writeStatusRecords(Unknown Source)
24
at com.cloudera.cmon.firehose.BasicFirehoseClient.writeStatusRecords(BasicFirehoseClient.java:75)
25
at com.cloudera.cmon.firehose.HMONToSMONHostSubjectRecordPublisher.processRecords(HMONToSMONHostSubjectRecordPublisher.java:107)
26
at com.cloudera.cmon.tstore.leveldb.LDBSubjectRecordStore.write(LDBSubjectRecordStore.java:399)
27
at com.cloudera.cmon.kaiser.HMONTestRunner.runHostTestsForSession(HMONTestRunner.java:86)
28
at com.cloudera.cmon.kaiser.HMONTestRunner.runTestsForSession(HMONTestRunner.java:66)
29
at com.cloudera.cmon.kaiser.BaseTestRunner.runTestsOnAllSubjects(BaseTestRunner.java:143)
30
at com.cloudera.cmon.kaiser.KaiserService$KaiserServiceRunner.run(KaiserService.java:138)
31
at java.lang.Thread.run(Thread.java:748)
32
Caused by: org.apache.avro.AvroRemoteException: java.net.ConnectException: Connection refused (Connection refused)
33
smon服務的端口9999和firehose端口9998
1
1
smon服務的端口9999和firehose端口9998

經過對比只有server服務器啓動9999,9998端口並且agent必須能訪問兩個端口
而221阿里雲機器沒法訪問IDC170(server)機器9999端口
內網機器才能夠,不能經過server公網ip訪問,儘管是一臺機器

將9999相關的端口綁定成通配符地址:clouderamanagerserver-配置-activemonitor修改成通配符地址
1
1
將9999相關的端口綁定成通配符地址:clouderamanagerserver-配置-activemonitor修改成通配符地址

cd /var/log/cloudera-scm-firehose #只有hostmonitor報錯了activemonitor不報錯了 2020-02-21 11:18:07,529 INFO com.cloudera.cmon.tstore.leveldb.LDBPartitionManager: Opening partition LDBPartitionMetadataWrapper{tableName=ts_subject, partiti onName=ts_subject_2020-02-11T07:41:01.428Z, startTime=2020-02-11T07:41:01.428Z, endTime=null, version=9, state=CLOSED} 2020-02-21 11:18:07,546 WARN com.cloudera.cmon.firehose.HMONToSMONHostSubjectRecordPublisher: Failed to send messages to SMON. java.lang.reflect.UndeclaredThrowableException at com.sun.proxy.$Proxy23.writeStatusRecords(Unknown Source) 。。。 at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.avro.AvroRemoteException: java.net.ConnectException: Connection refused (Connection refused) at org.apache.avro.ipc.specific.SpecificRequestor.invoke(SpecificRequestor.java:104) ... 9 more Caused by: java.net.ConnectException: Connection refused (Connection refused)
13
1
cd /var/log/cloudera-scm-firehose
2
#只有hostmonitor報錯了activemonitor不報錯了
3
2020-02-21 11:18:07,529 INFO com.cloudera.cmon.tstore.leveldb.LDBPartitionManager: Opening partition LDBPartitionMetadataWrapper{tableName=ts_subject, partiti
4
onName=ts_subject_2020-02-11T07:41:01.428Z, startTime=2020-02-11T07:41:01.428Z, endTime=null, version=9, state=CLOSED}
5
2020-02-21 11:18:07,546 WARN com.cloudera.cmon.firehose.HMONToSMONHostSubjectRecordPublisher: Failed to send messages to SMON.
6
java.lang.reflect.UndeclaredThrowableException
7
at com.sun.proxy.$Proxy23.writeStatusRecords(Unknown Source)
8
。。。
9
at java.lang.Thread.run(Thread.java:748)
10
Caused by: org.apache.avro.AvroRemoteException: java.net.ConnectException: Connection refused (Connection refused)
11
at org.apache.avro.ipc.specific.SpecificRequestor.invoke(SpecificRequestor.java:104)
12
... 9 more
13
Caused by: java.net.ConnectException: Connection refused (Connection refused)
接着一樣操做:勾上便可

MainThread main ERROR Top-level exception: <Fault 40: 'ABNORMAL_TERMINATION: status_server'> #查看cloudera-scm-eventserver 2020-02-21 11:34:07,569 INFO org.apache.avro.ipc.NettyServer: [id: 0xe2bcd0eb, /192.168.20.170:51594 => /192.168.20.170:7184] OPEN 2020-02-21 11:34:07,570 INFO org.apache.avro.ipc.NettyServer: [id: 0xe2bcd0eb, /192.168.20.170:51594 => /192.168.20.170:7184] BOUND: /192.168.20.170:7184 2020-02-21 11:34:07,570 INFO org.apache.avro.ipc.NettyServer: [id: 0xe2bcd0eb, /192.168.20.170:51594 => /192.168.20.170:7184] CONNECTED: /192.168.20.170:51594 2020-02-21 11:34:07,576 ERROR com.cloudera.cmf.eventcatcher.server.EventMetricsPublisher: Could not publish metrics to HMON: java.lang.reflect.UndeclaredThrowableException 。。。 2020-02-21 11:34:07,590 ERROR com.cloudera.cmf.eventcatcher.server.EventMetricsPublisher: Could not publish metrics to SMON: java.lang.reflect.UndeclaredThrowableException at com.sun.proxy.$Proxy22.writeMetrics(Unknown Source) at com.cloudera.cmon.firehose.BasicFirehoseClient.writeMetrics(BasicFirehoseClient.java:87) at com.cloudera.cmf.eventcatcher.server.EventMetricsPublisher.publishToSMON(EventMetricsPublisher.java:233) at com.cloudera.cmf.eventcatcher.server.EventMetricsPublisher.run(EventMetricsPublisher.java:110) at com.cloudera.enterprise.PeriodicEnterpriseService$UnexceptionablePeriodicRunnable.run(PeriodicEnterpriseService.java:67) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.avro.AvroRemoteException: java.net.ConnectException: Connection refused (Connection refused) at org.apache.avro.ipc.specific.SpecificRequestor.invoke(SpecificRequestor.java:104) ... 6 more Caused by: java.net.ConnectException: Connection refused (Connection refused) #最後開啓servermonitor的通配符,仍是上面的錯誤查看agent scm-status.log [21/Feb/2020 11:57:55 +0000] 16366 MainThread _cplogging INFO [21/Feb/2020:11:57:55] ENGINE Started monitor thread '_TimeoutMonitor'. [21/Feb/2020 11:57:55 +0000] 16366 HTTPServer Thread-3 _cplogging ERROR [21/Feb/2020:11:57:55] ENGINE Error in HTTP server: shutting down Traceback (most recent call last): File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cherrypy/process/servers.py", line 225, in _start_http_thread self.httpserver.start() File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cheroot/server.py", line 1326, in start raise socket.error(msg) error: No socket could be created -- (('47.103.112.221', 9000): [Errno 99] Cannot assign requested address) #supervisord 2020-02-21 11:42:12,122 INFO gave up: status_server entered FATAL state, too many start retries too quickly 2020-02-21 11:57:46,783 INFO spawned: 'status_server' with pid 16328 2020-02-21 11:57:47,355 INFO exited: status_server (exit status 70; not expected)
x
1
MainThread main ERROR Top-level exception: <Fault 40: 'ABNORMAL_TERMINATION: status_server'>
2
#查看cloudera-scm-eventserver
3
2020-02-21 11:34:07,569 INFO org.apache.avro.ipc.NettyServer: [id: 0xe2bcd0eb, /192.168.20.170:51594 => /192.168.20.170:7184] OPEN
4
2020-02-21 11:34:07,570 INFO org.apache.avro.ipc.NettyServer: [id: 0xe2bcd0eb, /192.168.20.170:51594 => /192.168.20.170:7184] BOUND: /192.168.20.170:7184
5
2020-02-21 11:34:07,570 INFO org.apache.avro.ipc.NettyServer: [id: 0xe2bcd0eb, /192.168.20.170:51594 => /192.168.20.170:7184] CONNECTED: /192.168.20.170:51594
6
2020-02-21 11:34:07,576 ERROR com.cloudera.cmf.eventcatcher.server.EventMetricsPublisher: Could not publish metrics to HMON:
7
java.lang.reflect.UndeclaredThrowableException
8
。。。
9
2020-02-21 11:34:07,590 ERROR com.cloudera.cmf.eventcatcher.server.EventMetricsPublisher: Could not publish metrics to SMON:
10
java.lang.reflect.UndeclaredThrowableException
11
at com.sun.proxy.$Proxy22.writeMetrics(Unknown Source)
12
at com.cloudera.cmon.firehose.BasicFirehoseClient.writeMetrics(BasicFirehoseClient.java:87)
13
at com.cloudera.cmf.eventcatcher.server.EventMetricsPublisher.publishToSMON(EventMetricsPublisher.java:233)
14
at com.cloudera.cmf.eventcatcher.server.EventMetricsPublisher.run(EventMetricsPublisher.java:110)
15
at com.cloudera.enterprise.PeriodicEnterpriseService$UnexceptionablePeriodicRunnable.run(PeriodicEnterpriseService.java:67)
16
at java.lang.Thread.run(Thread.java:748)
17
Caused by: org.apache.avro.AvroRemoteException: java.net.ConnectException: Connection refused (Connection refused)
18
at org.apache.avro.ipc.specific.SpecificRequestor.invoke(SpecificRequestor.java:104)
19
... 6 more
20
Caused by: java.net.ConnectException: Connection refused (Connection refused)
21
#最後開啓servermonitor的通配符,仍是上面的錯誤查看agent scm-status.log
22
[21/Feb/2020 11:57:55 +0000] 16366 MainThread _cplogging INFO [21/Feb/2020:11:57:55] ENGINE Started monitor thread '_TimeoutMonitor'.
23
[21/Feb/2020 11:57:55 +0000] 16366 HTTPServer Thread-3 _cplogging ERROR [21/Feb/2020:11:57:55] ENGINE Error in HTTP server: shutting down
24
Traceback (most recent call last):
25
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cherrypy/process/servers.py", line 225, in _start_http_thread
26
self.httpserver.start()
27
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cheroot/server.py", line 1326, in start
28
raise socket.error(msg)
29
error: No socket could be created -- (('47.103.112.221', 9000): [Errno 99] Cannot assign requested address)
30
#supervisord
31
2020-02-21 11:42:12,122 INFO gave up: status_server entered FATAL state, too many start retries too quickly
32
2020-02-21 11:57:46,783 INFO spawned: 'status_server' with pid 16328
33
2020-02-21 11:57:47,355 INFO exited: status_server (exit status 70; not expected)
34

9000是內網ip綁定,是否是這個緣由=》agent換成內網映射
server 映射是內網ip

server是外網映射






最終效果

#補充
159啓動cloudera-manager失敗發現啓動過程當中event-server失敗,後面接着三個monitor就失敗了
所以查看event-server日誌
2020-02-21 23:27:04,647 INFO com.cloudera.enterprise.DebugServer: Running debug HTTP server on 0.0.0.0:8084 2020-02-21 23:27:04,766 ERROR com.cloudera.cmf.eventcatcher.server.EventCatcherService: Error starting EventServer org.jboss.netty.channel.ChannelException: Failed to bind to: 0.0.0.0/0.0.0.0:7184 at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:298) at org.apache.avro.ipc.CustomNettyServer.<init>(CustomNettyServer.java:76) at com.cloudera.cmf.eventcatcher.server.AvroEventStoreServer.<init>(AvroEventStoreServer.java:107) at com.cloudera.cmf.eventcatcher.server.EventCatcherService.main(EventCatcherService.java:179) Caused by: java.net.BindException: Address already in use

1
1
2020-02-21 23:27:04,647 INFO com.cloudera.enterprise.DebugServer: Running debug HTTP server on 0.0.0.0:8084
2
2020-02-21 23:27:04,766 ERROR com.cloudera.cmf.eventcatcher.server.EventCatcherService: Error starting EventServer
3
org.jboss.netty.channel.ChannelException: Failed to bind to: 0.0.0.0/0.0.0.0:7184
4
at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:298)
5
at org.apache.avro.ipc.CustomNettyServer.<init>(CustomNettyServer.java:76)
6
at com.cloudera.cmf.eventcatcher.server.AvroEventStoreServer.<init>(AvroEventStoreServer.java:107)
7
at com.cloudera.cmf.eventcatcher.server.EventCatcherService.main(EventCatcherService.java:179)
8
Caused by: java.net.BindException: Address already in use
netstat -nltpa
1
1
netstat -nltpa

#鏈接等待關閉 ss -ano|grep 7184 #查看進程加上-p就能看到進程號
1
1
#鏈接等待關閉
2
ss -ano|grep 7184 #查看進程加上-p就能看到進程號