1、Ganglia簡介php
Ganglia 是 UC Berkeley 發起的一個開源監視項目,設計用於測量數以千計的節點。每臺計算機都運行一個收集和發送度量數據(如處理器速度、內存使用量等)的名爲 gmond 的守護進程。它將從操做系統和指定主機中收集。接收全部度量數據的主機能夠顯示這些數據而且能夠將這些數據的精簡表單傳遞到層次結構中。正由於有這種層次結構模式,才使得 Ganglia 能夠實現良好的擴展。gmond 帶來的系統負載很是少,這使得它成爲在集羣中各臺計算機上運行的一段代碼,而不會影響用戶性能html
1.1 Ganglia組件前端
Ganglia 監控套件包括三個主要部分:gmond,gmetad,和網頁接口,一般被稱爲ganglia-web。node
Gmond :是一個守護進程,他運行在每個須要監測的節點上,收集監測統計,發送和接受在同一個組播或單播通道上的統計信息 若是他是一個發送者(mute=no)他會收集基本指標,好比系統負載(load_one),CPU利用率。他同時也會發送用戶經過添加C/Python模塊來自定義的指標。 若是他是一個接收者(deaf=no)他會聚合全部從別的主機上發來的指標,並把它們都保存在內存緩衝區中。web
Gmetad:也是一個守護進程,他按期檢查gmonds,從那裏拉取數據,並將他們的指標存儲在RRD存儲引擎中。他能夠查詢多個集羣並聚合指標。他也被用於生成用戶界面的web前端。數據庫
Ganglia-web :顧名思義,他應該安裝在有gmetad運行的機器上,以便讀取RRD文件。 集羣是主機和度量數據的邏輯分組,好比數據庫服務器,網頁服務器,生產,測試,QA等,他們都是徹底分開的,你須要爲每一個集羣運行單獨的gmond實例。apache
通常來講每一個集羣須要一個接收的gmond,每一個網站須要一個gmetad。服務器
圖1 ganglia工做流app
Ganglia工做流如圖1所示:frontend
左邊是運行在各個節點上的gmond進程,這個進程的配置只由節點上/etc/gmond.conf的文件決定。因此,在各個監視節點上都須要安裝和配置該文件。
右上角是更加負責的中心機(一般是這個集羣中的一臺,也能夠不是)。在這個臺機器上運行這着gmetad進程,收集來自各個節點上的信息並存儲在RRDtool上,該進程的配置只由/etc/gmetad.conf決定。
右下角顯示了關於網頁方面的一些信息。咱們的瀏覽網站時調用php腳本,從RRDTool數據庫中抓取信息,動態的生成各種圖表。
1.2 Ganglia運行模式(單播與多播)
Ganglia的收集數據工做能夠工做在單播(unicast)或多播(multicast)模式下,默認爲多播模式。
單播:發送本身收集到的監控數據到特定的一臺或幾臺機器上,能夠跨網段。
多播:發送本身收集到的監控數據到同一網段內全部的機器上,同時收集同一網段內的全部機器發送過來的監控數據。由於是以廣播包的形式發送,所以須要同一網段內。但同一網段內,又能夠定義不一樣的發送通道。
2、安裝ganglia
一、拓撲說明
3臺主機,分別爲:
- 10.171.29.191 master
- 10.171.94.155 slave1
- 10.251.0.197 slave3
其中master將gmeta及web,三臺機都做gmon
如下步驟均使用root用戶執行
二、master上安裝gmeta及web
- yum install ganglia-web.x86_64
- yum install ganglia-gmetad.x86_64
三、在三臺機上都安撫gmond
- yum install ganglia-gmond.x86_64
四、在三臺機器上配置/etc/ganglia/gmond.conf,修改如下內容:
- udp_send_channel {
- #bind_hostname = yes # Highly recommended, soon to be default.
- # This option tells gmond to use a source address
- # that resolves to the machine's hostname. Without
- # this, the metrics may appear to come from any
- # interface and the DNS names associated with
- # those IPs will be used to create the RRDs.
- mcast_join = 10.171.29.191
- port = 8649
- ttl = 1
- }
- /* You can specify as many udp_recv_channels as you like as well. */
- udp_recv_channel {
- #mcast_join = 239.2.11.71
- port = 8649
- #bind = 239.2.11.71
- }
即將默認的多播地址改成master地址,將udp_recv_channel 的2個IP註釋掉。
五、在master上修改/etc/ganglia/gmetad.conf
修改data_source,改爲:
- data_source "my cluster」 10.171.29.191
六、ln -s /usr/share/ganglia /var/www/ganglia
如有問題,能夠將/usr/share/ganglia的內容直接複製到/var/www/ganglia
七、修改/etc/httpd/conf.d/ganglia.conf,改爲:
- #
- # Ganglia monitoring system php web frontend
- #
-
- Alias /ganglia /usr/share/ganglia
-
- <Location /ganglia>
- Order deny,allow
- Allow from all
- Allow from 127.0.0.1
- Allow from ::1
- # Allow from .example.com
- </Location>
即將 Deny from all 改成 Allow from all,不然在頁面訪問時有權限問題。
八、啓動
- service gmetad start
- service gmond start
- /usr/sbin/apachectl start
九、從頁面上訪問
http://ip/ganglia
一些注意問題:
一、gmetad收集到的信息被放到/var/lib/ganglia/rrds/
二、能夠經過如下命令檢查是否有數據在傳輸
3、配置hadoop與hbase
一、配置hadoop
hadoop-metrics2.properties
- # syntax: [prefix].[source|sink|jmx].[instance].[options]
- # See package.html for org.apache.hadoop.metrics2 for details
-
- *.sink.file.class=org.apache.hadoop.metrics2.sink.FileSink
-
- #namenode.sink.file.filename=namenode-metrics.out
-
- #datanode.sink.file.filename=datanode-metrics.out
-
- #jobtracker.sink.file.filename=jobtracker-metrics.out
-
- #tasktracker.sink.file.filename=tasktracker-metrics.out
-
- #maptask.sink.file.filename=maptask-metrics.out
-
- #reducetask.sink.file.filename=reducetask-metrics.out
- # Below are for sending metrics to Ganglia
- #
- # for Ganglia 3.0 support
- # *.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink30
- #
- # for Ganglia 3.1 support
- *.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31
-
- *.sink.ganglia.period=10
-
- # default for supportsparse is false
- *.sink.ganglia.supportsparse=true
-
- *.sink.ganglia.slope=jvm.metrics.gcCount=zero,jvm.metrics.memHeapUsedM=both
- *.sink.ganglia.dmax=jvm.metrics.threadsBlocked=70,jvm.metrics.memHeapUsedM=40
- menode.sink.ganglia.servers=10.171.29.191:8649
-
- datanode.sink.ganglia.servers=10.171.29.191:8649
-
- jobtracker.sink.ganglia.servers=10.171.29.191:8649
- tasktracker.sink.ganglia.servers=10.171.29.191:8649
-
- maptask.sink.ganglia.servers=10.171.29.191:8649
-
- reducetask.sink.ganglia.servers=10.171.29.191:8649
二、配置hbase
hadoop-metrics.properties
- # See http://wiki.apache.org/hadoop/GangliaMetrics
- # Make sure you know whether you are using ganglia 3.0 or 3.1.
- # If 3.1, you will have to patch your hadoop instance with HADOOP-4675
- # And, yes, this file is named hadoop-metrics.properties rather than
- # hbase-metrics.properties because we're leveraging the hadoop metrics
- # package and hadoop-metrics.properties is an hardcoded-name, at least
- # for the moment.
- #
- # See also http://hadoop.apache.org/hbase/docs/current/metrics.html
- # GMETADHOST_IP is the hostname (or) IP address of the server on which the ganglia
- # meta daemon (gmetad) service is running
-
- # Configuration of the "hbase" context for NullContextWithUpdateThread
- # NullContextWithUpdateThread is a null context which has a thread calling
- # periodically when monitoring is started. This keeps the data sampled
- # correctly.
- hbase.class=org.apache.hadoop.metrics.spi.NullContextWithUpdateThread
- hbase.period=10
-
- # Configuration of the "hbase" context for file
- # hbase.class=org.apache.hadoop.hbase.metrics.file.TimeStampingFileContext
- # hbase.fileName=/tmp/metrics_hbase.log
-
- # HBase-specific configuration to reset long-running stats (e.g. compactions)
- # If this variable is left out, then the default is no expiration.
- hbase.extendedperiod = 3600
-
- # Configuration of the "hbase" context for ganglia
- # Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)
- # hbase.class=org.apache.hadoop.metrics.ganglia.GangliaContext
- hbase.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
- hbase.period=10
- hbase.servers=10.171.29.191:8649
-
- # Configuration of the "jvm" context for null
- jvm.class=org.apache.hadoop.metrics.spi.NullContextWithUpdateThread
- jvm.period=10
-
- # Configuration of the "jvm" context for file
- # jvm.class=org.apache.hadoop.hbase.metrics.file.TimeStampingFileContext
- # jvm.fileName=/tmp/metrics_jvm.log
-
- # Configuration of the "jvm" context for ganglia
- # Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)
- # jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext
- jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
- jvm.period=10
- jvm.servers=10.171.29.191:8649
-
- # Configuration of the "rpc" context for null
- rpc.class=org.apache.hadoop.metrics.spi.NullContextWithUpdateThread
- rpc.period=10
-
- # Configuration of the "rpc" context for file
- # rpc.class=org.apache.hadoop.hbase.metrics.file.TimeStampingFileContext
- # rpc.fileName=/tmp/metrics_rpc.log
-
- # Configuration of the "rpc" context for ganglia
- # Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)
- # rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext
- rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
- rpc.period=10
- rpc.servers=10.171.29.191:8649
-
- # Configuration of the "rest" context for ganglia
- # Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)
- # rest.class=org.apache.hadoop.metrics.ganglia.GangliaContext
- rest.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
- rest.period=10
- rest.servers=10.171.29.191:8649
重啓hadoop與hbase。