storm集羣的安裝

storm圖解java


storm的基本概念

  Topologies:拓撲,也俗稱一個任務
  Spoults:拓撲的消息源
  Bolts:拓撲的處理邏輯單元
  tuple:消息元組,在Spoults和Bolts傳遞數據報的一種格式
  Streams:流
  Streams groupings:流的分組策略
  Tasks:任務處理單元
  Executor:工做線程
  Workers:工做進程
  Configuration:topology的配置node

官網:http://storm.apache.org/
storm:
  實時在線運算,用於流式計算,就是數據像水同樣源源不斷的來,storm此時就得把這些數據處理完
  storm通常不單獨使用,由於它不存儲,通常數據從消息隊列進來處理完能夠存儲到mysql或其餘數據庫中去
  Apache Storm是一個免費的開源分佈式實時計算系統。Apache Storm能夠輕鬆可靠地處理無限數據流,實現Hadoop爲批處理所作的實時處理。Apache Storm很簡單,能夠與任何編程語言一塊兒使用,而且使用起來頗有趣!
  Apache Storm有許多用例:實時分析,在線機器學習,連續計算,分佈式RPC,ETL等。Apache Storm很快:一個基準測試時鐘表示每一個節點每秒處理超過一百萬個元組。它具備可擴展性,容錯性,可確保您的數據獲得處理,而且易於設置和操做。
  Apache Storm與您已經使用的消息隊列和數據庫技術集成。Apache Storm拓撲消耗數據流並以任意複雜的方式處理這些流,而後在計算的每一個階段之間從新劃分流。mysql


Storm與Hadoop的對比
Topology與Mapreduce
  一個關鍵的區別是:一個MapReduce job最終會結束,而一個Topology永遠會存在(除非手動kill掉)
Nimbus與JobTracker
  在Storm的集羣裏面有兩種節點:控制節點(master node)和工做槽位節點(worker node,默認每臺機器最多4個slots槽位).控制節點上面運行一個叫nimbus後臺程序,它的做用相似於haddop裏面的JobTracker。nimbus負責在集羣裏面分發代碼,分配計算任務給機器,而且監控狀態.。
Supervisor與TaskTracker
  每個工做節點上面運行一個叫作Supervisor的節點,Supervisor會監聽分配給它那臺機器的工做,根據須要啓動/關閉工做進程.每個工做進程執行一個topology的一個子集;一個運行的topology由運行在不少機器上的不少工做進程組成。linux

 

安裝步驟:web

1.安裝一個zookeeper集羣
2.下載storm的安裝包,解壓
3.修改配置文件storm.yamlredis

#所使用的zookeeper集羣主機
- hadoop01
- hadoop02
- hadoop03sql

#nimbus所在的主機名
nimbus.host: "hadoop01"
#默認4個槽位,能夠根據機器性能配置大於4個
supervisor.slots.ports
-6701
-6702
-6703
-6704
-6705數據庫

#啓動storm
#在nimbus主機上
nohup ./storm nimbus 1 > /dev/bull 2>&1 &
nohup ./storm ui 1 > /dev/null 2>&1 &apache

在supervisor主機上
nohup ./storm supervisor 1 > /dev/null 2>&1 &編程

 

1.zookeeper集羣前面已經安裝過

2.下載storm的安裝包,解壓

[linyouyi@hadoop01 software]$ wget https://mirrors.aliyun.com/apache/storm/apache-storm-2.0.0/apache-storm-2.0.0.tar.gz
[linyouyi@hadoop01 software]$ ll
total 739172
-rw-rw-r-- 1 linyouyi linyouyi 312465430 Apr 30 06:17 apache-storm-2.0.0.tar.gz
-rw-r--r-- 1 linyouyi linyouyi 218720521 Aug  3 17:56 hadoop-2.7.7.tar.gz
-rw-rw-r-- 1 linyouyi linyouyi 132569269 Mar 18 14:28 hbase-2.0.5-bin.tar.gz
-rw-r--r-- 1 linyouyi linyouyi  54701720 Aug  3 17:47 server-jre-8u144-linux-x64.tar.gz
-rw-r--r-- 1 linyouyi linyouyi  37676320 Aug  8 09:36 zookeeper-3.4.14.tar.gz
[linyouyi@hadoop01 software]$ tar -zxvf apache-storm-2.0.0.tar.gz -C /hadoop/module/
[linyouyi@hadoop01 software]$ cd /hadoop/module/apache-storm-2.0.0
[linyouyi@hadoop01 apache-storm-2.0.0]$ ll
total 308
drwxrwxr-x  2 linyouyi linyouyi  4096 Aug 12 21:11 bin
drwxrwxr-x  2 linyouyi linyouyi  4096 Aug 12 21:11 conf
-rw-r--r--  1 linyouyi linyouyi 91939 Apr 30 05:13 DEPENDENCY-LICENSES
drwxr-xr-x 19 linyouyi linyouyi  4096 Apr 30 05:13 examples
drwxrwxr-x 19 linyouyi linyouyi  4096 Aug 12 21:11 external
drwxr-xr-x  2 linyouyi linyouyi  4096 Apr 30 05:59 extlib
drwxr-xr-x  2 linyouyi linyouyi  4096 Apr 30 05:59 extlib-daemon
drwxrwxr-x  2 linyouyi linyouyi  4096 Aug 12 21:11 lib
drwxrwxr-x  5 linyouyi linyouyi  4096 Aug 12 21:11 lib-tools
drwxr-xr-x  2 linyouyi linyouyi  4096 Apr 30 05:59 lib-webapp
drwxr-xr-x  2 linyouyi linyouyi  4096 Apr 30 05:58 lib-worker
-rw-r--r--  1 linyouyi linyouyi 82390 Apr 30 05:13 LICENSE
drwxr-xr-x  2 linyouyi linyouyi  4096 Apr 30 05:13 licenses
drwxrwxr-x  2 linyouyi linyouyi  4096 Aug 12 21:11 log4j2
-rw-r--r--  1 linyouyi linyouyi 34065 Apr 30 05:13 NOTICE
drwxrwxr-x  6 linyouyi linyouyi  4096 Aug 12 21:11 public
-rw-r--r--  1 linyouyi linyouyi  7914 Apr 30 05:13 README.markdown
-rw-r--r--  1 linyouyi linyouyi     6 Apr 30 05:13 RELEASE
-rw-r--r--  1 linyouyi linyouyi 23865 Apr 30 05:13 SECURITY.md

3.修改配置文件storm.yaml

[linyouyi@hadoop01 apache-storm-2.0.0]$ vim conf/storm.yaml
#zookeeper地址
storm.zookeeper.servers:
    - "hadoop01"
    - "hadoop02"
    - "hadoop03"
nimbus.seeds: ["hadoop01"]
#nimbus.seeds: ["host1", "host2", "host3"]

[linyouyi@hadoop01 apache-storm-2.0.0]$ cd ../ 
[linyouyi@hadoop01 module]$ scp -r apache-storm-2.0.0 linyouyi@hadoop02:/hadoop/module/
[linyouyi@hadoop01 module]$ scp -r apache-storm-2.0.0 linyouyi@hadoop03:/hadoop/module/

4.啓動服務

[linyouyi@hadoop01 module]$ cd apache-storm-2.0.0
//若是報找不到java_home則須要配置conf/strom-env.sh文件
[linyouyi@hadoop01 apache-storm-2.0.0]$ bin/storm nimbus &
[linyouyi@hadoop01 apache-storm-2.0.0]$ jps
30051 Nimbus
44057 QuorumPeerMain
30381 Jps
[linyouyi@hadoop01 apache-storm-2.0.0]$ netstat -tnpl | grep 30684
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp6       0      0 :::6627                 :::*                    LISTEN      30684/java
[linyouyi@hadoop01 apache-storm-2.0.0]$ bin/storm ui &
[linyouyi@hadoop01 apache-storm-2.0.0]$ jps
32674 UIServer
44057 QuorumPeerMain
30684 Nimbus
32989 Jps
[linyouyi@hadoop01 apache-storm-2.0.0]$ netstat -tnpl | grep 32674
tcp6       0      0 :::8080                 :::*                    LISTEN      32674/java
//瀏覽器查看http://hadoop01:8080發現不少工做槽都是0,下面咱們在hadoop02,hadoop03啓動supervisor,工做槽就再也不是0了
[linyouyi@hadoop02 apache-storm-2.0.0]$ bin/storm supervisor
[linyouyi@hadoop02 apache-storm-2.0.0]$ jps
70952 Jps
70794 Supervisor
34879 QuorumPeerMain
[linyouyi@hadoop03 apache-storm-2.0.0]$ bin/storm supervisor
[linyouyi@hadoop03 apache-storm-2.0.0]$ jps
119587 QuorumPeerMain
116291 Jps
116143 Supervisor

 

 

storm提交Topologies經常使用命令

//命令格式: storm jar [jar路徑] [拓撲包名.拓撲類名] [stormIP地址] [storm端口] [拓撲名稱] [參數]
[linyouyi@hadoop01 apache-storm-2.0.0]$ bin/storm jar --help
usage: storm jar [-h] [--jars JARS] [--artifacts ARTIFACTS]
                 [--artifactRepositories ARTIFACTREPOSITORIES]
                 [--mavenLocalRepositoryDirectory MAVENLOCALREPOSITORYDIRECTORY]
                 [--proxyUrl PROXYURL] [--proxyUsername PROXYUSERNAME]
                 [--proxyPassword PROXYPASSWORD] [--storm-server-classpath]
                 [--config CONFIG] [-storm_config_opts STORM_CONFIG_OPTS]
                 topology-jar-path topology-main-class
                 [topology_main_args [topology_main_args ...]]

positional arguments:
  topology-jar-path     will upload the jar at topology-jar-path when the
                        topology is submitted.
  topology-main-class   main class of the topology jar being submitted
  topology_main_args    Runs the main method with the specified arguments.

optional arguments:
  --artifactRepositories ARTIFACTREPOSITORIES
                        When you need to pull the artifacts from other than
                        Maven Central, you can pass remote repositories to
                        --artifactRepositories option with a comma-separated
                        string. Repository format is "<name>^<url>". '^' is
                        taken as separator because URL allows various
                        characters. For example, --artifactRepositories
                        "jboss-repository^http://repository.jboss.com/maven2,H
                        DPRepo^http://repo.hortonworks.com/content/groups/publ
                        ic/" will add JBoss and HDP repositories for
                        dependency resolver.
  --artifacts ARTIFACTS
                        When you want to ship maven artifacts and its
                        transitive dependencies, you can pass them to
                        --artifacts with comma-separated string. You can also
                        exclude some dependencies like what you're doing in
                        maven pom. Please add exclusion artifacts with '^'
                        separated string after the artifact. For example,
                        -artifacts "redis.clients:jedis:2.9.0,org.apache.kafka
                        :kafka-clients:1.0.0^org.slf4j:slf4j-api" will load
                        jedis and kafka-clients artifact and all of transitive
                        dependencies but exclude slf4j-api from kafka.
  --config CONFIG       Override default storm conf file
  --jars JARS           When you want to ship other jars which are not
                        included to application jar, you can pass them to
                        --jars option with comma-separated string. For
                        example, --jars "your-local-jar.jar,your-local-
                        jar2.jar" will load your-local-jar.jar and your-local-
                        jar2.jar.
  --mavenLocalRepositoryDirectory MAVENLOCALREPOSITORYDIRECTORY
                        You can provide local maven repository directory via
                        --mavenLocalRepositoryDirectory if you would like to
                        use specific directory. It might help when you don't
                        have '.m2/repository' directory in home directory,
                        because CWD is sometimes non-deterministic (fragile).
  --proxyPassword PROXYPASSWORD
                        password of proxy if it requires basic auth
  --proxyUrl PROXYURL   You can also provide proxy information to let
                        dependency resolver utilizing proxy if needed. URL
                        representation of proxy ('http://host:port')
  --proxyUsername PROXYUSERNAME
                        username of proxy if it requires basic auth
  --storm-server-classpath
                        If for some reason you need to have the full storm
                        classpath, not just the one for the worker you may
                        include the command line option `--storm-server-
                        classpath`. Please be careful because this will add
                        things to the classpath that will not be on the worker
                        classpath and could result in the worker not running.
  -h, --help            show this help message and exit
  -storm_config_opts STORM_CONFIG_OPTS, -c STORM_CONFIG_OPTS
                        Override storm conf properties , e.g.
                        nimbus.ui.port=4443


[linyouyi@hadoop01 apache-storm-2.0.0]$ storm jar /home/storm/storm-starter.jar storm.start.WordCountTopology.wordcountTop

提交storm-starter.jar到遠程集羣,並啓動wordcountTop拓撲

相關文章
相關標籤/搜索