注: 環境: skylin-linuxhtml
Flume的下載方式: node
wget http://www.apache.org/dyn/closer.lua/flume/1.6.0/apache-flume-1.6.0-bin.tar.
tar -zvxf apache-flume-1.6..0-bin.tar.
進入flume的conf配置包中,使用命令touch flume.conf,而後cp flume-conf.properties.template flume.confexpress
使vim/gedit flume.conf 編輯配置文件,須要說明的的是,Flume conf文件用的是Java版的property文件的key-value鍵值對模式.apache
1. 須要命名當前使用的Agent的名稱.app
2. 命名Agent下的source的名字.less
3. 命名Agent下的channal的名字.ui
4. 命名Agent下的sink的名字.this
5. 將source和sink經過channal綁定起來.
#Agent取名爲 agent_name #source 取名爲 source_name ,一次類推 agent_name.source = source_name agent_name.channels = channel_name agent_name.sinks = sink_name
若是咱們須要在一個Agent上配置n個sink,m個channel(n>1, m>1),
#Agent取名爲 agent_name #source 取名爲 source_name ,一次類推 agent_name.source = source_name ,source_name1 agent_name.channels = channel_name,channel_name1 agent_name.sinks = sink_name,sink_name1
上面的配置就表示一個Agent中有兩個 source,sink,channel的狀況,如圖所示
以上是對多sink,channel,source狀況,對於 多個Agent,只須要給每一個Agent取一個獨一無二的名字便可!
Sources | Channels | Sinks |
以上的類型,你能夠根據本身的需求來搭配組合使用,固然若是你願意,你能夠隨心所欲的搭配.好比咱們使用Avro source類型,採用Memory channel,使用HDFS sink存儲,那咱們的配置能夠接着上的配置這樣寫
#Agent取名爲 agent_name #source 取名爲 source_name ,一次類推 agent_name.source = Avro agent_name.channels = MemoryChannel agent_name.sinks = HDFS
當你命名好Agent的組成部分後,你還須要對Agent的組成sources , sinks, channles去一一描述. 下面咱們來逐一的細說;
注: 須要特別說明,在Agent中對於存在的N(N>1)個source,其中的每個source都須要單獨進行配置,首先咱們須要對source的type進行設置,而後在對每個type進行對應的屬性設置.其通用的模式以下:
agent_name.sources. source_name.type = value agent_name.sources. source_name.property2 = value agent_name.sources. source_name.property3 = value
#Agent取名爲 agent_name #source 取名爲 source_name ,一次類推 agent_name.source = Avro agent_name.channels = MemoryChannel agent_name.sinks = HDFS #——————————sourcec配置——————————————# agent_name.source.Avro.type = avro agent_name.source.Avro.bind = localhost agent_name.source.Avro.port = 9696 #將source綁定到MemoryChannel管道上 agent_name.source.Avro.channels = MemoryChannel
agent_name.channels.channel_name.type = value agent_name.channels.channel_name. property2 = value agent_name.channels.channel_name. property3 = value
具體的例子,假如咱們選用memory channel類型,那麼我先要配置管道的類型
agent_name.channels.MemoryChannel.type = memory
agent_name.sources.Avro.channels = MemoryChannel
agent_name.sinks.HDFS.channels = MemoryCHannel
agent_name.sinks. sink_name.type = value agent_name.sinks. sink_name.property2 = value agent_name.sinks. sink_name.property3 = value
具體例子,好比咱們設置Sink類型爲HDFS ,那麼咱們的配置單就以下:
agent_name.sinks.HDFS.type = hdfs
agent_name.sinks.HDFS.path = HDFS‘s path
# Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # distributed with this work for additional information # regarding copyright ownership. The ASF licenses this file # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, # software distributed under the License is distributed on an # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. # The configuration file needs to define the sources, # the channels and the sinks. # Sources, channels and sinks are defined per agent, # in this case called 'agent' #define agent agent.sources = seqGenSrc agent.channels = memoryChannel agent.sinks = loggerSink kafkaSink # # For each one of the sources, the type is defined #默認模式 agent.sources.seqGenSrc.type = seq / netcat / avro agent.sources.seqGenSrc.type = avro agent.sources.seqGenSrc.bind = localhost agent.sources.seqGenSrc.port = 9696 #####數據來源#### #agent.sources.seqGenSrc.coommand = tail -F /home/gongxijun/Qunar/data/data.log # The channel can be defined as follows. agent.sources.seqGenSrc.channels = memoryChannel #+++++++++++++++定義sink+++++++++++++++++++++# # Each sink's type must be defined agent.sinks.loggerSink.type = logger agent.sinks.loggerSink.type = hbase agent.sinks.loggerSink.channel = memoryChannel #表名 agent.sinks.loggerSink.table = flume #列名 agent.sinks.loggerSink.columnFamily= gxjun agent.sinks.loggerSink.serializer = org.apache.flume.sink.hbase.MyHbaseEventSerializer #agent.sinks.loggerSink.serializer = org.apache.flume.sink.hbase.RegexHbaseEventSerializer agent.sinks.loggerSink.zookeeperQuorum=localhost:2181 agent.sinks.loggerSink.znodeParent= /hbase #Specify the channel the sink should use agent.sinks.loggerSink.channel = memoryChannel # Each channel's type is defined. #memory agent.channels.memoryChannel.type = memory agent.channels.memortChhannel.keep-alive = 10 # Other config values specific to each type of channel(sink or source) # can be defined as well # In this case, it specifies the capacity of the memory channel #agent.channels.memoryChannel.checkpointDir = /home/gongxijun/Qunar/data #agent.channels.memoryChannel.dataDirs = /home/gongxijun/Qunar/data , /home/gongxijun/Qunar/tmpData agent.channels.memoryChannel.capacity = 10000000 agent.channels.memoryChannel.transactionCapacity = 10000 #define the sink2 kafka #+++++++++++++++定義sink+++++++++++++++++++++# # Each sink's type must be defined agent.sinks.kafkaSink.type = logger agent.sinks.kafkaSink.type = org.apache.flume.sink.kafka.KafkaSink agent.sinks.kafkaSink.channel = memoryChannel #agent.sinks.kafkaSink.server=localhost:9092 agent.sinks.kafkaSink.topic= kafka-topic agent.sinks.kafkaSink.batchSize = 20 agent.sinks.kafkaSink.brokerList = localhost:9092 #Specify the channel the sink should use agent.sinks.kafkaSink.channel = memoryChannel
做者: 龔細軍