Flume Spooling Directory Source Sink To Hdfs

Spooling Directory Source:apache

This source lets you ingest data by placing files to be ingested into a 「spooling」 directory on disk. 
This source will watch the specified directory for new files, and will parse events out of new files as they appear. 
The event parsing logic is pluggable. After a given file has been fully read into the channel, it is renamed to indicate completion (or optionally deleted).緩存

Unlike the Exec source, this source is reliable and will not miss data, even if Flume is restarted or killed. In exchange for this reliability, only immutable, uniquely-named files must be dropped into the spooling directory. Flume tries to detect these problem conditions and will fail loudly if they are violated:app

#定義三大組件的名稱
ag1.sources = source1
ag1.sinks = sink1
ag1.channels = channel1this

# 配置source組件
ag1.sources.source1.type = spooldir
ag1.sources.source1.spoolDir = /root/log/
ag1.sources.source1.fileSuffix=.FINISHED
ag1.sources.source1.deserializer.maxLineLength=5120rest

# 配置sink組件
ag1.sinks.sink1.type = hdfs
ag1.sinks.sink1.hdfs.path =hdfs://hdp-01:9000/access_log/%y-%m-%d/%H-%M
ag1.sinks.sink1.hdfs.filePrefix = app_log
ag1.sinks.sink1.hdfs.fileSuffix = .log
ag1.sinks.sink1.hdfs.batchSize= 100
ag1.sinks.sink1.hdfs.fileType = DataStream
ag1.sinks.sink1.hdfs.writeFormat =Text日誌

## roll:滾動切換:控制寫文件的切換規則
ag1.sinks.sink1.hdfs.rollSize = 512000    ## 按文件體積(字節)來切   
ag1.sinks.sink1.hdfs.rollCount = 1000000  ## 按event條數切
ag1.sinks.sink1.hdfs.rollInterval = 60    ## 按時間間隔切換文件orm

## 控制生成目錄的規則
ag1.sinks.sink1.hdfs.round = true
ag1.sinks.sink1.hdfs.roundValue = 10
ag1.sinks.sink1.hdfs.roundUnit = minute事務

ag1.sinks.sink1.hdfs.useLocalTimeStamp = trueci

# channel組件配置
ag1.channels.channel1.type = memory
ag1.channels.channel1.capacity = 500000   ## event條數
ag1.channels.channel1.transactionCapacity = 600  ##flume事務控制所須要的緩存容量600條eventkafka

# 綁定source、channel和sink之間的鏈接
ag1.sources.source1.channels = channel1
ag1.sinks.sink1.channel = channel1

 

 

bin/flume-ng agent -c conf/ -f dir-hdfs.conf -n ag1 -Dflume.root.logger=INFO,console
-c : 啓動配置
-f : 採集問價
-n : agent名字
-Dflume.root.logger=INFO,console : 日誌打印紙控制檯

 

bin/flume-ng agent -n a2 -f /usr/local/devtools/flume/apache-flume-1.7.0-bin/conf/flume-kafkaChannel.properties -Dflume.root.logger=INFO,console

相關文章
相關標籤/搜索