1,kafka依賴於zookeeper,下載:apache
kafka2.10-0.10.00包下載,zookeeper3.4.10下載;ubuntu
2,配置啓動ZOOKEEPERoop
配置項:ZOOKEEPER_HOME,和PATH;參考:spa
export ZOOKEEPER_HOME=/home/t/source/zookeeper-3.4.10 export JAVA_HOME=/home/t/source/jdk1.8.0_121 export PATH=/home/t/source/jdk1.8.0_121/bin:/home/t/source/scala/scala-2.10.6/bin:/home/t/source/spark/spark-1.6.2-bin-hadoop2.6/bin:$PATH:$ZOOKEEPER_HOME/bin:$ZOOKEEPER_HOME/conf:/home/t/source/sbt/sbt/bin:/home/t/source/hadoop-2.6.4/bin export HADOOP_HOME=/home/t/source/hadoop-2.6.4 export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop export YARN_HOME=/home/t/source/hadoop-2.6.4 export YARN_CONF_DIR=${YARN_HOME}/etc/hadoop
修改zookeeper-3.4.10/conf下,zoo.conf文件:scala
設置項:code
dataDir=/home/t/source/zookeeper-3.4.10/dataDir dataLogDir=/home/t/source/zookeeper-3.4.10/dataLogDir
zookeeper啓動:server
./zkServer.sh start
3,配置啓動kafkahadoop
修改kafka配置項:部署
kafka外網訪問 advertised.listeners=PLAINTEXT://x.x.x.x:9092get
啓動kafka
./kafka-server-start.sh ../config/server.properties
建立topic(消息類型)
./kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
生產消息:
./kafka-console-producer.sh --broker-list localhost:9092 --topic test
消費消息:topic數據總量
./kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning
描述主題:
bin
/kafka-topics
.sh --describe --zookeeper localhost:2181 --topic myTest4
最終效果:
生產端輸入什麼,消費端輸出什麼。
4,建立多broker集羣
複製多個server.properties,修改broker.id,由於筆者在單機部署,須要再修改listerners.port,log.dir
複製server.properties,啓動多個broker
bin/kafka-server-start.sh config/server-1.properties bin/kafka-server-start.sh config/server-2.properties
建立一個bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic myTest2
不寫入數據,發現僅在broker=0的log。dir下生成目錄
隨機寫入一些數據,
由於已經有三個broker了, 能夠建立replication-factor <=3的topic
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 1 --topic myTest3
若是超過3,會拋以下錯誤
t@ubuntu:~/source/kafka_2.10-0.10.0.0$ bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 5 --partitions 1 --topic myTest2 Error while executing topic command : replication factor: 5 larger than available brokers: 3 [2017-09-05 17:24:55,153] ERROR kafka.admin.AdminOperationException: replication factor: 5 larger than available brokers: 3 at kafka.admin.AdminUtils$.assignReplicasToBrokers(AdminUtils.scala:117) at kafka.admin.AdminUtils$.createTopic(AdminUtils.scala:403) at kafka.admin.TopicCommand$.createTopic(TopicCommand.scala:110) at kafka.admin.TopicCommand$.main(TopicCommand.scala:61) at kafka.admin.TopicCommand.main(TopicCommand.scala) (kafka.admin.TopicCommand$) t@ubuntu:~/source/kafka_2.10-0.10.0.0$
查看三個broker的log.dir目錄,均多出myTest3-0/目錄
使用describe topics命令
t@ubuntu:~/source/kafka_2.10-0.10.0.0$ t@ubuntu:~/source/kafka_2.10-0.10.0.0$ bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic myTest3 Topic:myTest3 PartitionCount:1 ReplicationFactor:3 Configs: Topic: myTest3 Partition: 0 Leader: 2 Replicas: 2,0,1 Isr: 2,0,1 t@ubuntu:~/source/kafka_2.10-0.10.0.0$
由於 replication-factor和partion關係有點亂,因此再試試這個
t@ubuntu:~/source/kafka_2.10-0.10.0.0$ bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 5 --topic myTest4
結果是三個broker的log.dir目錄都出現了
5,kafka 在bin目錄下提供了connect-standalone.sh來自動導入導出數據
bin
/connect-standalone
.sh config
/connect-standalone
.properties config
/connect-file-source
.properties config
/connect-file-sink
.properties
connect-file-source.properties配置導入數據鏈接類以及對應的topic
name=local-file-source connector.class=FileStreamSource tasks.max=1 file=test.txt topic=connect-test
connect-file-sink.properties配置對應的導出數據鏈接類以及對應的topic
name=local-file-sink connector.class=FileStreamSink tasks.max=1 file=test.sink.txt topics=connect-test