kafka學習(二)---- Kafka簡單的Java版本的Hello World實例

kafka學習(二)---- Kafka簡單的Java版本的Hello World實例

源碼git地址:http://git.oschina.net/zhengweishan/Kafka_study_demohtml

github下載地址java

一、開發環境

我使用的是官網的kafka_2.11-0.10.0.0版本,最新的是kafka_2.11-0.10.0.1版本,你們自行下載安裝配置。點擊進入下載地址點擊進入如何win下配置開發環境 ##二、 建立項目 ## 兩種方式:git

(a)普通的方式建立github

注意:開發時候,須要將下載kafka-2.11-0.10.0.0.jar包加入到classpath下面,這個包包含了全部Kafka的api的實現。因爲kafka是使用Scala編寫的,因此可能下載的kafka中的libs文件中的kafka-2.11-0.10.0.0.jar放到項目中不能用,並且還依賴scala-library-2.11.8.jar,因此推薦使用第二種方式構建項目。apache

項目結構圖:api

(b)maven構建項目 maven下載配置這裏再也不敘述,請參看:eclipse建立maven多模塊項目中有關maven的介紹。好處在於不用本身去添加依賴了,maven本身幫咱們加載依賴。session

項目結構圖:多線程

三、實例源碼

3.1 生產者

package com.kafka.demo;

import java.util.Date;
import java.util.Properties;

import kafka.javaapi.producer.Producer;
import kafka.producer.KeyedMessage;
import kafka.producer.ProducerConfig;

/**
 * [@see](http://my.oschina.net/weimingwei) https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+Producer+Example
 * [@see](http://my.oschina.net/weimingwei) http://kafka.apache.org/documentation.html#producerapi
 * [@author](http://my.oschina.net/arthor) wesley
 *
 */
public class ProducerDemo {
	@SuppressWarnings("deprecation")
	public static void main(String[] args) {
		int events = 20;
		// [@see](http://my.oschina.net/weimingwei) http://kafka.apache.org/08/configuration.html-- 3.3 Producer
		// Configs
		// @see http://kafka.apache.org/documentation.html#producerconfigs
		// 設置配置屬性
		Properties props = new Properties();
		props.put("metadata.broker.list", "127.0.0.1:9092"); // 配置kafka的IP和端口
		props.put("serializer.class", "kafka.serializer.StringEncoder");
		// key.serializer.class默認爲serializer.class
		props.put("key.serializer.class", "kafka.serializer.StringEncoder");
		// 可選配置,若是不配置,則使用默認的partitioner
		props.put("partitioner.class", "com.kafka.demo.PartitionerDemo");
		// 觸發acknowledgement機制,不然是fire and forget,可能會引發數據丟失
		// 值爲0,1,-1,能夠參考
		props.put("request.required.acks", "1");
		ProducerConfig config = new ProducerConfig(props);

		// 建立producer
		Producer<String, String> producer = new Producer<String, String>(config);
		// 產生併發送消息
		long start = System.currentTimeMillis();
		for (long i = 0; i < events; i++) {
			long runtime = new Date().getTime();
			String ip = "192.168.1." + i;
			String msg = runtime + "--www.kafkademo.com--" + ip;
			// 若是topic不存在,則會自動建立,默認replication-factor爲1,partitions爲0
			KeyedMessage<String, String> data = new KeyedMessage<String, String>("page_visits", ip, msg);
			System.out.println("-----Kafka Producer----createMessage----" + data);
			producer.send(data);
		}
		System.out.println("Time consuming:" + (System.currentTimeMillis() - start));
		// 關閉producer
		producer.close();
	}
}

3.2 生產者須要配置的Partition類

package com.kafka.demo;

import kafka.producer.Partitioner;
import kafka.utils.VerifiableProperties;

@SuppressWarnings("deprecation")
public class PartitionerDemo implements Partitioner {
	
    public PartitionerDemo (VerifiableProperties props) {
 
    }
 
    public int partition(Object key, int a_numPartitions) {
        int partition = 0;
        String stringKey = (String) key;
        int offset = stringKey.lastIndexOf('.');
        if (offset > 0) {
           partition = Integer.parseInt( stringKey.substring(offset+1)) % a_numPartitions;
        }
       return partition;
  }
 
}

運行以後的效果:

查看控制檯:

紅色部分就是新生成的待消費的信息。併發

3.3 消費者(單線程實例)

package com.kafka.demo;

import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Properties;

import kafka.consumer.ConsumerConfig;
import kafka.consumer.ConsumerIterator;
import kafka.consumer.KafkaStream;
import kafka.javaapi.consumer.ConsumerConnector;
/**
 * @see http://kafka.apache.org/documentation.html#consumerapi
 * @see https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example
 * @see https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example
 * @author wesley
 *
 */
public class ConsumerSimpleDemo extends Thread {
	// 消費者鏈接
	private final ConsumerConnector consumer;
	// 要消費的話題
	private final String topic;

	public ConsumerSimpleDemo(String topic) {
		consumer = kafka.consumer.Consumer.createJavaConsumerConnector(createConsumerConfig());
		this.topic = topic;
	}

	// 配置相關信息
	private static ConsumerConfig createConsumerConfig() {
		Properties props = new Properties();
		// props.put("zookeeper.connect","localhost:2181,10.XX.XX.XX:2181,10.XX.XX.XX:2181");
		// 配置要鏈接的zookeeper地址與端口
		props.put("zookeeper.connect", "127.0.0.1:2181");
		// 配置zookeeper的組id
		props.put("group.id", "group-1");
		// 配置zookeeper鏈接超時間隔
		props.put("zookeeper.session.timeout.ms", "10000");
		// 配置zookeeper異步執行時間
		props.put("zookeeper.sync.time.ms", "200");
		// 配置自動提交時間間隔
		props.put("auto.commit.interval.ms", "1000");
		return new ConsumerConfig(props);
	}

	public void run() {

		Map<String, Integer> topickMap = new HashMap<String, Integer>();
		topickMap.put(topic, 1);
		Map<String, List<KafkaStream<byte[], byte[]>>> streamMap = consumer.createMessageStreams(topickMap);

		KafkaStream<byte[], byte[]> stream = streamMap.get(topic).get(0);
		ConsumerIterator<byte[], byte[]> it = stream.iterator();
		System.out.println("*********Results********");
		while (true) {
			if (it.hasNext()) {
				// 打印獲得的消息
				System.err.println(Thread.currentThread() + " get data:" + new String(it.next().message()));
			}
			try {
				Thread.sleep(1000);
			} catch (InterruptedException e) {
				e.printStackTrace();
			}
		}
	}
	
	public static void main(String[] args) {
		ConsumerSimpleDemo consumerThread = new ConsumerSimpleDemo("page_visits");
		consumerThread.start();
	}
}

運行以後的效果:

3.4 消費者(線程池實例)

package com.kafka.demo;

import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Properties;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;

import kafka.consumer.Consumer;
import kafka.consumer.ConsumerConfig;
import kafka.consumer.KafkaStream;
import kafka.javaapi.consumer.ConsumerConnector;

/* https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example
 * http://kafka.apache.org/documentation.html#consumerapi
 */
public class ConsumerDemo {
	private final ConsumerConnector consumer;
	private final String topic;
	private ExecutorService executor;

	public ConsumerDemo(String a_zookeeper, String a_groupId, String a_topic) {
		consumer = Consumer.createJavaConsumerConnector(createConsumerConfig(a_zookeeper,a_groupId));
		this.topic = a_topic;
	}

	public void shutdown() {
		if (consumer != null)
			consumer.shutdown();
		if (executor != null)
			executor.shutdown();
		try {
            if (!executor.awaitTermination(5000, TimeUnit.MILLISECONDS)) {
                System.out.println("Timed out waiting for consumer threads to shut down, exiting uncleanly");
            }
        } catch (InterruptedException e) {
            System.out.println("Interrupted during shutdown, exiting uncleanly");
        }
	}

	public void run(int numThreads) {
		System.out.println("-----Consumers begin to execute-------");
		Map<String, Integer> topicCountMap = new HashMap<String, Integer>();
		topicCountMap.put(topic, new Integer(numThreads));
		Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumer
				.createMessageStreams(topicCountMap);
		List<KafkaStream<byte[], byte[]>> streams = consumerMap.get(topic);
		System.err.println("-----Need to consume content----"+streams);

		// now launch all the threads
		executor = Executors.newFixedThreadPool(numThreads);

		// now create an object to consume the messages
		int threadNumber = 0;
		for (final KafkaStream<byte[], byte[]> stream : streams) {
			System.out.println("-----Consumers begin to consume-------"+stream);
			executor.submit(new ConsumerMsgTask(stream, threadNumber));
			threadNumber++;
		}
	}

	private static ConsumerConfig createConsumerConfig(String a_zookeeper,
			String a_groupId) {
		Properties props = new Properties();
		// see http://kafka.apache.org/08/configuration.html --3.2 Consumer Configs
		// http://kafka.apache.org/documentation.html#consumerconfigs
		props.put("zookeeper.connect", a_zookeeper); //配置ZK地址
		props.put("group.id", a_groupId); //必填字段
		props.put("zookeeper.session.timeout.ms", "400");
		props.put("zookeeper.sync.time.ms", "200");
		props.put("auto.commit.interval.ms", "1000");
		return new ConsumerConfig(props);
	}

	public static void main(String[] arg) {
		String[] args = { "127.0.0.1:2181", "group-1", "page_visits", "10" };
		String zooKeeper = args[0];
		String groupId = args[1];
		String topic = args[2];
		int threads = Integer.parseInt(args[3]);

		ConsumerDemo demo = new ConsumerDemo(zooKeeper, groupId, topic);
		demo.run(threads);

		try {
			Thread.sleep(10000);
		} catch (InterruptedException ie) {

		}
		demo.shutdown();
	}
}

注意:這裏要調用處理消息的類eclipse

3.5 處理消息的類

package com.kafka.demo;

import kafka.consumer.ConsumerIterator;
import kafka.consumer.KafkaStream;

public class ConsumerMsgTask implements Runnable {

	private KafkaStream<byte[], byte[]> m_stream;
	private int m_threadNumber;

	public ConsumerMsgTask(KafkaStream<byte[], byte[]> stream, int threadNumber) {
		m_threadNumber = threadNumber;
		m_stream = stream;
	}

	public void run() {
		System.out.println("-----Consumers begin to consume-------");
		ConsumerIterator<byte[], byte[]> it = m_stream.iterator();
		while (it.hasNext()){
			System.out.println("Thread " + m_threadNumber + ": "+ new String(it.next().message()));
		}
		System.out.println("Shutting down Thread: " + m_threadNumber);
	}

}

運行效果圖:

實例到此結束,你們能夠多看看kafka的文檔,多瞭解一些kafka的知識,這裏只是演示了怎麼用,其實也都是文檔中的東西,本身總結了一下。

說明

爲何使用High Level Consumer?

有些場景下,從Kafka中讀取消息的邏輯不處理消息的offset,僅僅是獲取消息數據。High Level Consumer就提供了這種功能。首先要知道的是,High Level Consumer在ZooKeeper上保存最新的offset(從指定的分區中讀取)。這個offset基於consumer group名存儲。Consumer group名在Kafka集羣上是全局性的,在啓動新的consumer group的時候要當心集羣上沒有關閉的consumer。當一個consumer線程啓動了,Kafka會將它加入到相同的topic下的相同consumer group裏,而且觸發從新分配。在從新分配時,Kafka將partition分配給consumer,有可能會移動一個partition給另外一個consumer。若是老的、新的處理邏輯同時存在,有可能一些消息傳遞到了老的consumer上。使用High LevelConsumer首先要知道的是,它應該是多線程的。消費者線程的數量跟tipic的partition數量有關,它們之間有一些特定的規則:

  • 若是線程數量大於主題的分區數量,一些線程將得不到任何消息

  • 若是分區數大於線程數,一些線程將獲得多個分區的消息

  • 若是一個線程處理多個分區的消息,它接收到消息的順序是不能保證的。好比,先從分區10獲取了5條消息,從分區11獲取了6條消息,而後從分區10獲取了5條,緊接着又從分區10獲取了5條,雖然分區11還有消息。

  • 添加更多了同consumer group的consumer將觸發Kafka從新分配,某個分區原本分配給a線程的,重新分配後,有可能分配給了b線程。

四、參考資料:

  1. http://kafka.apache.org/documentation.html
  2. https://cwiki.apache.org/confluence/display/KAFKA/Index
相關文章
相關標籤/搜索