kafka 文檔1

時間 2019-11-12

標籤 kafka 文檔欄目 Kafka 简体版

原文原文鏈接

Getting Started
入門
1.1 Introduction
簡介
   Kafka is a distributed, partitioned, replicated commit log service. It provides the functionality of a messaging system,
but with a unique design.
    kafka 是一種分佈式的，可分區，多副本的log服務，提供了消息系統的不少功能，可是又有本身的獨特的設計
What does all that mean?
    意味這什麼呢？
First let's review some basic messaging terminology:
    首先，讓咱們回顧一些基本的通信術語：
        1. Kafka maintains feeds of messages in categories called topics.
           kafka 包含topic 的功能
        2.We'll call processes that publish messages to a Kafka topic producers.
           topic消息的產生着稱爲生產者
         3.We'll call processes that subscribe to topics and process the feed of published messages  consumers..
             topic消息的訂閱着稱爲消費者
        4.Kafka is run as a cluster comprised of one or more servers each of which is called a broker.
            kafka 的運行在一個集羣上面
So, at a high level, producers send messages over the network to the Kafka cluster which in turn serves them up to consumers like this:java
生產者發送消息和消費者消費消息，流程大概是這樣的:web

Communication between the clients and the servers is done with a simple, high-performance, language agnostic TCP protocol. We provide a Java client for Kafka, but clients are available in many languages.apache
客戶端的連接是tcp協議，咱們提供了java客戶端，可是咱們支持多種語言的客戶端服務器

kafka 包含了一個topic的概念，每一個procudcer 都有一個topic，每一個topic 包含多個分區，每一個分區內部是有序的，消息序列不改變的。
     kafka 將保存全部的數據，無論數據是否已經被消費掉了，可是必定時間之後消息將被拋棄
     對於消費者的數據標石是在zookeeper中，使用一個偏移量控制，消費者也能夠重置偏移量
     每一個分區都有一臺服務器充當「領頭羊」和零個或更多的服務器充當「追隨者」。領導者處理全部讀取和寫入該分區的請求，
     而被動的追隨者複製的領導者。若是失敗的領導者，
     追隨者的人會自動成爲新的領導者。每一個服務器充當一些分區，一個跟隨他人的領導者，以便負載集羣中的平衡。
     Producers 能夠根據數據的關鍵字選擇分區
     kafka 的消費者提供了，隊列和發佈-訂閱兩種模式

     Broker Configs
     broker.id 身份的惟一性
     log.dirs   消息的目錄
     port       接受客戶端請求的端口
     zookeeper.connect   zookeeper的地址
     message.max.bytes   消息的最大長度
     num.network.threads 網絡請求的線程數據流
     num.io.threads       持久化的線程數量
     queued.max.requests 最大的請求隊列
     host.name：broker 對於 zookeeper的name
     advertised.host.name 對於消費者，生產者 name
     num.partitions topic 的分區數
     num.replica.fetchers 數據複製的多少份，一個分區，跟隨者多少，

     消費者：
     auto.commit.enable 自動提交偏移量
     rebalance.max.retries 新的消費者加入，嘗試從新負載的次數
     生產者：
     request.required.acks 確認是否須要集羣，肯定分區的數據已經到達副本網絡

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。