SolrCloud Wiki翻譯(2)Nodes,Cores,Clusters & Leaders

Nodes and Cores

Node和Core

In SolrCloud, a node is Java Virtual Machine instance running Solr, commonly called a server. Each Solr core can also be considered a node. Any node can contain both an instance of Solr and various kinds of data. node

在SolrCloud裏面,一個node表明運行一個Solr應用的JVM進程,通常叫作一個server。每個Solr core也能夠認爲是一個node。一個node能夠包含一個Solr的運行實例和各類各樣的索引數據。 shell

A Solr core is basically an index of the text and fields found in documents. A single Solr instance can contain multiple "cores", which are separate from each other based on local criteria. It might be that they are going to provide different search interfaces to users (customers in the US and customers in Canada, for example), or they have security concerns (some users cannot have access to some documents), or the documents are really different and just won't mix well in the same index (a shoe database and a dvd database). 數據庫

一個Solr core是一個基本概念,指的是能夠在文檔裏面的查找文本和其餘類型字段的一個索引。一個單獨的Solr實例能夠包含多個「core」,這些core由於一系列的條件而須要隔離的。這些條件多是:它們要給用戶提供不一樣的一些搜索服務(例如在美國的顧客和在加拿大的顧客),它們要各自設置一些安全策略(好比某些用戶不能訪問某些文檔),或者是它們的數據格式徹底不一樣致使它們不能很好的混合在一份索引裏面(好比一個鞋子的數據庫和dvd的數據庫) apache

When you start a new core in SolrCloud mode, it registers itself with ZooKeeper. This involves creating an Ephemeral node that will go away if the Solr instance goes down, as well as registering information about the core and how to contact it (such as the base Solr URL, core name, etc). Smart clients and nodes in the cluster can use this information to determine who they need to talk to in order to fulfill a request. bootstrap

當你在SolrCloud模式下面啓動一個新的core的時候,它會把本身註冊到ZooKeeper裏面,這個註冊操做包含建立一個Ephemeral節點,這個節點在Solr實例關閉的時候會自動刪除,也會把core的相關信息和怎麼和這個core通訊的方式註冊到ZooKeeper裏面(例如Solr的base url,core名字,等等)。爲了成功的執行一個請求,在集羣中的智能客戶端和節點可以運用這些信息來肯定他們須要和誰通訊。 安全

New Solr cores may also be created and associated with a collection via  CoreAdmin. Additional cloud-related parameters are discussed in the Parameter Reference page. Terms used for the CREATE action are: app

新版本的Solr core亦可經過 CoreAdmin來建立而且把它和一個collection關聯在一塊兒。一些SolrCloud相關的附加參數已經在Parameter Reference進行了說明 less

  • collection: the name of the collection to which this core belongs. Default is the name of the core.
  • shard: the shard id this core represents. (Optional: normally you want to be auto assigned a shard id.)
  • collection.<param>=<value>: causes a property of <param>=<value> to be set if a new collection is being created. For example, usecollection.configName=<configname> to point to the config for a new collection.
  • collection:該core所屬collection的名稱,默認就是core的名稱。
  • shard: 該core所表明的shard的id(該參數是可選的,通常你都會想要集羣幫你自動分配一個shard id)
  • collection.<param>=<value>: 在一個新的collection建立的時候使用這種<param>=<value>的方式來設置相關參數。例如,用collection.configName=<configname>來指明新的collection所須要使用的索引配置。

For example: curl

舉例: ide

curl  'http://localhost:8983/solr/admin/cores?action=CREATE&name=mycore&collection=collection1&shard=shard2'
Clusters

集羣:

A cluster is set of Solr nodes managed by ZooKeeper as a single unit. When you have a cluster, you can always make requests to the cluster and if the request is acknowledged, you can be sure that it will be managed as a unit and be durable, i.e., you won't lose data. Updates can be seen right after they are made and the cluster can be expanded or contracted.

一個集羣是一個獨立的單元,這個單元由經過ZooKeeper管理的一系列Solr節點構成。當你擁有一個集羣的時候,你能夠發送全部的標準請求到集羣的任意節點,能夠確保整個集羣式做爲一個單元進行管理的而且持續可用,好比在集羣中你不會丟失任何數據。只要集羣可以正常的伸縮,那麼全部的更新操做在完成以後改變會立馬正確的可見。

Creating a Cluster

建立集羣

A cluster is created as soon as you have more than one Solr instance registered with ZooKeeper. The section Getting Started with SolrCloud reviews how to set up a simple cluster.

只要一個Solr實例註冊到了ZooKeeper上那麼就認爲一個集羣已經建立好了。 你能夠去Getting Started with SolrCloud複習一下怎麼建立一個簡單的集羣。

Resizing a Cluster

集羣伸縮:

Clusters contain a settable number of shards. You set the number of shards for a new cluster by passing a system property, numShards, when you start up Solr. ThenumShards parameter must be passed on the first startup of any Solr node, and is used to auto-assign which shard each instance should be part of. Once you have started up more Solr nodes than numShards, the nodes will create replicas for each shard, distributing them evenly across the node, as long as they all belong to the same collection.

集羣包含了若干個shard,shard的數量能夠經過數字設置,你能夠經過設置一個numShards的system property來指定一個新的集羣的shard數量。numShards參數只能在第一次啓動Solr節點的時候指定,並且shard中的節點都是自動分配到各個shard中去的。若是你啓動了比numShards參數更多的solr節點的話,這些新啓動的節點都會做爲shard的replica加入到集羣中,這些節點都是均勻的分佈到shard中的,同時他們都是屬於同一個collection。

To add more cores to your collection, simply start the new core. You can do this at any time and the new core will sync its data with the current replicas in the shard before becoming active.

爲了在你的collection中添加更多的core,只須要簡單的啓動新的core。你能夠隨時進行這個操做,而且新的core在自動同步該shard的數據副本以後纔會對外開放服務

You can also avoid numShards and manually assign a core a shard ID if you choose.

固然你也能夠不使用numShards參數而是選擇手動的分配一個core到一個shard,這個shard經過一個shard id來指定。

The number of shards determines how the data in your index is broken up, so you cannot change the number of shards of the index after initially setting up the cluster.

shard的數量決定你的索引數據會分割成多少個部分,因此你不能在初始化一個集羣以後再改變shard的數量。

However, you do have the option of breaking your index into multiple shards to start with, even if you are only using a single machine. You can then expand to multiple machines later. To do that, follow these steps:

然而,你能夠選擇將一個索引的數據分割到多個shard中去,即使你用的是單機模式。你能夠在往後將他們擴展到多個機器上去。下面即是實現這種方式的步驟:

  1. Set up your collection by hosting multiple cores on a single physical machine (or group of machines). Each of these shards will be a leader for that shard. 
  2. When you're ready, you can migrate shards onto new machines by starting up a new replica for a given shard on each new machine. 
  3. Remove the shard from the original machine. ZooKeeper will promote the replica to the leader for that shard.
  1. 在一個擁有多個core的物理機上(或者是多個物理機上)構建一個collection。每一個shard都會有一個屬於該shard的leader。
  2. 當你準備好之後,你能夠爲每一個shard在一個新的機器上建立一個replica,這樣就能夠把每一個shard都遷移到不一樣機器上去。
  3. 刪除原來機器上的shard,ZooKeeper會自動把replica提高爲當前shard的leader。

Leaders and Replicas

Leader和Replica

The concept of a leader is similar to that of master when thinking of traditional Solr replication. The leader is responsible for making sure the replicas are up to date with the same information stored in the leader.

leader的概念和舊式的solr主從複製索引同步中的master概念是類似的。leader節點負責確保replica中所保存的信息可以跟上存儲在leader中信息並保持一致。

However, with SolrCloud, you don't simply have one master and one or more "slaves", instead you likely have distributed your search and index traffic to multiple machines. If you have bootstrapped Solr with numShards=2, for example, your indexes are split across both shards. In this case, both shards are considered leaders. If you start more Solr nodes after the initial two, these will be automatically assigned as replicas for the leaders.

然而,在SolrCloud模式裏面,你不是簡單的擁有了一個master和一個或多個slave,取而代之的是你的搜索和索引請求會被分發到多臺機器上去。若是你用一個numShards=2的參數啓動了Solr,這個例子中,你的索引請求會被分離到兩個shard裏面去,兩個shard都被當作是一個leader。若是你在開始的兩個solr以後啓動了更多的solr節點的話,它們都會自動的分配給leader節點當作replica。

Replicas are assigned to shards in the order they are started the first time they join the cluster. This is done in a round-robin manner, unless the new node is manually assigned to a shard with the shardId parameter during startup. This parameter is used as a system property, as in -DshardId=1, the value of which is the ID number of the shard the new node should be attached to.

Replica在第一次啓動而且加入到集羣的時候會被有序的分配到shard裏面去。 這是一種重複的工做方式,除非新的節點被在啓動的時候被手動的指定了一個shardId參數來分配到一個特定的Shard上去。這個參數是做爲一個system property使用的,例如 -DshardId=1,參數的值是新節點想要加入的shard的ID數值。

On subsequent restarts, each node joins the same shard that it was assigned to the first time the node was started (whether that assignment happened manually or automatically). A node that was previously a replica, however, may become the leader if the previously assigned leader is not available.

在接下來的重啓操做中,每一個節點會分配到和第一次啓動的時候加入的shard相同的shard(不管這個分配操做是手動仍是自動)。只要shard的當前leader不可用,前面成爲了replica的節點都有可能會成爲它們所屬shard的leader。

Consider this example:

考慮下面這個例子:

  • Node A is started with the bootstrap parameters, pointing to a stand-alone ZooKeeper, with the numShards parameter set to 2.
  • Node B is started and pointed to the stand-alone ZooKeeper.
  • 節點A在啓動的時候用參數將本身指向了一個單獨運行的ZooKeeper,而且把numShards參數給設置到了2.
  • 節點B也在啓動的時候把本身指向了這個ZooKeeper。

Nodes A and B are both shards, and have fulfilled the 2 shard slots we defined when we started Node A. If we look in the Solr Admin UI, we'll see that both nodes are considered leaders (indicated with a solid black circle).

節點A和節點B都是shard,在啓動A節點的時候定義好了集羣只能有2個shard。如咱們咱們看一下Solr Admin UI,咱們將會看到全部的節點都被當作了leader(經過一個實心黑圓來表示)

  • Node C is started and pointed to the stand-alone ZooKeeper.


  • 啓動節點C而且把它指向單獨的ZooKeeper.


Node C will automatically become a replica of Node A because we didn't specify any other shard for it to belong to, and it cannot become a new shard because we only defined two shards and those have both been taken.

節點C會自動成爲節點A的一個replica節點,由於咱們沒有爲它指定一個特定的shard,並且它也不可以成爲一個新的shard,由於咱們只定義了兩個shard而且兩個shard都已經存在了。

  • Node D is started and pointed to the stand-alone ZooKeeper.


  • 啓動節點D而且把它指向單獨的ZooKeeper


Node D will automatically become a replica of Node B, for the same reasons why Node C is a replica of Node A.

節點D將會自動成爲節點B的一個replica節點,緣由和節點C成爲節點A的replica同樣。

Upon restart, suppose that Node C starts before Node A. What happens? Node C will become the leader, while Node A becomes a replica of Node C.

在從新啓動的時候,假設節點C比節點A先啓動,將會發生什麼呢?節點C會自動的成爲一個leader節點,而節點A則會成爲節點C的一個replica節點。

全文完

相關文章
相關標籤/搜索