SolrCloud wiki翻譯(6)近實時搜索, 索引複製,災難恢復

SolrCloud and Replication

SolrCloud與索引複製

Replication ensures redundancy for your data, and enables you to send an update request to any node in the shard.  If that node is a replica, it will forward the request to the leader, which then forwards it to all existing replicas, using versioning to make sure every replica has the most up-to-date version.  This architecture enables you to be certain that your data can be recovered in the event of a disaster, even if you are using Near Real Time searching. node

索引複製確保爲你的數據提供了冗餘,而且你能夠把一個更新請求發送到shard裏面的任意一個節點。若是收到請求的節點是replica節點,它會把請求轉發給leader節點,而後leader節點會把這個請求轉發到全部存活的replica節點上去,他們經過使用版本控制來確保每一個replica節點的數據都是最新的版本。SolrCloud的這種結構讓數據可以在一個災難事故以後恢復,即使你正在使用的是一個近實時的搜索系統。 apache

Near Real Time Searching

近實時搜索

If you want to use the NearRealtimeSearch support, enable auto soft commits in your solrconfig.xml file before storing it into Zookeeper. Otherwise you can send explicit soft commits to the cluster as you need. 服務器

若是你想要得到近實時搜索的支持,在solrconfig.xml放到ZooKeeper以前打開索引自動softCommit的特性。另外若是你須要的話能夠明確的發送一個softCommit請求給集羣。 架構

SolrCloud doesn't work very well with separated data clusters connected by an expensive pipe. The root problem is that SolrCloud's architecture sends documents to all the nodes in the cluster (on a per-shard basis), and that architecture is really dictated by the NRT functionality. app

若是你的數據分佈在一個節點之間傳輸數據代價很是高的集羣中,那麼SolrCloud可能不會運行的很好。其根本緣由是由於SolrCloud的架構會把文檔發送給集羣中的全部節點(會在每一個shard的節點之間發送),而這種架構其實是基於近實時功能的。 ide

Imagine that you have a set of servers in China and one in the US that are aware of each other. Assuming 5 replicas, a single update to a shard may make multiple trips over the expensive pipe before it's all done, probably slowing indexing speed unacceptably. ui

想象一下你有一系列的服務器是在放在中國,還有一些放在美國,而且它們都知道彼此的存在。假設有5個replica節點,一個發送給shard的單獨請求在完成以前可能在高代價的鏈接上傳輸屢次,極可能把索引速度拖慢到一個不可接受的程度。 this

So the SolrCloud recommendation for this situation is to maintain these clusters separately; nodes in China don't even know that nodes exist in the US and vice-versa.  When indexing, you send the update request to one node in the US and one in China and all the node-routing after that is local to the separate clusters. Requests can go to any node in either country and maintain a consistent view of the data. spa

所以SolrCloud對這種狀況的建議是把這些集羣分開維護;放在中國的節點不用知道放在美國的節點的存在,反之亦然。當索引的時候,你把更新請求發送到一個放在美國的節點同時也發送到一個放在中國的節點,而後發送以後兩個分開的集羣之間的節點路由都是在各自集羣本地進行的。 版本控制

However, if your US cluster goes down, you have to re-synchronize the down cluster with up-to-date information from China. The process requires you to replicate the index from China to the repaired US installation and then get everything back up and working.

然而,若是你在美國的集羣宕機了,你必須將最新的數據相關信息從中國的機器上從新同步到美國的集羣中。這個處理須要你把索引從中國的機器上拷貝到美國的集羣中,而後備份好數據就能夠繼續正常工做了。

Disaster Recovery for an NRT system

近實時系統的災難恢復

Use of Near Real Time (NRT) searching affects the way that systems using SolrCloud behave during disaster recovery.

使用近實時搜索會影響使用SolrCloud的系統在災難恢復時候的行爲方式。

The procedure outlined below assumes that you are maintaining separate clusters, as described above.   Consider, for example, an event in which the US cluster goes down (say, because of a hurricane), but the China cluster is intact.  Disaster recovery consists of creating the new system and letting the intact cluster create a replicate for each shard on it, then promoting those replicas to be leaders of the newly created US cluster.

下面所述的處理過程是假設你正在維護一個分開的集羣,跟上面所述的狀況同樣。考慮到以下這個例子,在美國的集羣出現了宕機事件(能夠說是由於一場颶風),可是在中國的集羣倒是無缺無損的。災難恢復由如下流程構成,首先建立一個新的系統而且讓完整的集羣在這個系統裏面爲每個shard都建立一個replica節點,而後把這些replica節點所有晉升爲新建立的美國集羣裏面的leader節點。

Here are the steps to take:

以下是須要進行的步驟:

  1. Take the downed system offline to all end users.
  2. Take the indexing process offline.
  3. Repair the system.
  4. Bring up one machine per shard in the repaired system as part of the ZooKeeper cluster on the good system, and wait for replication to happen, creating a replica on that machine.  (SoftCommits will not be repeated, but data will be pulled from the transaction logs if necessary.)
    Icon

    SolrCloud will automatically use old-style replication for the bulk load. By temporarily having only one replica, you'll minimize data transfer across a slow connection.

  5. Bring the machines of the repaired cluster down, and reconfigure them to be a separate Zookeeper cluster again, optionally adding more replicas for each shard.
  6. Make the repaired system visible to end users again.
  7. Start the indexing program again, delivering updates to both systems.
  1. 使宕機的系統對全部用戶來講都變成離線狀態。
  2. 中止提供索引處理服務。
  3. 修復系統。
  4. 從待修復系統中每一個shard拿一臺機器加入到沒有問題的系統中做爲ZooKeeper集羣的一部分,而後等待索引複製的開始,在每臺機器上都會建立一個replica節點。(軟提交不會被複制,可是若是有必要的話會從事務日誌中拉取相關的數據)SolrCloud會自動的使用舊的主從方式來進行索引的批量加載。因爲你只是臨時的建立一個replica節點,因此經過慢速的鏈接傳輸的數據會減小到最少。
  5. 把這些原本是待修復集羣中的機器停掉,而後把它們從新配置成一個分開的ZooKeeper集羣,能夠爲每一個shard添加更多的replica節點。
  6. 讓修復的系統對全部終端用戶可見。
  7. 從新啓動索引程序,把更新請求同時分發給兩個系統。


    全文完

相關文章
相關標籤/搜索