【原創】RabbitMQ官網文檔翻譯 -- Clustering Guide

時間 2019-11-13

標籤原創 rabbitmq 文檔翻譯 clustering guide 欄目 RabbitMQ 简体版

原文原文鏈接

爲了方便工做中使用，本身花費了週末空閒的時間對 RabbitMQ 的集羣配置相關文檔進行了翻譯，鑑於本身水平有限，翻譯中不免有紕漏產生，若是疑問，歡迎指出探討。此文以中英對照方式呈現。 html

官方原文：http://www.rabbitmq.com/clustering.html node

============== 我是分割線 ================ shell

Clustering Guide
集羣配置

A RabbitMQ broker is a logical grouping of one or several Erlang nodes, each running the RabbitMQ application and sharing users, virtual hosts, queues, exchanges, etc. Sometimes we refer to the collection of nodes as a cluster. 安全

RabbitMQ 中的 broker 是指一個或多個 Erlang node 的邏輯分組，每一個 node 上面都運行 RabbitMQ 應用程序而且共享 user、vhost、queue、exchange 等。一般咱們將 node 的集合稱之爲集羣 cluster 。

All data/state required for the operation of a RabbitMQ broker is replicated across all nodes, for reliability and scaling, with full ACID properties. An exception to this are message queues, which by default reside on the node that created them, though they are visible and reachable from all nodes. To replicate queues across nodes in a cluster, see the documentation on high availability (note that you will need a working cluster first). cookie

運行 RabbitMQ broker 所需的所有 data/state 數據在全部 node 中均是可複製的，一方面是爲了知足可靠性，另外一方面是爲了知足可擴展性，且符合 ACID 的要求。可是存在一個 例外狀況 是針對 message queue 的，其默認是僅存在於建立它的那個 node 上面，儘管其同時對於全部其餘 node 是可見和可達的。爲了在 cluster 中的全部 node 上覆制某個 queue 的內容，參考 [ high availability ] 相關文檔（首先你可能須要一個創建可用的 cluster）

RabbitMQ clustering does not tolerate network partitions well, so it should not be used over a WAN. The shovel or federation plugins are better solutions for connecting brokers across a WAN. 網絡

RabbitMQ clustering 不能很好處理網絡分裂的問題，故 RabbitMQ cluster 不該該用在 WAN 上。[shovel] 或者 [federation] 插件是用於 WAN 上的鏈接 broker 的比較好的解決方法。數據結構

network partition 即網絡分裂。是指在系統中的任何兩個分組之間的全部網絡鏈接同時發生故障後所出現的狀況。發生這種狀況時，分裂的系統雙方都會從本方一側從新啓動應用程序，進而致使重複服務或裂腦。若是一個羣集中配置的兩個獨立系統具備對指定資源（一般是文件系統或卷）的獨佔訪問權限，則會發生裂腦狀況。由網絡分裂形成的最爲嚴重的問題是它會影響共享磁盤上的數據。

The composition of a cluster can be altered dynamically. All RabbitMQ brokers start out as running on a single node. These nodes can be joined into clusters, and subsequently turned back into individual brokers again. app

cluster 的構成是能夠動態改變的。 全部 RabbitMQ broker 在最初啓動時都是從單獨一個 node 上開始的。 這些 node 能夠加入到同一個 cluster 中，以後還能夠從新加回到不一樣的 broker 中。

RabbitMQ brokers tolerate the failure of individual nodes. Nodes can be started and stopped at will. 負載均衡

RabbitMQ broker 對單個 node 的失效是能夠容忍的，node 能夠隨意地啓動或者中止。

A node can be a disk node or a RAM node. (Note: disk and disc are used interchangeably. Configuration syntax or status messages normally use disc.) RAM nodes keep their state only in memory (with the exception of queue contents, which can reside on disc if the queue is persistent or too big to fit in memory). Disk nodes keep state in memory and on disk. As RAM nodes don't have to write to disk as much as disk nodes, they can perform better. However, not that since the queue data is always stored on disc, the performance improvements will affect only resources management (e.g. adding/removing queues, exchanges, or vhosts), but not publishing or consuming speed. Because state is replicated across all nodes in the cluster, it is sufficient (but not reccomended) to have just one disk node within a cluster, to store the state of the cluster safely. dom

node 的類型分爲磁盤（disk） node 或者是內存（RAM） node 兩種。（注：磁盤間能夠相互替換，配置語法或者狀態消息一般使用磁盤 node 進行存儲） 內存 node 只在內存中保存狀態信息（除了 queue 內容的特殊狀況，即若是將 queue 的屬性設置爲 persistent 或者出現要存放的數據量太大不適合放在內存中的狀況時，queue 中的內容會被存放到磁盤上）。 磁盤 node 同時在內存和磁盤上保存狀態信息；而內存 node 不像磁盤 node 那樣必須在磁盤上保存信息，故內存 node 具備更高效的性能。然而，並非說由於 queue 數據老是保存在 disk 上，因此只有資源管理（例如，增長/刪除 quque 、exchange 或者 vhost）纔可以對性能提升產生影響，還要考慮 publishing 和 consuming 速度的影響。 由於狀態信息會在 cluster 包含的全部 node 中是能夠進行復制，因此在一個 cluster 中只配置一個磁盤 node 便足夠安全存儲 cluster 的狀態信息（但並非說建議必定要這樣作）。

Clustering transcript
集羣配置操做示範

The following is a transcript of setting up and manipulating a RabbitMQ cluster across three machines - rabbit1, rabbit2, rabbit3, with two of the machines replicating data on ram and disk, and the other replicating data in ram only.

下面是一份創建和操控 RabbitMQ cluster 的示範。其中包括 3 臺機器 - rabbit1，rabbit2，rabbit3，其中兩臺機器採用磁盤 node 方式複製數據，一臺機器採用內存 node 方式複製數據。

We assume that the user is logged into all three machines, that RabbitMQ has been installed on the machines, and that the rabbitmq-server and rabbitmqctl scripts are in the user's PATH.

假定用戶已經登陸到所有 3 臺已經安裝好 RabbitMQ 的機器上了，而且 rabbitmq-server 和 rabbitmqctl 命令行腳本已經在系統路徑 PATH 中配置好了。

Initial setup
初始安裝

Erlang nodes use a cookie to determine whether they are allowed to communicate with each other - for two nodes to be able to communicate they must have the same cookie.

Erlang node 使用 cookie 值來肯定 node 間是否容許相互通訊 - 兩個 node 可以相互通訊的前提是他們必須擁有相同的 cookie 值。

The cookie is just a string of alphanumeric characters. It can be as long or short as you like.

cookie 的值就是一串由字母和數字構成的字符串，其長度隨大爺你的便。

Erlang will automatically create a random cookie file when the RabbitMQ server starts up. This will be typically located in /var/lib/rabbitmq/.erlang.cookie on Unix systems and C:\Users\Current User\.erlang.cookie or C:\Documents and Settings\Current User\.erlang.cookie on Windows systems. The easiest way to proceed is to allow one node to create the file, and then copy it to all the other nodes in the cluster.

Erlang 會在 RabbitMQ 服務啓動後自動地建立一個具備隨機 cookie 值的文件，該文件通常會位於 Unix 系統的 /var/lib/rabbitmq/.erlang.cookie 以及 Windows 系統的 C:\Users\Current User\.erlang.cookie 或者 C:\Documents and Settings\Current User\.erlang.cookie 。 最簡單的方式就是讓某一個 node 建立該 cookie 文件，而後收到將其拷貝到 cluster 中的全部其餘 node 上。

As an alternative, you can insert the option "-setcookie cookie" in the erl call in the rabbitmq-server and rabbitmqctl scripts.

另一種方法是，你可使用在腳本命令 rabbitmq-server 和 rabbitmqctl 中使用選項 " -setcookie cookie" 。

Starting independent nodes
啓動每個單獨的 node

Clusters are set up by re-configuring existing RabbitMQ nodes into a cluster configuration. Hence the first step is to start RabbitMQ on all nodes in the normal way:

要想創建一個 Cluster ，你就必須對每個已經存在的 RabbitMQ node 按照 cluster 配置的方式從新進行配置。故第一步要作的就是在每個 node 上都常規啓動 RabbitMQ 服務：

rabbit1$ rabbitmq-server -detached
rabbit2$ rabbitmq-server -detached
rabbit3$ rabbitmq-server -detached

This creates three independent RabbitMQ brokers, one on each node, as confirmed by the cluster_status command:

這樣就建立了 3 個獨立的 RabbitMQ broker ，每個 node 上一個，能夠經過 cluster_status 命令來確認：

rabbit1$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit1 ...
[{nodes,[{disc,[rabbit@rabbit1]}]},{running_nodes,[rabbit@rabbit1]}]
...done.
rabbit2$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit2 ...
[{nodes,[{disc,[rabbit@rabbit2]}]},{running_nodes,[rabbit@rabbit2]}]
...done.
rabbit3$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit3 ...
[{nodes,[{disc,[rabbit@rabbit3]}]},{running_nodes,[rabbit@rabbit3]}]
...done.

The node name of a RabbitMQ broker started from the rabbitmq-server shell script is rabbit@shorthostname, where the short node name is lower-case (as in rabbit@rabbit1, above). If you use the rabbitmq-server.bat batch file on Windows, the short node name is upper-case (as in rabbit@RABBIT1). When you type node names, case matters, and these strings must match exactly.

經過 rabbitmq-server 腳本命令建立的 RabbitMQ broker 對應的 node 的名字是 rabbit@shorthostname 樣式，其中 short node 名字在 Linux 下是小寫字母形式（如 rabbit@rabbit1）。若是您是在 Windows 上使用 rabbitmq-server.bat 批處理來執行的上述命令，short node 名字會是大寫字母形式（如 rabbit@RABBIT1）。因此， 當你要使用 node 名字時，要注意大小寫的問題，由於匹配時要求徹底一致。

Creating the cluster
建立集羣

In order to link up our three nodes in a cluster, we tell two of the nodes, say rabbit@rabbit2 and rabbit@rabbit3, to join the cluster of the third, say rabbit@rabbit1.

爲了將咱們建立的 3 個 node 鏈接成一個 cluster ，須要將其中兩個 node（如 rabbit@rabbit2 和 rabbit@rabbit3）加入到第三個 node（如 rabbit@rabbit1）所在的 cluster 中。

We first join rabbit@rabbit2 as a ram node in a cluster with rabbit@rabbit1 in a cluster. To do that, on rabbit@rabbit2 we stop the RabbitMQ application and join the rabbit@rabbit1 cluster enabling the --ram flag, and restart the RabbitMQ application. Note that joining a cluster implicitly resets the node, thus removing all resources and data that were previously present on that node.

咱們首先將 rabbit@rabbit2 按照內存 node 的方式加入到 rabbit@rabbit1 所在 cluster 中。咱們須要先中止 rabbit@rabbit2 上的 RabbitMQ 應用，而後以使能 " --ram " 標識的方式加入到 rabbit@rabbit1 所在 cluster ，最後從新啓動 RabbitMQ 應用。 注意：加入 cluster 的過程隱式包含了重置 node 的動做，即移除了當前 node 上以前存放的所的資源和數據。

rabbit2$ rabbitmqctl stop_app
Stopping node rabbit@rabbit2 ...done.
rabbit2$ rabbitmqctl join_cluster --ram rabbit@rabbit1
Clustering node rabbit@rabbit2 with [rabbit@rabbit1] ...done.
rabbit2$ rabbitmqctl start_app
Starting node rabbit@rabbit2 ...done.

We can see that the two nodes are joined in a cluster by running the cluster_status command on either of the nodes:

咱們能夠從 rabbit@rabbit1 或者 rabbit@rabbit2 上經過命令 cluster_status 看到兩個 node 已經加入到同一個 cluster 中了：

rabbit1$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit1 ...
[{nodes,[{disc,[rabbit@rabbit1]},{ram,[rabbit@rabbit2]}]},
 {running_nodes,[rabbit@rabbit2,rabbit@rabbit1]}]
...done.
rabbit2$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit2 ...
[{nodes,[{disc,[rabbit@rabbit1]},{ram,[rabbit@rabbit2]}]},
 {running_nodes,[rabbit@rabbit1,rabbit@rabbit2]}]
...done.

Now we join rabbit@rabbit3 as a disk node to the same cluster. The steps are identical to the ones above, except that we omit the --ram flag in order to turn it into a disk rather than ram node. This time we'll cluster to rabbit2 to demonstrate that the node chosen to cluster to does not matter - it is enough to provide one online node and the node will be clustered to the cluster that the specified node belongs to.

如今咱們將 rabbit@rabbit3 以磁盤 node 的形式加入到同一個 cluster 中。步驟和上面的相同，除了須要省掉 "--ram" 標識以便按照磁盤 node 的形式加入。此次咱們將加入 rabbit2 所在的 cluster （其實也是 rabbit1 所在的 cluster）以證實在這種狀況下經過哪個 node 加入 cluster 都是同樣同樣同樣的。即只要咱們提供了處於某個 cluster 中的可被其餘人訪問的 node ，那麼該 node 所在的 cluster 就能夠被其餘 node 加入。

rabbit3$ rabbitmqctl stop_app
Stopping node rabbit@rabbit3 ...done.
rabbit3$ rabbitmqctl join_cluster rabbit@rabbit2
Clustering node rabbit@rabbit3 with rabbit@rabbit2 ...done.
rabbit3$ rabbitmqctl start_app
Starting node rabbit@rabbit3 ...done.

We can see that the three nodes are joined in a cluster by running the cluster_status command on any of the nodes:

咱們能夠從任意一個 node 上經過命令 cluster_status 看到三個 node 已經加入到同一個 cluster 中了：

rabbit1$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit1 ...
[{nodes,[{disc,[rabbit@rabbit1,rabbit@rabbit3]},{ram,[rabbit@rabbit2]}]},
 {running_nodes,[rabbit@rabbit3,rabbit@rabbit2,rabbit@rabbit1]}]
...done.
rabbit2$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit2 ...
[{nodes,[{disc,[rabbit@rabbit1,rabbit@rabbit3]},{ram,[rabbit@rabbit2]}]},
 {running_nodes,[rabbit@rabbit3,rabbit@rabbit1,rabbit@rabbit2]}]
...done.
rabbit3$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit3 ...
[{nodes,[{disc,[rabbit@rabbit3,rabbit@rabbit1]},{ram,[rabbit@rabbit2]}]},
 {running_nodes,[rabbit@rabbit2,rabbit@rabbit1,rabbit@rabbit3]}]
...done.

By following the above steps we can add new nodes to the cluster at any time, while the cluster is running.

按照上面的步驟，咱們能夠在任意時間添加新的 node 到 cluster 中，只要 cluster 處於運行狀態。

Changing node types
改變 node 的類型

We can change the type of a node from ram to disk and vice versa. Say we wanted to reverse the types of rabbit@rabbit2 and rabbit@rabbit3, turning the former from a ram node into a disk node and the latter from a disk node into a ram node. To do that we can use the change_cluster_node_type command. The node must be stopped first.

咱們能夠改變 node 的類型，如磁盤 node 到內存 node ，或者相反。好比將 rabbit@rabbit2 和 rabbit@rabbit3 的 node 類型都變成和以前不一樣的種類。咱們可使用命令 change_cluster_node_type 來進行轉換，可是首先須要將 node 中止。

rabbit2$ rabbitmqctl stop_app
Stopping node rabbit@rabbit2 ...done.
rabbit2$ rabbitmqctl change_cluster_node_type disc
Turning rabbit@rabbit2 into a disc node ...
...done.
rabbit2$ rabbitmqctl start_app
Starting node rabbit@rabbit2 ...done.

rabbit3$ rabbitmqctl stop_app
Stopping node rabbit@rabbit3 ...done.
rabbit3$ rabbitmqctl change_cluster_node_type ram
Turning rabbit@rabbit3 into a ram node ...
rabbit3$ rabbitmqctl start_app
Starting node rabbit@rabbit3 ...done.

Restarting cluster nodes
從新啓動 cluster node

Nodes that have been joined to a cluster can be stopped at any time. It is also ok for them to crash. In both cases the rest of the cluster continues operating unaffected, and the nodes automatically "catch up" with the other cluster nodes when they start up again.

cluster 中的 node 在任什麼時候候均可以被中止。 一樣地若是他們崩潰了也是沒有任何問題的。在上述兩種狀況中，cluster 中的其餘 node 均可以不受任何影響的繼續運行，這些「非正常」 node 從新啓動後會自動地與 cluster 中的其餘 node 取得聯繫。

We shut down the nodes rabbit@rabbit1 and rabbit@rabbit3 and check on the cluster status at each step:

咱們手動關閉 rabbit@rabbit1 和 rabbit@rabbit3 後，經過命令查看 cluster 的狀態：

rabbit1$ rabbitmqctl stop
Stopping and halting node rabbit@rabbit1 ...done.

rabbit2$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit2 ...
[{nodes,[{disc,[rabbit@rabbit1,rabbit@rabbit2]},{ram,[rabbit@rabbit3]}]},
 {running_nodes,[rabbit@rabbit3,rabbit@rabbit2]}]
...done.

rabbit3$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit3 ...
[{nodes,[{disc,[rabbit@rabbit2,rabbit@rabbit1]},{ram,[rabbit@rabbit3]}]},
 {running_nodes,[rabbit@rabbit2,rabbit@rabbit3]}]
...done.

rabbit3$ rabbitmqctl stop
Stopping and halting node rabbit@rabbit3 ...done.

rabbit2$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit2 ...
[{nodes,[{disc,[rabbit@rabbit1,rabbit@rabbit2]},{ram,[rabbit@rabbit3]}]},
 {running_nodes,[rabbit@rabbit2]}]
...done.

Now we start the nodes again, checking on the cluster status as we go along:

如今咱們從新啓動 node ，並查看 cluster 的狀態：

rabbit1$ rabbitmq-server -detached

rabbit1$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit1 ...
[{nodes,[{disc,[rabbit@rabbit1,rabbit@rabbit2]},{ram,[rabbit@rabbit3]}]},
 {running_nodes,[rabbit@rabbit2,rabbit@rabbit1]}]
...done.

rabbit2$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit2 ...
[{nodes,[{disc,[rabbit@rabbit1,rabbit@rabbit2]},{ram,[rabbit@rabbit3]}]},
 {running_nodes,[rabbit@rabbit1,rabbit@rabbit2]}]
...done.

rabbit3$ rabbitmq-server -detached

rabbit1$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit1 ...
[{nodes,[{disc,[rabbit@rabbit1,rabbit@rabbit2]},{ram,[rabbit@rabbit3]}]},
 {running_nodes,[rabbit@rabbit2,rabbit@rabbit1,rabbit@rabbit3]}]
...done.

rabbit2$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit2 ...
[{nodes,[{disc,[rabbit@rabbit1,rabbit@rabbit2]},{ram,[rabbit@rabbit3]}]},
 {running_nodes,[rabbit@rabbit1,rabbit@rabbit2,rabbit@rabbit3]}]
...done.

rabbit3$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit3 ...
[{nodes,[{disc,[rabbit@rabbit2,rabbit@rabbit1]},{ram,[rabbit@rabbit3]}]},
 {running_nodes,[rabbit@rabbit2,rabbit@rabbit1,rabbit@rabbit3]}]
...done.

There are some important caveats:

有幾個須要注意的地方：

At least one disk node should be running at all times to prevent data loss. RabbitMQ will prevent the creation of a RAM-only cluster in many situations, but it still won't stop you from stopping and forcefully resetting all the disc nodes, which will lead to a RAM-only cluster. Doing this is not advisable and makes losing data very easy.

爲了防止數據丟失的發生，在任何狀況下都應該保證至少有一個 node 是採用磁盤 node 方式。RabbitMQ 在不少狀況下會阻止建立僅有內存 node 的 cluster ，可是若是你經過手動將 cluster 中的所有磁盤 node 都中止掉或者強制 reset 全部的磁盤 node 的方式間接致使生成了僅有內存 node 的 cluster ，RabbitMQ 沒法阻止你。你這麼作自己是很不明智的，由於會致使你的數據很是容易丟失。

When the entire cluster is brought down, the last node to go down must be the first node to be brought online. If this doesn't happen, the nodes will wait 30 seconds for the last disc node to come back online, and fail afterwards. If the last node to go offline cannot be brought back up, it can be removed from the cluster using the forget_cluster_node command - consult the rabbitmqctl manpage for more information.

當整個 cluster 不能工做了，最後一個失效的 node 必須是第一個從新開始工做的那一個。若是這種狀況得不到知足，全部 node 將會爲最後一個磁盤 node 的恢復等待 30 秒。若是最後一個離線的 node 沒法從新上線，咱們能夠經過命令 forget_cluster_node 將其從 cluster 中移除 - 具體參考 rabbitmqctl 的使用手冊。

Breaking up a cluster
拆分 cluster

Nodes need to be removed explicitly from a cluster when they are no longer meant to be part of it. We first remove rabbit@rabbit3 from the cluster, returning it to independent operation. To do that, on rabbit@rabbit3 we stop the RabbitMQ application, reset the node, and restart the RabbitMQ application.

當 node 不該該繼續存在於一個 cluster 中時，咱們須要顯式的將這些 node 移除。咱們首先從 cluster 中移除 rabbit@rabbit3 ，將其還原爲獨立運行狀態。具體作法爲，在 rabbit@rabbit3 上先中止 RabbitMQ 應用，再重置 node ，最後從新啓動 RabbitMQ 應用。

rabbit3$ rabbitmqctl stop_app
Stopping node rabbit@rabbit3 ...done.
rabbit3$ rabbitmqctl reset
Resetting node rabbit@rabbit3 ...done.
rabbit3$ rabbitmqctl start_app
Starting node rabbit@rabbit3 ...done.

Note that it would have been equally valid to list rabbit@rabbit3 as a node.

值得注意的是，此時仍舊能夠經過 list 命令發現 rabbit@rabbit3 仍然做爲 node 顯示出來。

Running the cluster_status command on the nodes confirms that rabbit@rabbit3 now is no longer part of the cluster and operates independently:

在 node 上運行 cluster_status 命令能夠發現 rabbit@rabbit3 已經再也不是 cluster 中的一員，且已經處於獨立運行狀態：

rabbit1$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit1 ...
[{nodes,[{disc,[rabbit@rabbit1,rabbit@rabbit2]}]},
 {running_nodes,[rabbit@rabbit2,rabbit@rabbit1]}]
...done.

rabbit2$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit2 ...
[{nodes,[{disc,[rabbit@rabbit1,rabbit@rabbit2]}]},
 {running_nodes,[rabbit@rabbit1,rabbit@rabbit2]}]
...done.

rabbit3$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit3 ...
[{nodes,[{disc,[rabbit@rabbit3]}]},{running_nodes,[rabbit@rabbit3]}]
...done.

We can also remove nodes remotely. This is useful, for example, when having to deal with an unresponsive node. We can for example remove rabbit@rabbi1 from rabbit@rabbit2.

咱們還能夠利用遠端移除 node 的操做，這在有些狀況下是頗有用的，好比對無任何反應的 node 的處理。例如，咱們能夠在 rabbit@rabbit2 上執行移除 rabbit@rabbit1 的操做。

rabbit1$ rabbitmqctl stop_app
Stopping node rabbit@rabbit1 ...done.

rabbit2$ rabbitmqctl forget_cluster_node rabbit@rabbit1
Removing node rabbit@rabbit1 from cluster ...
...done.

Note that rabbit1 still thinks its clustered with rabbit2, and trying to start it will result in an error. We will need to reset it to be able to start it again.

注意到，rabbit1 仍舊會認爲本身與 rabbit2 處於同一個 cluster 中，可是此時在 rabbit1 上執行 start_app 操做會提示相應錯誤信息。若是須要，咱們能夠將 rabbit1 重置成與 rabbit2 處於同一 cluster 的狀態。

rabbit1$ rabbitmqctl start_app
Starting node rabbit@rabbit1 ...
Error: inconsistent_cluster: Node rabbit@rabbit1 thinks it's clustered with node rabbit@rabbit2, but rabbit@rabbit2 disagrees

rabbit1$ rabbitmqctl reset
Resetting node rabbit@rabbit1 ...done.

rabbit1$ rabbitmqctl start_app
Starting node rabbit@rabbit1 ...
...done.

The cluster_status command now shows all three nodes operating as independent RabbitMQ brokers:

此時執行 cluster_status 命令能夠顯示出當前全部 3 個 node 均是做爲獨立的 RabbitMQ broker 處於運行狀態：

rabbit1$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit1 ...
[{nodes,[{disc,[rabbit@rabbit1]}]},{running_nodes,[rabbit@rabbit1]}]
...done.

rabbit2$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit2 ...
[{nodes,[{disc,[rabbit@rabbit2]}]},{running_nodes,[rabbit@rabbit2]}]
...done.

rabbit3$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit3 ...
[{nodes,[{disc,[rabbit@rabbit3]}]},{running_nodes,[rabbit@rabbit3]}]
...done.

Note that rabbit@rabbit2 retains the residual state of the cluster, whereas rabbit@rabbit1 and rabbit@rabbit3 are freshly initialised RabbitMQ brokers. If we want to re-initialise rabbit@rabbit2 we follow the same steps as for the other nodes:

注意到 rabbit@rabbit2 會保有 cluster 的殘餘狀態信息，而 rabbit@rabbit1 和 rabbit@rabbit3 卻能夠當作是新初始化的 RabbitMQ broker 。若是咱們想要從新初始化 rabbit@rabbit2 ，咱們能夠按照下面的方式執行：

rabbit2$ rabbitmqctl stop_app
Stopping node rabbit@rabbit2 ...done.

rabbit2$ rabbitmqctl reset
Resetting node rabbit@rabbit2 ...done.

rabbit2$ rabbitmqctl start_app
Starting node rabbit@rabbit2 ...done.

Auto-configuration of a cluster
cluster 的自動配置

Instead of configuring clusters "on the fly" using the cluster command, clusters can also be set up via the RabbitMQ configuration file. The file should set the cluster_nodes field in the rabbit application to a tuple contanining a list of rabbit nodes, and an atom - either disc or ram - indicating whether the node should join them as a disc node or not.

替代採用 cluster 命令「匆忙地」進行相關 cluster 配置的方式，咱們還能夠採用經過 [ RabbitMQ configuration file ] 來進行 cluster 配置。該配置文件必須以 tuple 的形式設置在 rabbit 應用中須要的 cluster_nodes 域，其中 tuple 中包含了 rabbit 的 node 以及一個 atom 形式的標識 - 或者 disc 或者 ram - 代表當前 node 是什麼類型的 node 加入到 cluster 中的。

If cluster_nodes is specified, RabbitMQ will try to cluster to each node provided, and stop after it can cluster with one of them. RabbitMQ will try cluster to any node which is online that has the same version of Erlang and RabbitMQ. If no suitable nodes are found, the node is left unclustered.

若是指定了 cluster_nodes 字段，RabbitMQ 將嘗試對給出的 node 進行 cluster 操做，而後在與這些 node 之中的一個構成 cluster 以後中止。RabbitMQ 將嘗試對任何在線的且具備相同 Erlang 和 RabbitMQ 版本的 node 進行 cluster 操做。若是沒有發現合適的 node ，當前 node 將以非 cluster 的狀態離開。

Note that the cluster configuration is applied only to fresh nodes. A fresh nodes is a node which has just been reset or is being start for the first time. Thus, the automatic clustering won't take place after restarts of nodes. This means that any change to the clustering via rabbitmqctl will take precedence over the automatic clustering configuration.

注意到， cluster 配置僅被用於 fresh node 。 fresh node 是指剛剛被 reset 或者首次被 start 的 node 。這樣，自動 cluster 行爲不會在重啓 node 以後發生。 這意味着任何經過 rabbitmqctl 命令對 cluster 進行地改變將地位高於（覆蓋）自動 cluster 配置。

A common use of cluster configuration via the RabbitMQ config file is to automatically configure nodes to join a common cluster. For this purpose the same cluster nodes can be specified on all cluster, plus the boolean to determine disc nodes.

運用 RabbitMQ 配置文件進行 cluster 配置的最多見形式是可使得 node 自動加入到 cluster 中去。爲了達到該目的，在全部的 cluster 上均指定相同的 cluster node ，且包含一個代表是否爲磁盤 node 的布爾值。

Say we want to join our three separate nodes of our running example back into a single cluster, with rabbit@rabbit1 and rabbit@rabbit2 being the disk nodes of the cluster. First we reset and stop all nodes, to make sure that we're working with fresh nodes:

例如，咱們想將以前拆開運行的 node 從新加入到同一個 cluster 中去，且 rabbit@rabbit1 和 rabbit@rabbit2 的 node 類型爲磁盤 node 。首先，咱們要 reset 和 stop 全部 node 以確保咱們是以 fresh node 開始後續工做：

rabbit1$ rabbitmqctl stop_app
Stopping node rabbit@rabbit1 ...done.

rabbit1$ rabbitmqctl reset
Resetting node rabbit@rabbit1 ...done.

rabbit1$ rabbitmqctl stop
Stopping and halting node rabbit@rabbit1 ...done.

rabbit2$ rabbitmqctl stop_app
Stopping node rabbit@rabbit2 ...done.

rabbit2$ rabbitmqctl reset
Resetting node rabbit@rabbit2 ...done.

rabbit2$ rabbitmqctl stop
Stopping and halting node rabbit@rabbit2 ...done.

rabbit3$ rabbitmqctl stop_app
Stopping node rabbit@rabbit3 ...done.

rabbit3$ rabbitmqctl reset
Resetting node rabbit@rabbit3 ...done.

rabbit3$ rabbitmqctl stop
Stopping and halting node rabbit@rabbit3 ...done.

Now we set the relevant field in the config file:

此時咱們在配置文件的相關字段上進行設置：

[
  ...
  {rabbit, [
        ...
        {cluster_nodes, {['rabbit@rabbit1', 'rabbit@rabbit2', 'rabbit@rabbit3'], disc}},
        ...
  ]},
  ...
].

For instance, if this were the only field we needed to set, we would simply create the RabbitMQ config file with the contents:

例如，若是咱們只須要設置上面給出的字段，咱們只需使用以下內容建立 RabbitMQ 配置文件：

[{rabbit,
  [{cluster_nodes, {['rabbit@rabbit1', 'rabbit@rabbit2', 'rabbit@rabbit3'], disc}}]}].

Since we want rabbit@rabbit3 to be a ram node, we need to specify that in its configuration file:

若是咱們想將 rabbit@rabbit3 設置爲內存 node ，咱們須要在配置文件中具體指出：

[{rabbit,
  [{cluster_nodes, {['rabbit@rabbit1', 'rabbit@rabbit2', 'rabbit@rabbit3'], ram}}]}].

(Note for Erlang programmers and the curious: this is a standard Erlang configuration file. For more details, see the configuration guide and the Erlang Config Man Page.)

（注：上述配置文件是標準的 Erlang 配置文件，更多細節參考 [ configuration guide ] 和 [ Erlang Config Man Page ]）

Once we have the configuration files in place, we simply start the nodes:

一旦咱們準備好了配置文件，就能夠簡單地 start 相應的 node ：

rabbit1$ rabbitmq-server -detached
rabbit2$ rabbitmq-server -detached
rabbit3$ rabbitmq-server -detached

We can see that the three nodes are joined in a cluster by running the cluster_status command on any of the nodes:

咱們能夠經過 cluster_status 命令看到 3 個 node 確實已經加入了同一個 cluster 中：

rabbit1$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit1 ...
[{nodes,[{disc,[rabbit@rabbit1,rabbit@rabbit2]},{ram,[rabbit@rabbit3]}]},
 {running_nodes,[rabbit@rabbit1,rabbit@rabbit2,rabbit@rabbit3]}]
...done.

rabbit2$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit2 ...
[{nodes,[{disc,[rabbit@rabbit1,rabbit@rabbit2]},{ram,[rabbit@rabbit3]}]},
 {running_nodes,[rabbit@rabbit1,rabbit@rabbit2,rabbit@rabbit3]}]
...done.

rabbit3$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit3 ...
[{nodes,[{disc,[rabbit@rabbit1,rabbit@rabbit2]},{ram,[rabbit@rabbit3]}]},
 {running_nodes,[rabbit@rabbit1,rabbit@rabbit2,rabbit@rabbit3]}]
...done.

Note that, in order to remove a node from an auto-configured cluster, it must first be removed from the rabbitmq.config files of the other nodes in the cluster. Only then, can it be reset safely.

須要注意的是：爲了從經過自動配置方式配置的 cluster 中移除 node ，你首先須要將該 node 從 cluster 中的其餘 node 上的 rabbitmq.config 文件中移除，只有這樣作，才能保證安全 reset 。

Upgrading clusters
cluster 升級

When upgrading from one version of RabbitMQ to another, RabbitMQ will automatically update its persistent data structures if necessary. In a cluster, this task is performed by the first disc node to be started (the "upgrader" node). Therefore when upgrading a RabbitMQ cluster, you should not attempt to start any RAM nodes first; any RAM nodes started will emit an error message and fail to start up.

當 RabbitMQ 從一個版本升級到另外一個版本時，若是必要，RabbitMQ 會自動升級持久化數據結構。在 cluster 中，上述工做會由第一個被啓動的磁盤 node 進行（即「負責升級的」 node ）。因此，當你升級一個 RabbitMQ cluster 的時候，不能夠首先啓動任何內存 node ，任何內存 node 的啓動將產生一條錯誤消息而且啓動失敗。

All nodes in a cluster must be running the same versions of Erlang and RabbitMQ, although they may have different plugins installed. Therefore it is necessary to stop all nodes in the cluster, then start all nodes when performing an upgrade.

cluster 中的全部 node 必須運行在相同的 Erlang 和 RabbitMQ 版本之上，儘管他們均可以安裝不少不一樣的插件。因此在升級 cluster 的時候有必要先將所有 node 都中止，升級以後再將所有 node 從新啓動。

While not strictly necessary, it is a good idea to decide ahead of time which disc node will be the upgrader, stop that node last, and start it first. Otherwise changes to the cluster configuration that were made between the upgrader node stopping and the last node stopping will be lost.

儘管不是必定必要，可是建議你事先決定好使用哪一個磁盤 node 做爲升級點（upgrader），而後在升級過程當中，最後中止那個 node ，最早啓動那個 node 。不然，在升級點 node 中止和最後中止的 node 之間所作的對於 cluster 配置的修改將會被丟失掉。

Automatic upgrades are only possible from RabbitMQ versions 2.1.1 and later. If you have an earlier cluster, you will need to rebuild it to upgrade.

自動升級的功能僅在 RabbitMQ 2.1.1 和以後的版本中才具備。若是你使用了更早版本的 cluster ，你講須要經過從新構建的方式來升級。

A cluster on a single machine
單機上的 cluster

Under some circumstances it can be useful to run a cluster of RabbitMQ nodes on a single machine. This would typically be useful for experimenting with clustering on a desktop or laptop without the overhead of starting several virtual machines for the cluster. The two main requirements for running more than one node on a single machine are that each node should have a unique name and bind to a unique port / IP address combination for each protocol in use.

在一些狀況下，在單機上運行 RabbitMQ node 的 cluster 可能對你頗有實用價值。其中之一是，你能夠在你的臺式機或者筆記本上運行 cluster 而不用額外跑多個虛擬機。 想要在單機上運行超過一個 node 的兩個主要要求是，每個 node 應該具備一個惟一的名字，而且與惟一的 port/ip 綁定，以使得每一份協議均可用。

You can start multiple nodes on the same host manually by repeated invocation of rabbitmq-server ( rabbitmq-server.bat on Windows). You must ensure that for each invocation you set the environment variables RABBITMQ_NODENAME and RABBITMQ_NODE_PORT to suitable values.

你能夠經過手動重複執行 rabbitmq-server 命令在同一主機上啓動多個 node ，你必須確保你每次執行該命令時都對環境變量 RABBITMQ_NODENAME 和 RABBITMQ_NODE_PORT 設置了合適的值。

舉例：

$ RABBITMQ_NODE_PORT=5672 RABBITMQ_NODENAME=rabbit rabbitmq-server -detached
$ RABBITMQ_NODE_PORT=5673 RABBITMQ_NODENAME=hare rabbitmq-server -detached
$ rabbitmqctl -n hare stop_app
$ rabbitmqctl -n hare reset
$ rabbitmqctl -n hare cluster rabbit@`hostname -s`
$ rabbitmqctl -n hare start_app

will set up a two node cluster with one disc node and one ram node. Note that if you have RabbitMQ opening any ports other than AMQP, you'll need to configure those not to clash as well - for example:

上述命令創建了兩個 node 的 cluster ，其中包含一個磁盤 node 一個內存 node 。注意到若是你令 RabbitMQ 使用了非 AMQP 協議指定的任何其餘端口，你將須要經過配置保證不會出現端口衝突 - 例如：

$ RABBITMQ_NODE_PORT=5672 RABBITMQ_SERVER_START_ARGS="-rabbitmq_management listener [{port,15672}]" RABBITMQ_NODENAME=rabbit rabbitmq-server -detached
$ RABBITMQ_NODE_PORT=5673 RABBITMQ_SERVER_START_ARGS="-rabbitmq_management listener [{port,15673}]" RABBITMQ_NODENAME=hare rabbitmq-server -detached

will start two nodes (which can then be clustered) when the management plugin is installed.

上述命令一樣創建了兩個 node 的 cluster ，可是使用了管理插件。

Firewalled nodes
防火牆後的 node

The case for firewalled clustered nodes exists when nodes are in a data center or on a reliable network, but separated by firewalls. Again, clustering is not recommended over a WAN or when network links between nodes are unreliable.

這種狀況是指數據中心或者可靠網絡上的 cluster 中的 node 彼此之間存在防火牆的狀況。再一次重申，不建議在 WAN 或者 node 之間的網絡鏈接不可靠的狀況下建立 cluster 。

If different nodes of a cluster are in the same data center, but behind firewalls then additional configuration will be necessary to ensure inter-node communication. Erlang makes use of a Port Mapper Daemon (epmd) for resolution of node names in a cluster. Nodes must be able to reach each other and the port mapper daemon for clustering to work.

若是 cluster 中的不一樣 node 均處於同一個數據中，可是處於防火牆以後，那麼就須要進行額外的配置以保證 node 之間的正常通訊。 Erlang 利用了端口映射守護進程（epmd）用於解析 cluster 中的 node 名字，node 之間以及 node 和 epmd 之間必須保證可以進行通訊。

The default epmd port is 4369, but this can be changed using the ERL_EPMD_PORT environment variable. All nodes must use the same port. Firewalls must permit traffic on this port to pass between clustered nodes. For further details see the Erlang epmd manpage.

epmd 的默認端口是 4369 ，可是能夠經過使用環境變量 ERL_EPMD_PORT 進行從新設置。 全部的 node 都必須使用相同的端口。 防火牆必須容許 cluster 中 node 之間在該端口上的相互通訊。 進一步信息能夠參考 [ Erlang epmd manpage ] 。

Once a distributed Erlang node address has been resolved via epmd, other nodes will attempt to communicate directly with that address using the Erlang distributed node protocol. The port range for this communication can be configured with two parameters for the Erlang kernel application:

一旦分佈式 Erlang node 地址被 epmd 成功解析，其餘 node 將嘗試使用解析出的地址，經過 Erlang 分佈式 node 協議進行直連。用於該通訊的 端口範圍 能夠經過 Erlang kernel 應用的兩個參數進行配置：

inet_dist_listen_min
inet_dist_listen_max

Firewalls must permit traffic in this range to pass between clustered nodes (assuming all nodes use the same port range). The default port range is unrestricted.

防火牆必須容許 cluster 中的 node 在這個端口範圍內的通訊（假定全部 node 都使用一樣的端口範圍）。 默認端口範圍是無限制。

The Erlang kernel_app manpage contains more details on the port range that distributed Erlang nodes listen on. See the configuration page for information on how to create and edit a configuration file.

[ Erlang kernel_app manpage] 包含了更多關於分佈式 Erlang node 可監聽端口範圍的細節。參考配置頁[ configuration]

Connecting to Clusters from Clients
從客戶端鏈接 cluster

A client can connect as normal to any node within a cluster. If that node should fail, and the rest of the cluster survives, then the client should notice the closed connection, and should be able to reconnect to some surviving member of the cluster. Generally, it's not advisable to bake in node hostnames or IP addresses into client applications: this introduces inflexibility and will require client applications to be edited, recompiled and redeployed should the configuration of the cluster change or the number of nodes in the cluster change. Instead, we recommend a more abstracted approach: this could be a dynamic DNS service which has a very short TTL configuration, or a plain TCP load balancer, or some sort of mobile IP achieved with pacemaker or similar technologies. In general, this aspect of managing the connection to nodes within a cluster is beyond the scope of RabbitMQ itself, and we recommend the use of other technologies designed specifically to solve these problems.

客戶端能夠透明地鏈接到 cluster 中的任意一個 node 上。 若是當前與客戶端處於鏈接狀態的那個 node 失效了，可是 cluster 中的其餘 node 正常工做，那麼客戶端應該發現當前鏈接的關閉，而後應該能夠從新鏈接到 cluster 中的其餘正常的 node 上。通常來說，將 node 的主機名或者 IP 地址 硬編碼 到客戶端應用程序中是很是不明智的：這會致使各類坑爹問題的出現，由於一旦 cluster 的配置改變或者 cluster 中的 ndoe 數目改變，客戶端將面臨從新編碼、編譯和從新發布的問題。做爲替代，咱們建議一種更加通常化的方式：採用 動態 DNS 服務 ，其具備很是短的 TTL 配置，或者 普通 TCP 負載均衡器 ，或者經過隨機行走或者相似技術實現的某種形式的 mobile IP 。一般來說，關於如何成功鏈接 cluster 中的 node 已經超出了 RabbitMQ 自己要說明的範疇，咱們建議你使用其餘的專門用於處理這方面問題的技術來解決這種問題。