Dubbo集羣容錯模式之Failback實現

時間 2019-12-19

標籤 dubbo 集羣容錯模式 failback 實現欄目 Dubbo 简体版

原文原文鏈接

注: Dubbo版本是2.6.2java

圖1 Dubbo的FailbackClusterInvoker類繼承圖ide

1.Failback的含義

Failback能夠理解爲後臺記錄失敗請求，定時重發。this

2.Failback的實現

核心代碼在FailbackClusterInvoker的doInvoke(Invocation,List<Invoker<T>>,LoadBalance)中，源碼以下。spa

@Override
protected Result doInvoke(Invocation invocation, List<Invoker<T>> invokers, LoadBalance loadbalance) throws RpcException {
    try {
        checkInvokers(invokers, invocation);
        Invoker<T> invoker = select(loadbalance, invocation, invokers, null);
        return invoker.invoke(invocation);
    } catch (Throwable e) {
        logger.error("Failback to invoke method " + invocation.getMethodName() + ", wait for retry in background. Ignored exception: "
                + e.getMessage() + ", ", e);
        addFailed(invocation, this);
        return new RpcResult(); // ignore
    }
}

首先根據loadbalance從服務提供者列表中取出一個服務提供者。
調用服務提供者的服務，若是成功則直接返回調用結果；若是請求失敗，用error日誌記錄，以後將這次請求的信息(參數、上下文)保存起來，用定時任務重發。

下面來分析addFailed方法的實現日誌

addFailed(invocation,this)的方法源碼以下，將invocation和router放如到failed裏面(failed是個ConcurrentHashMap)code

private void addFailed(Invocation invocation, AbstractClusterInvoker<?> router) {
    if (retryFuture == null) {
        synchronized (this) {
            if (retryFuture == null) {
                retryFuture = scheduledExecutorService.scheduleWithFixedDelay(new Runnable() {

                    @Override
                    public void run() {
                        // collect retry statistics
                        try {
                            retryFailed();
                        } catch (Throwable t) { // Defensive fault tolerance
                            logger.error("Unexpected error occur at collect statistic", t);
                        }
                    }
                }, RETRY_FAILED_PERIOD, RETRY_FAILED_PERIOD, TimeUnit.MILLISECONDS);
            }
        }
    }
    failed.put(invocation, router);
}

retryFailed()方法源碼以下，遍歷failed中的每一個，若是其中一個請求發生異常，則只是記錄error日誌，不拋出異常，不中斷後面的。router

void retryFailed() {
    if (failed.size() == 0) {
        return;
    }
    for (Map.Entry<Invocation, AbstractClusterInvoker<?>> entry : new HashMap<Invocation, AbstractClusterInvoker<?>>(
            failed).entrySet()) {
        Invocation invocation = entry.getKey();
        Invoker<?> invoker = entry.getValue();
        try {
            invoker.invoke(invocation);
            failed.remove(invocation);
        } catch (Throwable e) {
            logger.error("Failed retry to invoke method " + invocation.getMethodName() + ", waiting again.", e);
        }
    }
}

假設定時任務10s中執行一次，0s時已經執行過一次。則若是0s到10s之間失敗的請求的有A、B、C，則在10s這個時間點，就會開始對A、B、C進行從新調用。繼承

重點在於，對失敗的請求，會記錄下來，然後定時重發。rem