Python開啓尾遞歸優化!

時間 2019-11-10

原文原文鏈接

Python尾遞歸優化

通常遞歸與尾遞歸

通常遞歸:

def normal_recursion(n):
    if n == 1:
        return 1
    else:
        return n + normal_recursion(n-1)

執行：html

normal_recursion(5)
5 + normal_recursion(4)
5 + 4 + normal_recursion(3)
5 + 4 + 3 + normal_recursion(2)
5 + 4 + 3 + 2 + normal_recursion(1)
5 + 4 + 3 + 3
5 + 4 + 6
5 + 10
15

能夠看到, 通常遞歸, 每一級遞歸都產生了新的局部變量, 必須建立新的調用棧, 隨着遞歸深度的增長, 建立的棧愈來愈多, 形成爆棧?python

尾遞歸

尾遞歸基於函數的尾調用, 每一級調用直接返回遞歸函數更新調用棧, 沒有新局部變量的產生, 相似迭代的實現:函數

def tail_recursion(n, total=0):
    if n == 0:
        return total
    else:
        return tail_recursion(n-1, total+n)

執行：工具

tail_recursion(5, 0)
tail_recursion(4, 5)
tail_recursion(3, 9)
tail_recursion(2, 12)
tail_recursion(1, 14)
tail_recursion(0, 15)
15

能夠看到, 尾遞歸每一級遞歸函數的調用變成"線性"的形式. 這時, 咱們能夠思考, 雖然尾遞歸調用也會建立新的棧, 可是咱們能夠優化使得尾遞歸的每一級調用共用一個棧!, 如此即可解決爆棧和遞歸深度限制的問題!優化

C中尾遞歸的優化

gcc使用-O2參數開啓尾遞歸優化:this

int tail_recursion(int n, int total) {
    if (n == 0) {
        return total;
    }
    else {
        return tail_recursion(n-1, total+n);
    }
}

int main(void) {
    int total = 0, n = 4;
    tail_recursion(n, total);
    return 0;
}

反彙編spa

$ gcc -S tail_recursion.c -o normal_recursion.S
$ gcc -S -O2 tail_recursion.c -o tail_recursion.S gcc開啓尾遞歸優化

對比反彙編代碼以下(AT&T語法, 左圖爲優化後) 調試

能夠看到, 開啓尾遞歸優化前, 使用call調用函數, 建立了新的調用棧(LBB0_3); 而開啓尾遞歸優化後, 就沒有新的調用棧生成了, 而是直接pop bp指向的_tail_recursion函數的地址(pushq %rbp)而後返回, 仍舊用的是同一個調用棧!code

Python開啓尾遞歸優化

cpython自己不支持尾遞歸優化, 可是一個牛人想出的解決辦法：實現一個 tail_call_optimized 裝飾器orm

#!/usr/bin/env python2.4
# This program shows off a python decorator(
# which implements tail call optimization. It
# does this by throwing an exception if it is
# it's own grandparent, and catching such
# exceptions to recall the stack.

import sys

class TailRecurseException:
    def __init__(self, args, kwargs):
        self.args = args
        self.kwargs = kwargs

def tail_call_optimized(g):
    """
    This function decorates a function with tail call
    optimization. It does this by throwing an exception
    if it is it's own grandparent, and catching such
    exceptions to fake the tail call optimization.

    This function fails if the decorated
    function recurses in a non-tail context.
    """
    def func(*args, **kwargs):
        f = sys._getframe()
        if f.f_back and f.f_back.f_back \
            and f.f_back.f_back.f_code == f.f_code:
            # 拋出異常
            raise TailRecurseException(args, kwargs)
        else:
            while 1:
                try:
                    return g(*args, **kwargs)
                except TailRecurseException, e:
                    args = e.args
                    kwargs = e.kwargs
    func.__doc__ = g.__doc__
    return func

@tail_call_optimized
def factorial(n, acc=1):
    "calculate a factorial"
    if n == 0:
        return acc
    return factorial(n-1, n*acc)

print factorial(10000)

這裏解釋一下sys._getframe()函數:

sys._getframe([depth]):
Return a frame object from the call stack.
If optional integer depth is given, return the frame object that many calls below the top of the stack.
If that is deeper than the call stack, ValueEfror is raised. The default for depth is zero,
returning the frame at the top of the call stack.

即返回depth深度調用的棧幀對象.

import sys

def get_cur_info():
    print sys._getframe().f_code.co_filename  # 當前文件名
    print sys._getframe().f_code.co_name  # 當前函數名
    print sys._getframe().f_lineno # 當前行號
    print sys._getframe().f_back # 調用者的幀

更多關於sys._getframe的使用請看Frame Hacks
說一下tail_call_optimized實現尾遞歸優化的原理: 當遞歸函數被該裝飾器修飾後, 遞歸調用在裝飾器while循環內部進行, 每當產生新的遞歸調用棧幀時: f.f_back.f_back.f_code == f.f_code:, 就捕獲當前尾調用函數的參數, 並拋出異常, 從而銷燬遞歸棧並使用捕獲的參數手動調用遞歸函數. 因此遞歸的過程當中始終只存在一個棧幀對象, 達到優化的目的.
爲了更清晰的展現開啓尾遞歸優化前、後調用棧的變化和tail_call_optimized裝飾器拋異常退出遞歸調用棧的做用, 我這裏利用pudb調試工具作了動圖:

開啓尾遞歸優化前的調用棧