500 Lines or Less: A Template Engine(模板引擎)

時間 2019-11-08
標籤 lines template engine 模板引擎简体版
原文原文鏈接
介紹：多數項目都是包含不少的邏輯處理，只有少部分的文字文本處理。編程語言很是擅長這類項目。可是有一些項目只包含了少許的邏輯處理，大量的文本數據處理。對於這些任務，咱們指望有一個工具可以很好的處理這些文本問題。模板引擎就是這樣的一個工具。在這個章節中，咱們會創建一個模板引擎。html
對於富文原本說，多數的項目樣例就是web應用。web應用中一個重要的方面就是產生瀏覽器上的HTML。只有不多的HTML網頁是靜止的。它們至少都擁有一部分的動態數據，好比用戶名。通常來講，他們都包含大量的動態數據：產品列表，朋友的更新等等。前端
於此同時，每一個HTML頁面都包含了大量的靜態文本。而且這些頁面都很是的大，包含了上萬字節的文本。對於web應用的開發者來講有一個問題必須解決：如何用最小的動態，靜態數據生成大的字符串。python
爲了直觀一些，讓咱們假設產生這樣的HTML：web
<p>Welcome, Charlie!</p><p>Products:</p><ul>正則表達式
    <li>Apple: $1.00</li>express
    <li>Fig: $1.50</li>編程
    <li>Pomegranate: $3.25</li></ul>瀏覽器
在這裏，用戶名是動態的，一樣產品名稱和價格也是動態的。甚至產品的數目也不是固定的：在某些時刻，將會有更多或者更少的產品展現前端工程師
經過在代碼中產生這些字符串而且把它們鏈接起來生成HTML是一種方式。動態數據能夠經過字符串插入。一些動態數據是重複性的，好比產品清單。這就意味着有大量重複的HTML，這些數據必須分開處理而後和頁面的其餘數據結合起來。閉包
經過這種方式產生的頁面就像下面這種：
# The main HTML for the whole page.
PAGE_HTML = """<p>Welcome, {name}!</p><p>Products:</p><ul>{products}</ul>"""
# The HTML for each product displayed.
PRODUCT_HTML = "<li>{prodname}: {price}</li>\n"
def make_page(username, products):
    product_html = ""
    for prodname, price in products:
        product_html += PRODUCT_HTML.format(
            prodname=prodname, price=format_price(price))
    html = PAGE_HTML.format(name=username, products=product_html)
    return html
這樣也能工做，可是會形成混亂。由各類字符串變量組成的HTML嵌入到了咱們的代碼中。這會致使頁面的邏輯不直觀，由於一個完整的文本被分紅了不一樣的部分。若是要修改HTML頁面，前端工程師須要經過編輯python代碼來修改頁面。想象一下當頁面有10倍甚至百倍複雜的時候代碼會變成什麼樣子。很快這種方式就會行不通
 
模板：
更好產生HTML頁面的方式是模板。HTML經過模板來編輯，意味着文件基本是靜態的HTML，其中動態數據經過特殊的記號插入進去。上面的玩具頁面經過模板表達以下：
<p>Welcome, {{user_name}}!</p><p>Products:</p><ul>
{% for product in product_list %}
    <li>{{ product.name }}:
        {{ product.price|format_price }}</li>
{% endfor %}</ul>
在這裏重點是HTML文本，邏輯代碼嵌入到HTML中。對比下文本中心的方法和上面的邏輯中心代碼。先前的項目都是python代碼，HTML頁面嵌入到python代碼中。如今咱們的項目大多數都是靜態的HTML標記。
模板裏面的編程語言是如何工做的呢？在python中，大多數的源文件都是可執行的代碼。若是你須要一個靜態的文字文本，須要將其嵌入到字符串中
def hello():
    print("Hello, world!")
 
hello()
當python讀到這樣的源文件，它會將def hello()解釋爲可運行的函數。print（「hello,world」）
意味着雙括號裏面的都是文本。對於大多數編程語言來講，靜態部分都是包含在雙括號中。
模板文件大多數都是靜態的頁面，經過特殊的記號來指示可運行的的動態部分。
<p>Welcome, {{user_name}}!</p>
所以在HTML頁面中，{{意味着動態模式的開始，user_name變量將會在輸出中顯示
在python中經過"foo = {foo}!".format(foo=17)」的方式從一個字符串來建立一段文本。模板也採用了相似的方法
爲了使用HTML模板，咱們必需要有一個模板引擎。這個模板引擎須要將模板中的靜態數據和動態數據結合起來。這個引擎的做用就是解釋這個模板而且用珍視的數據來替換動態部分。
在python中，下面的三種方式分別表明不一樣的意義。
dict["key"]
obj.attr
obj.method()
可是在模板語法中，全部的操做同時經過.來表示
dict.key
obj.attr
obj.method
好比
<p>The price is: {{product.price}}, with a {{product.discount}}% discount.</p>
還有一種過濾的方式來修改數據，過濾是經過管道來實現的
<p>Short name: {{story.subject|slugify|lower}}</p>
邏輯判斷方式：
{% if user.is_logged_in %}
    <p>Welcome, {{ user.name }}!</p>
{% endif %}
循環方式：
<p>Products:</p><ul>
{% for product in product_list %}
    <li>{{ product.name }}: {{ product.price|format_price }}</li>
{% endfor %}</ul>
 
解析模板有兩種方式：1 解釋模式 2 編譯模式，在這裏做者是採用的編譯模式。
首先來看下面的HTML代碼：
<p>Welcome, {{user_name}}!</p><p>Products:</p><ul>
{% for product in product_list %}
    <li>{{ product.name }}:
        {{ product.price|format_price }}</li>
{% endfor %}</ul>
下面是python代碼的方式來生成上面的HTML代碼：
def render_function(context, do_dots):
    c_user_name = context['user_name']
    c_product_list = context['product_list']
    c_format_price = context['format_price']
 
    result = []
    append_result = result.append
    extend_result = result.extend
    to_str = str
 
    extend_result([
        '<p>Welcome, ',
        to_str(c_user_name),
        '!</p>\n<p>Products:</p>\n<ul>\n'
    ])
    for c_product in c_product_list:
        extend_result([
            '\n    <li>',
            to_str(do_dots(c_product, 'name')),
            ':\n        ',
            to_str(c_format_price(do_dots(c_product, 'price'))),
            '</li>\n'
        ])
    append_result('\n</ul>\n')
    return ''.join(result)
經過這段代碼，能夠看到將不一樣的HTML代碼都加入到列表中，而後經過.join(result)的方式鏈接成一段完整的代碼
下面就開始寫引擎：
引擎類：首先是經過一段文原本構造一個引擎類的實例，而後經過render()方法來生成一個完整的頁面。
templite = Templite('''    <h1>Hello {{name|upper}}!</h1>    {% for topic in topics %}        <p>You are interested in {{topic}}.</p>    {% endfor %}    ''',
    {'upper': str.upper},
)
# Later, use it to render some data.
text = templite.render({
    'name': "Ned",
    'topics': ['Python', 'Geometry', 'Juggling'],
})
 
代碼生成器：
前面介紹過在引擎中，咱們須要將模板中的動態表示部分表示成代碼來執行，最後生成數據在模板中顯示。代碼生成器就是用來生成可執行的python代碼的。
首先在每一個代碼生成器中會初始化兩個變量:self.code和self.indent_level。code是用來存儲python代碼的。indent_level是指示代碼縮進的。初始值爲0
class CodeBuilder(object):
    """Build source code conveniently."""
 
    def __init__(self, indent=0):
        self.code = []
        self.indent_level = indent
add_line是在self.code中插入代碼。插入的方式是空格乘以縮進數目+代碼行+換行符。這也是爲了適配Python代碼的格式，由於Python就是經過縮進來區分不一樣層次的。
def add_line(self, line):
        """Add a line of source to the code.
        Indentation and newline will be added for you, don't provide them.
        """
        self.code.extend([" " * self.indent_level, line, "\n"])
 
indent和dedent是用來生成縮進的。縮進步長被初始化爲4. indent就是將縮進步長加4, dedent是將縮進步長減4.
INDENT_STEP = 4      # PEP8 says so!
 
    def indent(self):
        """Increase the current indent for following lines."""
        self.indent_level += self.INDENT_STEP
 
    def dedent(self):
        """Decrease the current indent for following lines."""
        self.indent_level -= self.INDENT_STEP
add_section的做用是在每一個層次下新增代碼。具體實現是經過新實例化一個CodeBuilder實例而且傳入縮進的程度。而後插入代碼並返回。
def add_section(self):
        """Add a section, a sub-CodeBuilder."""
        section = CodeBuilder(self.indent_level)
        self.code.append(section)
        return section
最後__str__是生成類的表達方式。這裏return的其實就是將self.code中的代碼鏈接在一塊兒
def __str__(self):
        return "".join(str(c) for c in self.code)
get_globals的做用就是執行python代碼。經過str(self)的方式調用__str__獲得完整的python代碼。
 
def get_globals(self):
        """Execute the code, and return a dict of globals it defines."""
        # A check that the caller really finished all the blocks they started.
        assert self.indent_level == 0
        # Get the Python source as a single string.
        python_source = str(self)
        # Execute the source, defining globals, and return them.
        global_namespace = {}
        exec(python_source, global_namespace)
        return global_namespace
exec()來執行python代碼，並將結果存儲在global_namespaces這個字典中。來看一個使用實例：
python_source = """\SEVENTEEN = 17
def three():    return 3"""
global_namespace = {}
exec(python_source, global_namespace)
在這裏執行global_namespace[SEVENTEEN]獲得的是17,global_namespace[three]獲得的是3.
接下來咱們再來驗證下CodeBuilder的做用：
code = CodeBuilder()
code.add_line("def render_function(context, do_dots):")
code.indent()
vars_code = code.add_section()
code.add_line("result = []")
code.add_line("if 'a' in result:")
code.indent()
code.add_line("pass")
code.dedent()
code.add_line("append_result = result.append")
code.add_line("extend_result = result.extend")
code.add_line("to_str = str")
print(code)
每次須要縮進的時候調用indent，而後調用add_line插入代碼。縮進完成後調用dedent推出縮進。而後繼續插入代碼。
運行結果：
def render_function(context, do_dots):
    result = []
    if 'a' in result:
        pass
    append_result = result.append
    extend_result = result.extend
to_str = str
 
 
下面來看模板類：
首先在__init__中，傳入text以及*contexts。text就是傳入的文本，*context是在傳入的時候帶入的其餘變量
    def __init__(self, text, *contexts):
        self.context = {}
        for context in contexts:
            self.context.update(context)
好比進行以下的初始化，contexts就是{'upper': str.upper}並被賦值給self.context 
Templite('''
            <h1>Hello {{name|upper}}!</h1>
            {% for topic in topics %}
                <p>You are interested in {{topic}}.</p>
            {% endif %}
            ''',
            {'upper': str.upper},
        )
 
定義兩個set變量，分別存儲全部變量(all_vars)以及在循環中的變量(loop_vars)。
self.all_vars = set()
self.loop_vars = set()
接下來實例化一個代碼生成器，在這裏定義了一個函數render_function。
code = CodeBuilder()
code.add_line("def render_function(context, do_dots):")
code.indent()
vars_code = code.add_section()
code.add_line("result = []")
code.add_line("append_result = result.append")
code.add_line("extend_result = result.extend")
code.add_line("to_str = str")
1 首先定義了一個函數render_function
2 接着縮進，而後定義了一個代碼分段，vars_code. 在後面的實現中將往這個代碼分段中添加參數擴展的代碼
3 最後是4行固定的代碼，定義了列表，字符串以及列表的append以及extend功能
 
 
buffered列表和flush_output:
        buffered = []
        def flush_output():
            """Force `buffered` to the code builder."""
            if len(buffered) == 1:
                code.add_line("append_result(%s)" % buffered[0])
            elif len(buffered) > 1:
                code.add_line("extend_result([%s])" % ", ".join(buffered))
            del buffered[:]
buffered是用來存儲在網頁中的變量以及網頁標記，在後面的代碼中將介紹對網頁代碼的解析，好比遇到{{，{%等的用法。在解析到{{的時候，能夠確定的是這是個變量。所以會調用buffered.append("to_str(%s)" % expr)的方式存儲在buffered中。
若是既不是{{，{%，也就是說既不是變量也不是循環體的時候，那麼只能是網頁，此時調用buffered.append(repr(token))添加到buffered中。
 
那麼再來看flush_output的做用，在代碼中，當buffered的長度等於1的時候，採用append_result的方法存儲數據，當buffered的長度大於1的時候，採用extend_result的方法存儲數據。爲何會有這種不一樣的處理方式呢，來看下下面的實例代碼：
    buffer=[]
    buffer.append('<h1>Hello)')
    buffer.append('!</h1>')
    buffer.append('name')
    string_result=[', '.join(buffer)]
    result.append(string_result)
    del buffer[:]
    print(result)
    buffer.append('topic')
    result.append(buffer)
print(result)
運行結果：
['<h1>Hello),!</h1>,name', ['topic']]
當咱們採用append的時候，是向列表添加一個對象的object。extend是把一個序列的seq內容添加到列表中。所以能夠看到buffer.append(‘topic’)的時候，添加的是[‘topic’]。
若是改爲buffer.append('topic')   result.extend(buffer)
那麼運行結果將是：['<h1>Hello), !</h1>, name', 'topic']
 
最後來看下flush_output的調用。flush_output是個閉包函數。當解析到{% 開頭(循環調用代碼)的文本時候，就會首先調用flush_output添加存儲在buffered中的數據代碼。
 
 
接下來就要進入代碼的重點，對於網頁文本內容的解析：
1 tokens = re.split(r"(?s)({{.*?}}|{%.*?%}|{#.*?#})", text) 這裏採用了正則表達式對網頁數據的解析。這裏{?s}是模式修飾符：即Singleline(單行模式)。表示更改.的含義，使它與每個字符匹配（包括換行 符\n）
正則表達式的修飾符以下：
(?i)即匹配時不區分大小寫。表示匹配時不區分大小寫。
(?s)即Singleline(單行模式)。表示更改.的含義，使它與每個字符匹配（包括換行 符\n）。
(?m)即Multiline(多行模式) 。  表示更改^和$的 含義，使它們分別在任意一行的行首和行尾匹配，而不只僅在整個字符串的開頭和結尾匹配。(在此模式下,$的 精確含意是:匹配\n以前的位置以及字符串結束前的位置.)   
(?x)：表示若是加上該修飾符，表達式中的空白字符將會被忽略，除非它已經被轉義。 
(?e)：表示本修飾符僅僅對於replacement有用，表明在replacement中做爲PHP代碼。 
(?A)：表示若是使用這個修飾符，那麼表達式必須是匹配的字符串中的開頭部分。好比說"/a/A"匹配"abcd"。 
(?E)：與"m"相反，表示若是使用這個修飾符，那麼"$"將匹配絕對字符串的結尾，而不是換行符前面，默認就打開了這個模式。 
(?U)：表示和問號的做用差很少，用於設置"貪婪模式"。
這裏經過tokens將網頁不一樣的數據給區分開來，好比下面下面這段網頁代碼
text="""
    <h1>Hello {{name|upper}}!</h1>
            {% for topic in topics %}
                <p>You are interested in {{topic}}.</p>
            {% endif %}
            {'upper': str.upper}
"""
解析出來就是下面的結果。
<h1>Hello
{{name|upper}}
!</h1>
{% for topic in topics %}
<p>You are interested in
{{topic}}
.</p>
{% endif %}
{'upper': str.upper}
 
2 對各個部分分別作解析：
 
（一）若是是註釋，則直接忽略
            if token.startswith('{#'):
                # Comment: ignore it and move on.
                continue
 
（二）若是是{{，則表明是變量。首先調用self._expr_code(token[2:-2].strip())獲得全部的變量
            elif token.startswith('{{'):
                # An expression to evaluate.
                expr = self._expr_code(token[2:-2].strip())
                buffered.append("to_str(%s)" def get_globals(self):
        """Execute the code, and return a dict of globals it defines."""
        # A check that the caller really finished all the blocks they started.
        assert self.indent_level == 0
        # Get the Python source as a single string.
        python_source = str(self)
        # Execute the source, defining globals, and return them.
        global_namespace = {}
        exec(python_source, global_namespace)
        return global_namespace% expr)
這裏介紹下__expr_code。咱們可能會遇到下面3種形式{{name}},{{user.name}},{{name|func}}. 對於第一種方式的處理很簡單。直接採用token[2:-2].strip()的方式就能夠提取出來。並最終經過c_name的方式返回
若是是{{user.name}}的方式。處理方式以下
        elif "." in expr:
            dots = expr.split(".")
            code = self._expr_code(dots[0])
            args = ", ".join(repr(d) for d in dots[1:])
            code = "do_dots(%s, %s)" % (code, args)
首先將各個.的變量分離出來。而後傳入到do_dots中。這裏有兩個變量。code和args。好比user.name.那麼傳入的code就是user,args就是name
那麼再來看下do_dots具體幹了什麼呢，來看下代碼：
首先判斷全部的args是不是value的屬性，若是是的話則直接返回屬性的值。若是不是屬性的話則通常是字典，經過字典的方式返回對應的值。
def _do_dots(self, value, *dots):
        for dot in dots:
            try:
                value = getattr(value, dot)
            except AttributeError:
                value = value[dot]
            if callable(value):
                value = value()
        return value
 
若是是{{user|func}}。處理方式以下
        if "|" in expr:
            pipes = expr.split("|")
            code = self._expr_code(pipes[0])
            for func in pipes[1:]:
                self._variable(func, self.all_vars)
                code = "c_%s(%s)" % (func, code)
管道的方式其實能夠看做是func(user)，func是函數名，user是變量。那麼首先將變量提取出來。若是有多個管道。則依次取出這些管道函數，並存儲在all_vars這個集合中。最後依次造成函數調用的字符串。好比{{user|func}}，最終的結果就是c_func(c_user). 若是是{{use|func|display}}這種形式，則結果就是c_display(c_func(c_user))
 
(三) 若是是{%, 則進入到了控制或者循環部分。首先調用flush_output()將變量進行存儲
若是是if語句。首先將if添加進ops_stack棧，而後構造if語句代碼，造成縮進
                if words[0] == 'if':
                    # An if statement: evaluate the expression to determine if.
                    if len(words) != 2:
                        self._syntax_error("Don't understand if", token)
                    ops_stack.append('if')
                    code.add_line("if %s:" % self._expr_code(words[1]))
                    code.indent()
 
若是是for語句，首先將for添加進ops_stack棧，而後經過self._variable(words[1], self.loop_vars)將循環變量添加到loop_vars中。而將循環的對象添加到all_vars中。最後造成for語句以及縮進。好比for topic in topics. topic添加到loop_vars中，而topics則添加到all_vars中
                elif words[0] == 'for':
                    # A loop: iterate over expression result.
                    if len(words) != 4 or words[2] != 'in':
                        self._syntax_error("Don't understand for", token)
                    ops_stack.append('for')
                    self._variable(words[1], self.loop_vars)
                    code.add_line(
                        "for c_%s in %s:" % (
                            words[1],
                            self._expr_code(words[3])
                        )
                    )
                    code.indent()
 
若是是end語句，則表明控制語句的結束。經過end_what = words[0][3:]判斷是if仍是for循環體的結束。而後經過start_what = ops_stack.pop()將上次對應的控制語句出棧，若是和本次end的控制符語句不相等。則拋出異常。最後code.dedent()完成縮進。
                elif words[0].startswith('end'):
                    # Endsomething.  Pop the ops stack.
                    if len(words) != 1:
                        self._syntax_error("Don't understand end", token)
                    end_what = words[0][3:]
                    if not ops_stack:
                        self._syntax_error("Too many ends", token)
                    start_what = ops_stack.pop()
                    if start_what != end_what:
                        self._syntax_error("Mismatched end tag", end_what)
                    code.dedent()
                else:
                    self._syntax_error("Don't understand tag", words[0])
 
至此全部的解析都已經完成，全部的變量都存儲在all_vars和loop_vars中。如今須要將在循環體外的變量提取出來
好比下面的這段文本
<h1>Hello {{name|upper}}!</h1>
                {% for topic in topics %}
                    <p>You are interested in {{topic}}.</p>
                {% endfor %}
all_vars中的變量{'topic', 'upper', 'name', 'topics'}
loop_vars中的變量{'topic'}
而topic實際上是屬於循環體的變量。所以採用var_name in self.all_vars - self.loop_vars
的方式將循環體外的變量所有提取出來。而後造成提取變量的代碼插入到以前的section vars_code中。
        for var_name in self.all_vars - self.loop_vars:
            vars_code.add_line("c_%s = context[%r]" % (var_name, var_name))
 
在__init__的最後添加返回代碼，完成縮進。並將_render_function賦值爲 code.get_globals()['render_function'].  也就是render_fuction這個函數對象。<function render_function at 0x7f4eb1632510>
 
code.add_line("return ''.join(result)")
code.dedent()
self._render_function = code.get_globals()['render_function']
 
最後一步render函數，這個函數的做用就是更新全部的變量值。這個變量是在self.context上進行更新。最終返回函數的調用並添加進參數。
    def render(self, context=None):
        render_context = dict(self.context)
        if context:
            render_context.update(context)
        return self._render_function(render_context, self._do_dots)
 
好比下面的調用：首先在__init__中{‘upper’:’str.upper}被更新到了self.context中
templite = Templite('''
                <h1>Hello {{name|upper}}!</h1>
                {% for topic in topics %}
                    <p>You are interested in {{topic}}.</p>
                {% endfor %}
                ''',
                        {'upper': str.upper},
                        )
當繼續調用templite.render的時候， 'name': "Ned",'topics': ['Python', 'Geometry', 'Juggling']也被更新進self.context並最終傳遞給render_function
    text = templite.render({
        'name': "Ned",
        'topics': ['Python', 'Geometry', 'Juggling'],
})
 
至此代碼就結束了，下面我來實際運行下看下效果：
class var_init(object):
    def __init__(self):
        self.value=1
 
if __name__=="__main__":
    v=var_init()
    templite = Templite('''
                <h1>Hello {{name|upper}}!</h1>
                {% for topic in topics %}
                    <p>You are interested in {{topic}}.</p>
                {% endfor %}
                {% if para.value %}
                    <p>it is true.</p>
                {% endif %}    
                ''',
                        {'upper': str.upper},
                        )
    text = templite.render({
        'name': "Ned",
        'topics': ['Python', 'Geometry', 'Juggling'],'para':v
    })
print(text)
運行結果：
代碼打印：
def render_function(context, do_dots):
    c_topics = context['topics']
    c_upper = context['upper']
    c_para = context['para']
    c_name = context['name']
    result = []
    append_result = result.append
    extend_result = result.extend
    to_str = str
    extend_result(['\n                <h1>Hello ', to_str(c_upper(c_name)), '!</h1>\n                '])
    for c_topic in c_topics:
        extend_result(['\n                    <p>You are interested in ', to_str(c_topic), '.</p>\n                '])
    append_result('\n                ')
    if do_dots(c_para, 'value'):
        append_result('\n                    <p>it is true.</p>\n                ')
    append_result('    \n                ')
    print(result)
    return ''.join(result)
生成的網頁打印：
                <h1>Hello NED!</h1>
                
                    <p>You are interested in Python.</p>
                
                    <p>You are interested in Geometry.</p>
                
                    <p>You are interested in Juggling.</p>
                    <p>it is true.</p>
相關標籤/搜索
每日一句
每一个你不满意的现在，都有一个你没有努力的曾经。