字符串格式：％與.format

時間 2019-12-18

標籤字符串格式 format 简体版

原文原文鏈接

Python 2.6引入了str.format()方法，其語法與現有的%運算符略有不一樣。哪一個更好，什麼狀況下適合？ html

如下使用每種方法並具備相同的結果，那麼有什麼區別？ python

#!/usr/bin/python sub1 = "python string!" sub2 = "an arg" a = "i am a %s" % sub1 b = "i am a {0}".format(sub1) c = "with %(kwarg)s!" % {'kwarg':sub2} d = "with {kwarg}!".format(kwarg=sub2) print a # "i am a python string!" print b # "i am a python string!" print c # "with an arg!" print d # "with an arg!"

此外，什麼時候在Python中進行字符串格式化？例如，若是個人日誌記錄級別設置爲「高」，執行如下%操做是否還會受到影響？若是是這樣，有辦法避免這種狀況嗎？ linux
```
log.debug("some debug info: %s" % some_info)
```

#1樓

可是請當心，當我嘗試在現有代碼.format全部%替換爲.format時，才發現一個問題： '{}'.format(unicode_string)將嘗試對unicode_string進行編碼，而且可能會失敗。 正則表達式

只需查看如下Python交互式會話日誌便可：數組

Python 2.7.2 (default, Aug 27 2012, 19:52:55) 
[GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2
; s='й'
; u=u'й'
; s
'\xd0\xb9'
; u
u'\u0439'

s只是一個字符串（在Python3中稱爲「字節數組」），而u是Unicode字符串（在Python3中稱爲「字符串」）：函數

; '%s' % s
'\xd0\xb9'
; '%s' % u
u'\u0439'

當您將Unicode對象做爲參數提供給%運算符時，即便原始字符串不是Unicode，它也會產生一個Unicode字符串：性能

; '{}'.format(s)
'\xd0\xb9'
; '{}'.format(u)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'latin-1' codec can't encode character u'\u0439' in position 0: ordinal not in range(256)

可是.format函數將引起「 UnicodeEncodeError」：編碼

; u'{}'.format(s)
u'\xd0\xb9'
; u'{}'.format(u)
u'\u0439'

而且僅當原始字符串爲Unicode時，它才能夠與Unicode參數一塊兒使用。 spa

; '{}'.format(u'i')
'i'

或者參數字符串能夠轉換爲字符串（所謂的「字節數組」） debug

#2樓

正如我今天發現的那樣，經過%格式化字符串的舊方法不支持Decimal （即用於十進制定點和浮點算術的Python模塊）。

示例（使用Python 3.3.5）：

#!/usr/bin/env python3

from decimal import *

getcontext().prec = 50
d = Decimal('3.12375239e-24') # no magic number, I rather produced it by banging my head on my keyboard

print('%.50f' % d)
print('{0:.50f}'.format(d))

輸出：

0.00000000000000000000000312312239239009009464850 0.00000000000000000000000312375239000000000000000000

固然可能有解決方法，可是您仍然能夠考慮當即使用format()方法。

#3樓

附帶說明，您沒必要爲了提升性能而在日誌記錄中使用新樣式格式。您能夠將實現__str__ magic方法的任何對象傳遞給logging.debug ， logging.info等。當日志記錄模塊決定必須發出您的消息對象（不管它是什麼）時，它會先調用str(message_object) 。所以，您能夠執行如下操做：

import logging


class NewStyleLogMessage(object):
    def __init__(self, message, *args, **kwargs):
        self.message = message
        self.args = args
        self.kwargs = kwargs

    def __str__(self):
        args = (i() if callable(i) else i for i in self.args)
        kwargs = dict((k, v() if callable(v) else v) for k, v in self.kwargs.items())

        return self.message.format(*args, **kwargs)

N = NewStyleLogMessage

# Neither one of these messages are formatted (or calculated) until they're
# needed

# Emits "Lazily formatted log entry: 123 foo" in log
logging.debug(N('Lazily formatted log entry: {0} {keyword}', 123, keyword='foo'))


def expensive_func():
    # Do something that takes a long time...
    return 'foo'

# Emits "Expensive log entry: foo" in log
logging.debug(N('Expensive log entry: {keyword}', keyword=expensive_func))

全部這些都在Python 3文檔（ https://docs.python.org/3/howto/logging-cookbook.html#formatting-styles ）中進行了描述。可是，它也能夠在Python 2.6中使用（ https://docs.python.org/2.6/library/logging.html#using-arbitrary-objects-as-messages ）。

對使用這種技術，比的事實，它的格式風格無關，其餘的優勢是，它容許偷懶值，例如功能expensive_func以上。這爲Python文檔中的建議提供了更優雅的替代方法： https : //docs.python.org/2.6/library/logging.html#optimization 。

#4樓

.format另外一個優勢（我在答案中沒有看到）：它能夠採用對象屬性。

In [12]: class A(object):
   ....:     def __init__(self, x, y):
   ....:         self.x = x
   ....:         self.y = y
   ....:         

In [13]: a = A(2,3)

In [14]: 'x is {0.x}, y is {0.y}'.format(a)
Out[14]: 'x is 2, y is 3'

或者，做爲關鍵字參數：

In [15]: 'x is {a.x}, y is {a.y}'.format(a=a)
Out[15]: 'x is 2, y is 3'

據我所知， %是不可能的。

#5樓

在格式化正則表達式時， %可能會有所幫助的一種狀況。例如，

'{type_names} [a-z]{2}'.format(type_names='triangle|square')

引起IndexError 。在這種狀況下，您可使用：

'%(type_names)s [a-z]{2}' % {'type_names': 'triangle|square'}

這樣能夠避免將正則表達式寫爲'{type_names} [az]{{2}}' 。當您有兩個正則表達式時，這頗有用，其中一個正則表達式單獨使用而沒有格式，可是兩個正則表達式的鏈接都已格式化。

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。