Python三、Unicode、UTF-八、編碼

時間 2019-11-29

原文原文鏈接

text = u'你好，今每天氣不錯'
text
print(text)

text = '\u4f60\u597d\uff0c\u4eca\u5929\u5929\u6c14\u4e0d\u9519'
text
print(text)

text = u'\u4f60\u597d\uff0c\u4eca\u5929\u5929\u6c14\u4e0d\u9519'
text
print(text)

text = '\\u4f60\\u597d\\uff0c\\u4eca\\u5929\\u5929\\u6c14\\u4e0d\\u9519'
text
print(text)
text = text.encode('utf-8').decode('unicode_escape')
text
print(text)

text = '\\u4f60\\u597d\\uff0c今每天氣不錯'
text
print(text)
import re
text = re.sub(r'(\\u[0-9a-fA-F]{4})', lambda matched: matched.group(1).encode('utf-8').decode('unicode_escape'), text)
text
print(text)

以上爲運行的代碼，運行的結果以下：spa

>>> text = u'你好，今每天氣不錯'
>>> text
'你好，今每天氣不錯'
>>> print(text)
你好，今每天氣不錯

>>> text = '\u4f60\u597d\uff0c\u4eca\u5929\u5929\u6c14\u4e0d\u9519'
>>> text
'你好，今每天氣不錯'
>>> print(text)
你好，今每天氣不錯

>>> text = u'\u4f60\u597d\uff0c\u4eca\u5929\u5929\u6c14\u4e0d\u9519'
>>> text
'你好，今每天氣不錯'
>>> print(text)
你好，今每天氣不錯

>>> text = '\\u4f60\\u597d\\uff0c\\u4eca\\u5929\\u5929\\u6c14\\u4e0d\\u9519'
>>> text
'\\u4f60\\u597d\\uff0c\\u4eca\\u5929\\u5929\\u6c14\\u4e0d\\u9519'
>>> print(text)
\u4f60\u597d\uff0c\u4eca\u5929\u5929\u6c14\u4e0d\u9519
>>> text = text.encode('utf-8').decode('unicode_escape')
>>> text
'你好，今每天氣不錯'
>>> print(text)
你好，今每天氣不錯

>>> text = '\\u4f60\\u597d\\uff0c今每天氣不錯'
>>> text
'\\u4f60\\u597d\\uff0c今每天氣不錯'
>>> print(text)
\u4f60\u597d\uff0c今每天氣不錯
>>> import re
>>> text = re.sub(r'(\\u[0-9a-fA-F]{4})', lambda matched: matched.group(1).encode('utf-8').decode('unicode_escape'), text)
>>> text
'你好，今每天氣不錯'
>>> print(text)
你好，今每天氣不錯