咱們知道,python中的字符串分普通字符串和unicode字符串,通常從數據庫中讀取的字符串會自動被轉換爲unicode字符串python
下面回到重點,使用json.dumps時,通常的用法爲:數據庫
>>> obj={"name":"測試"}json
>>> json.dumps(obj)
'{"name": "\\u6d4b\\u8bd5"}'函數
>>> print json.dumps(obj)
{"name": "\u6d4b\u8bd5"}測試
>>> json.dumps(obj).encode("utf-8")
'{"name": "\\u6d4b\\u8bd5"}'ui
能夠看到這裏輸出的字符串爲普通字符串,可是裏面的內容倒是unicode字符串的內容,即便對結果進行encode("utf-8") ,由於這個字符串自己就已經編碼過了,全部進行encode不會有變化編碼
要想獲得字符串的真實表示,須要用到參數ensure_ascii=False(默認爲True):spa
>>> json.dumps(obj,ensure_ascii=False)
'{"name": "\xe6\xb5\x8b\xe8\xaf\x95"}'code
>>> print json.dumps(obj,ensure_ascii=False)
{"name": "測試"}對象
坑:試試下面的用法(好比key是從數據庫中讀取的,則會以unicode字符串形式存在):
>>> key=u"name"
>>> obj={key:"測試"}
>>> json.dumps(obj,ensure_ascii=False)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python2.6/json/__init__.py", line 237, in dumps
**kw).encode(obj)
File "/usr/lib64/python2.6/json/encoder.py", line 368, in encode
return ''.join(chunks)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 1: ordinal not in range(128)
這是由於key和value不能以混合普通字符串和unicode字符串的形式存在
改爲下面則正常了(同時爲普通字符串或同時爲unicode字符串):
>>> key=u"name"
>>> obj={key:u"測試"}
>>> json.dumps(obj,ensure_ascii=False)
u'{"name": "\u6d4b\u8bd5"}'
>>> obj={key.encode("utf-8"):u"測試".encode("utf-8")}
>>> json.dumps(obj,ensure_ascii=False)
'{"name": "\xe6\xb5\x8b\xe8\xaf\x95"}'
另外說說還有一個參數default
考慮下面的場景:
>>> class Data:
... def __init__(self):
... self.name = ""
... self.detail = ""
...
>>> data=Data()
>>> data.name="名字"
>>> data.detail="細節"
>>> obj={"data":data}
>>> json.dumps(obj,ensure_ascii=False)
會報下面的異常:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python2.6/json/__init__.py", line 237, in dumps
**kw).encode(obj)
File "/usr/lib64/python2.6/json/encoder.py", line 367, in encode
chunks = list(self.iterencode(o))
File "/usr/lib64/python2.6/json/encoder.py", line 309, in _iterencode
for chunk in self._iterencode_dict(o, markers):
File "/usr/lib64/python2.6/json/encoder.py", line 275, in _iterencode_dict
for chunk in self._iterencode(value, markers):
File "/usr/lib64/python2.6/json/encoder.py", line 317, in _iterencode
for chunk in self._iterencode_default(o, markers):
File "/usr/lib64/python2.6/json/encoder.py", line 323, in _iterencode_default
newobj = self.default(o)
File "/usr/lib64/python2.6/json/encoder.py", line 344, in default
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: <__main__.Data instance at 0x11e87e8> is not JSON serializable
這是由於json.dumps不知道如何對Data對象進行序列化,須要定義一個函數,並賦給參數default:
>>> def convert_to_builtin_type(obj):
... d = {}
... d.update(obj.__dict__)
... return d
...
>>> json.dumps(obj,ensure_ascii=False, default=convert_to_builtin_type)
'{"data": {"name": "\xe5\x90\x8d\xe5\xad\x97", "detail": "\xe7\xbb\x86\xe8\x8a\x82"}}'
>>> print json.dumps(obj,ensure_ascii=False, default=convert_to_builtin_type)
{"data": {"name": "名字", "detail": "細節"}}
def convert_to_builtin_type(obj): d = {} d.update(obj.__dict__) return d