encode(編碼) decode(解碼) encoding(編碼格式)ide
#-*- coding:utf-8 -*- import chardet #用於查看編碼 with open("d:/kk/kk.txt", "rb") as temp: tem = temp.read() print(tem) print(chardet.detect(tem)) print(tem.decode(encoding="gb2312")) print((tem.decode(encoding="gb2312")).encode("utf-8")) print(chardet.detect((tem.decode(encoding="gb2312")).encode("utf-8")))
結果:編碼
b'\xc4\xe3\xba\xc3\r\n\xc4\xe3\xba\xc3\r\n\xc4\xe3\xba\xc3' {'encoding': 'TIS-620', 'confidence': 0.3598212120361634, 'language': 'Thai'} 你好 你好 你好 b'\xe4\xbd\xa0\xe5\xa5\xbd\r\n\xe4\xbd\xa0\xe5\xa5\xbd\r\n\xe4\xbd\xa0\xe5\xa5\xbd' {'encoding': 'utf-8', 'confidence': 0.99, 'language': ''}
其它編碼轉換:spa
aa = "\\u672c\\u7248\\u672c\\u5185\\u4e0d\\u652f\\u6301\\u7684\\u63a5\\u53e3\\u6216\\u8005\\u63a5\\u53e3\\u5df2\\u7ecf\\u88ab\\u5e9f\\u5f03\\uff0c\\u8bf7\\u53c2\\u8003\\u8be5\\u63a5\\u53e3\\u7684\\u6587\\u6863\\u3002" print(type(aa)) print(aa.encode('utf-8').decode('unicode_escape'))