html = html.encode('gbk') 仍是報html
'gbk' codec can't encode character u'\u2f45' in position 392: illegal multibyte sequencecode
html = html.encode('gbk', 'ignore') 就行了htm