昨天作一個爬網站的程序使用Jsoup.connect(String url) 時報java.nio.charset.IllegalCharsetNameException:utf-8錯誤 html
錯誤內容是: java
java.nio.charset.IllegalCharsetNameException: UTF-8
at java.nio.charset.Charset.checkName(Charset.java:284)
at java.nio.charset.Charset.lookup2(Charset.java:458)
at java.nio.charset.Charset.lookup(Charset.java:437)
at java.nio.charset.Charset.isSupported(Charset.java:476)
at org.jsoup.helper.DataUtil.getCharsetFromContentType(DataUtil.java:132)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:448)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:393)
at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:159)
at org.jsoup.helper.HttpConnection.post(HttpConnection.java:154)
at domain.Working.main(Working.java:18)
對於這個錯誤一開始莫名奇妙,一開始覺得是SDK編碼問題就去改Edit編碼 Eclipse默認編碼反正各類編碼都試了一遍,結果錯誤一點都沒變;後來請教大神,神說:加個請求頭進去。因而在請求裏添加了個相似這樣的玩意
Accept-Language:
zh-CN,zh;q=0.8
,結果仍是同樣,過程各類複雜下午折騰了近三個小時,結果我去查api
發現這樣一個方法
parse(InputStream in, String charsetName, String baseUri) api
修改代碼 dom
url = new URL(htmlUrl);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
InputStream is = connection.getInputStream();
doc=Jsoup.parse(is,"utf-8",htmlUrl);
問題解決了- -... post