HTTPClient4.x簡單使用方法

HttpClient4.x簡單使用

以前一直使用HttpClient4來獲取URL的頁面,那麼HttpClient怎麼使用呢?閒話少敘直接上代碼吧!html

public class HTTPUtils {

    private static CloseableHttpClient httpClient;

    private static RequestConfig requestConfig = RequestConfig.custom()
            .setSocketTimeout(5000).setConnectTimeout(5000).build();

        /**
         * 
         * @param url
         * @return
         * @throws IOException
         */
        public static String getHTML(String url) throws IOException {
            httpClient = HttpClients.createDefault();
            HttpGet request = new HttpGet(url);
            request.setConfig(requestConfig);
            HttpResponse response = httpClient.execute(request);
            HttpEntity entity = response.getEntity();
            // ContentType contentType = ContentType.get(entity);
            String html = EntityUtils.toString(entity, "GB18030");
            httpClient.close();
            // httpClient.getConnectionManager().shutdown();
            return html;
        }
    }
該段代碼重點在於requestConfig的定義,若是不設置超時時間,當批量操做大量網頁的時候,會出現等待假死的狀況。這種狀況是特別嚴重的,會大大提升人工,因此加入超時設定來控制。獲取html頁面的時候,須要設置一下頁面編碼,不然默認ISO_8859_1字符編碼。
相關文章
相關標籤/搜索