HTTPClient模擬登錄人人網

目的:
使用HTTPClient4.0.1登陸到人人網,並從特定的網頁抓取數據。


總結&注意事項:

  • HttpClient(DefaultHttpClient)表明了一個會話,在同一個會話中,HttpClient對cookie自動進行管理(固然,也能夠在程序中進行控制)。
  • 在同一個會話中,當使用post或是get發起一個新的請求時,通常須要對調用前一個會話的abort()方法,不然會拋出異常。
  • 有些網站登陸成功後會重定向(302, 303),好比這裏的人人網。若是發出的是post請求,須要從響應頭中取出location,並再次向網站發送請求,以獲取最終數據。
  • 抓取程序不要運行地過於頻繁,大部分站點都有抵制刷網站機制。人人網訪問過於頻繁會鎖帳號。
  • 使用錄製工具錄製出登陸時向網站發出的請求參數。在這裏,我使用了badboy,導出成jmeter文件,在jmeter中就能夠看到登陸時向網站發送的參數列表和相應的值。
  • 人人網屬於登錄流程比較簡單的網站,後一篇會介紹一家比較難搞的網站。

代碼:java

public class RenRen {
    // The configuration items
    private static String userName = "YourMailinRenren";
    private static String password = "YourPassword";
    private static String redirectURL = "http://blog.renren.com/blog/304317577/449470467";

    // Don't change the following URL
    private static String renRenLoginURL = "http://www.renren.com/PLogin.do";

    // The HttpClient is used in one session
    private HttpResponse response;
    private DefaultHttpClient httpclient = new DefaultHttpClient();

    private boolean login() {
        HttpPost httpost = new HttpPost(renRenLoginURL);
        // All the parameters post to the web site
        List<NameValuePair> nvps = new ArrayList<NameValuePair>();
        nvps.add(new BasicNameValuePair("origURL", redirectURL));
        nvps.add(new BasicNameValuePair("domain", "renren.com"));
        nvps.add(new BasicNameValuePair("isplogin", "true"));
        nvps.add(new BasicNameValuePair("formName", ""));
        nvps.add(new BasicNameValuePair("method", ""));
        nvps.add(new BasicNameValuePair("submit", "登陸"));
        nvps.add(new BasicNameValuePair("email", userName));
        nvps.add(new BasicNameValuePair("password", password));
        try {
            httpost.setEntity(new UrlEncodedFormEntity(nvps, HTTP.UTF_8));
            response = httpclient.execute(httpost);
        } catch (Exception e) {
            e.printStackTrace();
            return false;
        } finally {
            httpost.abort();
        }
        return true;
    }

    private String getRedirectLocation() {
        Header locationHeader = response.getFirstHeader("Location");
        if (locationHeader == null) {
            return null;
        }
        return locationHeader.getValue();
    }

    private String getText(String redirectLocation) {
        HttpGet httpget = new HttpGet(redirectLocation);
        // Create a response handler
        ResponseHandler<String> responseHandler = new BasicResponseHandler();
        String responseBody = "";
        try {
            responseBody = httpclient.execute(httpget, responseHandler);
        } catch (Exception e) {
            e.printStackTrace();
            responseBody = null;
        } finally {
            httpget.abort();
            httpclient.getConnectionManager().shutdown();
        }
        return responseBody;
    }

    public void printText() {
        if (login()) {
            String redirectLocation = getRedirectLocation();
            if (redirectLocation != null) {
                System.out.println(getText(redirectLocation));
            }
        }
    }

    public static void main(String[] args) {
        RenRen renRen = new RenRen();
        renRen.printText();
    }
}
相關文章
相關標籤/搜索