httpclient 多線程爬蟲實例

本人最近在研究安全測試的過程當中,偶然發現某站一個漏洞,在獲取資源的時候居然不須要校驗,原來設定的用戶天天獲取資源的次數限制就沒了。趕忙想到用爬蟲多爬一些數據,可是奈何數據量太大了,因此想到用多線程來爬蟲。通過嘗試終於完成了,腳本寫得比較粗糙,由於沒真想爬完。預計10萬數據量,10個線程,每一個線程爬1萬,每次爬100個數據(居然是 get 接口,有 url 長度限制)。java

分享代碼,供你們參考。apache

package practise;
 
import java.util.Date;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import org.apache.http.client.methods.HttpGet;
import net.sf.json.JSONObject;
import source.ApiLibrary;
 
public class LoginDz extends ApiLibrary {
 
	public static void main(String[] args) {
		LoginDz loginDz = new LoginDz();
		loginDz.excuteTreads();
		testOver();
	}
 
	public JSONObject getTi(int[] code, String name) {
		JSONObject response = null;
		String url = "***********";
		JSONObject args = new JSONObject();
		// args.put("ID_List", getTiId(884969));
		args.put("ID_List", getTiId(code));
		HttpGet httpGet = getHttpGet(url, args);
		response = getHttpResponseEntityByJson(httpGet);
		// output(response.toString());
		String text = response.toString();
		if (!text.equals("{\"success_response\":[]}"))
			logLog("name", response.toString());
		output(response);
		return response;
	}
 
 
	public String getTiId(int... id) {
		StringBuffer result = new StringBuffer();
		int length = id.length;
		for (int i = 0; i < length; i++) {
			String abc = "filter[where][origDocID][inq]=" + id[i] + "&";
			result.append(abc);
		}
		return result.toString();
	}
 
	/**
	 * 執行多線程任務
	 */
	public void excuteTreads() {
		int threads = 10;
		ExecutorService executorService = Executors.newFixedThreadPool(threads);
		CountDownLatch countDownLatch = new CountDownLatch(threads);
		Date start = new Date();
		for (int i = 0; i < threads; i++) {
			executorService.execute(new More(countDownLatch, i));
		}
		try {
			countDownLatch.await();
			executorService.shutdown();
		} catch (InterruptedException e) {
			e.printStackTrace();
		}
		Date end = new Date();
		outputTimeDiffer(start, end);
	}
 
	/**
	 * 多線程類
	 */
	class More implements Runnable {
		public CountDownLatch countDownLatch;
		public int num;
 
		public More(CountDownLatch countDownLatch, int num) {
			this.countDownLatch = countDownLatch;
			this.num = num;
		}
 
		@Override
		public void run() {
			int bound = num * 10000;
 
			try {
				for (int i = bound; i < bound + 10000; i += 100) {
					int[] ids = new int[100];
					for (int k = 0; k < 100; k++) {
						ids[i] = i + k;
						getTi(ids, bound + "");
					}
				}
			} finally {
				countDownLatch.countDown();
			}
		}
 
	}
 
}

技術類文章精選

  1. java一行代碼打印心形
  2. Linux性能監控軟件netdata中文漢化版
  3. 接口測試代碼覆蓋率(jacoco)方案分享
  4. 性能測試框架
  5. 如何在Linux命令行界面愉快進行性能測試
  6. 圖解HTTP腦圖
  7. 如何測試機率型業務接口
  8. httpclient處理多用戶同時在線
  9. 將swagger文檔自動變成測試代碼
  10. 五行代碼構建靜態博客
  11. httpclient如何處理302重定向
  12. 基於java的直線型接口測試框架初探
  13. Tcloud 雲測平臺--集大成者

非技術文章精選

  1. 爲何選擇軟件測試做爲職業道路?
  2. 成爲傑出Java開發人員的10個步驟
  3. 寫給全部人的編程思惟
  4. 自動化測試的障礙
  5. 自動化測試的問題所在
  6. 測試之《代碼不朽》腦圖
  7. 成爲優秀自動化測試工程師的7個步驟
  8. 優秀軟件開發人員的態度
  9. 如何正確執行功能API測試

點擊查看公衆號地圖

相關文章
相關標籤/搜索