作安全測試時常常須要經過切換IP來探測或者繞過一些安全防禦策略,有一些網站會提供免費或者付費的代理IP,而不管是免費仍是付費的都不能徹底保證代理服務器的可用性,若是一個個手動嘗試將會是一件很痛苦的事情。所以咱們能夠經過腳本,自動化地從這些網站上抓取代理IP並測試其可用性,最終過濾出一批可用的代理IP。javascript
代碼託管在Githubjava
Proxy Server Crawler is a tool used to crawl public proxy servers from proxy websites. When crawled a proxy server(ip::port::type), it will test the functionality of the server automatically.node
Currently supported websites:git
http://www.66ip.cngithub
Currently supported testing(for http proxy)
ssl support
post support
speed (tested with 10 frequently used sites)
type(high/anonymous/transparent)
Python >= 2.7
Scrapy 1.3.0 (not tested for lower version)
node (for some sites, you need node to bypass waf based on javascript)
cd proxy_server_crawler scrapy crawl chunzhen
[ result] ip: 59.41.214.218 , port: 3128 , type: http, proxy server not alive or healthy. [ result] ip: 117.90.6.67 , port: 9000 , type: http, proxy server not alive or healthy. [ result] ip: 117.175.183.10 , port: 8123 , speed: 984 , type: high [ result] ip: 180.95.154.221 , port: 80 , type: http, proxy server not alive or healthy. [ result] ip: 110.73.0.206 , port: 8123 , type: http, proxy server not alive or healthy. [ proxy] ip: 124.88.67.54 , port: 80 , speed: 448 , type: high , post: True , ssl: False [ result] ip: 117.90.2.149 , port: 9000 , type: http, proxy server not alive or healthy. [ result] ip: 115.212.165.170, port: 9000 , type: http, proxy server not alive or healthy. [ proxy] ip: 118.123.22.192 , port: 3128 , speed: 769 , type: high , post: True , ssl: False [ proxy] ip: 117.175.183.10 , port: 8123 , speed: 908 , type: high , post: True , ssl: True
The MIT License (MIT)