PHP利用socket_bind函數切換IP地址採集數據

在利用PHP進行數據採集的過程當中,一般會遇到IP被屏蔽或出現驗證碼的狀況;爲了可以繼續採集,咱們須要切換不一樣的ip,每訪問一次,隨機切換一個IP。固然也能夠經過收集大量代理,經過切換代理的方式進行採集,原理大抵類似。
       由於本人在實際工做中遇到這種狀況,恰好發生的場景在美國站羣的服務器,上面有已經綁定了200多個ip(這種服務器1300元一月),所以能夠輕鬆的利用socket_bind()函數進行出口ip的綁定,只須要隨機抽取一個IP進行綁定就能夠。
           在C#中一樣能夠經過Socket.Bind()函數進行ip的綁定,以此切換服務器中不一樣ip進行採集!php

<?php
    //輸出內容
    echo Getdata("http://www.baidu.com/s?wd=ip");

    //Getdata()採集函數
    function Getdata($url){
        //隨機ip
        require_once('D:\fang360_100dir\datas\Iplist.php');
        $ip = $ip_arr[rand(0,count($ip_arr)-1)];
 
        //host post path
        $arr = parse_url($url);
        $path=$arr['path']?$arr['path']:"/";
        $host=$arr['host'];
        $port=isset($arr['port'])?$arr['port']:80;
        if ( $arr['query'] ){
            $path .= "?".$arr['query'];
        }
 
        // Create a new socket 
        $sockHttp = socket_create(AF_INET, SOCK_STREAM, SOL_TCP);
        if (!$sockHttp){
            echo "socket_create() failed: reason: " .socket_strerror(socket_last_error()) . "\n";
        } 
 
        // Bind the source address 
        if (socket_bind($sockHttp, $ip) === false) {
            echo "socket_bind() failed: reason: " .socket_strerror(socket_last_error($sockHttp)) . "\n";
        }
 
        // Connect to destination address
        $resSockHttp = socket_connect($sockHttp, $host, $port);
        if (!$resSockHttp){  
            echo 'socket_connect() failed!';  
        }
 
        $user_agent = "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 xttest/24.0";
        $cookie = '';
        $timeout = 25;
        $out = "GET {$path} HTTP/1.0\r\n";
        $out .= "Host: {$host}\r\n";
        $out .= "User-Agent: {$user_agent}\r\n";
        $out .= "Accept: */*\r\n";
        $out .= "Accept-Language: zh-cn\r\n";
        $out .= "Accept-Encoding: identity\r\n";
        $out .= "Referer: {$url}\r\n";
        $out .= "Cookie: {$cookie}\r\n";
        $out .= "Connection: Close\r\n\r\n";
 
        // Write
        socket_write($sockHttp, $out,strlen($out));
 
        $httpCode = substr(socket_read($sockHttp, 13),9,3);
 
        $data ='';
        while ($sRead = socket_read($sockHttp, 4096)){  
        $data .= $sRead;  
    }
    // Close
    socket_close($sockHttp);
 
    if (preg_match("#Content-Type:([^\r\n]*)#i", $data, $matches) && trim($matches[1]) != '')
    {
        $content_type_array = explode(';', $matches[1]);
        $ContentType = strtolower(trim($content_type_array[0]));
    }
    else
    {
        $ContentType = 'text/html';
    }
 
    header("Content-type: $ContentType"); 
    $data=preg_replace("/^[^<]*?\r\n\r\n/","",$data);
 
    if($httpCode>=400){
        $data = "Request Error";
    } 
 
    return $data;
}
 
?>                        
相關文章
相關標籤/搜索