C#爬蟲基本知識

url編碼解碼

  • 首先引用程序集System.Web.dll

若是要解碼某個url的參數值的話,能夠調用下面的方法:
System.Web.HttpUtility.UrlDecode(string)
對某個url參數進行編碼:
string s = "[1,2]"; string result = System.Web.HttpUtility.UrlEncode(s);html

HttpWebRequest HttpWebResponse的使用

string url = "www.baidu.com";
HttpWebRequest request = WebRequest.Create(url) as HttpWebRequest;
// request.Accept = ...(根據實際狀況填寫)
// request.Method = ...(根據實際狀況填寫)
HttpWebResponse response = request.GetResponse() as HttpWebResponse;

using(Stream s = response.GetResponseStream())
{
    using(StreamReader reader = new StreamReader(s))
    {
        string data = reader.ReadToEnd();
    }
    s.Close();
}

response.Close();

要注意Stream 和 HttpWebResponse都實現了IDisposeable接口,因此要用using語句包裹,或者自行調用其Dispose()方法.還有,他們兩在使用完後有調用一下他們的Close()方法來關閉鏈接.web

利用Html Agility Pack來解析html

相關文章
相關標籤/搜索