1、問題點:html
一、模擬登陸後,若是帶有嵌套的iframe嵌套,很差讀取iframe內容,能夠直接指定iframe抓取網址web
二、C# 清除WebBrowser控件的Session和Cookiecookie
參考文檔:http://www.360doc.com/content/14/0810/12/9200790_400769010.shtmlpost
代碼以下:this
[DllImport("wininet.dll",SetLastError = true)] private static extern bool InternetSetOption(IntPtr hInternet, int dwOption, IntPtr lpBuffer, int lpdwBufferLength); private void timer_Tick(object sender, EventArgs e) { InternetSetOption(IntPtr.Zero,42,IntPtr.Zero,0); if (this.webBrowser.Document != null) { this.webBrowser.Document.Cookie.Remove(0, this.webBrowser.Document.Cookie.Count() - 1); } string[] cookies = System.IO.Directory.GetFiles(Environment.GetFolderPath(Environment.SpecialFolder.Cookies)); foreach (string currentFile in cookies) { try { System.IO.File.Delete(currentFile); } catch { } } this.webBrowser.Navigate(SysInfo.WEBURL); }
三、IHTMLDocument2 的引用 引用--COM--Microsoft HTML Object Library url
2、模擬登陸spa
模擬無驗證碼登陸,用WebBrowser比較簡單,爲登陸用戶和密碼賦值,而後模擬點擊登陸按鈕便可.net
this.webBrowser.Document.GetElementById("user").SetAttribute("value", "user"); this.webBrowser.Document.GetElementById("password").SetAttribute("value", "password"); this.webBrowser.Document.InvokeScript("SetCookie"); for (int i = 0; i < 10; i++)//等待1秒,進行登陸 { Thread.Sleep(100); } HtmlElement btnLogin = this.webBrowser.Document.GetElementById("login"); btnLogin.InvokeMember("Click"); for (int i = 0; i < 5; i++)//等待0.5秒,進行跳轉 { Thread.Sleep(100); } this.webBrowser.Navigate(SysInfo.DATAURL);
3、抓取數據code
指定抓取網址,載入以後,獲取元素值htm
HtmlElement div = this.webBrowser.Document.GetElementById("style1");
參考博客:C#中的WebBrowser控件的使用