最近研究了下如何抓取爲知筆記的內容,在抓取筆記裏的圖片內容時,總是提示403錯誤,用Chorme的開發者工具看了下:html
![](http://static.javashuo.com/static/loading.gif)
這裏的Cookie來自兩個域,估計爲知那邊是驗證了token(登陸後才能獲取到token)web
下載圖片的代碼:跨域
- var path = "https://note.wiz.cn/" + str.TrimStart('/');
- var extension = Path.GetExtension(path);
- var filepath = AppPath.Combine("Images/" + DateTime.Now.Ticks + extension);
-
- const string userAgent ="Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.76 Safari/537.36";
- const string accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
- const string acceptLanguage = "zh-CN,zh;q=0.8";
- const string acceptEncoding = "gzip,deflate,sdch";
- var cookieContainer = new CookieContainer();
- var cookie = new Cookie
- {
- Name = "token".Trim(),
- Value = Token,
- Domain = ".wiz.cn".Trim()
- };
- cookieContainer.Add(cookie);
- string[] cookiesArr = txtCookie.Text.Split(';');
- foreach (string s in cookiesArr)
- {
- string[] keyValuePair = s.Split('=');
- if (keyValuePair.Length > 1)
- {
- cookie = new Cookie
- {
- Name = keyValuePair[0].Trim(),
- Value = keyValuePair[1].Trim(),
- Domain = "note.wiz.cn"
- };
- cookieContainer.Add(cookie);
- }
- }
-
- var newUri = new Uri(path);
- var webRequest = (HttpWebRequest)WebRequest.Create(newUri);
- webRequest.Timeout = 20000;
- webRequest.UserAgent = userAgent;
- webRequest.Accept = accept;
- webRequest.Headers["Accept-Language"] = acceptLanguage;
- webRequest.Headers["Accept-Charset"] = acceptEncoding;
- webRequest.Headers["Accept-Encoding"] = acceptEncoding;
- webRequest.KeepAlive = true;
- webRequest.Headers["Cache-Control"] = "no-cache";
- webRequest.Headers["Upgrade-Insecure-Requests"] = "1";
- webRequest.Headers["Pragma"] = "no-cache";
- webRequest.Headers["Cookie"] = "token=" + Token + ";" + txtCookie.Text.Trim();
-
- webRequest.Referer = newUri.AbsoluteUri;
- HttpWebResponse rsp = (HttpWebResponse)webRequest.GetResponse();
-
- Stream stream = null;
- stream = rsp.GetResponseStream();
- Image.FromStream(stream).Save(filepath);
-
- if (stream != null) stream.Close();
- if (rsp != null) rsp.Close();
奇怪的是:用 webRequest.CookieContainer = cookieContainer; 來跟cookie賦值,token參數老是賦不上,cookie
後面改成:webRequest.Headers["Cookie"] = "token=" + Token + ";" + txtCookie.Text.Trim(); 就能夠了,app
CookieContainer 不是支持多個域的cookie嗎,難到跨域Cookie只能webRequest.Headers["Cookie"]這樣賦值嗎? 沒弄明白,有知道的童鞋不吝賜教。工具