不知道多久之前就有過寫個自動回帖的小軟件一直沒有實現,最近閒下來了遂研究了下,本人小菜對於HTTP協議只知其一;不知其二隻能在請教google大神了,把個人想法跟google大神說了以後,google大神說這小子不錯,這是爲防火事業作貢獻啊!特賜予小弟如下神器:php
如下列出的是 HttpClient 提供的主要的功能,要知道更多詳細的功能能夠參見 HttpClient 的主頁。html
jsoup 的主要功能以下java
廢話很少說直接進入正題,在HTTPClient源碼包內包含example文件夾此文件夾內包含一些基本用法這些例子入門足夠了找到ClientFormLogin.java具體解釋註釋已經很清楚了大體意思就是模擬HTTP請求存儲cookies。jquery
測試網站:http://bbs.dakele.com/web
由於此網站對登陸作了特殊處理因此與標準的DZ論壇可能會有些出入請自行修改ajax
對網站的分析使用的chrome自帶的審查元素,這個折騰了很多時間chrome
登陸地址:http://passport.dakele.com/login.do?product=bbsapache
輸入錯誤的用戶名和密碼會發現實際登陸地址爲http://passport.dakele.com/logon.do 注意【i/n的區別剛開始沒注意覺得見鬼了】json
返回錯誤信息服務器
{"err_msg":"賬號或密碼錯誤"}
輸入正確信息返回
{"result":true,"redirect":http://bbs.dakele.com/member.php?mod=logging&action=login&loginsubmit=yes&infloat=yes&lssubmit=yes&inajax=0&fastloginfield=username&quickforward=yes&handlekey=ls&cookietime=2592000&remember=0&username=youname&AccessKey=[]}
直接輸入rediret鏈接和正常登陸
獲取跳轉連接:
private LoginResult getRedirectUrl(){ LoginResult loginResult = null; CloseableHttpClient httpClient = HttpClients.createDefault(); HttpPost httpost = new HttpPost(LOGINURL); httpost.setHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"); httpost.setHeader("Accept-Language", "zh-CN,zh;q=0.8"); httpost.setHeader("Cache-Control", "max-age=0"); httpost.setHeader("Connection", "keep-alive"); httpost.setHeader("Host", "passport.dakele.com"); httpost.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36"); List <NameValuePair> nvps = new ArrayList <NameValuePair>(); nvps.add(new BasicNameValuePair("product", "bbs")); nvps.add(new BasicNameValuePair("surl", "http://bbs.dakele.com/")); nvps.add(new BasicNameValuePair("username", "yourname"));//用戶名 nvps.add(new BasicNameValuePair("password", "yourpass"));//密碼 nvps.add(new BasicNameValuePair("remember", "0")); httpost.setEntity(new UrlEncodedFormEntity(nvps, Consts.UTF_8)); CloseableHttpResponse response2 = null; try { response2 = httpClient.execute(httpost); if(response2.getStatusLine().getStatusCode()==200){ HttpEntity entity = response2.getEntity(); String entityString = EntityUtils.toString(entity); JSONArray jsonArray = JSONArray.fromObject("["+entityString+"]"); JsonConfig jsonConfig=new JsonConfig(); jsonConfig.setArrayMode(JsonConfig.MODE_OBJECT_ARRAY); jsonConfig.setRootClass(LoginResult.class); LoginResult[] results= (LoginResult[]) JSONSerializer.toJava( jsonArray, jsonConfig ); if(results.length==1){ loginResult = results[0]; } } } catch (ClientProtocolException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); }finally{ try { response2.close(); httpClient.close(); } catch (IOException e) { e.printStackTrace(); } } return loginResult; }
登陸代碼:
public boolean login(){ boolean flag = false; LoginResult loginResult = getRedirectUrl(); if(loginResult.getResult().equals("true")){ cookieStore = new BasicCookieStore(); globalClient = HttpClients.custom().setDefaultCookieStore(cookieStore).build(); HttpGet httpGet = new HttpGet(loginResult.getRedirect()); httpGet.setHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"); httpGet.setHeader("Accept-Language", "zh-CN,zh;q=0.8"); httpGet.setHeader("Connection", "keep-alive"); httpGet.setHeader("Host", HOST); httpGet.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36"); try { globalClient.execute(httpGet); } catch (ClientProtocolException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } List<Cookie> cookies2 = cookieStore.getCookies(); if (cookies2.isEmpty()) { log.error("cookie is empty"); } else { for (int i = 0; i < cookies2.size(); i++) { } } } return flag; }
到此已經登陸成功能夠進行只有登陸號才能作的事了,什麼?你不知道固然是滅火了
首先取得須要回覆的帖子地址,列表頁比較有規律全部沒有寫自動發現的因此寫了個循環@1
for(int i=1;i<200;i++){ String basurl="http://bbs.dakele.com/forum-43-"+i+".html"; log.info(basurl); List<String> urls = dakele.getThreadURLs(basurl); for(String url:urls){ //log.info(url); ReplayContent content = dakele.preReplay(url); if(content!=null){ log.info(content.getUrl()); log.info(content.getMessage()); //dakele.replay( content); //Thread.sleep(15300); } } }
在列表頁內獲取帖子地址:
String html = EntityUtils.toString(entity); Document document = Jsoup.parse(html,HOST); Elements elements=document.select("tbody[id^=normalthread_] > tr > td.new > a.xst"); for(int i=0;i<elements.size();i++){ Element e = elements.get(i); urList.add(e.attr("abs:href")); }
在須要回覆的帖子內得到須要提交的form表單地址以及構造回覆內容
public ReplayContent preReplay(String url){ ReplayContent content = null; HttpGet get = new HttpGet(url); get.setHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"); get.setHeader("Accept-Language", "zh-CN,zh;q=0.8"); get.setHeader("Connection", "keep-alive"); get.setHeader("Host", HOST); get.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36"); try { CloseableHttpResponse response = globalClient.execute(get); HttpEntity entity = response.getEntity(); String html = EntityUtils.toString(entity); Document document = Jsoup.parse(html, HOST); Element postForm = document.getElementById("fastpostform"); if(!postForm.toString().contains("您如今無權發帖")){ content = new ReplayContent(); content.setUrl(url); log.debug(postForm.attr("abs:action")); content.setAction(postForm.attr("abs:action")); //////// Elements teElements = document.select("td[id^=postmessage_]"); String message = ""; for(int i=0;i<teElements.size();i++){ String temp = teElements.get(i).html().replaceAll( "(?is)<.*?>", ""); if(temp.contains("發表於")){ String[] me = temp.split("\\s+"); temp = me[me.length-1]; } message+=temp.replaceAll("\\s+", ""); } log.debug(message.replaceAll("\\s+", "")); /////////////// /*取最後一條評論 Element messageElement= document.select("td[id^=postmessage_]").last(); // String message = messageElement.html().replaceAll("\\&[a-zA-Z]{1,10};", "").replaceAll("<[^>]*>", "").replaceAll("[(/>)<]", ""); String message = messageElement.html().replaceAll( "(?is)<.*?>", ""); */ if(message.contains("發表於")){ String[] me = message.split("\\s+"); message = me[me.length-1]; } content.setMessage(message.replaceAll(" ", "").replaceAll("上傳", "").replaceAll("附件", "").replaceAll("下載", "")); Elements inputs = postForm.getElementsByTag("input"); for(Element input:inputs){ log.debug(input.attr("name")+":"+input.attr("value")); if(input.attr("name").equals("posttime")){ content.setPosttime(input.attr("value")); }else if(input.attr("name").equals("formhash")){ content.setFormhash(input.attr("value")); }else if(input.attr("name").equals("usesig")){ content.setUsesig(input.attr("value")); }else if(input.attr("name").equals("subject")){ content.setSubject(input.attr("value")); } } }else{ log.warn("您如今無權發帖:"+url); } } catch (ClientProtocolException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } return content; }
地址有了,內容有了接下來開始放水了
public void replay(ReplayContent content){ HttpPost httpost = new HttpPost(content.getAction()); httpost.setHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"); httpost.setHeader("Accept-Language", "zh-CN,zh;q=0.8"); httpost.setHeader("Cache-Control", "max-age=0"); httpost.setHeader("Connection", "keep-alive"); httpost.setHeader("Host", HOST); httpost.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36"); List <NameValuePair> nvps = new ArrayList <NameValuePair>(); nvps.add(new BasicNameValuePair("posttime", content.getPosttime())); nvps.add(new BasicNameValuePair("formhash", content.getFormhash())); nvps.add(new BasicNameValuePair("usesig", content.getUsesig())); nvps.add(new BasicNameValuePair("subject", content.getSubject())); nvps.add(new BasicNameValuePair("message", content.getMessage())); httpost.setEntity(new UrlEncodedFormEntity(nvps, Consts.UTF_8)); //HTTP 三次握手 必須處理響應剛開始沒注意卡在這了 CloseableHttpResponse response2 = null; try { response2 = globalClient.execute(httpost); //log.info(content.getAction()); //log.info(content.getMessage()); HttpEntity entity = response2.getEntity(); EntityUtils.consume(entity); // BufferedWriter bw= new BufferedWriter(new FileWriter("d:/tt1.html")); // bw.write(EntityUtils.toString(response2.getEntity())); // bw.flush(); // bw.close(); //System.out.println(EntityUtils.toString(response2.getEntity())); } catch (ClientProtocolException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } }
固然這隻適用於沒有驗證碼的論壇對於有驗證碼的只能繞道了,
對於回覆內容剛開始只取了當前帖子內最後一條評論而後進行回覆,被警告!而後使用IK分詞獲取關鍵字代碼是貼來的請移步
參考鏈接:
小弟第一次發帖其中有不足之處望批評指正
------------------------------------------
下載地址http://pan.baidu.com/s/1jGjwA5g
早上把代碼整理了下,如今分享給你們,直接對Myeclipse工程進行的打包解壓後可直接導入
修改IKFenci.java 內用戶名和密碼可直接運行