Fiddler是一個http協議調試代理工具,它可以記錄並檢查全部你的電腦和互聯網之間的http通信,設置斷點,查看全部的「進出」Fiddler的數據(指cookie,html,js,css等文件)。css
Fiddler 要比其餘的網絡調試器要更加簡單,由於它不只僅暴露http通信還提供了一個用戶友好的格式。同類的工具備: httpwatch, firebug, wireshark。html
Fiddler使用,參考:http://www.javashuo.com/article/p-hmddygzb-hc.htmlpython
Fiddler下載:https://www.telerik.com/fiddlerandroid
傻瓜式安裝,一鍵到底。web
打開Fiddler軟件,打開工具的設置。(Fiddler軟件菜單欄:Tools->Options)chrome
在HTTPS中設置以下:json
在Connections中設置以下,這裏使用默認8888端口,固然也能夠本身更改,可是注意不要與已經使用的端口衝突:api
載後在手機裏打開,命名。安裝瀏覽器
使用Fiddler進行手機抓包,首先要確保手機和電腦的網絡在一個內網中,可使用讓電腦和手機都鏈接同一個路由器。固然,也可讓電腦開放WIFI熱點,手機連入。安全
這裏,我使用的方法是,讓手機和電腦同時連入一個路由器中。最後,讓手機使用電腦的代理IP進行上網。
首先,查看電腦的IP地址,在cmd
中使用命令ipconfig
查看電腦IP地址。找到無線局域網WLAN的IPv4地址,記下此地址。
在手機上,點擊鏈接的WIFI進行網絡修改,添加代理。進行手動設置,主機名即爲上圖中找到的IP地址,端口號即爲Fiddler設置中的端口號8888:
在手機瀏覽器中輸入地址:http://localhost:8888/
,點擊FiddlerRoot certificate
,下載安全證書:
以華爲手機爲例:
在手機設置--->高級設置-->安全---->顯示受信任的CA證書--->用戶
安裝成功後,顯示以下:
上述步驟都設置完成以後,用手機打開今日頭條app,截圖以下:
咱們再來看fidder抓取的數據狀況:
能夠複製url和head內容
GET http://cards.iqiyi.com/views_search/3.0/search?card_v=3.0&scrn_res=1080,1788&keyword=%E5%8E%A6%E9%97%A8%E8%A7%86%E9%A2%91%E5%A4%B4%E6%9D%A1&source=suggest&qr=0&mode=1&duration_level=0&publish_date=0&bitrate=0&need_qc=0&s_sr=1&from_rpage=qy_home&origin=0&psp_vip=0&s_token=main%23%E5%8E%A6%E9%97%A8%E8%A7%86%E9%A2%91&app_k=3179f25bc69e815ad828327ccf10c539&app_v=10.3.5&platform_id=10&dev_os=7.0&dev_ua=HUAWEI+CAZ-AL10&net_sts=1&qyid=864590038380239&cupid_v=3.35.002&psp_uid=1732414636&psp_cki=03RdTbm2uf4Km2X6Mvs1lAVDAg4l6om2Uf0HWm32122YH5VCFgxKvr4m2UFiOwCwBuvlcCu9c&imei=c0497fcececef4b5a2a4f4156d6fd726&aid=47628a3804ad50be&mac=14:5F:94:B3:E0:AD&scrn_scale=3&secure_p=GPhone&secure_v=1&core=1&api_v=8.8&profile=%7B%22group%22%3A%221%2C2%22%2C%22counter%22%3A2%7D&province_id=2007&service_filter=&service_sort=&layout_v=44.115&device_type=0&cupid_uid=864590038380239&psp_status=1&app_gv=&gps=116.373202,39.962811&bdgps=116.385157,39.970314&lang=zh_CN&app_lm=cn&req_times=0&req_sn=1556011726525 HTTP/1.1 qyid: 864590038380239_47628a3804ad50be_14Z5FZ94ZB3ZE0ZAD Connection: Keep-Alive t: 512025323 sign: 04876862c652470b54dd1add698631d5 Host: cards.iqiyi.com Accept-Encoding: gzip
四、python代碼測試
有了上面這些信息就能夠寫代碼了
# -*- coding: UTF-8 -*- import requests from urllib import request import time from selenium.webdriver.chrome.options import Options from selenium import webdriver from pyquery import PyQuery as pq from requests.packages.urllib3.exceptions import InsecureRequestWarning requests.packages.urllib3.disable_warnings(InsecureRequestWarning) import json class app_data: def __init__(self): self.headers = {'Accept-Charset': 'UTF-8', 'X-Requested-With': 'XMLHttpRequest', 'Host': 'lf-hl.snssdk.com', 'Connection': 'Keep-Alive', 'Accept-Encoding': 'gzip', 'X-SS-REQ-TICKET': '1544235590880', 'sdk-version': '1', 'User-Agent': 'Dalvik/2.1.0 (Linux; U; Android 7.0; HUAWEI CAZ-AL10 Build/HUAWEICAZ-AL10) NewsArticle/7.0.1 cronet/TTNetVersion:pre_blink_merge-277498-gd2bb364e 2018-08-24', 'X-SS-TC': '0' } self.headers2 = { 'Accept': '*/*', 'Accept-Encoding': 'gzip,deflate', 'Accept-Language': 'zh-CN,en-US;q=0.8', 'User-Agent': 'Mozilla/5.0 (Linux; Android 7.0; HUAWEI CAZ-AL10 Build/HUAWEICAZ-AL10; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/56.0.2924.87 Mobile Safari/537.36 JsSdk/2 NewsArticle/7.0.1 NetType/wifi', 'X-Requested-With': 'com.ss.android.article.news' } def catch_app_data(self,link): if not link: link=self.heros_url1 req = requests.get(url=link, headers=self.headers,verify=False).json() data = req.get("data") name = data.get("name") print('帳號:', name) verified_content = data.get("verified_content") print('認證:', verified_content) area = data.get("area") print('位置:', area) description = data.get("description") print('簡介:', description) user_id = data.get("user_id") print('user_id:', user_id) def cat_app_list(self,keyword='中餐廳'): url = 'https://lf-hl.snssdk.com/api/search/content/?from=search_tab' \ '&keyword='+keyword+'' \ '&cur_tab_title=search_tab' \ '&plugin_enable=3' \ '&iid=53115531269' \ '&device_id=52727404130' \ '&ac=wifi' \ '&channel=huawei&aid=13' \ '&app_name=news_article' \ '&version_code=701' \ '&version_name=7.0.1' \ '&device_platform=android' \ '&ab_group=94567' \ '%252C102749%252C181430' \ '&abflag=3' \ '&device_type=HUAWEI%2BCAZ-AL10' \ '&device_brand=HUAWEI' \ '&language=zh' \ '&os_api=24' \ '&os_version=7.0' \ '&uuid=864590038380239' \ '&openudid=47628a3804ad50be' \ '&manifest_version_code=701' \ '&resolution=1080*1788' \ '&dpi=480' \ '&update_version_code=70108' \ '&_rticket=1544497762334' \ '&fp=DrT_L2w1cST5FlT_F2U1FYK7FrxO' \ '&tma_jssdk_version=1.5.4.2' \ '&rom_version=emotionui_5.0.4_caz-al10c00b386' \ '&plugin=26958&search_sug=1' \ '&forum=1&count=10' \ '&format=json' \ '&source=input' \ '&pd=synthesis' \ '&keyword_type=' \ '&action_type=input_keyword_search' \ '&search_position=search_tab' \ '&from_search_subtab=' \ '&offset=0' \ '&search_id=' \ '&has_count=0&qc_query=' head = { 'Accept': '*/*', 'Accept-Encoding': 'gzip,deflate', 'Accept-Language': 'zh-CN,en-US;q=0.8', 'User-Agent': 'Mozilla/5.0 (Linux; Android 7.0; HUAWEI CAZ-AL10 Build/HUAWEICAZ-AL10; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/56.0.2924.87 Mobile Safari/537.36 JsSdk/2 NewsArticle/7.0.1 NetType/wifi', 'X-Requested-With': 'com.ss.android.article.news' } req = requests.get(url=url, headers=head,verify=False).json() data = req.get("data") url_list = [] for item in data: display_list = item.get('display') if display_list: album_group_dict = display_list.get('album_group') if album_group_dict: extra = str(album_group_dict.get('extra')) item_list = extra.split(',') for e in item_list: if e.find("album_group_url") > -1: url = e[e.find(":") + 1:] url = url.replace("\"", "") url_list.append(url) break else: url = display_list.get('url') if url: url_list.append(url) print(url_list) def catchdata_sogo(self,url): chrome_options = Options() chrome_options.add_argument('--headless') chrome_options.add_argument('--disable-gpu') self.driver = webdriver.Chrome(chrome_options=chrome_options) self.driver.set_page_load_timeout(10) self.driver.maximize_window() # self.driver = webdriver.PhantomJS(service_args=['--load-images=false']) # self.driver.set_page_load_timeout(20) # self.driver.maximize_window() try: self.driver.get(url) print(url) # handles = self.driver.window_handles # 獲取當前窗口句柄集合(列表類型) # self.driver.switch_to.window(handles[2 - 1]) time.sleep(2) selenium_html = self.driver.execute_script("return document.documentElement.outerHTML") doc = pq(selenium_html) elements = doc("div[class='content-txt']").find("p") for element in elements.items(): print(element.text()) elements = doc("p[class='mod-base-item']").find("span") for element in elements.items(): print(element.text()) except Exception as ex: print(ex) def catchdata_so(self, url): chrome_options = Options() chrome_options.add_argument('--headless') chrome_options.add_argument('--disable-gpu') self.driver = webdriver.Chrome(chrome_options=chrome_options) self.driver.set_page_load_timeout(10) self.driver.maximize_window() # self.driver = webdriver.PhantomJS(service_args=['--load-images=false']) # self.driver.set_page_load_timeout(20) # self.driver.maximize_window() try: self.driver.get(url) print(url) # handles = self.driver.window_handles # 獲取當前窗口句柄集合(列表類型) # self.driver.switch_to.window(handles[2 - 1]) time.sleep(2) selenium_html = self.driver.execute_script("return document.documentElement.outerHTML") doc = pq(selenium_html) elements = doc("div[class='cp-info-main']") for element in elements.items(): print(element('h3').text()) # print(element("p[class='js-info-upinfo']").text()) print(element('p').text()) except Exception as ex: print(ex) def test(self,link): req = requests.get(url=link, headers=self.headers2, verify=False) json_str = req.content.decode() print(json_str) if __name__ == '__main__': obj = app_data() # http://m.video.so.com/android/va/Zs5sb3Ny7JA4DT.html # https://m.douguo.com/search/trecipe/%E4%B8%AD%E9%A4%90%E5%8E%85/0?f=tt # https://baike.sogou.com/m/fullLemma?ch=jrtt.search.item&cid=xm.click&lid=167408303 # obj.catch_app_data('') # 湖南衛視中餐廳 # keywords = ['湖南衛視中餐廳','農廣天地','看臺','十年','CCTV-4遠方的家','CCTV熱線12'] # for keyword in keywords: # obj.cat_app_list(keyword) # print('\n') obj.cat_app_list('CCTV熱線12') # obj.catchdata_sogo('https://baike.sogou.com/m/fullLemma?ch=jrtt.search.item&cid=xm.click&lid=167408303#lemmaHome') # obj.catchdata_so('http://m.video.so.com/android/va/Zs5sb3Ny7JA4DT.html') # obj.catchdata_so('http://m.video.so.com/android/va/YcMpcKVv82YBDz.html') # obj.parserurl() # obj.test('http://m.video.so.com/android/va/Zs5sb3Ny7JA4DT.html')
輸出結果以下:
須要注意的是,必須先運行Fidder,而後再在手機上進行相關的操做,順序不能亂,若是在不運行fidder的狀況下,操做手機,將沒法聯網
抓不到https包,fiddler並非支持所有協議
fiddler並不支持所有協議,目前已知的有http二、tcp、udp、websocket等,若是應用走了以上協議,那麼fiddler確定是抓不到的。
http2:由於fiddler是基於.net framework實現的,由於.net framework不支持http2,因此fiddler沒法抓取http2
fiddler抓包的原理是中間人攻擊,也就是說,兩頭瞞,欺騙客戶端&&欺騙服務器端,若是https證書寫死在app裏,也就是說,app不信任fiddler頒發給它的證書,
app只信任本身的證書,fiddler無法瞞客戶端了,所以fiddler也就抓取不到包了。
再多說幾句,若是是本身開發的app,開發調試方便起見,可使用相似wireshark的工具導入服務器證書,抓包解密。
參考:https://blog.csdn.net/memoryofyck/article/details/80955615