首先使用開發者工具獲取所需爬取的url,如圖所示html
查看數據類型爲json格式,看以前各位大佬的博客叫json數據類型,json
用json庫loads轉換爲dict數據格式取出每個元素,再打開文件寫入數據,網絡
import requests import json try: r=requests.get('https://edu.cnblogs.com/Homework/GetAnswers?homeworkId=2420&_=1544072161608') r.raise_for_status() r.encoding=r.apparent_encoding datas=json.loads(r.text)['data'] except: print("網絡錯誤") else: crawl='' for data in datas: crawl+=str(data["StudentNo"])+","+data["RealName"]+","+data["DateAdded"].replace("T"," ")+","+data["Title"]+","+data["Url"]+"\n" with open ('hwlist.csv','w') as f: f.write(crawl)
以上是源代碼,下面是結果app