requests

時間 2019-11-16

標籤 requests 简体版

原文原文鏈接

requests是一個很實用的Python HTTP客戶端庫，編寫爬蟲和測試服務器響應數據時常常會用到。能夠說，Requests 徹底知足現在網絡的需求html

官方文檔 http://docs.python-requests.org/en/master/python

什麼是requests模塊

requests模塊是python中原生的基於網絡請求的模塊，其主要做用是用來模擬瀏覽器發起請求。功能強大，用法簡潔高效。在爬蟲領域中佔據着半壁江山的地位。json

爲何要使用requests模塊

由於在使用urllib模塊的時候，會有諸多不便之處，總結以下：
- 手動處理url編碼
- 手動處理post請求參數
- 處理cookie和代理操做繁瑣
- ......
使用requests模塊：
- 自動處理url編碼
- 自動處理post請求參數
- 簡化cookie和代理操做
- .....

如何使用requests模塊

安裝：
- pip install requests
做用特色
- 做用：就是用來模擬瀏覽器上網的。
- 特色：簡單，高效
使用流程
- 指定url
- 基於requests模塊發起請求
- 獲取響應對象中的數據值
- 持久化存儲

無參數：瀏覽器

#爬取搜狗首頁的頁面數據
import requests
#1指定url
url = 'https://www.sogou.com/'
#2.發起請求
response = requests.get(url=url)
#3獲取響應數據
page_text = response.text #text返回的是字符串類型的數據
#持久化存儲
with open('./sogou.html','w',encoding='utf-8') as fp:
    fp.write(page_text)
print('over!')

帶參數服務器

#百度翻譯
url = 'https://fanyi.baidu.com/sug'
word = input('enter a English word:')
#請求參數的封裝
data = {
    'kw':word
}
#UA假裝
headers = {
    'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36'
}
response = requests.post(url=url,data=data,headers=headers)
#text:字符串  json():對象
obj_json = response.json()

print(obj_json)

動態加載的數據cookie

#爬取任意城市對應的肯德基餐廳的位置信息

city = input('enter a cityName:')
url = 'http://www.kfc.com.cn/kfccda/ashx/GetStoreList.ashx?op=keyword'
data = {
    "cname": "",
    "pid": "",
    "keyword": city,
    "pageIndex": "2",
    "pageSize": "10",
}
#UA假裝
headers = {
    'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36'
}
response = requests.post(url=url,headers=headers,data=data)

json_text = response.text

print(json_text)