requests 是一個功能強大、簡單易用的 HTTP 請求庫,可使用 pip install requests
命令進行安裝html
下面咱們將會介紹 requests 中經常使用的方法,詳細內容請參考 官方文檔python
在開始講解前,先給你們提供一個用於測試的網站,http://www.httpbin.org/json
這個網站能夠在頁面上返回所發送 請求 的相關信息,十分適合練習使用cookie
好了,下面正式開始!網絡
該方法用於向目標網址發送請求,接收響應app
該方法返回一個 Response 對象,其經常使用的屬性和方法列舉以下:post
response.content.decode('utf-8')
json.loads(response.text)
>>> import requests >>> response = requests.get('http://www.httpbin.org/get') >>> type(response) # <class 'requests.models.Response'> >>> print(response.url) # 返回請求網站的 URL # http://www.httpbin.org/get >>> print(response.status_code) # 返回響應的狀態碼 # 200 >>> print(response.encoding) # 返回響應的編碼方式 # None >>> print(response.cookies) # 返回響應的 Cookie 信息 # <RequestsCookieJar[]> >>> print(response.headers) # 返回響應頭 # {'Connection': 'keep-alive', 'Server': 'gunicorn/19.9.0', 'Date': 'Sat, 18 Aug 2018 02:00:23 GMT', 'Content-Type': 'application/json', 'Content-Length': '275', 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Credentials': 'true', 'Via': '1.1 vegur'} >>> type(response.content) # 返回 bytes 類型的響應體 # <class 'bytes'> >>> type(response.text) # 返回 str 類型的響應體 # <class 'str'> >>> type(response.json()) # 返回 dict 類型的響應體 # <class 'dict'>
該方法的參數說明以下:測試
url:必填,指定請求 URL網站
params:字典類型,指定請求參數,經常使用於發送 GET 請求時使用編碼
>>> import requests >>> url = 'http://www.httpbin.org/get' >>> params = { 'key1':'value1', 'key2':'value2' } >>> response = requests.get(url=url,params=params) >>> print(response.text) # { # "args": { # 咱們設定的請求參數 # "key1": "value1", # "key2": "value2" # }, # "headers": { # "Accept": "*/*", # "Accept-Encoding": "gzip, deflate", # "Connection": "close", # "Host": "www.httpbin.org", # "User-Agent": "python-requests/2.19.1" # }, # "origin": "110.64.88.141", # "url": "http://www.httpbin.org/get?key1=value1&key2=value2" # }
data:字典類型,指定表單信息,經常使用於發送 POST 請求時使用
注意:此時應該使用 post 方法,只須要簡單的將 get 替換成 post 便可
>>> import requests >>> url = 'http://www.httpbin.org/post' >>> data = { 'key1':'value1', 'key2':'value2' } >>> response = requests.post(url=url,data=data) >>> print(response.text) # { # "args": {}, # "data": "", # "files": {}, # "form": { # 咱們設定的表單數據 # 'key1': 'value1', # 'key2': 'value2' # }, # "headers": { # "Accept": "*/*", # "Accept-Encoding": "gzip, deflate", # "Connection": "close", # "Content-Length": "17", # "Content-Type": "application/x-www-form-urlencoded", # "Host": "www.httpbin.org", # "User-Agent": "python-requests/2.19.1" # }, # "json": null, # "origin": "116.16.107.178", # "url": "http://www.httpbin.org/post" # }
headers:字典類型,指定請求頭部
>>> import requests >>> url = 'http://www.httpbin.org/headers' >>> headers = { 'USER-AGENT':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36' } >>> response = requests.get(url=url,headers=headers) >>> print(response.text) # { # "headers": { # "Accept": "*/*", # "Accept-Encoding": "gzip, deflate", # "Connection": "close", # "Host": "www.httpbin.org", # "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36" # 咱們設定的請求頭部 # } # }
proxies:字典類型,指定使用的代理
>>> import requests >>> url = 'http://www.httpbin.org/ip' >>> proxies = { 'http':'182.88.178.128:8123', 'http':'61.135.217.7:80' } >>> response = requests.get(url=url,proxies=proxies) >>> print(response.text) # { # "origin": "182.88.178.128" # }
cookies:字典類型,指定 Cookie
>>> import requests >>> url = 'http://www.httpbin.org/cookies' >>> cookies = { 'name1':'value1', 'name2':'value2' } >>> response = requests.get(url=url,cookies=cookies) >>> print(response.text) # { # "cookies": { # "name1": "value1", # "name2": "value2" # } # }
auth:元組類型,指定登錄時的帳號和密碼
>>> import requests >>> url = 'http://www.httpbin.org/basic-auth/user/password' >>> auth = ('user','password') >>> response = requests.get(url=url,auth=auth) >>> print(response.text) # { # "authenticated": true, # "user": "user" # }
verify:布爾類型,指定請求網站時是否須要進行證書驗證,默認爲 True,表示須要證書驗證
假如不但願進行證書驗證,則須要設置爲 False
>>> import requests >>> response = requests.get(url='https://www.httpbin.org/',verify=False)
可是在這種狀況下,通常會出現 Warning 提示,由於 Python 但願咱們可以使用證書驗證
若是不但願看到 Warning 信息,可使用如下命令消除
>>> requests.packages.urllib3.disable_warnings()
timeout:指定超時時間,若超過指定時間沒有得到響應,則拋出異常
exceptions 是 requests 中負責異常處理的模塊,包含下面常見的異常類:
注意 :全部顯式拋出的異常都繼承自 requests.exceptions.RequestException
>>> import requests >>> try: response = requests.get('http://www.httpbin.org/get', timeout=0.1) except requests.exceptions.RequestException as e: if isinstance(e,requests.exceptions.Timeout): print("Time out") # Time out
【參考資料】
【爬蟲系列相關文章】