scrapy post payload的坑及相關知識的補充【POST傳參方式的說明及scrapy和requests實現】

1、問題及解決:

在用scrapy發送post請求時,把發送方式弄錯了。python

原本應該是 application/x-www-form-urlencoded  弄成了application/jsonjson

但須要改兩部分:body傳入字典的構造方式和header的Content-Type內容api

 

請求截圖:app

 

代碼部分:(紅色部分是原來錯誤的代碼,綠色的是修改正確的,黃色是修改的地方)scrapy

 

 

2、POST傳參方式的說明及scrapy和requests實現:

一、application/x-www-form-urlencoded

若是不設置Content-type,默認爲該方式,提交的數據按照 key1=val1&key2=val2 的方式進行編碼。ide

  • requests :
# -*- encoding:UTF-8 -*-
import requests import sys # 根據python版本,引入包
if sys.version_info[0] > 2: from urllib.parse import urlencode else: from urllib import urlencode url = "http://xxxx.com" payload_dict = {'aaa': '111'} data = urlencode(payload_dict) headers = {'Content-Type': "application/x-www-form-urlencoded"} response = requests.request("POST", url, data=payload_dict, headers=headers) print(response.text)

 

  • scrapy:
#!/usr/bin/env python # -*- coding: utf-8 -*-
import sysif sys.version_info[0] > 2: from urllib.parse import urlencode else: from urllib import urlencode payload_dict = {'page': 1} # 使用普通request方法,須要將數據的字典進行url編碼,傳入body
yield scrapy.Request(url=url, method='POST', body=urlencode(payload_dict), headers={'Content-Type': 'application/x-www-form-urlencoded'}, callback=self.parse, dont_filter=True) # 使用scrapy自帶的post請求方法,將字典直接傳入formdata,默認會對其進行編碼
yield scrapy.FormRequest(url=i, method='POST', formdata=payload_dict, headers={'Content-Type': 'application/x-www-form-urlencoded'}, callback=self.parse)

 

 二、application/json:

請求所需參數以json的數據格式寫入body中,後臺也以json格式進行解析。post

  •  requests
# -*- encoding:UTF-8 -*-
import requests import json url = "https://xxxx.com"
# 須要發送的參數
payload = {'page': 1, 'branch': 'guide'} headers = {'Content-Type': "application/json"}

# 將參數轉爲json格式傳入
response = requests.request("POST", url, data=json.dumps(payload_dict), headers=headers) print(response.json())

 

  • scrapy
# -*- coding: utf-8 -*-
import json import scrapy data_raw = { "query": "coronavirus ", "queryExpression": "", "filters": [ "Y>=1978", "Y<=1978" ], "orderBy": 0, "skip": 0, "sortAscending": 'true', "take": 10, "includeCitationContexts": 'true', "profileId": "" } url = 'https://academic.microsoft.com/api/search'

# body傳入json格式參數
yield Request(url, method="POST", body=json.dumps(data_raw), headers={'Content-Type': 'application/json'}, callback=self.parse)

 

三、multipart/form-data:用於上傳表單位文件。ui

四、text/xml:如今基本不用( 由於XML 結構過於臃腫,通常場景用 JSON 會更靈活方便)。編碼

相關文章
相關標籤/搜索