這裏使用scrapy官方第一個示例html
scrapy startproject tutorial
使用PyCharm打開項目,在tutorial/tutorial/spiders目錄下建立quotes_spider.py
文件並寫入,如下代碼scrapy
import scrapy class QuotesSpider(scrapy.Spider): name = "quotes" def start_requests(self): urls = [ 'http://quotes.toscrape.com/page/1/', 'http://quotes.toscrape.com/page/2/', ] for url in urls: yield scrapy.Request(url=url, callback=self.parse) def parse(self, response): page = response.url.split("/")[-2] filename = 'quotes-%s.html' % page with open(filename, 'wb') as f: f.write(response.body) self.log('Saved file %s' % filename)
找到scrapy下的cmdline.py文件(好比我這裏是D:\Language\Miniconda3\envs\default\Lib\site-packages\scrapy\cmdline.py)ide
複製一份到tutorial項目的根目錄下(scrapy.cfg文件的同一目錄下)url
Name--和上邊建立的spider文件相同,我這裏叫quotes_spider
spa
Script path--選擇當前項目下的cmdline.py,我這裏是F:\PycharmProjects\tutorial\cmdline.py3d
Parameters--crawl+要調試運行的spider名稱,我這裏是crawl quotes
調試
Working directory--填項目所在主目錄,我這裏是F:\PycharmProjects\tutorialcode
最後要注意點「Apply」,不要直接點「OK」htm
選擇調試,程序成功停在斷點處blog
選擇運行,程序也成功通行