scrapy在spider中經過pipeline獲取數據庫內容

時間 2019-12-20

標籤 scrapy spider 經過 pipeline 獲取數據庫內容欄目 Python 简体版

原文原文鏈接

http://stackoverflow.com/questions/23105590/how-to-get-the-pipeline-object-in-scrapy-spiderpython

# This is my Pipline
class MongoDBPipeline(object):
    def __init__(self, mongodb_db=None, mongodb_collection=None):
        self.connection = pymongo.Connection(settings['MONGODB_SERVER'], settings['MONGODB_PORT'])
        ....

    def process_item(self, item, spider):
        ....
    def get_date(self):
        ....

    def open_spider(self, spider):
        spider.myPipeline = self

class Spider(Spider):
    name = "test"

    def __init__(self):
        self.myPipeline = None

    def parse(self, response):
        self.myPipeline.get_date()