scrapy發送郵件

scrapy發送郵件

應用場景:在爬蟲關閉或者爬蟲空閒時能夠經過發送郵件的提醒。html

經過twisted的非阻塞IO實現,能夠直接寫在spider中,也能夠寫在中間件或者擴展中,看你具體的需求。python

在網上找了不少教程,都是不少年前的或者就是官網搬運的,一點實際的代碼都沒有,因此就本身嘗試了一下,因爲本人也是爬蟲新手,輕噴,輕噴!git

看下面的示例代碼前,先看下官網,熟悉基本的屬性。github

官網地址sending e-mail:<https://docs.scrapy.org/en/latest/topics/email.html?highlight=MailSender>服務器

  1. 首先在settings同級的目錄下建立extendions(擴展)文件夾,scrapy

    代碼以下:ide

    import logging
    from scrapy import signals
    from scrapy.exceptions import NotConfigured
    from scrapy.mail import MailSender
    logger = logging.getLogger(__name__)
    class SendEmail(object):
    
        def __init__(self,sender,crawler):
            self.sender = sender
            crawler.signals.connect(self.spider_idle, signal=signals.spider_idle)
            crawler.signals.connect(self.spider_closed, signal=signals.spider_closed)
    
        @classmethod
        def from_crawler(cls,crawler):
            if not crawler.settings.getbool('MYEXT_ENABLED'):
                raise NotConfigured
    
            mail_host = crawler.settings.get('MAIL_HOST') # 發送郵件的服務器
            mail_port = crawler.settings.get('MAIL_PORT') # 郵件發送者
            mail_user = crawler.settings.get('MAIL_USER') # 郵件發送者
            mail_pass = crawler.settings.get('MAIL_PASS') # 發送郵箱的密碼不是你註冊時的密碼,而是受權碼!!!切記!
    
            sender = MailSender(mail_host,mail_user,mail_user,mail_pass,mail_port) #因爲這裏郵件的發送者和郵件帳戶是同一個就都寫了mail_user了
            h = cls(sender,crawler)
    
            return h
    
        def spider_idle(self,spider):
            logger.info('idle spider %s' % spider.name)
    
        def spider_closed(self, spider):
            logger.info("closed spider %s", spider.name)
            body = 'spider[%s] is closed' %spider.name
            subject = '[%s] good!!!' %spider.name
            # self.sender.send(to={'zfeijun@foxmail.com'}, subject=subject, body=body)
            return self.sender.send(to={'zfeijun@foxmail.com'}, subject=subject, body=body)

    這裏爲何是return self.sender.send,是由於直接用sender.send會報builtins.AttributeError: 'NoneType' object has no attribute 'bio_read'的錯誤(郵件會發送成功),具體緣由不是很懂,有大牛知道的能夠指導一下。ui

    解決方法參考:<https://github.com/scrapy/scrapy/issues/3478>code

    sender.send前加return就行了。htm

  2. 在擴展中寫好代碼後,須要在settings中啓用

EXTENSIONS = {
    # 'scrapy.extensions.telnet.TelnetConsole': 300,
    'bukalapak.extendions.sendmail.SendEmail': 300,
}
MYEXT_ENABLED = True

轉載請註明出處!

相關文章
相關標籤/搜索