Python 定時任務的實現方式

時間 2019-12-08

標籤 python 定時任務實現方式欄目 Python 简体版

原文原文鏈接

本文轉載自：html

https://lz5z.com/Python%E5%AE%9A%E6%97%B6%E4%BB%BB%E5%8A%A1%E7%9A%84%E5%AE%9E%E7%8E%B0%E6%96%B9%E5%BC%8F/python

背景

目前所在的項目組須要常常執行一些定時任務，因而選擇使用 Python 的定時器。git

Python 實現定時任務

循環 sleep

這種方式最簡單，在循環裏面放入要執行的任務，而後 sleep 一段時間再執行mongodb

from datetime import datetime
import time
# 每n秒執行一次
def timer(n):
    while True:
        print(datetime.now().strftime("%Y-%m-%d %H:%M:%S"))
        time.sleep(n)
# 5s
timer(5)

這個方法的缺點是，只能執行固定間隔時間的任務，若是有定時任務就沒法完成，好比早上六點半喊我起牀。而且 sleep 是一個阻塞函數，也就是說 sleep 這一段時間，啥都不能作。數據庫

threading模塊中的Timer

threading 模塊中的 Timer 是一個非阻塞函數，比 sleep 稍好一點，不過依然沒法喊我起牀。app

from datetime import datetime
from threading import Timer
# 打印時間函數
def printTime(inc):
    print(datetime.now().strftime("%Y-%m-%d %H:%M:%S"))
    t = Timer(inc, printTime, (inc,))
    t.start()
# 5s
printTime(5)

Timer 函數第一個參數是時間間隔（單位是秒），第二個參數是要調用的函數名，第三個參數是調用函數的參數(tuple)框架

使用sched模塊

sched 模塊是 Python 內置的模塊，它是一個調度（延時處理機制），每次想要定時執行某任務都必須寫入一個調度。async

import sched
import time
from datetime import datetime
# 初始化sched模塊的 scheduler 類
# 第一個參數是一個能夠返回時間戳的函數，第二個參數能夠在定時未到達以前阻塞。
schedule = sched.scheduler(time.time, time.sleep)
# 被週期性調度觸發的函數
def printTime(inc):
    print(datetime.now().strftime("%Y-%m-%d %H:%M:%S"))
    schedule.enter(inc, 0, printTime, (inc,))
# 默認參數60s
def main(inc=60):
    # enter四個參數分別爲：間隔事件、優先級（用於同時間到達的兩個事件同時執行時定序）、被調用觸發的函數，
    # 給該觸發函數的參數（tuple形式）
    schedule.enter(0, 0, printTime, (inc,))
    schedule.run()
# 10s 輸出一次
main(10)

sched 使用步驟以下：ide

（1）生成調度器：
s = sched.scheduler(time.time,time.sleep)
第一個參數是一個能夠返回時間戳的函數，第二個參數能夠在定時未到達以前阻塞。函數

（2）加入調度事件
其實有 enter、enterabs 等等，咱們以 enter 爲例子。
s.enter(x1,x2,x3,x4)
四個參數分別爲：間隔事件、優先級（用於同時間到達的兩個事件同時執行時定序）、被調用觸發的函數，給觸發函數的參數（注意：必定要以 tuple 給，若是隻有一個參數就(xx,)）

（3）運行
s.run()
注意 sched 模塊不是循環的，一次調度被執行後就 Over 了，若是想再執行，請再次 enter

APScheduler定時框架

終於找到了能夠天天定時喊我起牀的方式了

APScheduler是一個 Python 定時任務框架，使用起來十分方便。提供了基於日期、固定時間間隔以及 crontab 類型的任務，而且能夠持久化任務、並以 daemon 方式運行應用。

使用 APScheduler 須要安裝

1	$ pip install apscheduler

首先來看一個週一到週五天天早上6點半喊我起牀的例子

from apscheduler.schedulers.blocking import BlockingScheduler
from datetime import datetime
# 輸出時間
def job():
    print(datetime.now().strftime("%Y-%m-%d %H:%M:%S"))
# BlockingScheduler
scheduler = BlockingScheduler()
scheduler.add_job(job, 'cron', day_of_week='1-5', hour=6, minute=30)
scheduler.start()

代碼中的 BlockingScheduler 是什麼呢？

BlockingScheduler是APScheduler中的調度器，APScheduler 中有兩種經常使用的調度器，BlockingScheduler 和 BackgroundScheduler，當調度器是應用中惟一要運行的任務時，使用 BlockingSchedule，若是但願調度器在後臺執行，使用 BackgroundScheduler。

BlockingScheduler: use when the scheduler is the only thing running in your process

BackgroundScheduler: use when you’re not using any of the frameworks below, and want the scheduler to run in the background inside your application

AsyncIOScheduler: use if your application uses the asyncio module

GeventScheduler: use if your application uses gevent

TornadoScheduler: use if you’re building a Tornado application

TwistedScheduler: use if you’re building a Twisted application

QtScheduler: use if you’re building a Qt application

APScheduler四個組件

APScheduler 四個組件分別爲：觸發器(trigger)，做業存儲(job store)，執行器(executor)，調度器(scheduler)。

觸發器(trigger)

包含調度邏輯，每個做業有它本身的觸發器，用於決定接下來哪個做業會運行。除了他們本身初始配置意外，觸發器徹底是無狀態的
APScheduler 有三種內建的 trigger:

date: 特定的時間點觸發
interval: 固定時間間隔觸發
cron: 在特定時間週期性地觸發

做業存儲(job store)

存儲被調度的做業，默認的做業存儲是簡單地把做業保存在內存中，其餘的做業存儲是將做業保存在數據庫中。一個做業的數據講在保存在持久化做業存儲時被序列化，並在加載時被反序列化。調度器不能分享同一個做業存儲。
APScheduler 默認使用 MemoryJobStore，能夠修改使用 DB 存儲方案

執行器(executor)

處理做業的運行，他們一般經過在做業中提交制定的可調用對象到一個線程或者進城池來進行。看成業完成時，執行器將會通知調度器。
最經常使用的 executor 有兩種：

ProcessPoolExecutor
ThreadPoolExecutor

調度器(scheduler)

一般在應用中只有一個調度器，應用的開發者一般不會直接處理做業存儲、調度器和觸發器，相反，調度器提供了處理這些的合適的接口。配置做業存儲和執行器能夠在調度器中完成，例如添加、修改和移除做業。

配置調度器

APScheduler提供了許多不一樣的方式來配置調度器，你可使用一個配置字典或者做爲參數關鍵字的方式傳入。你也能夠先建立調度器，再配置和添加做業，這樣你能夠在不一樣的環境中獲得更大的靈活性。

下面來看一個簡單的 BlockingScheduler 例子

from apscheduler.schedulers.blocking import BlockingScheduler
from datetime import datetime

def job():
    print(datetime.now().strftime("%Y-%m-%d %H:%M:%S"))
# 定義BlockingScheduler
sched = BlockingScheduler()
sched.add_job(job, 'interval', seconds=5)
sched.start()

上述代碼建立了一個 BlockingScheduler，並使用默認內存存儲和默認執行器。(默認選項分別是 MemoryJobStore 和 ThreadPoolExecutor，其中線程池的最大線程數爲10)。配置完成後使用 start() 方法來啓動。

若是想要顯式設置 job store(使用mongo存儲)和 executor 能夠這樣寫：

from datetime import datetime
from pymongo import MongoClient
from apscheduler.schedulers.blocking import BlockingScheduler
from apscheduler.jobstores.memory import MemoryJobStore
from apscheduler.jobstores.mongodb import MongoDBJobStore
from apscheduler.executors.pool import ThreadPoolExecutor, ProcessPoolExecutor
# MongoDB 參數
host = '127.0.0.1'
port = 27017
client = MongoClient(host, port)
# 輸出時間
def job():
    print(datetime.now().strftime("%Y-%m-%d %H:%M:%S"))
# 存儲方式
jobstores = {
    'mongo': MongoDBJobStore(collection='job', database='test', client=client),
    'default': MemoryJobStore()
}
executors = {
    'default': ThreadPoolExecutor(10),
    'processpool': ProcessPoolExecutor(3)
}
job_defaults = {
    'coalesce': False,
    'max_instances': 3
}
scheduler = BlockingScheduler(jobstores=jobstores, executors=executors, job_defaults=job_defaults)
scheduler.add_job(job, 'interval', seconds=5, jobstore='mongo')
scheduler.start()

在運行程序5秒後，第一次輸出時間。
在 MongoDB 中能夠看到 job 的狀態

對 job 的操做

添加 job

添加job有兩種方式：

add_job()
scheduled_job()

第二種方法只適用於應用運行期間不會改變的 job，而第一種方法返回一個apscheduler.job.Job 的實例，能夠用來改變或者移除 job。

from apscheduler.schedulers.blocking import BlockingScheduler
sched = BlockingScheduler()
# 裝飾器
@sched.scheduled_job('interval', id='my_job_id', seconds=5)
def job_function():
    print("Hello World")
# 開始
sched.start()

@sched.scheduled_job() 是 Python 的裝飾器。

移除 job

移除 job 也有兩種方法：

remove_job()
job.remove()

remove_job 使用 jobID 移除
job.remove() 使用 add_job() 返回的實例

job = scheduler.add_job(myfunc, 'interval', minutes=2)
job.remove()
# id
scheduler.add_job(myfunc, 'interval', minutes=2, id='my_job_id')
scheduler.remove_job('my_job_id')

暫停和恢復 job

暫停一個 job：

1 2	apscheduler.job.Job.pause() apscheduler.schedulers.base.BaseScheduler.pause_job()

恢復一個 job：

1 2	apscheduler.job.Job.resume() apscheduler.schedulers.base.BaseScheduler.resume_job()

但願你還記得 apscheduler.job.Job 是 add_job() 返回的實例

獲取 job 列表

得到可調度 job 列表，可使用get_jobs() 來完成，它會返回全部的 job 實例。

也可使用print_jobs() 來輸出全部格式化的 job 列表。

修改 job

除了 jobID 以外 job 的全部屬性均可以修改，使用 apscheduler.job.Job.modify() 或者 modify_job() 修改一個 job 的屬性

1 2	job.modify(max_instances=6, name='Alternate name') modify_job('my_job_id', trigger='cron', minute='*/5')

關閉 job

默認狀況下調度器會等待全部的 job 完成後，關閉全部的調度器和做業存儲。將 wait 選項設置爲 False 能夠當即關閉。

1 2	scheduler.shutdown() scheduler.shutdown(wait=False)

scheduler 事件

scheduler 能夠添加事件監聽器，並在特殊的時間觸發。

def my_listener(event):
    if event.exception:
        print('The job crashed :(')
    else:
        print('The job worked :)')
# 添加監聽器
scheduler.add_listener(my_listener, EVENT_JOB_EXECUTED | EVENT_JOB_ERROR)

trigger 規則

date

最基本的一種調度，做業只會執行一次。它的參數以下：

run_date (datetime|str) – the date/time to run the job at
timezone (datetime.tzinfo|str) – time zone for run_date if it doesn’t have one already

from datetime import date
from apscheduler.schedulers.blocking import BlockingScheduler
sched = BlockingScheduler()
def my_job(text):
    print(text)
# The job will be executed on November 6th, 2009
sched.add_job(my_job, 'date', run_date=date(2009, 11, 6), args=['text'])
sched.add_job(my_job, 'date', run_date=datetime(2009, 11, 6, 16, 30, 5), args=['text'])
sched.add_job(my_job, 'date', run_date='2009-11-06 16:30:05', args=['text'])
# The 'date' trigger and datetime.now() as run_date are implicit
sched.add_job(my_job, args=['text'])
sched.start()

cron

year (int|str) – 4-digit year
month (int|str) – month (1-12)
day (int|str) – day of the (1-31)
week (int|str) – ISO week (1-53)
day_of_week (int|str) – number or name of weekday (0-6 or mon,tue,wed,thu,fri,sat,sun)
hour (int|str) – hour (0-23)
minute (int|str) – minute (0-59)
second (int|str) – second (0-59)
start_date (datetime|str) – earliest possible date/time to trigger on (inclusive)
end_date (datetime|str) – latest possible date/time to trigger on (inclusive)
timezone (datetime.tzinfo|str) – time zone to use for the date/time calculations (defaults to scheduler timezone)

表達式:

from apscheduler.schedulers.blocking import BlockingScheduler

def job_function():
    print("Hello World")
# BlockingScheduler
sched = BlockingScheduler()
# Schedules job_function to be run on the third Friday
# of June, July, August, November and December at 00:00, 01:00, 02:00 and 03:00
sched.add_job(job_function, 'cron', month='6-8,11-12', day='3rd fri', hour='0-3')
# Runs from Monday to Friday at 5:30 (am) until 2014-05-30 00:00:00
sched.add_job(job_function, 'cron', day_of_week='mon-fri', hour=5, minute=30, end_date='2014-05-30')
sched.start()

interval

參數：

weeks (int) – number of weeks to wait
days (int) – number of days to wait
hours (int) – number of hours to wait
minutes (int) – number of minutes to wait
seconds (int) – number of seconds to wait
start_date (datetime|str) – starting point for the interval calculation
end_date (datetime|str) – latest possible date/time to trigger on
timezone (datetime.tzinfo|str) – time zone to use for the date/time calculations

from datetime import datetime
from apscheduler.schedulers.blocking import BlockingScheduler

def job_function():
    print("Hello World")
# BlockingScheduler
sched = BlockingScheduler()
# Schedule job_function to be called every two hours
sched.add_job(job_function, 'interval', hours=2)
# The same as before, but starts on 2010-10-10 at 9:30 and stops on 2014-06-15 at 11:00
sched.add_job(job_function, 'interval', hours=2, start_date='2010-10-10 09:30:00', end_date='2014-06-15 11:00:00')
sched.start()