python 緩存

緩存的目的

緩存是一種將定量數據加以保存以備迎合後續請求的處理方式,旨在加快數據的檢索速度。html

簡單實現本身的一個緩存類

import datetime
import pprint
import random

class MyCache(object):
    def __init__(self):
        self.cache = {}
        self.max_cache_size = 10

    def __contains__(self, key):
        """
        判斷鍵是否存在於緩存中
        實現這個魔法方法 是爲了在實例化以後檢查 key 是否在緩存實例中
        :param key:
        :return:
        """
        return key in self.cache

    def update(self, key, value):
        """
        更新緩存字典 而且選擇性刪除最先的條目
        :param key:
        :param value:
        :return:
        """
        if key not in self.cache and len(self.cache) >= self.max_cache_size:
            self.remove_oldest()
        self.cache[key] = {"date_accessed": datetime.datetime.now(), "value": value}

    def remove_oldest(self):
        """
        刪除最先訪問時間的輸入數據
        :return:
        """
        oldest_entry = None
        for key in self.cache:
            if not oldest_entry:
                oldest_entry = key
            elif self.cache[key]["date_accessed"] < self.cache[oldest_entry]['date_accessed']:
                oldest_entry = key
        self.cache.pop(oldest_entry)

    @property
    def size(self):
        """
        緩存容量
        :return:
        """
        return len(self.cache)
複製代碼
  • contains, 雖然在這裏並不必定要使用該方法,但其基本思路在於容許咱們檢查該類實例,從而瞭解其中是否包含有咱們正在尋找的鍵。
  • 另外,update方法負責利用新的鍵/值對進行緩存字典更新。一旦達到或者超出緩存最大容量,其還會刪除日期最先的輸入數據。
  • 另外,remove_oldest方法負責具體的字典內早期數據刪除工做。
  • 最後,咱們還引入了名爲size的屬性,其可以返回緩存的具體容量。

在運行這段代碼以後,你們會注意到當緩存被佔滿時,其會刪除時間最先的條目。 不過以上示例代碼並無提到如何更新訪問日期,即訪問某一條數據的時候將時間設置爲最新。python

進行測試:git

if __name__ == "__main__":
    keys = ["test", "red", "fox", "fence", "junk",
            "other", "alpha", "bravo", "cal", "devo",
            "ele"]

    s = "abcdefghijklmnop"
    cache = MyCache()
    for i, key in enumerate(keys):
        if key in cache:
            continue
        else:
            value = "".join(random.choice(s) for j in range(20))
            cache.update(key, value)
        print(f"{i+1}s iterations, {cache.size} cached entries")
        print()
    print(pprint.pformat(cache.cache))
    print("test" in cache)   # __contains__ 實現的效果 
    print("cal" in cache)
複製代碼

使用 lru_cache 裝飾器

import time
import urllib.error
import urllib.request
from functools import lru_cache

@lru_cache(maxsize=24)
def get_webpage(module):
    """
    獲取特定Python模塊網絡頁面
    """
    webpage = "https://docs.python.org/3/library/{}.html".format(module)
    try:
        with urllib.request.urlopen(webpage) as request:
            return request.read()
    except urllib.error.HTTPError:
        return None


if __name__ == '__main__':
    t1 = time.time()
    modules = ['functools', 'collections', 'os', 'sys']
    for module in modules:
        page = get_webpage(module)
        if page:
            print("{} module page found".format(module))
    t2 = time.time()
    for m in modules:
        page = get_webpage(m)
        if page:
            print(f"{m} get again ...")
    t3 = time.time()

    print(t2-t1)
    print(t3-t2)
    print((t2-t1) / (t3-t2))
複製代碼

咱們利用lru_cache對get_webpage函數進行了裝飾,並將其最大尺寸設置爲24條調用。 在此以後,咱們設置了一條網頁字符串變量,並將其傳遞至咱們但願函數獲取的模塊當中。 如此一來,咱們就可以針對該函數運行屢次循環。能夠看到在首次運行上述代碼時,輸出結果的顯示速度相對比較慢。 但若是你們在同一會話中再次加以運行,那麼其顯示速度將極大加快——這意味着lru_cache已經正確對該調用進行了緩存處理。github

另外,咱們還能夠將一條typed參數傳遞至該裝飾器。 其屬於一條Boolean,旨在通知該裝飾器在typed爲設定爲True時對不一樣類型參數進行分別緩存。web

使用 cachetools 模塊

代碼來源: www.thepythoncorner.com/2018/04/how…正則表達式

原文講了如何使用 緩存來加速你的 python 程序,舉出如下兩個例子: 在未使用緩存時:redis

import time
import datetime


def get_candy_price(candy_id):
    # let's use a sleep to simulate the time your function spends trying to connect to
    # the web service, 5 seconds will be enough.
    time.sleep(5)

    # let's pretend that the price returned by the web service is $1 for candies with a
    # odd candy_id and $1,5 for candies with a even candy_id

    price = 1.5 if candy_id % 2 == 0 else 1

    return (datetime.datetime.now().strftime("%c"), price)


# now, let's simulate 20 customers in your show.
# They are asking for candy with id 2 and candy with id 3...
for i in range(0, 20):
    print(get_candy_price(2))
    print(get_candy_price(3))
複製代碼

在適應了緩存以後:數據庫

import time
import datetime

from cachetools import cached, TTLCache  # 1 - let's import the "cached" decorator and the "TTLCache" object from cachetools
cache = TTLCache(maxsize=100, ttl=300)  # 2 - let's create the cache object.


@cached(cache)  # 3 - it's time to decorate the method to use our cache system!
def get_candy_price(candy_id):
    # let's use a sleep to simulate the time your function spends trying to connect to
    # the web service, 5 seconds will be enough.
    time.sleep(5)

    # let's pretend that the price returned by the web service is $1 for candies with a
    # odd candy_id and $1,5 for candies with a even candy_id

    price = 1.5 if candy_id % 2 == 0 else 1

    return (datetime.datetime.now().strftime("%c"), price)


# now, let's simulate 20 customers in your show.
# They are asking for candy with id 2 and candy with id 3...
for i in range(0, 20):
    print(get_candy_price(2))
    print(get_candy_price(3))
複製代碼

這裏再也不展現運行結果,能夠自行 copy 運行。後端

多級緩存

以上緩存的思路大同小異,可是並不能解決個人問題。我想按照多個條件去設置和緩存。相似於將緩存當作一個簡易的數據庫去查詢,而不單單是簡單的鍵值對的形式。 找到了一個 cacheout 模塊,嘗試去實現本身想要的功能。緩存

cacheout 使用

連接

github.com/dgilland/ca… cacheout.readthedocs.io/en/latest/m…

簡介

這是一個 python 緩存庫。

特色

  • In-memory caching using dictionary backend
  • Cache manager for easily accessing multiple cache objects
  • Reconfigurable cache settings for runtime setup when using module-level cache objects
  • Maximum cache size enforcement
  • Default cache TTL (time-to-live) as well as custom TTLs per cache entry
  • Bulk set, get, and delete operations
  • Bulk get and delete operations filtered by string, regex, or function
  • Memoization decorators
  • Thread safe
  • Multiple cache implementations:
    • FIFO (First In, First Out)
    • LIFO (Last In, First Out)
    • LRU (Least Recently Used)
    • MRU (Most Recently Used)
    • LFU (Least Frequently Used)
    • RR (Random Replacement)

簡單翻譯下:

  • 使用字典後端的內存緩存
  • 緩存管理器,用於輕鬆訪問多個緩存對象
  • 使用模塊級緩存對象時,運行時設置的可從新配置緩存設置
  • 最大緩存大小實施
  • 默認緩存TTL(生存時間)以及每一個緩存條目的自定義TTL
  • 批量設置,獲取和刪除操做
  • 批量獲取和刪除由字符串,正則表達式或函數過濾的操做
  • 記憶裝飾
  • 線程安全
  • 多個緩存實現:
    • FIFO(先進先出)
    • LIFO(後進先出)
    • LRU(最近最少使用)
    • MRU(最近使用)
    • LFU(最不經常使用)
    • RR(隨機替換)

路線圖

Roadmap

  • Layered caching (multi-level caching)
  • Cache event listener support (e.g. on-get, on-set, on-delete)
  • Cache statistics (e.g. cache hits/misses, cache frequency, etc)

路線圖

  • 分層緩存(多級緩存)
  • 緩存事件監聽器支持(例如on-get,on-set,on-delete)
  • 緩存統計信息(例如緩存命中/未命中,緩存頻率等)

安裝

pip install cacheout
複製代碼

依賴

Python >= 3.4
複製代碼

簡單使用

建立一個緩存對象:

# start with some basic caching by creating a cache object:
from cacheout import Cache
cache = Cache()
複製代碼

默認有 256 的緩存個數以及不設置過時時間: cache = Cache() 等價於:

# By default the cache object will have a maximum size of 256 and default TTL expiration turned off. These values can be set with:
cache = Cache(maxsize=256, ttl=0, timer=time.time, default=None)  # defaults
複製代碼

設置值:

# Set a cache key using cache.set():
cache.set(1, 'foobar')
複製代碼

獲取值:

# Get the value of a cache key with cache.get():
assert cache.get(1) == 'foobar'
複製代碼

設置一個在沒有獲取到值的時候拿到的默認值:

# Get a default value when cache key isn't set:
assertcache.get(2) is None
assert cache.get(2, default=False) is False
assert 2 not in cache
複製代碼

可是這個值並無被設置進入緩存。

設置一個全局的默認值:

# Provide a global default:
cache2 = Cache(default=True)
assert cache2.get('missing') is True
assert 'missing' not in cache2

cache3 = Cache(default=lambda key: key)
assert cache3.get('missing') == 'missing'
# missing 被設置進入緩存
assert 'missing' in cache3
複製代碼

設置緩存的過時時間:

# Set the TTL (time-to-live) expiration per entry:
cache.set(3, {'data': {}}, ttl=1)
assert cache.get(3) == {'data': {}}
time.sleep(1)
assert cache.get(3) is None
複製代碼

緩存函數的結果:

# Memoize a function where cache keys are generated from the called function parameters:
@cache.memoize()
def func(a, b):
    return a + b 

# Provide a TTL for the memoized function and incorporate argument types into generated cache keys:
@cache.memoize(ttl=5, typed=True)
def func(a, b):
    print("--- into --- func ---")
    return a + b

# func(1, 2) has different cache key than func(1.0, 2.0), whereas,
# with "typed=False" (the default), they would have the same key

print(func(1, 2))
print(func(1, 2))
print(func.uncached(1, 2))  # 訪問原始的memoized功能
print(func(1, 2))
複製代碼

獲取一份緩存的拷貝

# Get a copy of the entire cache with cache.copy():
assert cache.copy() == {1: 'foobar', 2: ('foo', 'bar', 'baz')}
複製代碼

刪除緩存中的某個值

# Delete a cache key with cache.delete():
cache.delete(1)
assert cache.get(1) is None
複製代碼

清空整個緩存

# Clear the entire cache with cache.clear():
cache.clear()
assert len(cache) == 0
複製代碼

緩存的批量設置 獲取 以及刪除

# Perform bulk operations with cache.set_many(), cache.get_many(), and cache.delete_many():
cache.set_many({'a': 1, 'b': 2, 'c': 3})
assert cache.get_many(['a', 'b', 'c']) == {'a': 1, 'b': 2, 'c': 3}
cache.delete_many(['a', 'b', 'c'])
assert cache.count() == 0
複製代碼

批量獲取和刪除時的匹配問題

# Use complex filtering in cache.get_many() and cache.delete_many():

import re
cache.set_many({'a_1': 1, 'a_2': 2, '123': 3, 'b': 4})

cache.get_many('a_*') == {'a_1': 1, 'a_2': 2}
cache.get_many(re.compile(r'\d')) == {'123': 3}
cache.get_many(lambda key: '2' in key) == {'a_2': 2, '123': 3}

cache.delete_many('a_*')
assert dict(cache.items()) == {'123': 3, 'b': 4}
複製代碼

在建立以後從新配置緩存對象

# Reconfigure the cache object after creation with cache.configure():
cache.configure(maxsize=1000, ttl=5 * 60)
複製代碼

像字典同樣去獲取緩存的鍵 值 鍵值對

# Get keys, values, and items from the cache with cache.keys() cache.values(), and cache.items():

cache.set_many({'a': 1, 'b': 2, 'c': 3})
assert list(cache.keys()) == ['a', 'b', 'c']
assert list(cache.values()) == [1, 2, 3]
assert list(cache.items()) == [('a', 1), ('b', 2), ('c', 3)]
複製代碼

遍歷迭代緩存

# Iterate over cache keys:

for key in cache:
    print(key, cache.get(key))
    # 'a' 1
    # 'b' 2
    # 'c' 3
複製代碼

檢查被緩存的鍵是否存在

# Check if key exists with cache.has() and key in cache:
assert cache.has('a')
assert 'a' in cache
複製代碼

使用CacheManager管理多級緩存

from cacheout import CacheManager

cacheman = CacheManager({'a': {'maxsize': 100},
                         'b': {'maxsize': 200, 'ttl': 900},
                         'c': {})

cacheman['a'].set('key1', 'value1')
value = cacheman['a'].get('key')

cacheman['b'].set('key2', 'value2')
assert cacheman['b'].maxsize == 200
assert cacheman['b'].ttl == 900

cacheman['c'].set('key3', 'value3')

cacheman.clear_all()
for name, cache in cacheman:
    assert name in cacheman
    assert len(cache) == 0
複製代碼

其中,最後講到的多級緩存應該能夠解決本身的問題,如圖,若是個人接口存在股票類型和時間兩個自變量,就能夠將股票類型設置在一級緩存裏面,將時間設置爲二級緩存:

代碼大體能夠這麼寫: 大體是: [圖片]

以前的作法是想(1)將緩存放在類變量裏面;(2)使用 redis 緩存。

相關文章
相關標籤/搜索