緩存

時間 2019-12-05

標籤緩存简体版

原文原文鏈接

1. 緩存技術

1.1 Guava Cache

Guava Cache是一個全內存的本地緩存實現，它提供了線程安全的實現機制。html

Guava Cache有兩種建立方式：
- cacheLoader
- callable callbackjava

　　經過這兩種方法建立的cache，和一般用map來緩存的作法比，不一樣在於，這兩種方法都實現了一種邏輯——從緩存中取key X的值，若是該值已經緩存過了，則返回緩存中的值，若是沒有緩存過，能夠經過某個方法來獲取這個值。但不一樣的在於cacheloader的定義比較寬泛，是針對整個cache定義的，能夠認爲是統一的根據key值load value的方法。而callable的方式較爲靈活，容許你在get的時候指定。算法

CacheLoader方式實現實例：數據庫

@Test
    public void TestLoadingCache() throws Exception{
        LoadingCache<String,String> cahceBuilder=CacheBuilder
        .newBuilder()
        .build(new CacheLoader<String, String>(){
            @Override
            public String load(String key) throws Exception {        
                String strProValue="hello "+key+"!";                
                return strProValue;
            }
            
        });        
        
        System.out.println("jerry value:"+cahceBuilder.apply("jerry"));
        System.out.println("jerry value:"+cahceBuilder.get("jerry"));
        System.out.println("peida value:"+cahceBuilder.get("peida"));
        System.out.println("peida value:"+cahceBuilder.apply("peida"));
        System.out.println("lisa value:"+cahceBuilder.apply("lisa"));
        cahceBuilder.put("harry", "ssdded");
        System.out.println("harry value:"+cahceBuilder.get("harry"));
    }

callable callback實現方式segmentfault

@Test
    public void testcallableCache()throws Exception{
        Cache<String, String> cache = CacheBuilder.newBuilder().maximumSize(1000).build();  
        String resultVal = cache.get("jerry", new Callable<String>() {  
            public String call() {  
                String strProValue="hello "+"jerry"+"!";                
                return strProValue;
            }  
        });  
        System.out.println("jerry value : " + resultVal);
        
        resultVal = cache.get("peida", new Callable<String>() {  
            public String call() {  
                String strProValue="hello "+"peida"+"!";                
                return strProValue;
            }  
        });  
        System.out.println("peida value : " + resultVal);  
    }

　　輸出：
　　jerry value : hello jerry!
　　peida value : hello peida!

guava Cache數據移除：緩存

　　guava作cache時候數據的移除方式，在guava中數據的移除分爲被動移除和主動移除兩種。
　　被動移除數據的方式，guava默認提供了三種方式：
　　1.基於大小的移除:看字面意思就知道就是按照緩存的大小來移除，若是即將到達指定的大小，那就會把不經常使用的鍵值對從cache中移除。
　　定義的方式通常爲 CacheBuilder.maximumSize(long)，還有一種一種能夠算權重的方法，我的認爲實際使用中不太用到。就這個經常使用的來看有幾個注意點，
　　　　其一，這個size指的是cache中的條目數，不是內存大小或是其餘；
　　　　其二，並非徹底到了指定的size系統纔開始移除不經常使用的數據的，而是接近這個size的時候系統就會開始作移除的動做；
　　　　其三，若是一個鍵值對已經從緩存中被移除了，你再次請求訪問的時候，若是cachebuild是使用cacheloader方式的，那依然仍是會從cacheloader中再取一次值，若是這樣尚未，就會拋出異常
　　2.基於時間的移除：guava提供了兩個基於時間移除的方法
　　expireAfterAccess(long, TimeUnit)　這個方法是根據某個鍵值對最後一次訪問以後多少時間後移除
　　expireAfterWrite(long, TimeUnit) 這個方法是根據某個鍵值對被建立或值被替換後多少時間移除
　　3.基於引用的移除：
　　這種移除方式主要是基於java的垃圾回收機制，根據鍵或者值的引用關係決定移除
　　主動移除數據方式，主動移除有三種方法：
　　1.單獨移除用 Cache.invalidate(key)
　　2.批量移除用 Cache.invalidateAll(keys)
　　3.移除全部用 Cache.invalidateAll()
　　若是須要在移除數據的時候有所動做還能夠定義Removal Listener，可是有點須要注意的是默認Removal Listener中的行爲是和移除動做同步執行的，若是須要改爲異步形式，能夠考慮使用RemovalListeners.asynchronous(RemovalListener, Executor)安全

2. 多級緩存

2.1 緩存策略

Multi-level caches introduce new design decisions. For instance, in some processors, all data in the L1 cache must also be somewhere in the L2 cache. These caches are called strictly inclusive. Other processors (like the AMD Athlon) have exclusive caches: data is guaranteed to be in at most one of the L1 and L2 caches, never in both. Still other processors (like the Intel Pentium II, III, and 4), do not require that data in the L1 cache also reside in the L2 cache, although it may often do so. There is no universally accepted name for this intermediate policy.架構

The advantage of exclusive caches is that they store more data. This advantage is larger when the exclusive L1 cache is comparable to the L2 cache, and diminishes if the L2 cache is many times larger than the L1 cache. When the L1 misses and the L2 hits on an access, the hitting cache line in the L2 is exchanged with a line in the L1. This exchange is quite a bit more work than just copying a line from L2 to L1, which is what an inclusive cache does.[33]併發

One advantage of strictly inclusive caches is that when external devices or other processors in a multiprocessor system wish to remove a cache line from the processor, they need only have the processor check the L2 cache. In cache hierarchies which do not enforce inclusion, the L1 cache must be checked as well. As a drawback, there is a correlation between the associativities of L1 and L2 caches: if the L2 cache does not have at least as many ways as all L1 caches together, the effective associativity of the L1 caches is restricted. Another disadvantage of inclusive cache is that whenever there is an eviction in L2 cache, the (possibly) corresponding lines in L1 also have to get evicted in order to maintain inclusiveness. This is quite a bit of work, and would result in a higher L1 miss rate.[33]app

Another advantage of inclusive caches is that the larger cache can use larger cache lines, which reduces the size of the secondary cache tags. (Exclusive caches require both caches to have the same size cache lines, so that cache lines can be swapped on a L1 miss, L2 hit.) If the secondary cache is an order of magnitude larger than the primary, and the cache data is an order of magnitude larger than the cache tags, this tag area saved can be comparable to the incremental area needed to store the L1 cache data in the L2.[34]

大致意思：
多級cache有三種設計：

exclusive：L1 cahce中的內容不能包含在L2中
strictly inclusive：L1cache的內容必定嚴格包含在L2中。
Third one（沒有正式名字）:不要求L1的必定包含在L2中

優缺點
exclusive方式能夠存儲更多數據。固然若是L2大大超過L1的大小，則這個優點也並非很大了。exclusive要求若是L1 miss L2 hit，則須要把L2 hit的line和L1中的一條line交換。這就比inclusive直接從L2拷貝hit line到L1中的方式多些工做。

strictly inclusive 方式的一個優勢是，當外部設備或者處理器想要從處理器裏刪掉一條cache line時，處理器只須要檢查下L2 cache便可。而第一種和第三種方式中，則L1也須要被檢查。而strictly inclusive一個缺點是L2中被替換的line，若是L1中有映射，也須要從L1中替換出去，這可能會致使L1的高miss率。

strictly inclusive 方式的另一個優勢是，越大的cache可使用越大的cache line，這可能減少二級cache tags的大小。而Exclusive須要L1和L2的cache line大小相同，以便進行替換。若是二級cahce是遠遠大於一級cache，而且cache data部分遠遠大於tag，省下的tag部分能夠存放數據。

3. 面臨的問題

3.1 緩存穿透

咱們在項目中使用緩存一般都是先檢查緩存中是否存在，若是存在直接返回緩存內容，若是不存在就直接查詢數據庫而後再緩存查詢結果返回。這個時候若是咱們查詢的某一個數據在緩存中一直不存在，就會形成每一次請求都查詢DB，這樣緩存就失去了意義，在流量大時，可能DB就掛掉了。

那這種問題有什麼好辦法解決呢？

要是有人利用不存在的key頻繁攻擊咱們的應用，這就是漏洞。有一個比較巧妙的做法是，能夠將這個不存在的key預先設定一個值。好比，「key」 , 「&&」。

在返回這個&&值的時候，咱們的應用就能夠認爲這是不存在的key，那咱們的應用就能夠決定是否繼續等待繼續訪問，仍是放棄掉此次操做。若是繼續等待訪問，過一個時間輪詢點後，再次請求這個key，若是取到的值再也不是&&，則能夠認爲這時候key有值了，從而避免了透傳到數據庫，從而把大量的相似請求擋在了緩存之中。

3.2 緩存併發

有時候若是網站併發訪問高，一個緩存若是失效，可能出現多個進程同時查詢DB，同時設置緩存的狀況，若是併發確實很大，這也可能形成DB壓力過大，還有緩存頻繁更新的問題。
我如今的想法是對緩存查詢加鎖，若是KEY不存在，就加鎖，而後查DB入緩存，而後解鎖；其餘進程若是發現有鎖就等待，而後等解鎖後返回數據或者進入DB查詢。

這種狀況和剛纔說的預先設定值問題有些相似，只不過利用鎖的方式，會形成部分請求等待。

3.3 緩存失效

引發這個問題的主要緣由仍是高併發的時候，平時咱們設定一個緩存的過時時間時，可能有一些會設置1分鐘啊，5分鐘這些，併發很高時可能會出在某一個時間同時生成了不少的緩存，而且過時時間都同樣，這個時候就可能引起一當過時時間到後，這些緩存同時失效，請求所有轉發到DB，DB可能會壓力太重。

那如何解決這些問題呢？

其中的一個簡單方案就時講緩存失效時間分散開，好比咱們能夠在原有的失效時間基礎上增長一個隨機值，好比1-5分鐘隨機，這樣每個緩存的過時時間的重複率就會下降，就很難引起集體失效的事件。

3.4 緩存雪崩

場景:key緩存過時失效而新緩存未到期間,該key的查詢全部請求都會去查詢數據,形成DB壓力上升,沒必要要的DB開銷

解決方案:
- 加鎖排隊重建,使請求能夠串行化,而不用所有的請求都去查詢數據庫
- 假設key的過時時間是A,建立一個key_sign,它的過時時間比A小,查詢key的時候檢查key_sign是否已通過期,若是過時則加鎖後臺起一個線程異步去更新key的值,而實際的緩存沒有過時(若是實際緩存已通過期,須要加鎖排隊重建),可是會浪費雙份緩存
- 在原有的value中存一個過時值B,B比A小,取值的時候根據B判斷value是否過時,若是過時,解決方案同上
- 犧牲用戶體驗,當發現緩存中沒有對應的數據直接返回失敗,而且把須要的數據放入一個分佈式隊列,後臺經過異步線程更新隊列中須要更新的緩存