NoSQL and Redis

時間 2019-11-30

標籤 nosql redis 欄目 NoSQL 简体版

原文原文鏈接

轉自：http://www.cnblogs.com/fxjwind/archive/2011/12/10/2283344.htmlhtml

首先談談爲何須要NoSQL?python

這兒看到一篇blog說的不錯http://robbin.iteye.com/blog/524977, 摘錄一下web

首先是面對Web2.0網站, 出現的3高問題,redis

一、High performance - 對數據庫高併發讀寫的需求
web2.0網站要根據用戶個性化信息來實時生成動態頁面和提供動態信息，因此基本上沒法使用動態頁面靜態化技術，所以數據庫併發負載很是高，每每要達到每秒上萬次讀寫請求。關係數據庫應付上萬次SQL查詢還勉強頂得住，可是應付上萬次SQL寫數據請求，硬盤IO就已經沒法承受了。其實對於普通的 BBS網站，每每也存在對高併發寫請求的需求，例如像JavaEye網站的實時統計在線用戶狀態，記錄熱門帖子的點擊次數，投票計數等，所以這是一個至關廣泛的需求。
二、Huge Storage - 對海量數據的高效率存儲和訪問的需求
相似Facebook，twitter，Friendfeed這樣的SNS網站，天天用戶產生海量的用戶動態，以Friendfeed爲例，一個月就達到了2.5億條用戶動態，對於關係數據庫來講，在一張2.5億條記錄的表裏面進行SQL查詢，效率是極其低下乃至不可忍受的。再例如大型web網站的用戶登陸系統，例如騰訊，盛大，動輒數以億計的賬號，關係數據庫也很難應付。
三、High Scalability && High Availability- 對數據庫的高可擴展性和高可用性的需求
在基於web的架構當中，數據庫是最難進行橫向擴展的，當一個應用系統的用戶量和訪問量與日俱增的時候，你的數據庫卻沒有辦法像web server和app server那樣簡單的經過添加更多的硬件和服務節點來擴展性能和負載能力。對於不少須要提供24小時不間斷服務的網站來講，對數據庫系統進行升級和擴展是很是痛苦的事情，每每須要停機維護和數據遷移，爲何數據庫不能經過不斷的添加服務器節點來實現擴展呢？sql

而對於web2.0網站來講，關係數據庫的不少主要特性卻每每無用武之地，例如：數據庫

一、數據庫事務一致性需求
不少web實時系統並不要求嚴格的數據庫事務，對讀一致性的要求很低，有些場合對寫一致性要求也不高。所以數據庫事務管理成了數據庫高負載下一個沉重的負擔。
二、數據庫的寫實時性和讀實時性需求
對關係數據庫來講，插入一條數據以後馬上查詢，是確定能夠讀出來這條數據的，可是對於不少web應用來講，並不要求這麼高的實時性，比方說我（JavaEye的robbin）發一條消息以後，過幾秒乃至十幾秒以後，個人訂閱者纔看到這條動態是徹底能夠接受的。
三、對複雜的SQL查詢，特別是多表關聯查詢的需求
任何大數據量的web系統，都很是忌諱多個大表的關聯查詢，以及複雜的數據分析類型的複雜SQL報表查詢，特別是SNS類型的網站，從需求以及產品設計角度，就避免了這種狀況的產生。每每更多的只是單表的主鍵查詢，以及單表的簡單條件分頁查詢，SQL的功能被極大的弱化了。express

Redis Cookbook設計模式

1 Introduction數組

1.1 Are your application and data a good fit for NoSQL?緩存

When working on the web, chances are your data and data model keep changing with added functionality and business updates. Evolving the schema to support these changes in a relational database is a painful process, especially if you can’t really afford downtime—which people most often can’t these days, because applications are expected to run 24/7.

Examples of data that are a particularly good fit for nonrelation storage are transactional details, historical data, and server logs. These are normally highly dynamic, changing quite often, and their storage tends to grow quite quickly, also don’t typically feel 「relational」.

ACID，指數據庫事務正確執行的四個基本要素的縮寫

原子性(Atomicity）

整個事務中的全部操做，要麼所有完成，要麼所有不完成，不可能停滯在中間某個環節。事務在執行過程當中發生錯誤，會被回滾（Rollback）到事務開始前的狀態，就像這個事務歷來沒有執行過同樣。

一致性（Consistency）

在事務開始以前和事務結束之後，數據庫的完整性約束沒有被破壞。

隔離性（Isolation）

兩個事務的執行是互不干擾的，一個事務不可能看到其餘事務運行時，中間某一時刻的數據。

持久性（Durability）

在事務完成之後，該事務所對數據庫所做的更改便持久的保存在數據庫之中，並不會被回滾。

Performance can also be a key factor. NoSQL databases are generally faster, particularly for write operations, making them a good fit for applications that are write-heavy.

NoSQL databases generally don’t provide ACID or do it only partially.

Redis provides partial ACID compliance by design due to the fact that it is single threaded (which guarantees consistency and isolation), and full compliance if configured with appendfsync always, providing durability as well.

Once you’ve weighted all the options, picking between SQL (for stable, predictable, relational data) and NoSQL (for temporary, highly dynamic data) should be an easy task.

There are also big differences between NoSQL databases that you should account for.

MongoDB (a popular NoSQL database) is a feature-heavy document database that allows you to perform range queries, regular expression searches, indexing, and MapReduce.

Redis is extremely fast, making it perfectly suited for applications that are write-heavy, data that changes often, and data that naturally fits one of Redis’s data structures (for instance, analytics data).

1.2 Using Redis Data Types

Unlike most other NoSQL solutions and key-value storage engines, Redis includes several built-in data types, allowing developers to structure their data in meaningful semantic ways—with the added benefit of being able to perform data-type specific operations inside Redis, which is typically faster than processing the data externally.

Strings
The simplest data type in Redis is a string. Strings are also the typical (and frequently the sole) data type in other key-value storage engines. You can store strings of any kind, including binary data. You might, for example, want to cache image data for avatars in a social network. The only thing you need to keep in mind is that a specific value inside Redis shouldn’t go beyond 512MB of data.

Lists
Lists in Redis are ordered lists of binary safe strings, implemented on the idea of a linked list.

是用linked list實現的, 而不是用數組, 因此不適合按index去取, 比較適合實現queue和stack

Hashes
Much like traditional hashtables, hashes in Redis store several fields and their values inside a specific key.

象python裏面的字典, 很是有用的結構

Sets and Sorted Sets
Sets in Redis are an unordered collection of binary-safe strings. Elements in a given set can have no duplicates. Sets allow you to perform typical set operations such as intersections and unions.

對於web應用, 如朋友圈...

1.3 Using Redis from Python with redis-py

easy_install redis

>>> import redis
>>> redis = redis.Redis(host='localhost', port=6379, db=0)
>>> redis.smembers('circle:jdoe:soccer')
set(['users:toby', 'users:adam', 'users:apollo', 'users:mike'])
>>> redis.sadd('circle:jdoe:soccer', 'users:fred')
True

In order to squeeze a bit more performance out of your Redis and Python setup, you may want to install the Python bindings for Hiredis, a C-based Redis client library developed by the Redis authors.

easy_install hiredis

redis-py will then automatically detect the Python bindings and use Hiredis to connect to the server and process responses—hopefully much faster than before.

2 Leveraging Redis

2.1 Using Redis as a Key/Value Store

Storing application usage counters

Let’s begin by storing something quite basic: counters. Imagine we run a business social network and want to track profile/page visit data. We could just add a column to whatever table is storing our page data in our RDBMS, but hopefully our traffic is high enough that updates to this column have trouble keeping up. We need something much faster to update and to query. So we’ll use Redis for this instead.

SET visits:1:totals 21389
SET visits:2:totals 1367894
INCR visits:635:totals
GET visits:635:totals

Storing object data in hashes

實現了相似python字典的功能

Let’s also assume we want to store a number of fields about our users, such as a full name, email address, phone number, and number of visits to our application. We’ll use Redis’s hash management commands—like HSET, HGET, and HINCRBY—to store this
information.

redis> hset users:jdoe name "John Doe"
(integer) 1
redis> hset users:jdoe email "jdoe@test.com"
(integer) 1
redis> hset users:jdoe phone "+1555313940"
(integer) 1
redis> hincrby users:jdoe visits 1
(integer) 1

redis> hget users:jdoe email
"jdoe@test.com"
redis> hgetall users:jdoe
1) "name"
2) "John Doe"
3) "email"
4) "jdoe@test.com"
5) "phone"
6) "+1555313940"
7) "visits"
8) "1"

Storing user 「Circles」 using sets

Let’s look at how we can use Redis’s support for sets to create functionality similar to the circles in the recently launched Google+. Sets are a natural fit for circles, because sets represent collections of data, and have native functionality to do interesting things like intersections and unions.

redis> sadd circle:jdoe:family users:anna
(integer) 1
redis> sadd circle:jdoe:family users:richard

(integer) 1
redis> sadd circle:jdoe:family users:mike
(integer) 1

redis> sadd circle:jdoe:family users:mike
(integer) 1
redis> sadd circle:jdoe:soccer users:mike
(integer) 1
redis> sadd circle:jdoe:soccer users:adam
(integer) 1
redis> sadd circle:jdoe:soccer users:toby
(integer) 1
redis> sadd circle:jdoe:soccer users:apollo
(integer) 1

redis> sinter circle:jdoe:family circle:jdoe:soccer
1) "users:mike"
redis> sunion circle:jdoe:family circle:jdoe:soccer
1) "users:anna"
2) "users:mike"
3) "users:apollo"
4) "users:adam"
5) "users:richard"
6) "users:toby"

2.2 Inspecting Your Data

While developing (or perhaps debugging) with Redis, you may find you need to take a look at your data. Even though it’s not as simple (or powerful) as MySQL’s SHOW TABLES; and SELECT * FROM table WHERE conditions; commands, there are ways of viewing data with Redis.

KEYS pattern
Lists all the keys in the current database that match the given pattern.
TYPE key-name
Tells the type of the key. Possible types are: string, list, hash, set, zset, and none.
MONITOR
Outputs the commands received by the Redis server in real time.

Keep in mind that every time you use the KEYS command, Redis has to scan all the keys in the database. Therefore, this can really slow down your server, so you probably shouldn’t use it as a normal operation. If you need a list of all your keys (or a subset) you might want to add those keys to a set and then query it. 慎用Keys, 由於須要遍歷全部keys, overload server

2.3 Implementing OAuth on Top of Redis

http://zh.wikipedia.org/zh/OAuth

對於OAuth不是很熟悉, 實際上是經過這個場景介紹了數據過時設置, 能夠設置key的有效時間, 過時自動刪除.

對於認證和受權是有時限的, 用Redis提供的這個特性很合適.

EXPIRE key seconds
Sets an expiration timeout on a key, after which it will be deleted. This can be used on any type of key (strings, hashes, lists, sets or sorted sets) and is one of the most powerful Redis features.
EXPIREAT key timestamp
Performs the same operation as EXPIRE, except you can specify a UNIX timestamp (seconds since midnight, January 1, 1970) instead of the number of elapsed seconds.
TTL key
Tells you the remaining time to live of a key with an expiration timeout.
PERSIST key
Removes the expiration timeout on the given key.

2.4 Using Redis’s Pub/Sub Functionality to Create a Chat System

經過聊天室場景來介紹Redis另外一個很是有用的功能, 發佈/訂閱（publish/subscribe）功能相似於傳統的消息路由功能，發佈者發佈消息，訂閱者接收消息，溝通發佈者和訂閱者之間的橋樑是訂閱的channel或者pattern. 這種設計模式實現了二者之間的鬆耦合.

Redis就是一種基於內存的服務器, 客戶端能夠經過簡單的命令, 進行建立Chanel, subscribe, publish的操做, 有Redis來保存chanel狀態並負責消息路由和發送. 這個功能和他提供的基本功能不同, 算是副業.

PUBLISH
Publishes to a specific channel
SUBSCRIBE
Subscribes to a specific channel
UNSUBSCRIBE
Unsubscribes from a specific channel

PSUBSCRIBE, 批量訂閱
Subscribes to channels that match a given pattern
PUNSUBSCRIBE
Unsubscribes from channels that match a given pattern

2.5 Implementing an Inverted-Index Text Search with Redis

用Redis實現一個內存倒排索引至關容易的, 由於它自己對數據結構的支持.

在生成索引時, 爲每一個word建立doc_id的集合

$redis.sadd("word:#{word}", doc_id)

在search的時候, 把全部搜索word的doc_id集合求交集

$redis.sinter(*terms.map{|t| "word:#{t}"})

至關簡單.可是這章經過這個場景是爲了介紹Redis的事務特性

document_ids =

The MULTI and EXEC commands allow transactional behavior in Redis.

Using DISCARD inside a transaction will abort the transaction, discarding all the commands and return to the normal state.

上面那段代碼是在search中用到, zinterstore會建立一個temp_set存放臨時結果, 若是多個客戶端調用search的時候, 這個temp_set就會衝突, race condition.

作法就是把這段代碼加到Multi…last的事務中, EXEC會由Redis在Last時, 自動調用.

Redis也是基於異步系統的, 異步系統是單線程的, 因此經過這個方法能夠解決這個問題.

2,6 Analytics and Time-Based Data

Storing analytics or other time-based data poses somewhat of a challenge for traditional storage systems (like an RDBMS). Perhaps you want to do rate limiting on incoming traffic (which requires fast and highly concurrent updates) or simply track visitors (or more complex metrics) to your website and plot them on a chart.

這是Redis一個很是典型和完美的應用, 實時數據統計, 對於關係數據庫, 這種短期內, 大量快速的更新, 和海量的查詢, 明顯是力不從心的.

Redis is ideally suited for storing this type of data, and for tracking events in particular.
A good and memory-efficient way of storing this data in Redis is to use hashes to store the counters, increment them using HINCRBY, and then fetch them using HGET and HMGET. Finding the top elements is also easily achieved using the SORT command.

2.7 Implementing a Job Queue with Redis

A typical use case for Redis has been a queue. Although this is owed mostly to Resque (a project started by Github after trying all the other Ruby job queueing solutions). In fact, there are several other implementations (Barbershop, qr, presque) and tutorials
(「Creating Processing Queues with Redis」). Nevertheless, it’s interesting in the context of this book to give an example implementation (inspired by existing ones).

這也是Redis一個典型的應用, Redis能夠被看做是共享內存, 做爲進程間的消息queue, 再合適不過...

3 Redis Administration and Maintenance

3.1 Configuring Persistence
One of the advantages of Redis over other key/value stores like memcached is its support for persistence.

(1)The default persistence model is snapshotting, which consists of saving the entire database to disk in the RDB format (basically a compressed database dump). This can be done periodically at set times, or every time a configurable number of keys changes.

You can manually trigger snapshotting with the SAVE and BGSAVE commands. SAVE performs the same operation as BGSAVE but does so in the foreground, thereby blocking your Redis server.

If you come to the conclusion that snapshotting is putting too much strain on your Redis servers you might want to consider using slaves for persistence (by commenting out all the save statements in your masters and enabling them only on the slaves), or using AOF instead.

(2)The alternative is using an Append Only File (AOF). This might be a better option if you have a large dataset or your data doesn’t change very frequently.

The Append Only File persistence mode keeps a log of the commands that change your dataset in a separate file. Like most writes on modern operating systems, any data logged to AOF is left in memory buffers and written to disk at intervals of a few secondsusing the system’s fsync call. You can configure how often the AOF gets synched to disk by putting appendfsync statements in your configuration file. Valid options are always, everysec, and no.

3.2 Starting a Redis Slave

Database slaves are useful for a number of reasons. You might need them to loadbalance your queries, keep hot standby servers, perform maintenance operations, or simply inspect your data.

Redis supports master-slave replication natively: you can have multiple slaves per master and slaves connecting to slaves. You can configure replication on the configuration file before starting a server or by connecting to a running server and using the SLAVEOFcommand.

3.3 Handling a Dataset Larger Than Memory

Often you might find yourself with a dataset that won’t fit in your available memory.
While you could try to get around that by adding more RAM or sharding (which in addition would allow you you to scale horizontally), it might not be feasible or practical to do so.

Redis has supported a feature called Virtual Memory (VM) since version 2.0. This allows you to have a dataset bigger than your available RAM by swapping rarely used values to disk and keeping all the keys and the frequently used values in memory.
However, this has one downside: before Redis reads or performs an operation on swapped values, they must be read into real memory.

任何東西都是balance, VM就是時間換空間, 因此使用VM必然帶來tradeoff,

If you decide to use VM, you should be aware of its ideal use cases and the tradeoffs you’re making.
• The keys are always kept in memory. This means that if you have a big number of small keys, VM might not be the best option for you or you might have the change your data structure and use large strings, hashes, lists, or sets instead. 對於Redis, 由於它提供豐富的數據結果, 因此儘可能使用數據結構來封裝數據, 而不該該單純將數據以kv的格式存放, 明顯大量的key會下降Redis系統的效率.
• VM is ideal for some patterns of data access, not all. If you regularly query all your data, VM is probably not a good fit because your Redis server might end up blocking clients in order to fetch the values from disk. VM is ideally suited for situations when you have a reasonable amount of frequently accessed values that fit in memory.
• Doing a full dump of your Redis server will be extremely slow. In order to generate a snapshot, Redis needs to read all the values swapped to disk in order to write them to the RDB file (see 「Configuring Persistence」 on page 45). This generates a lot of I/O. Due to this, it might be better for you to use AOF as a persistence mode.
• VM also affects the speed of replication, because Redis masters need to perform a BGSAVE when a new slave connects.

3.4 Upgrading Redis

Our solution will involve starting a new Redis server in slave mode, switching over the clients to the slave and promoting the new server to the master role.

3.5 Backing up Redis

Our proposed solution is heavily dependent on your Redis persistence model:
• With the default persistence model (snapshotting), you’re best off using a snapshot as a backup.
• If you’re using only AOF, you’ll have to back up your log in order to be able to replay it on startup.
• If you’re running your Redis in VM mode, you might want to use an AOF log for the purpose of backups, as the use of snapshotting is not advised with VM.

3.6 Sharding Redis

Sharding is a horizontal partitioning tecnique often used with databases. It allows you to scale them by distributing your data across several database instances. Not only does this allow you to have a bigger dataset, as you can use more memory, it will also help if CPU usage is the problem, since you can distribute your instances through different servers (or servers with multiple CPUs).
In Redis’s case, sharding can be easily implemented in the client library or application.

Redis的一些資源

Redis, from the Ground Up

http://blog.mjrusso.com/2010/10/17/redis-from-the-ground-up.html

Redis tutorial

http://simonwillison.net/static/2010/redis-tutorial/

Redis幾個認識誤區

轉一篇博客, 由於要翻*牆才能訪問, 以爲寫的不錯.

http://timyang.net/data/redis-misunderstanding/

Saturday, Dec 4th, 2010 by Tim | Tags: key value store, redis

前幾天微博發生了一塊兒大的系統故障，不少技術的朋友都比較關心，其中的緣由不會超出James Hamilton在On Designing and Deploying Internet-Scale Service(1)歸納的那幾個範圍，James第一條經驗「Design for failure」是全部互聯網架構成功的一個關鍵。互聯網系統的工程理論其實很是簡單，James paper中內容幾乎稱不上理論，而是多條實踐經驗分享，每一個公司對這些經驗的理解及執行力決定了架構成敗。

題外話說完，最近又研究了Redis。去年曾作過一個MemcacheDB, Tokyo Tyrant, Redis performance test，到目前爲止，這個benchmark結果依然有效。這1年咱們經歷了不少眼花繚亂的key value存儲產品的誘惑，從Cassandra的淡出(Twitter暫停在主業務使用)到HBase的興起(Facebook新的郵箱業務選用 HBase(2))，當再回頭再去看Redis，發現這個只有1萬多行源代碼的程序充滿了神奇及大量未經挖掘的特性。Redis性能驚人，國內前十大網站的子產品估計用1臺Redis就能夠知足存儲及Cache的需求。除了性能印象以外，業界其實廣泛對Redis的認識存在必定誤區。本文提出一些觀點供你們探討。

1. Redis是什麼

這個問題的結果影響了咱們怎麼用Redis。若是你認爲Redis是一個key value store, 那可能會用它來代替MySQL；若是認爲它是一個能夠持久化的cache, 可能只是它保存一些頻繁訪問的臨時數據。Redis是REmote DIctionary Server的縮寫，在Redis在官方網站的的副標題是A persistent key-value database with built-in net interface written in ANSI-C for Posix systems，這個定義偏向key value store。還有一些見解則認爲Redis是一個memory database，由於它的高性能都是基於內存操做的基礎。另一些人則認爲Redis是一個data structure server，由於Redis支持複雜的數據特性，好比List, Set等。對Redis的做用的不一樣解讀決定了你對Redis的使用方式。

互聯網數據目前基本使用兩種方式來存儲，關係數據庫或者key value。可是這些互聯網業務自己並不屬於這兩種數據類型，好比用戶在社會化平臺中的關係，它是一個list，若是要用關係數據庫存儲就須要轉換成一種多行記錄的形式，這種形式存在不少冗餘數據，每一行須要存儲一些重複信息。若是用key value存儲則修改和刪除比較麻煩，須要將所有數據讀出再寫入。Redis在內存中設計了各類數據類型，讓業務可以高速原子的訪問這些數據結構，而且不須要關心持久存儲的問題，從架構上解決了前面兩種存儲須要走一些彎路的問題。

2. Redis不可能比Memcache快

不少開發者都認爲Redis不可能比Memcached快，Memcached徹底基於內存，而Redis具備持久化保存特性，即便是異步的，Redis也不可能比Memcached快。可是測試結果基本是Redis佔絕對優點。一直在思考這個緣由，目前想到的緣由有這幾方面。

Libevent。和 Memcached不一樣，Redis並無選擇libevent。Libevent爲了迎合通用性形成代碼龐大(目前Redis代碼還不到 libevent的1/3)及犧牲了在特定平臺的很多性能。Redis用libevent中兩個文件修改實現了本身的epoll event loop(4)。業界很多開發者也建議Redis使用另一個libevent高性能替代libev，可是做者仍是堅持Redis應該小巧並去依賴的思路。一個印象深入的細節是編譯Redis以前並不須要執行./configure。
CAS問題。CAS是Memcached中比較方便的一種防止競爭修改資源的方法。CAS實現須要爲每一個cache key設置一個隱藏的cas token，cas至關value版本號，每次set會token須要遞增，所以帶來CPU和內存的雙重開銷，雖然這些開銷很小，可是到單機10G+ cache以及QPS上萬以後這些開銷就會給雙方相對帶來一些細微性能差異(5)。

3. 單臺Redis的存放數據必須比物理內存小

Redis的數據所有放在內存帶來了高速的性能，可是也帶來一些不合理之處。好比一箇中型網站有100萬註冊用戶，若是這些資料要用Redis來存儲，內存的容量必須可以容納這100萬用戶。可是業務實際狀況是100萬用戶只有5萬活躍用戶，1周來訪問過1次的也只有15萬用戶，所以所有100萬用戶的數據都放在內存有不合理之處，RAM須要爲冷數據買單。

這跟操做系統很是類似，操做系統全部應用訪問的數據都在內存，可是若是物理內存容納不下新的數據，操做系統會智能將部分長期沒有訪問的數據交換到磁盤，爲新的應用留出空間。現代操做系統給應用提供的並非物理內存，而是虛擬內存(Virtual Memory)的概念。

基於相同的考慮，Redis 2.0也增長了VM特性。讓Redis數據容量突破了物理內存的限制。並實現了數據冷熱分離。

4. Redis的VM實現是重複造輪子

Redis的VM依照以前的epoll實現思路依舊是本身實現。可是在前面操做系統的介紹提到OS也能夠自動幫程序實現冷熱數據分離，Redis只須要OS申請一塊大內存，OS會自動將熱數據放入物理內存，冷數據交換到硬盤，另一個知名的「理解了現代操做系統(3)」的Varnish就是這樣實現，也取得了很是成功的效果。

做者antirez在解釋爲何要本身實現VM中提到幾個緣由(6)。主要OS的VM換入換出是基於Page概念，好比OS VM1個Page是4K, 4K中只要還有一個元素即便只有1個字節被訪問，這個頁也不會被SWAP, 換入也一樣道理，讀到一個字節可能會換入4K無用的內存。而Redis本身實現則能夠達到控制換入的粒度。另外訪問操做系統SWAP內存區域時block 進程，也是致使Redis要本身實現VM緣由之一。

5. 用get/set方式使用Redis

做爲一個key value存在，不少開發者天然的使用set/get方式來使用Redis，實際上這並非最優化的使用方法。尤爲在未啓用VM狀況下，Redis所有數據須要放入內存，節約內存尤爲重要。

假如一個key-value單元須要最小佔用512字節，即便只存一個字節也佔了512字節。這時候就有一個設計模式，能夠把key複用，幾個key-value放入一個key中，value再做爲一個set存入，這樣一樣512字節就會存放10-100倍的容量。

這就是爲了節約內存，建議使用hashset而不是set/get的方式來使用Redis，詳細方法見參考文獻(7)。

6. 使用aof代替snapshot

Redis有兩種存儲方式，默認是snapshot方式，實現方法是定時將內存的快照(snapshot)持久化到硬盤，這種方法缺點是持久化以後若是出現crash則會丟失一段數據。所以在完美主義者的推進下做者增長了aof方式。aof即append only mode，在寫入內存數據的同時將操做命令保存到日誌文件，在一個併發更改上萬的系統中，命令日誌是一個很是龐大的數據，管理維護成本很是高，恢復重建時間會很是長，這樣致使失去aof高可用性本意。另外更重要的是Redis是一個內存數據結構模型，全部的優點都是創建在對內存複雜數據結構高效的原子操做上，這樣就看出aof是一個很是不協調的部分。

其實aof目的主要是數據可靠性及高可用性，在Redis中有另一種方法來達到目的：Replication。因爲Redis的高性能，複製基本沒有延遲。這樣達到了防止單點故障及實現了高可用。

小結

要想成功使用一種產品，咱們須要深刻了解它的特性。Redis性能突出，若是可以熟練的駕馭，對國內不少大型應用具備很大幫助。但願更多同行加入到Redis使用及代碼研究行列。

————————————————————————————————————————————————————————————————————————

另附：

1.Redis 是什麼？
2 Redis用來作什麼？
3 Redis的優勢？
4 Redis的缺點？

閱讀目的：對什麼是內存型數據庫有概念性的認知。?

Redis 是什麼？

一般而言目前的數據庫分類有幾種，包括 SQL/NSQL,，關係數據庫，鍵值數據庫等等等，分類的標準也不以，Redis本質上也是一種鍵值數據庫的，但它在保持鍵值數據庫簡單快捷特色的同時，又吸取了部分關係數據庫的優勢。從而使它的位置處於關係數據庫和鍵值數據庫之間。Redis不只能保存Strings類型的數據，還能保存Lists類型（有序）和Sets類型（無序）的數據，並且還能完成排序（SORT）等高級功能，在實現INCR，SETNX等功能的時候，保證了其操做的原子性，除此之外，還支持主從複製等功能。

  更爲詳細的描述請參考以下：

      http://code.google.com/p/redis/wiki/index

  Redis官方也一樣提供了一個名爲Retwis的項目代碼，能夠對照着官方學習。

2 Redis用來作什麼？

      一般侷限點來講，Redis也以消息隊列的形式存在，做爲內嵌的List存在，知足實時的高併發需求。而一般在一個電商類型的數據處理過程之中，有關商品，熱銷，推薦排序的隊列，一般存放在Redis之中，期間也包擴Storm對於Redis列表的讀取和更新。

3 Redis的優勢

性能極高 – Redis能支持超過 100K+ 每秒的讀寫頻率。

豐富的數據類型 – Redis支持二進制案例的 Strings, Lists, Hashes, Sets 及 Ordered Sets 數據類型操做。

原子 – Redis的全部操做都是原子性的，同時Redis還支持對幾個操做全並後的原子性執行。

豐富的特性 – Redis還支持 publish/subscribe, 通知, key 過時等等特性。

4 Redis的缺點

是數據庫容量受到物理內存的限制,不能用做海量數據的高性能讀寫,所以Redis適合的場景主要侷限在較小數據量的高性能操做和運算上。

總結： Redis受限於特定的場景，專一於特定的領域之下，速度至關之快，目前還未找到能替代使用產品。

——————————————————————————————————————————————————————————————————————————

Redis是什麼？兩句話能夠作下歸納：
1. 是一個徹底開源免費的key-value內存數據庫
2. 一般被認爲是一個數據結構服務器，主要是由於其有着豐富的數據結構 strings、map、 list、sets、 sorted sets

Redis不是什麼？一樣從兩個方面來作下對比：
1. 不是sql server、mySQL等關係型數據庫，主要緣由是：
   . redis目前還只能做爲小數據量存儲（所有數據可以加載在內存中），海量數據存儲方面並非redis所擅長的領域
   . 設計、實現方法很不同.關係型數據庫經過表來存儲數據，經過SQL來查詢數據。而Redis通上述五種數據結構來存儲數據，經過命令來查詢數據
2. 不是Memcached等緩存系統，主要緣由有如下幾個：
   .網絡IO模型方面：Memcached是多線程，分爲監聽線程、worker線程，引入鎖，帶來了性能損耗。Redis使用單線程的IO複用模型，將速度優點發揮到最大，也提供了較簡單的計算功能
   .內存管理方面：Memcached使用預分配的內存池的方式，帶來必定程度的空間浪費而且在內存仍然有很大空間時，新的數據也可能會被剔除，而Redis使用現場申請內存的方式來存儲數據，不會剔除任何非臨時數據 Redis更適合做爲存儲而不是cache
   .數據的一致性方面：Memcached提供了cas命令來保證.而Redis提供了事務的功能，能夠保證一串命令的原子性，中間不會被任何操做打斷
   . 存儲方式方面：Memcached只支持簡單的key-value存儲，不支持枚舉，不支持持久化和複製等功能

一句話小結一下：Redis是一個高性能的key-value數據庫。 redis的出現，很大程度補償了memcached這類key/value存儲的不足，在部分場合能夠對關係數據庫起到很好的補充做用。

Redis有什麼用？只有瞭解了它有哪些特性，咱們在用的時候才能揚長避短，爲咱們所用：
1. 速度快：使用標準C寫，全部數據都在內存中完成，讀寫速度分別達到10萬/20萬
2. 持久化：對數據的更新採用Copy-on-write技術，能夠異步地保存到磁盤上，主要有兩種策略，一是根據時間，更新次數的快照（save 300 10 ）二是基於語句追加方式(Append-only file，aof)
3. 自動操做：對不一樣數據類型的操做都是自動的，很安全
4. 快速的主--從複製，官方提供了一個數據，Slave在21秒即完成了對Amazon網站10G key set的複製。
5. Sharding技術：很容易將數據分佈到多個Redis實例中，數據庫的擴展是個永恆的話題，在關係型數據庫中，主要是以添加硬件、以分區爲主要技術形式的縱向擴展解決了不少的應用場景，但隨着web2.0、移動互聯網、雲計算等應用的興起，這種擴展模式已經不太適合了，因此近年來，像採用主從配置、數據庫複製形式的，Sharding這種技術把負載分佈到多個特理節點上去的橫向擴展方式用處愈來愈多。

這裏對Redis數據庫作下小結：
1. 提升了DB的可擴展性，只須要將新加的數據放到新加的服務器上就能夠了
2. 提升了DB的可用性，隻影響到須要訪問的shard服務器上的數據的用戶
3. 提升了DB的可維護性，對系統的升級和配置能夠按shard一個個來搞，對服務產生的影響較小
4. 小的數據庫存的查詢壓力小，查詢更快，性能更好

寫到這裏，可能就會有人急不可待地想用它了，那怎麼用呢？能夠直接到官方文檔，裏面幫咱們整理好了各個語言環境下的客戶端，主要有Ruby、Python、 PHP、Perl、Lua、Java、C#....有幾種語言，我也沒見過，因此就很少說了，你懂的....

最後，把我使用過程當中的一些經驗與教訓，作個小結： 1. 要進行Master-slave配置，出現服務故障時能夠支持切換。 2. 在master側禁用數據持久化，只需在slave上配置數據持久化。 3. 物理內存+虛擬內存不足，這個時候dump一直死着，時間久了機器掛掉。這個狀況就是災難！ 4. 當Redis物理內存使用超過內存總容量的3/5時就會開始比較危險了，就開始作swap,內存碎片大 5. 當達到最大內存時，會清空帶有過時時間的key，即便key未到過時時間. 6. redis與DB同步寫的問題，先寫DB，後寫redis，由於寫內存基本上沒有問題

相關標籤/搜索

nosql

redis&memcached&sql&nosql

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。