Storm存儲結果至Redis

時間 2019-11-19

標籤 storm 存儲結果 redis 欄目 Storm 简体版

原文原文鏈接

原有的事務支持使用MemcachedState來進行，如今須要將其遷移至Redis，而且須要記錄全部key值列表，由於在redis中雖然可使用keys *操做，但不是被推薦的方式，因此把全部結果存在Redis中的一個HASH格式字段中。

關於Redis與Storm集成的相關文檔，能夠參考：

http://storm.apache.org/releases/2.0.0-SNAPSHOT/storm-redis.html

因爲Redis中也有着較多種類型的數據結構，這也爲咱們提供了可能，將全部的key至統一放置到set中，或其餘更爲合適的數據結構中。

搭建啓動Redis

目前，分配過來的4臺服務器，只有135剩餘內存較多，分出1G用來做爲Redis存儲使用，搭建一臺單機Redis服務，用於記錄全部的查詢日誌。

啓動該服務：

sudo bin/redis-server conf/redis.6388.conf

Storm集成Redis

添加maven依賴：

<dependency>
            <groupId>org.apache.storm</groupId>
            <artifactId>storm-redis</artifactId>
            <version>${storm.version}</version>
        </dependency>

對於正常的Bolt來講，storm-redis提供了基本的bolt實現，RedisLookupBolt和RedisStoreBolt，

其中使用了策略模式，將實際要查詢/保存相關的key設置以及策略放到了RedisLookup/StoreMapper中，在LookupBolt和StoreBolt中進行實際的查找、保存操做，根據RedisDataType的不一樣，支持Redis的各類數據類型：STRING, HASH, LIST, SET, SORTED_SET, HYPER_LOG_LOG。

從對應傳輸過來的Tuple中查找、保存相應字段的值，在RedisLookupBolt中，根據不一樣的key值，從key值/或者additionalKey中使用不一樣的方法來get獲得對應的值。

@Override
    public void execute(Tuple input) {
        String key = lookupMapper.getKeyFromTuple(input);
        Object lookupValue;

        JedisCommands jedisCommand = null;
        try {
            jedisCommand = getInstance();

            switch (dataType) {
                case STRING:
                    lookupValue = jedisCommand.get(key);
                    break;

                case LIST:
                    lookupValue = jedisCommand.lpop(key);
                    break;

                case HASH:
                    lookupValue = jedisCommand.hget(additionalKey, key);
                    break;

                case SET:
                    lookupValue = jedisCommand.scard(key);
                    break;

                case SORTED_SET:
                    lookupValue = jedisCommand.zscore(additionalKey, key);
                    break;

                case HYPER_LOG_LOG:
                    lookupValue = jedisCommand.pfcount(key);
                    break;

                default:
                    throw new IllegalArgumentException("Cannot process such data type: " + dataType);
            }

            List<Values> values = lookupMapper.toTuple(input, lookupValue);
            for (Values value : values) {
                collector.emit(input, value);
            }

            collector.ack(input);
        } catch (Exception e) {
            this.collector.reportError(e);
            this.collector.fail(input);
        } finally {
            returnInstance(jedisCommand);
        }

Redis TridentState支持

此外，storm-redis中還支持trident state：

RedisState and RedisMapState, which provide Jedis interface just for single redis.

RedisClusterState and RedisClusterMapState, which provide JedisCluster interface, just for redis cluster.

因爲咱們使用的是single redis模式（非集羣），在下面的UML圖中會有所體現：

使用RedisDataTypeDescription來定義保存到Redis的數據類型和額外的key，其中支持兩種數據類型：STRING和HASH。若是使用HASH類型，則須要定義額外的key，由於hash屬於兩層的，咱們定義的additionalKey爲最外層的key類型。

例如咱們須要保存結果至Redis的Hash數據結構中，則須要定義RedisDataTypeDescription.RedisDataType.HASH，定義hash的key："controller:5min」，根據key進行group by操做，當前使用非事務型（對數據正確性敏感度不高）。

            Options<Object> fiveMinitesOptions = new Options<>();
            fiveMinitesOptions.dataTypeDescription = new RedisDataTypeDescription(RedisDataTypeDescription.RedisDataType.HASH,
                    "controller:5min");
            logStream.each(new Fields("logObject"), new Log5MinGroupFunction(), new Fields("key"))
                    .groupBy(new Fields("key"))
                    .persistentAggregate(RedisMapState.nonTransactional(poolConfig, fiveMinitesOptions), new Fields("logObject"),
                            new LogCombinerAggregator(), new Fields("statistic"));

最後在Redis中保存的值爲：

controller:5min
          Log5MinGroupFunction生成的key，LogCombinerAggregator合併完成後的value；

Log5MinGroupFunction生成的key會通過KeyFactory.build(List<Object> key)方法轉換，能夠考慮自定義生成的key；最終的value會經過Serializer的序列化以及反序列化方法轉換成byte[]存放至Redis中，默認是經過JSON的格式。

在AbstractRedisMapState中，對於傳過來的keys進行統一KeyFactory.get操做，而實際獲取值和持久化值是經過 retrieveValuesFromRedis以及updateStatesToRedis兩個方法來實現的

@Override public List<T> multiGet(List<List<Object>> keys) {
        if (keys.size() == 0) {
            return Collections.emptyList();
        }

        List<String> stringKeys = buildKeys(keys);
        List<String> values = retrieveValuesFromRedis(stringKeys);

        return deserializeValues(keys, values);
    }

private List<String> buildKeys(List<List<Object>> keys) {
        List<String> stringKeys = new ArrayList<String>();
        for (List<Object> key : keys) {
            stringKeys.add(getKeyFactory().build(key));
        }
        return stringKeys;
    }

@Override
    public void multiPut(List<List<Object>> keys, List<T> vals) {
        if (keys.size() == 0) {
            return;
        }

        Map<String, String> keyValues = new HashMap<String, String>();
        for (int i = 0; i < keys.size(); i++) {
            String val = new String(getSerializer().serialize(vals.get(i)));
            String redisKey = getKeyFactory().build(keys.get(i));
            keyValues.put(redisKey, val);
        }

        updateStatesToRedis(keyValues);
    }

在RedisMapState中，從Redis中獲取值的方法：

@Override
    protected List<String> retrieveValuesFromRedis(List<String> keys) {
        String[] stringKeys = keys.toArray(new String[keys.size()]);

        Jedis jedis = null;
        try {
            jedis = jedisPool.getResource();

            RedisDataTypeDescription description = this.options.dataTypeDescription;
            switch (description.getDataType()) {
            case STRING:
                return jedis.mget(stringKeys);

            case HASH:
                return jedis.hmget(description.getAdditionalKey(), stringKeys);

能夠看出，支持兩種類型STRING以及HASH，能夠經過批量獲取的API獲取多個keys值，update的過程也比較相似，若是是STRING類型，經過pipeline的方式（分佈式不支持）能夠極大提升查找效率；若是爲hash類型，直接經過hmget便可。

protected void updateStatesToRedis(Map<String, String> keyValues) {
        Jedis jedis = null;

        try {
            jedis = jedisPool.getResource();

            RedisDataTypeDescription description = this.options.dataTypeDescription;
            switch (description.getDataType()) {
            case STRING:
                String[] keyValue = buildKeyValuesList(keyValues);
                jedis.mset(keyValue);
                if(this.options.expireIntervalSec > 0){
                    Pipeline pipe = jedis.pipelined();
                    for(int i = 0; i < keyValue.length; i += 2){
                        pipe.expire(keyValue[i], this.options.expireIntervalSec);
                    }
                    pipe.sync();
                }
                break;

            case HASH:
                jedis.hmset(description.getAdditionalKey(), keyValues);
                if (this.options.expireIntervalSec > 0) {
                    jedis.expire(description.getAdditionalKey(), this.options.expireIntervalSec);
                }
                break;

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。