分庫分表 - 4：自定義主鍵、分佈式主鍵

時間 2021-01-02

標籤 html java node mysql linux git 面試 redis 算法 spring 欄目系統架構简体版

原文原文鏈接

狂創客圈經典圖書：《Netty Zookeeper Redis 高併發實戰》面試必備 + 面試必備 + 面試必備【博客園總入口】html
瘋狂創客圈經典圖書：《SpringCloud、Nginx高併發核心編程》大廠必備 + 大廠必備 + 大廠必備【博客園總入口】java
入大廠+漲工資必備：高併發【億級流量IM實戰】實戰系列【 SpringCloud Nginx秒殺】實戰系列【博客園總入口】node

目錄：分庫分表 -Sharding-JDBC

組件	連接地址
準備一：在window安裝虛擬機集羣	分佈式虛擬機 linux 環境製做 GO
並且：在虛擬機上須要安裝 mysql	centos mysql 筆記（內含vagrant mysql 鏡像）GO

分庫分表 -Sharding-JDBC- 從入門到精通 1	Sharding-JDBC 分庫、分表（入門實戰） GO
分庫分表 -Sharding-JDBC- 從入門到精通 2	Sharding-JDBC 基礎知識 GO
分庫分表 -Sharding-JDBC- 從入門到精通 3	MYSQL集羣主從複製，原理與實戰 GO
分庫分表 Sharding-JDBC 從入門到精通之4	自定義主鍵、分佈式主鍵，原理與實戰 GO
分庫分表 Sharding-JDBC 從入門到精通之5	讀寫分離，原理與實戰GO
分庫分表 Sharding-JDBC 從入門到精通之6	Sharding-JDBC執行原理 GO
分庫分表 Sharding-JDBC 從入門到精通之源碼	git

一、概述：sharding-jdbc 三種主鍵生成策略

傳統數據庫軟件開發中，主鍵自動生成技術是基本需求。而各大數據庫對於該需求也提供了相應的支持，好比MySQL的自增鍵。對於MySQL而言，分庫分表以後，不一樣表生成全局惟一的Id是很是棘手的問題。由於同一個邏輯表內的不一樣實際表之間的自增鍵是沒法互相感知的，這樣會形成重複Id的生成。咱們固然能夠經過約束表生成鍵的規則來達到數據的不重複，可是這須要引入額外的運維力量來解決重複性問題，並使框架缺少擴展性。mysql

sharding-jdbc提供的分佈式主鍵主要接口爲ShardingKeyGenerator, 分佈式主鍵的接口主要用於規定如何生成全局性的自增、類型獲取、屬性設置等。linux

sharding-jdbc提供了兩種主鍵生成策略UUID、SNOWFLAKE ,默認使用SNOWFLAKE,其對應實現類爲UUIDShardingKeyGenerator和SnowflakeShardingKeyGenerator。git

除了以上兩種內置的策略類，也能夠基於ShardingKeyGenerator，定製主鍵生成器。面試

二、自定義的自增主鍵生成器

shardingJdbc 抽離出分佈式主鍵生成器的接口 ShardingKeyGenerator，方便用戶自行實現自定義的自增主鍵生成器。redis

2.1自定義的主鍵生成器的參考代碼

package com.crazymaker.springcloud.sharding.jdbc.demo.strategy;

import lombok.Data;
import org.apache.shardingsphere.spi.keygen.ShardingKeyGenerator;

import java.util.Properties;
import java.util.concurrent.atomic.AtomicLong;

// 單機版 AtomicLong 類型的ID生成器
@Data
public class AtomicLongShardingKeyGenerator implements ShardingKeyGenerator
{

    private AtomicLong atomicLong = new AtomicLong(0);
    private Properties properties = new Properties();

    @Override
    public Comparable<?> generateKey() {
        return atomicLong.incrementAndGet();
    }

    @Override
    public String getType() {
       
    	//聲明類型
        return "AtomicLong";
    }
}

2.2SPI接口配置

在Apache ShardingSphere中，不少功能實現類的加載方式是經過SPI注入的方式完成的。 Service Provider Interface (SPI)是一種爲了被第三方實現或擴展的API，它能夠用於實現框架擴展或組件替換。算法

SPI全稱Service Provider Interface，是Java提供的一套用來被第三方實現或者擴展的接口，它能夠用來啓用框架擴展和替換組件。 SPI 的做用就是爲這些被擴展的API尋找服務實現。spring

SPI 其實是「基於接口的編程＋策略模式＋配置文件」組合實現的動態加載機制。

Spring中大量使用了SPI,好比：對servlet3.0規範對ServletContainerInitializer的實現、自動類型轉換Type Conversion SPI(Converter SPI、Formatter SPI)等

Apache ShardingSphere之因此採用SPI方式進行擴展，是出於總體架構最優設計考慮。爲了讓高級用戶經過實現Apache ShardingSphere提供的相應接口，動態將用戶自定義的實現類加載其中，從而在保持Apache ShardingSphere架構完整性與功能穩定性的狀況下，知足用戶不一樣場景的實際需求。

添加以下文件：META-INF/services/org.apache.shardingsphere.spi.keygen.ShardingKeyGenerator，

文件內容爲：com.crazymaker.springcloud.sharding.jdbc.demo.strategy.AtomicLongShardingKeyGenerator.

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

#配置本身的 AtomicLongShardingKeyGenerator
com.crazymaker.springcloud.sharding.jdbc.demo.strategy.AtomicLongShardingKeyGenerator


#org.apache.shardingsphere.core.strategy.keygen.SnowflakeShardingKeyGenerator
#org.apache.shardingsphere.core.strategy.keygen.UUIDShardingKeyGenerator

以上文件的原始文件，是從 sharding-core-common-4.1.0.jar 的META-INF/services 複製出來的spi配置文件。

2.3使用自定義的 ID 生成器

在配置分片策略是，能夠配置自定義的 ID 生成器，使用生成器的的 type類型便可，具體的配置以下：

spring:
  shardingsphere:
    datasource:
      names: ds0,ds1
      ds0:
        type: com.alibaba.druid.pool.DruidDataSource
        driver-class-name: com.mysql.cj.jdbc.Driver
        filters: com.alibaba.druid.filter.stat.StatFilter,com.alibaba.druid.wall.WallFilter,com.alibaba.druid.filter.logging.Log4j2Filter
        url: jdbc:mysql://cdh1:3306/store?useUnicode=true&characterEncoding=utf8&allowMultiQueries=true&useSSL=true&serverTimezone=UTC
        password: 123456
        username: root
        maxActive: 20
        initialSize: 1
        maxWait: 60000
        minIdle: 1
        timeBetweenEvictionRunsMillis: 60000
        minEvictableIdleTimeMillis: 300000
        validationQuery: select 'x'
        testWhileIdle: true
        testOnBorrow: false
        testOnReturn: false
        poolPreparedStatements: true
        maxOpenPreparedStatements: 20
        connection-properties: druid.stat.merggSql=ture;druid.stat.slowSqlMillis=5000
      ds1:
        type: com.alibaba.druid.pool.DruidDataSource
        driver-class-name: com.mysql.cj.jdbc.Driver
        filters: com.alibaba.druid.filter.stat.StatFilter,com.alibaba.druid.wall.WallFilter,com.alibaba.druid.filter.logging.Log4j2Filter
        url: jdbc:mysql://cdh2:3306/store?useUnicode=true&characterEncoding=utf8&allowMultiQueries=true&useSSL=true&serverTimezone=UTC
        password: 123456
        username: root
        maxActive: 20
        initialSize: 1
        maxWait: 60000
        minIdle: 1
        timeBetweenEvictionRunsMillis: 60000
        minEvictableIdleTimeMillis: 300000
        validationQuery: select 'x'
        testWhileIdle: true
        testOnBorrow: false
        testOnReturn: false
        poolPreparedStatements: true
        maxOpenPreparedStatements: 20
        connection-properties: druid.stat.merggSql=ture;druid.stat.slowSqlMillis=5000
    sharding:
      tables:
        #邏輯表的配置很重要，直接關係到路由是否能成功
        #shardingsphere會根據sql語言類型使用對應的路由印象進行路由，而logicTable是路由的關鍵字段
        # 配置 t_order 表規則
        t_order:
          #真實數據節點，由數據源名 + 表名組成，以小數點分隔。多個表以逗號分隔，支持inline表達式
          actual-data-nodes: ds$->{0..1}.t_order_$->{0..1}
          key-generate-strategy:
            column: order_id
            key-generator-name: snowflake
          table-strategy:
            inline:
              sharding-column: order_id
              algorithm-expression: t_order_$->{order_id % 2}
          database-strategy:
            inline:
              sharding-column: user_id
              algorithm-expression: ds$->{user_id % 2}
          key-generator:
            column: order_id
            type: AtomicLong
            props:
              worker.id: 123

2.4自定義主鍵的測試

啓動應用，訪問其swagger ui界面，鏈接以下：

http://localhost:7700/sharding-jdbc-provider/swagger-ui.html#/sharding%20jdbc%20%E6%BC%94%E7%A4%BA/listAllUsingPOST

增長一條訂單，訂單的 user id=4，其orderid不填，讓後臺自動生成，以下圖：

提交訂單後，再經過swagger ui上的查詢接口，查看所有的訂單，以下圖：

經過上圖能夠看到，新的訂單id爲1, 再也不是以前的雪花算法生成的id。

另外，經過控制檯打印的日誌，也能夠看出所生成的id爲 1，插入訂單的日誌以下

[http-nio-7700-exec-8] INFO  ShardingSphere-SQL - Actual SQL: ds0 ::: insert into t_order_1 (status, user_id, order_id) values (?, ?, ?) ::: [INSERT_TEST, 4, 1]

反覆插入訂單，訂單的id會經過 AtomicLongShardingKeyGenerator 生成，從 1/2/3/4/5/6/....開始一直向後累加

3.UUID生成器

ShardingJdbc內置ID生成器實現類有UUIDShardingKeyGenerator和SnowflakeShardingKeyGenerator。依靠UUID算法自生成不重複的主鍵鍵，UUIDShardingKeyGenerator的實現很簡單，其源碼以下：

/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

package org.apache.shardingsphere.core.strategy.keygen;

import lombok.Getter;
import lombok.Setter;
import org.apache.shardingsphere.spi.keygen.ShardingKeyGenerator;

import java.util.Properties;
import java.util.UUID;

/**
 * UUID key generator.
 */
@Getter
@Setter
public final class UUIDShardingKeyGenerator implements ShardingKeyGenerator {
    
    private Properties properties = new Properties();
    
    @Override
    public String getType() {
        return "UUID";
    }
    
    @Override
    public synchronized Comparable<?> generateKey() {
        return UUID.randomUUID().toString().replaceAll("-", "");
    }
}

因爲InnoDB採用的B+Tree索引特性，UUID生成的主鍵插入性能較差， UUID經常不推薦做爲主鍵。

4雪花算法

4.1雪花算法簡介

分佈式id生成算法的有不少種，Twitter的SnowFlake就是其中經典的一種。

有這麼一種說法，天然界中並不存在兩片徹底同樣的雪花的。每一片雪花都擁有本身漂亮獨特的形狀、獨一無二。雪花算法也表示生成的ID如雪花般獨一無二。

1. 雪花算法概述

雪花算法生成的ID是純數字且具備時間順序的。其原始版本是scala版，後面出現了許多其餘語言的版本如Java、C++等。

2. 組成結構

大體由：首位無效符、時間戳差值，機器(進程)編碼，序列號四部分組成。

基於Twitter Snowflake算法實現，長度爲64bit；64bit組成以下：

1bit sign bit.
41bits timestamp offset from 2016.11.01(Sharding-JDBC distributed primary key published data) to now.
10bits worker process id.
12bits auto increment offset in one mills.

Bits	名字	說明
1	符號位	0，一般不使用
41	時間戳	精確到毫秒數，支持 2 ^41 /365/24/60/60/1000=69.7年
10	工做進程編號	支持 1024 個進程
12	序列號	每毫秒從 0 開始自增，支持 4096 個編號

snowflake生成的ID總體上按照時間自增排序，一共加起來恰好64位，爲一個Long型(轉換成字符串後長度最多19)。而且整個分佈式系統內不會產生ID碰撞（由datacenter和workerId做區分），工做效率較高，經測試snowflake每秒可以產生26萬個ID。

3. 特色(自增、有序、適合分佈式場景)

時間位：能夠根據時間進行排序，有助於提升查詢速度。
機器id位：適用於分佈式環境下對多節點的各個節點進行標識，能夠具體根據節點數和部署狀況設計劃分機器位10位長度，如劃分5位表示進程位等。
序列號位：是一系列的自增id，能夠支持同一節點同一毫秒生成多個ID序號，12位的計數序列號支持每一個節點每毫秒產生4096個ID序號

snowflake算法能夠根據項目狀況以及自身須要進行必定的修改。

3、雪花算法的缺點

強依賴時間，
若是時鐘回撥，就會生成重複的ID

sharding-jdbc的分佈式ID採用twitter開源的snowflake算法，不須要依賴任何第三方組件，這樣其擴展性和維護性獲得最大的簡化；

可是snowflake算法的缺陷（強依賴時間，若是時鐘回撥，就會生成重複的ID），sharding-jdbc沒有給出解決方案，若是用戶想要強化，須要自行擴展；

4.2SnowflakeShardingKeyGenerator 源碼

/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

package org.apache.shardingsphere.core.strategy.keygen;

import com.google.common.base.Preconditions;
import lombok.Getter;
import lombok.Setter;
import lombok.SneakyThrows;
import org.apache.shardingsphere.spi.keygen.ShardingKeyGenerator;

import java.util.Calendar;
import java.util.Properties;

/**
 * Snowflake distributed primary key generator.
 * 
 * <p>
 * Use snowflake algorithm. Length is 64 bit.
 * </p>
 * 
 * <pre>
 * 1bit sign bit.
 * 41bits timestamp offset from 2016.11.01(ShardingSphere distributed primary key published data) to now.
 * 10bits worker process id.
 * 12bits auto increment offset in one mills
 * </pre>
 * 
 * <p>
 * Call @{@code SnowflakeShardingKeyGenerator.setWorkerId} to set worker id, default value is 0.
 * </p>
 * 
 * <p>
 * Call @{@code SnowflakeShardingKeyGenerator.setMaxTolerateTimeDifferenceMilliseconds} to set max tolerate time difference milliseconds, default value is 0.
 * </p>
 */
public final class SnowflakeShardingKeyGenerator implements ShardingKeyGenerator {
    
    public static final long EPOCH;
    
    private static final long SEQUENCE_BITS = 12L;
    
    private static final long WORKER_ID_BITS = 10L;
    
    private static final long SEQUENCE_MASK = (1 << SEQUENCE_BITS) - 1;
    
    private static final long WORKER_ID_LEFT_SHIFT_BITS = SEQUENCE_BITS;
    
    private static final long TIMESTAMP_LEFT_SHIFT_BITS = WORKER_ID_LEFT_SHIFT_BITS + WORKER_ID_BITS;
    
    private static final long WORKER_ID_MAX_VALUE = 1L << WORKER_ID_BITS;
    
    private static final long WORKER_ID = 0;
    
    private static final int DEFAULT_VIBRATION_VALUE = 1;
    
    private static final int MAX_TOLERATE_TIME_DIFFERENCE_MILLISECONDS = 10;
    
    @Setter
    private static TimeService timeService = new TimeService();
    
    @Getter
    @Setter
    private Properties properties = new Properties();
    
    private int sequenceOffset = -1;
    
    private long sequence;
    
    private long lastMilliseconds;
    
    static {
        Calendar calendar = Calendar.getInstance();
        calendar.set(2016, Calendar.NOVEMBER, 1);
        calendar.set(Calendar.HOUR_OF_DAY, 0);
        calendar.set(Calendar.MINUTE, 0);
        calendar.set(Calendar.SECOND, 0);
        calendar.set(Calendar.MILLISECOND, 0);
        EPOCH = calendar.getTimeInMillis();
    }
    
    @Override
    public String getType() {
        return "SNOWFLAKE";
    }
    
    @Override
    public synchronized Comparable<?> generateKey() {
        long currentMilliseconds = timeService.getCurrentMillis();
        if (waitTolerateTimeDifferenceIfNeed(currentMilliseconds)) {
            currentMilliseconds = timeService.getCurrentMillis();
        }
        if (lastMilliseconds == currentMilliseconds) {
            if (0L == (sequence = (sequence + 1) & SEQUENCE_MASK)) {
                currentMilliseconds = waitUntilNextTime(currentMilliseconds);
            }
        } else {
            vibrateSequenceOffset();
            sequence = sequenceOffset;
        }
        lastMilliseconds = currentMilliseconds;
        return ((currentMilliseconds - EPOCH) << TIMESTAMP_LEFT_SHIFT_BITS) | (getWorkerId() << WORKER_ID_LEFT_SHIFT_BITS) | sequence;
    }
    
    @SneakyThrows
    private boolean waitTolerateTimeDifferenceIfNeed(final long currentMilliseconds) {
        if (lastMilliseconds <= currentMilliseconds) {
            return false;
        }
        long timeDifferenceMilliseconds = lastMilliseconds - currentMilliseconds;
        Preconditions.checkState(timeDifferenceMilliseconds < getMaxTolerateTimeDifferenceMilliseconds(), 
                "Clock is moving backwards, last time is %d milliseconds, current time is %d milliseconds", lastMilliseconds, currentMilliseconds);
        Thread.sleep(timeDifferenceMilliseconds);
        return true;
    }
    
    //取得節點的ID
    private long getWorkerId() {
        long result = Long.valueOf(properties.getProperty("worker.id", String.valueOf(WORKER_ID)));
        Preconditions.checkArgument(result >= 0L && result < WORKER_ID_MAX_VALUE);
        return result;
    }
    
    private int getMaxVibrationOffset() {
        int result = Integer.parseInt(properties.getProperty("max.vibration.offset", String.valueOf(DEFAULT_VIBRATION_VALUE)));
        Preconditions.checkArgument(result >= 0 && result <= SEQUENCE_MASK, "Illegal max vibration offset");
        return result;
    }
    
    private int getMaxTolerateTimeDifferenceMilliseconds() {
        return Integer.valueOf(properties.getProperty("max.tolerate.time.difference.milliseconds", String.valueOf(MAX_TOLERATE_TIME_DIFFERENCE_MILLISECONDS)));
    }
    
    private long waitUntilNextTime(final long lastTime) {
        long result = timeService.getCurrentMillis();
        while (result <= lastTime) {
            result = timeService.getCurrentMillis();
        }
        return result;
    }
    
    private void vibrateSequenceOffset() {
        sequenceOffset = sequenceOffset >= getMaxVibrationOffset() ? 0 : sequenceOffset + 1;
    }
}

EPOCH = calendar.getTimeInMillis(); 計算 2016/11/01 零點開始的毫秒數。

generateKey() 實現邏輯

校驗當前時間小於等於最後生成編號時間戳，避免服務器時鐘同步，可能產生時間回退，致使產生重複編號
得到序列號。當前時間戳可得到自增量到達最大值時，調用 #waitUntilNextTime() 得到下一毫秒
設置最後生成編號時間戳，用於校驗時間回退狀況
位操做生成編號

根據代碼能夠得出，若是一個毫秒內只產生一個id,那麼12位序列號全是0，因此這種狀況生成的id全是偶數。

4.3workerId(節點)的配置問題？

問題：Snowflake 算法須要保障每一個分佈式節點，有惟一的workerId(節點)，怎麼解決工做進程編號分配？

Twitter Snowflake 算法實現上是相對簡單易懂的，較爲麻煩的是怎麼解決工做進程編號的分配？怎麼保證全局惟一？

解決方案：
能夠經過IP、主機名稱等信息，生成workerId(節點Id)。還能夠經過 Zookeeper、Consul、Etcd 等提供分佈式配置功能的中間件。

因爲ShardingJdbc的雪花算法，不是那麼的完成。比較簡單粗暴的解決策略爲：

在生產項目中，能夠基於百度的很是成熟、高性能的雪花ID庫，實現一個自定義的ID生成器。
在學習項目中，能夠基於瘋狂創客圈的學習類型雪花ID庫，實現一個自定義的ID生成器。

參考文獻：

http://shardingsphere.io/document/current/cn/overview/

http://www.javashuo.com/article/p-zhejlqwx-ht.html

http://www.javashuo.com/article/p-zuxoergx-dp.html

http://www.javashuo.com/article/p-bvgrlafk-hx.html

https://blog.csdn.net/feelwing1314/article/details/80237178

高併發開發環境系列：springcloud環境

組件	連接地址
windows centos 虛擬機安裝&排坑	vagrant+java+springcloud+redis+zookeeper鏡像下載(&製做詳解)）
centos mysql 安裝&排坑	centos mysql 筆記（內含vagrant mysql 鏡像）
linux kafka安裝&排坑	kafka springboot (或 springcloud ) 整合
Linux openresty 安裝	Linux openresty 安裝
【必須】Linux Redis 安裝（帶視頻）	Linux Redis 安裝（帶視頻）
【必須】Linux Zookeeper 安裝（帶視頻）	Linux Zookeeper 安裝, 帶視頻
Windows Redis 安裝（帶視頻）	Windows Redis 安裝（帶視頻）
RabbitMQ 離線安裝（帶視頻）	RabbitMQ 離線安裝（帶視頻）
ElasticSearch 安裝, 帶視頻	ElasticSearch 安裝, 帶視頻
Nacos 安裝（帶視頻）	Nacos 安裝（帶視頻）
【必須】Eureka	Eureka 入門，帶視頻
【必須】springcloud Config 入門，帶視頻	springcloud Config 入門，帶視頻
【必須】SpringCloud 腳手架打包與啓動	SpringCloud腳手架打包與啓動
Linux 自啓動假死自啓動定時自啓	Linux 自啓動假死啓動