從零開始利用JPA與SHARDING-JDBC動態劃分月表

時間 2019-11-08

標籤開始利用 jpa sharding jdbc 動態劃分欄目 Java 简体版

原文原文鏈接

開始

從零開始利用spring-data-jpa與sharding-jdbc進行動態月表，直接上手。java

需求說明

數據量按照分片鍵（入庫時間）進入對應的月表，查詢時根據分片鍵的值查詢指定表；可是每次查詢都必須帶上分片鍵，這就不是很友好，因此另外後面也有說明在沒有指定分片鍵時如何查詢最近的兩個月。mysql

前期準備

建表語句

-- 邏輯表，每月表都根據邏輯表生成
CREATE TABLE `EXAMPLE` (
  `ID` bigint(36) NOT NULL AUTO_INCREMENT,
  `NAME` varchar(255) NOT NULL,
  `CREATED` datetime(3) DEFAULT NULL,
  `UPDATED` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
-- 月表
CREATE TABLE `EXAMPLE_201909` (
  `ID` bigint(36) NOT NULL AUTO_INCREMENT,
  `NAME` varchar(255) NOT NULL,
  `CREATED` datetime(3) DEFAULT NULL,
  `UPDATED` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
CREATE TABLE `EXAMPLE_201910` (
  `ID` bigint(36) NOT NULL AUTO_INCREMENT,
  `NAME` varchar(255) NOT NULL,
  `CREATED` datetime(3) DEFAULT NULL,
  `UPDATED` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
複製代碼

實體類

@Entity
@Data
@Table(name = "EXAMPLE")
public class Example implements Serializable {
	private static final long serialVersionUID = 1L;
	@Id
	@GeneratedValue(strategy = GenerationType.IDENTITY)
	@Column(name = "ID")
	private String id;
	@Column(name = "NAME")
	private String name;
	@JsonFormat(pattern = "yyyy-MM-dd HH:mm:ss.SSS", timezone = "GMT+8")
	@Column(name = "CREATED")
	private Date created;
	@Column(name = "UPDATED", insertable = false, updatable = false)
	private Date updated;
}
複製代碼

repo

import java.util.Date;
import java.util.List;
import org.springframework.data.jpa.repository.JpaRepository;
import org.springframework.data.jpa.repository.JpaSpecificationExecutor;
import com.test.sharding.entity.Example;

public interface ExampleRepo extends JpaRepository<Example, Long>, JpaSpecificationExecutor<Example> {
	List<Example> findByCreatedBetween(Date start, Date end);
}
複製代碼

Maven依賴

通過測試，支持springboot 2.0.X+與1.5.X+。web

<dependency>
			<groupId>mysql</groupId>
			<artifactId>mysql-connector-java</artifactId>
		</dependency>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-data-jpa</artifactId>
		</dependency>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-web</artifactId>
		</dependency>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-devtools</artifactId>
			<scope>runtime</scope>
			<optional>true</optional>
		</dependency>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-configuration-processor</artifactId>
			<optional>true</optional>
		</dependency>
		<dependency>
			<groupId>org.projectlombok</groupId>
			<artifactId>lombok</artifactId>
			<optional>true</optional>
		</dependency>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-test</artifactId>
			<scope>test</scope>
			<exclusions>
				<exclusion>
					<groupId>org.junit.vintage</groupId>
					<artifactId>junit-vintage-engine</artifactId>
				</exclusion>
			</exclusions>
		</dependency>
		<dependency>
			<groupId>io.shardingsphere</groupId>
			<artifactId>sharding-jdbc-spring-boot-starter</artifactId>
			<version>3.0.0</version>
		</dependency>
		<dependency>
			<groupId>cn.hutool</groupId>
			<artifactId>hutool-all</artifactId>
			<version>4.6.7</version>
		</dependency>
		<dependency>
			<groupId>org.apache.commons</groupId>
			<artifactId>commons-lang3</artifactId>
		</dependency>
		<dependency>
			<groupId>com.alibaba</groupId>
			<artifactId>druid</artifactId>
			<version>1.1.20</version>
		</dependency>

複製代碼

分片算法實現

因爲選擇的分片策略是StandardShardingStrategy（在後面的配置文件中會配置），因此須要試下下面兩個分片算法：算法

精確分片算法

import java.util.Collection;
import java.util.Date;
import cn.hutool.core.date.DateUtil;
import io.shardingsphere.api.algorithm.sharding.PreciseShardingValue;
import io.shardingsphere.api.algorithm.sharding.standard.PreciseShardingAlgorithm;

public class MyPreciseShardingAlgorithm implements PreciseShardingAlgorithm<Date> {
  // 能夠優化爲全局變量
	private static String yearAndMonth = "yyyyMM";

	@Override
	public String doSharding(Collection<String> availableTargetNames, PreciseShardingValue<Date> shardingValue) {
		StringBuffer tableName = new StringBuffer();
		tableName.append(shardingValue.getLogicTableName()).append("_")
				.append(DateUtil.format(shardingValue.getValue(), yearAndMonth));
		return tableName.toString();
	}
}
複製代碼

範圍分片算法

public class TimeRangeShardingAlgorithm implements RangeShardingAlgorithm<Date> {
	private static String yearAndMonth = "yyyyMM";
	/** * 只查詢最近兩個月的數據 */
	@Override
	public Collection<String> doSharding(Collection<String> availableTargetNames, RangeShardingValue<Date> shardingValue) {
		Collection<String> result = new LinkedHashSet<String>();
		Range<Date> range = shardingValue.getValueRange();
		// 獲取範圍
		String end = DateUtil.format(range.lowerEndpoint(), yearAndMonth);
		// 獲取前一個月
		String start = DateUtil.format(range.upperEndpoint(), yearAndMonth);
		result.add(shardingValue.getLogicTableName() + "_" + start);
		if (!end.equals(start)) {
			result.add(shardingValue.getLogicTableName() + "_" + end);
		}
		return result;
	}

}
複製代碼

application.yml配置

spring:
 datasource: # 無關緊要，在配置了sharding以後，默認只會有sharding數據源生效
 type: com.alibaba.druid.pool.DruidDataSource
 url: jdbc:mysql://localhost:3306/ddssss
 username: root
 password: ppppppp
 tomcat:
 initial-size: 5
 driver-class-name: com.mysql.jdbc.Driver
 jpa:
 database: mysql
sharding:
 jdbc:
 datasource:
 names: month-0 # 數據源名稱
 month-0:
 driver-class-name: com.mysql.jdbc.Driver
 url: jdbc:mysql://localhost:3306/ddssss
 username: root
 password: ppppppp
 type: com.alibaba.druid.pool.DruidDataSource
 config:
 sharding:
 tables:
 month: # 表名
 key-generator-column-name: id # 主鍵名稱
 table-strategy:
 standard:
 sharding-column: ccreated # 分片鍵
 precise-algorithm-class-name: com.example.sharding.config.MyPreciseShardingAlgorithm # 實現類的徹底限定類名
 range-algorithm-class-name: com.example.sharding.config.MyRangeShardingAlgorithm # 實現類的徹底限定類名
 props:
          sql.show: true # 是否顯示SQL ,默認爲false
複製代碼

測試

import java.util.ArrayList;
import java.util.Date;
import java.util.List;
import javax.persistence.criteria.Predicate;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.CommandLineRunner;
import org.springframework.data.jpa.domain.Specification;
import org.springframework.stereotype.Component;
import com.alibaba.fastjson.JSONObject;
import com.test.sharding.entity.Example;
import com.test.sharding.repository.ExampleRepo;
import cn.hutool.core.date.DateUtil;
import lombok.extern.slf4j.Slf4j;

@Component
@Slf4j
public class StartRunner implements CommandLineRunner {
	@Autowired
	ExampleRepo exampleRepo;

	@Override
	public void run(String... args) throws Exception {
		log.info("==============init===================");
		Example example = new Example();
		example.setName("個人名字");
		example.setCreated(new Date());
		exampleRepo.save(example);
		log.info("example:{}", JSONObject.toJSONString(example));
		// 普通條件查詢
		List<Example> list = exampleRepo.findAll(org.springframework.data.domain.Example.<Example>of(example));
		log.info("normal list :{}", JSONObject.toJSONString(list));
		// 動態條件查詢
		Example condtion = new Example();
		condtion.setCreated(example.getCreated());
		list = exampleRepo.findAll(getIdSpecification(condtion));
		log.info("dynamic list :{}", JSONObject.toJSONString(list));
		// 範圍查詢
		Date end = new Date();
		list = exampleRepo.findByCreatedBetween(DateUtil.lastMonth()
				.toJdkDate(), end);
		log.info("range select list :{}", JSONObject.toJSONString(list));
	}

	protected Specification<Example> getIdSpecification(final Example condtion) {
		return (root, query, cb) -> {
			List<Predicate> list = new ArrayList<>();
			list.add(cb.equal(root.<Date>get("created"), condtion.getCreated()));
			Predicate[] predicates = new Predicate[list.size()];
			query.where(list.toArray(predicates));
			return query.getRestriction();
		};
	}
}
複製代碼

啓動後就會看到日誌以下： spring

數據庫：sql

表：數據庫
數據apache

後記

雖然這樣實現了基於時間的動態劃分月表查詢與插入，但在實際使用中卻還有着許多小問題，好比：save方法在指定了主鍵的狀況下依然會進行INSERT而不是UPDATE、查詢時必須帶上分片鍵、還須要手動建立後續的月表。json

針對這三個問題，須要作進一步的優化。api

問題產生的緣由

爲何save方法在指定了主鍵的狀況下依然會進行INSERT而不是UPDATE

JPA的SAVE在指定的主鍵不爲空時會先去表裏查詢該主鍵是否存在，可是這樣查詢的條件是隻有主鍵而沒有分片鍵的，Sharding-JDBC的策略是在沒有指定分片鍵時會去查詢全部的分片表。

可是這裏就是有一個誤區，Sharding-JDBC主動查詢全部的分片表指的是固定分片的狀況。好比這裏有另一張表，根據ID奇偶分片，分出來有兩張表。那麼全部的數據都會在者兩張表中，咱們在配置的時候也是直接配置者兩張表。

對於咱們如今的需求來講就不適用，由於咱們的分表規則是根據時間來的，每一年每個月都有一張新表，因此對於沒有指定分片鍵值得查詢，Sharding-JDBC默認值查詢了邏輯表。此時返回空，JPA就會認爲該主鍵沒有數據，因此對應的SQL是INSERT而不是UPDATE。

爲何查詢時必須帶上分片鍵

理由和上述是同樣的，Sharding-JDBC在沒有指定分片鍵時值查詢了邏輯表。

還須要手動建立後續的月表

首先，每月都須要建立對應的月表這個是確定的，固然也能夠直接一次性縣建立幾年的表，但我感受沒意義，這種重複的事情應該讓程序來作，定時建立月表。

解決方案

針對問題1與問題2，我直接重寫Sharding-JDBC的路由規則，能夠完美解決。

重寫路由規則

須要修改類io.shardingsphere.core.routing.type.standard.StandardRoutingEngine的routeTables方法，而且聲明瞭一個靜態變量記錄須要分表的邏輯表，具體代碼以下：

// 時間格式化
private static String yearAndMonth = "yyyyMM";
// 保存須要分表的邏輯表
private static final Set<String> needRoutTables = new HashSet<>(
			Lists.newArrayList("EXAMPLE"));
複製代碼

private Collection<DataNode> routeTables(final TableRule tableRule, final String routedDataSource, final List<ShardingValue> tableShardingValues) {
		Collection<String> availableTargetTables = tableRule.getActualTableNames(routedDataSource);
		// 路由表，根據分表算法獲得,動態分表時若是條件裏沒有分片鍵則返回邏輯表，本文是:EXAMPLE
		Collection<String> routedTables = new LinkedHashSet<>(tableShardingValues.isEmpty() ? availableTargetTables
				: shardingRule.getTableShardingStrategy(tableRule)
						.doSharding(availableTargetTables, tableShardingValues));
		// 若是獲得的路由表只有一個，由於大於2的狀況都應該是制定了分片鍵的（分表是不建議聯表查詢的）
		if (routedTables.size() <= 1) {
			// 獲得邏輯表名
			String routeTable = routedTables.iterator()
					.next();
			// 判斷是否須要分表，true表明須要分表
			if (needRoutTables.contains(routeTable)) {
				// 移除邏輯表
				routedTables.remove(routeTable);
				Date now = new Date();
				// 月份後綴，默認最近兩個月
				String nowSuffix = DateUtil.format(now, yearAndMonth);
				String lastMonthSuffix = DateUtil.format(DateUtil.lastMonth(), yearAndMonth);
				routedTables.add(routeTable + "_" + nowSuffix);
				routedTables.add(routeTable + "_" + lastMonthSuffix);
			}
		}
		Preconditions.checkState(!routedTables.isEmpty(), "no table route info");
		Collection<DataNode> result = new LinkedList<>();
		for (String each : routedTables) {
			result.add(new DataNode(routedDataSource, each));
		}
		return result;
	}
複製代碼

針對問題3，利用程序定時建表，我這裏沒有選擇通用的建表語句:

-- ****** 日期，在程序裏動態替換
CREATE TABLE `EXAMPLE_******` (
  `ID` bigint(36) NOT NULL AUTO_INCREMENT,
  `NAME` varchar(255) NOT NULL,
  `CREATED` datetime(3) DEFAULT NULL,
  `UPDATED` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`ID`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
複製代碼

主要緣由有如下兩點

在通常的項目裏的表字段通常都不會這麼少，建表語句會很長
並且後期的維護也很差，對於表任何改動都須要在程序裏也須要維護

我選擇了根據模板來建立表，SQL以下：

-- ****** 日期，在程序裏動態替換
CREATE TABLE IF NOT EXISTS `EXAMPLE_******` LIKE `EXAMPLE`
複製代碼

這樣的好處就是建表語句相對精簡、不須要關心表結構了，一切從模板新建月表。可是這也引出了一個新的問題，Sharding-JDBC不支持這樣的語法。因此又須要修改源代碼重寫一下攔截規則。具體就是類io.shardingsphere.core.parsing.parser.sql.ddl.create.table.AbstractCreateTableParser的parse方法：

public final DDLStatement parse() {
		lexerEngine.skipAll(getSkippedKeywordsBetweenCreateIndexAndKeyword());
		lexerEngine.skipAll(getSkippedKeywordsBetweenCreateAndKeyword());
		CreateTableStatement result = new CreateTableStatement();
		if (lexerEngine.skipIfEqual(DefaultKeyword.TABLE)) {
			lexerEngine.skipAll(getSkippedKeywordsBetweenCreateTableAndTableName());
		} else {
			throw new SQLParsingException("Can't support other CREATE grammar unless CREATE TABLE.");
		}
		tableReferencesClauseParser.parseSingleTableWithoutAlias(result);
		// 註釋掉這個命令
		// lexerEngine.accept(Symbol.LEFT_PAREN);
		do {
			parseCreateDefinition(result);
		} while (lexerEngine.skipIfEqual(Symbol.COMMA));
		// 註釋掉這個命令
		// lexerEngine.accept(Symbol.RIGHT_PAREN);
		return result;
	}
複製代碼