mariadb的flashback到底怎麼樣???防誤刪能夠,但算不上真正的閃回--再看mariadb 10.3的System-Versioned Tables

mariadb 在10.2.4引入閃回特性,支持DML(INSERT, DELETE, UPDATE)操做的閃回,不支持DDL語句,使用閃回,必須設置binlog_row_image=FULL。mysql

其原理和oracle有undo不同,將INSERT重寫爲DELETE, DELETE重寫爲INSERT, UPDATE根據先後值進行交換,這也是必須設置binlog_row_image=FULL的緣由。sql

mysqlbinlog默認狀況下會生成重作SQL,經過使用新增的"--flashback"選項,能夠生成自某個SCN或者時間點以來的反向SQL。看以下對比:數據庫

完了以後,就能夠經過執行mysql < flashback.sql將全部變動操做還原了。服務器

實際上受制於mysql的體系架構,這個被稱爲閃回的實現是很low的,準確的說都算不上,隨便一個高級開發換點時間慢慢搞均可以作出來,它還不支持查詢。架構

除此以外,它還須要訪問到mysql的binlog,這也是個比較困難的事,由於運維體系可能不容許用戶直接訪問mysql服務器,若是是阿里雲的RDS,就更是如此了。oracle

對於時間點恢復這個事情,還有一種典型的作法是依賴於從庫,經過延遲複製的方式實現,這種方式用於實現OLTP或者誤操做是能夠的,可是把它做爲一個撤銷操做的機制就比較強人所難了,須要人工干預的侵入性太強了。除非不得已,咱們不會選擇這種實現方式,太脆弱了。運維

============================================================================================ide

再來看看mariadb 10.3的表版本化怎麼樣?函數

首先看各類數據庫閃回的實現機制:性能

在mariadb中,表版本化是10.3.4開始引入的,參考了SQL:2011的標準,截止本文編寫,mariadb 10.3系列的最新版本爲MariaDB 10.3.10 Stable,10.3.7發佈第一個GA版本。以下:

因此新是有點新,具體看怎麼辦了。。。。

The CREATE TABLE syntax has been extended to permit creating a system-versioned table. To be system-versioned, according to SQL:2011, a table must have two generated columns, a period, and a special table option clause:

CREATE TABLE t( x INT, start_timestamp TIMESTAMP(6) GENERATED ALWAYS AS ROW START, end_timestamp TIMESTAMP(6) GENERATED ALWAYS AS ROW END, PERIOD FOR SYSTEM_TIME(start_timestamp, end_timestamp) ) WITH SYSTEM VERSIONING; 

In MariaDB one can also use a simplified syntax:

CREATE TABLE t ( x INT ) WITH SYSTEM VERSIONING; 

In the latter case no extra columns will be created and they won't clutter the output of, say, SELECT * FROM t. The versioning information will still be stored, and it can be accessed via the pseudo-columns ROW_START and ROW_END:

SELECT x, ROW_START, ROW_END FROM t; 

 採用簡化的語法能夠使得現有的SQL都不用調整,這很重要。

CREATE TABLE t (
   x INT
) WITH SYSTEM VERSIONING;
insert into t values(1),(2),(3);
insert into t values(4),(5),(6);

select now();
2018-10-23 11:58:54
select * from t;
delete from t;  
select * from t;   --此時默認查不到記錄了

SELECT * FROM t FOR SYSTEM_TIME AS OF TIMESTAMP '2018-10-23 11:58:54'

歷史版本查到了。

和oracle同樣,還支持版本歷史查詢,以下:

  • BETWEEN start AND end will show all rows that were visible at any point between two specified points in time. It works inclusively, a row visible exactly at start or exactly at end will be shown too.
SELECT * FROM t FOR SYSTEM_TIME BETWEEN (NOW() - INTERVAL 1 YEAR) AND NOW() 
  • FROM start TO end will also show all rows that were visible at any point between two specified points in time, including start, but excluding end.
SELECT * FROM t FOR SYSTEM_TIME FROM '2016-01-01 00:00:00' TO '2017-01-01 00:00:00' 

Additionally MariaDB implements a non-standard extension:

  • ALL will show all rows, historical and current.
SELECT * FROM t FOR SYSTEM_TIME ALL 

If the FOR SYSTEM_TIME clause is not used, the table will show the current data, as if one had specified FOR SYSTEM_TIME AS OF CURRENT_TIMESTAMP.

初步看來,特性仍是不錯的,比那個雞肋的閃回要好多了。

-------------再來看下性能-----------

本地PC筆記本測試
create table big_table like information_schema.columns;
insert into big_table select * from information_schema.columns;
ALTER TABLE big_table ADD SYSTEM VERSIONING;
insert into big_table select * from big_table; -- 重複執行造24w記錄
select * from big_table;
update big_table set table_catalog = CONCAT(table_catalog,'-','cata');
受影響的行: 239232
時間: 0.882s
select now();
2018-10-23 12:51:28
delete from big_table where ordinal_position < 3;
[SQL]delete from big_table where ordinal_position < 3;
受影響的行: 40576
時間: 0.176s
select now();
2018-10-23 12:54:30
update big_table set table_catalog = SUBSTR(table_catalog,4);
[SQL]update big_table set table_catalog = SUBSTR(table_catalog,4);
受影響的行: 198656
時間: 0.794s
select now();
2018-10-23 12:55:10
alter table big_table add column my_col int;
[Err] 4119 - Not allowed for system-versioned `test`.`big_table`. Change @@system_versioning_alter_history to proceed with ALTER.
set @@system_versioning_alter_history=KEEP;
alter table big_table add column my_col int;
[SQL]alter table big_table add column my_col int;
受影響的行: 677120
時間: 1.616s
select now();
2018-10-23 12:56:44
select * from big_table; -- 新增的字段會帶出來

根據上述測試,和非系統版本表相差並不大,只不過根據官方所述,若是按照默認都存在一張表,歷史數據多了以後會致使性能降低,因此有兩種解決方法:獨立分區維護歷史版本;按期刪除。

刪除方法1:

ALTER TABLE t ADD SYSTEM VERSIONING;
ALTER TABLE t DROP SYSTEM VERSIONING;

刪除方法2:

DELETE HISTORY FROM t;

刪除方法3:

DELETE HISTORY FROM t BEFORE SYSTEM_TIME '2016-10-09 08:07:06'; 

 下面來看使用獨立的分區維護歷史的方式(純操做類,各位讀者本身來):

Storing the History Separately

When the history is stored together with the current data, it increases the size of the table, so current data queries — table scans and index searches — will take more time, because they will need to skip over historical data. If most queries on that table use only current data, it might make sense to store the history separately, to reduce the overhead from versioning.

This is done by partitioning the table by SYSTEM_TIME. Because of partition pruning optimization, all current data queries will only access one partition, the one that stores current data.

This example shows how to create such a partitioned table:

CREATE TABLE t (x INT) WITH SYSTEM VERSIONING PARTITION BY SYSTEM_TIME ( PARTITION p_hist HISTORY, PARTITION p_cur CURRENT ); 

In this example all history will be stored in the partition p_hist while all current data will be in the partition p_cur. The table must have exactly one current partition and at least one historical partition.

Partitioning by SYSTEM_TIME also supports automatic partition rotation. One can rotate historical partitions by time or by size. This example shows how to rotate partitions by size:

CREATE TABLE t (x INT) WITH SYSTEM VERSIONING PARTITION BY SYSTEM_TIME LIMIT 100000 ( PARTITION p0 HISTORY, PARTITION p1 HISTORY, PARTITION pcur CURRENT ); 

MariaDB will start writing history rows into partition p0, and when it reaches a size of 100000 rows, MariaDB will switch to partition p1. There are only two historical partitions, so when p1 overflows, MariaDB will issue a warning, but will continue writing into it.

Similarly, one can rotate partitions by time:

CREATE TABLE t (x INT) WITH SYSTEM VERSIONING PARTITION BY SYSTEM_TIME INTERVAL 1 WEEK ( PARTITION p0 HISTORY, PARTITION p1 HISTORY, PARTITION p2 HISTORY, PARTITION pcur CURRENT ); 

This means that the history for the first week after the table was created will be stored in p0. The history for the second week — in p1, and all later history will go into p2. One can see the exact rotation time for each partition in the INFORMATION_SCHEMA.PARTITIONS table.

It is possible to combine partitioning by SYSTEM_TIME and subpartitions:

CREATE TABLE t (x INT) WITH SYSTEM VERSIONING PARTITION BY SYSTEM_TIME SUBPARTITION BY KEY (x) SUBPARTITIONS 4 ( PARTITION ph HISTORY, PARTITION pc CURRENT ); 

 

默認的系統版本有一個問題,就是他不知足讀提交隔離級別,也就是非事務隔離,文檔也講了,例子參見https://jira.mariadb.org/browse/MDEV-16236。要支持事務隔離,得按照下面這樣來:

CREATE TABLE demo_system_versioning (
  id INTEGER NOT NULL,
  data VARCHAR(255),
  start_ts BIGINT INVISIBLE UNSIGNED GENERATED ALWAYS AS ROW START NOT NULL,
  end_ts   BIGINT INVISIBLE UNSIGNED GENERATED ALWAYS AS ROW END NOT NULL,
  PERIOD FOR SYSTEM_TIME (start_ts, end_ts),
  PRIMARY KEY (id)
) WITH SYSTEM VERSIONING;
 
START TRANSACTION;
INSERT INTO demo_system_versioning (id, data) VALUES (1, 'X');
SELECT SLEEP(0.1);
INSERT INTO demo_system_versioning (id, data) VALUES (2, 'Y');
COMMIT;
 
START TRANSACTION;
INSERT INTO demo_system_versioning (id, data) VALUES (3, 'X');
SELECT SLEEP(0.1);
INSERT INTO demo_system_versioning (id, data) VALUES (4, 'Y');
COMMIT;


-- should return a single row as both rows where inserted in the same transaction
SELECT COUNT(*), start_ts
  FROM demo_system_versioning
 GROUP BY start_ts;

START TRANSACTION;
delete from demo_system_versioning where id = 1;
SELECT SLEEP(0.1);
delete from demo_system_versioning where id = 3;
COMMIT;


START TRANSACTION;
INSERT INTO demo_system_versioning (id, data) VALUES (5, 'X');
SELECT SLEEP(0.1);
INSERT INTO demo_system_versioning (id, data) VALUES (6, 'Y');
COMMIT;

SELECT *
  FROM demo_system_versioning
FOR SYSTEM_TIME AS OF TIMESTAMP '2018-10-23 13:58:53' where id >2;

 

注意:10.3.8以及以前的版本都會提示trt_begin_ts函數不存在。

其還會用到mysql.transaction_registry表,定義以下:

Field Type Null Key Default Description
transaction_id bigint(20) unsigned NO Primary NULL  
commit_id bigint(20) unsigned NO Unique NULL  
begin_timestamp timestamp(6) NO Multiple 0000-00-00 00:00:00.000000 Timestamp when the transaction began (BEGIN statement), however see MDEV-16024.
commit timestamp(6) NO Multiple 0000-00-00 00:00:00.000000 Timestamp when the transaction was committed.
isolation_level enum('READ-UNCOMMITTED','READ-COMMITTED','REPEATABLE-READ','SERIALIZABLE') NO   NULL Transaction isolation level.
 
最後 系統變量 system_versioning_alter_history控制是否容許DDL,默認不容許,此時會報錯。
 
根據上述針對閃回和表系統版本特性的測試,基本上能夠認爲閃回只能用於小白場景,也就是不作任何設計的數據庫。

參考:

https://modern-sql.com/blog/2018-08/whats-new-in-mariadb-10.3

https://jira.mariadb.org/browse/MDEV-16236

https://jira.mariadb.org/browse/MDEV-16024  

https://mariadb.com/kb/en/library/system-versioned-tables/

https://www.slideshare.net/MariaDB/m18-querying-data-at-a-previous-point-in-time

相關文章
相關標籤/搜索