參賽日記day17-tidb性能競賽-tikv/pd#2950

changelog

2020/10/27 day17 筆記轉移到語雀了，用來單純作技術筆記不錯，左邊是筆記，右邊是流程圖
2020/10/26 day16 繼續翻譯yugabytedb，發現若是任務分散到天天而不是想一天全翻譯完，天天的理解會前一天更多一些
2020/10/25 day14 翻譯完yugabytedb
2020/10/24 day13 中斷
2020/10/23 day12 中斷
2020/10/22 day11 熟悉了下redis-py的使用，以前沒接觸過redis
2020/10/21 day10 醬油
2020/10/20 day9晚上應該用來翻譯yugabytedb的Colocated tables的，結果想把整章翻譯完，最後拖來拖去一個字沒翻譯，因此仍是先翻譯Colocated tables這個小結比較靠譜，也不要急着運行代碼。Colocated tables看完了應該就能夠依葫蘆畫瓢寫文檔了。
2020/10/19 day8用力過猛，熬夜把zoom講解看完了，筆記流程圖放在前面筆記裏了。畫流程圖仍是有點效果的。

背景補充

翻譯Colocated tables | YugabyteDB Docs

Colocated tables 共用表/同地辦公/主機表？linux

In workloads that need lower throughput and have a small data set, the bottleneck shifts from CPU/disk/network to the number of tablets that should be hosted per node. Since each table by default requires at least one tablet per node, a YugabyteDB cluster with 5000 relations (which includes tables and indexes) will result in 5000 tablets per node. There are practical limitations to the number of tablets that YugabyteDB can handle per node since each tablet adds some CPU, disk, and network overhead. If most or all of the tables in YugabyteDB cluster are small tables, then having separate tablets for each table unnecessarily adds pressure on CPU, network and disk.ios

在須要較低吞吐量且數據集較小的工做負載中，瓶頸從CPU/磁盤/網絡轉移到每一個節點應該託管的分片tablet數量上。因爲每一個表默認須要每一個節點至少存儲一個片子tablet，因此一個擁有5000個關係（包括表和索引）的YugabyteDB集羣將致使每一個節點須要5000個分片tablet。YugabyteDB每一個節點能夠處理的平板電腦數量是有實際限制的，由於每增長一個分片tablet電腦都須要增長一些CPU、磁盤和網絡開銷。若是YugabyteDB集羣中的大部分或所有表都是小表，那麼爲每張表單獨設置分片tablet就會沒必要要地增長CPU、網絡和磁盤的壓力。git

爲何說致使每一個節點都須要5000個tablet的，每一個節點不須要包含5000個tablet？好比有5個節點，第1個節點存15000，其餘三個節點用來備份，這樣不就須要每一個節點都包含5000個tablet了？github

To help accommodate such relational tables and workloads, YugabyteDB supports colocating SQL tables. Colocating tables puts all of their data into a single tablet, called the colocation tablet. This can dramatically increase the number of relations (tables, indexes, etc) that can be supported per node while keeping the number of tablets per node low. Note that all the data in the colocation tablet is still replicated across three nodes (or whatever the replication factor is). Large tablets can be dynamically split at a future date if there is need to serve more throughput over a larger data set.redis

爲了幫助適應這樣的關係表和工做負載，YugabyteDB支持SQL表的主機託管。共用Colocating表將它們的全部數據放到一個單一的分片tablet中，稱爲共用分片colocation tablet。這能夠極大地增長每一個節點能夠支持的關係（表、索引等）的數量，同時保持每一個節點的tablet數量較少。須要注意的是，colocation tablet中的全部數據仍然要在三個節點上進行復制（或者無論複製係數是多少）。若是須要在更大的數據集上提供更多的吞吐量，能夠在將來的某一天動態拆分大型分片。sql

也就是把多個小表放到一個tablet中了。數據庫

Motivation This feature is desirable in a number of scenarios, some of which are described below.api

Small datasets needing HA or geo-distribution Applications that have a smaller dataset may fall into the following pattern:網絡

They require large number of tables, indexes and other relations created in a single database. The size of the entire dataset is small. Typically, this entire database is less than 500 GB in size. Need high availability and/or geographic data distribution. Scaling the dataset or the number of IOPS is not an immediate concern. In this scenario, it is undesirable to have the small dataset spread across multiple nodes because this might affect performance of certain queries due to more network hops (for example, joins).

Example: User identity service for a global application. The user dataset size may not be too large, but is accessed in a relational manner, requires high availability and might need to be geo-distributed for low latency access.

動機這個功能在一些場景中是可取的，其中一些場景描述以下。

須要HA或地理分佈的小型數據集 擁有較小數據集的應用程序可能屬於如下模式。

它們須要在一個數據庫中建立大量的表、索引和其餘關係。整個數據集的大小很小。一般狀況下，這個整個數據庫的大小小於500GB。須要高可用性和/或地理數據分佈。縮放數據集或IOPS的數量不是一個直接的問題。在這種狀況下，將小數據集分佈在多個節點上是不可取的，由於這可能會因爲更多的網絡跳數（例如，鏈接）而影響某些查詢的性能。

**舉個例子：**全局應用的用戶身份服務。用戶數據集規模可能不會太大，但以關係方式訪問，須要高可用性，可能須要地理分佈以實現低延遲訪問。

有點相似CDN了。

Large datasets - a few large tables with many small tables Applications that have a large dataset may fall into the pattern where:

They need a large number of tables and indexes.
A handful of tables are expected to grow large, needing to be scaled out.
The rest of the tables will continue to remain small.

In this scenario, only the few large tables would need to be sharded and scaled out. All other tables would benefit from colocation because queries involving all tables, except the larger ones, would not need network hops.

Example: An IoT use case, where one table records the data from the IoT devices while there are a number of other tables that store data pertaining to user identity, device profiles, privacy, etc.

擁有大型數據集的應用程序可能屬於這樣的模式

他們須要大量的表和索引。少數表預計會變大，須要縮減規模。其他的表將繼續保持小規模。

在這種狀況下，只有少數大型表須要被分片和縮減。全部其餘表都將從主機代管中受益，由於除了大表以外，涉及全部表的查詢都不須要網絡跳轉。

**例子：**一個物聯網用例，其中一個表記錄來自物聯網設備的數據，而其餘一些表則存儲與用戶身份、設備配置文件、隱私等相關的數據。

Scaling the number of databases, each database with a small dataset There may be scenarios where the number of databases grows rapidly, while the dataset of each database is small. This is characteristic of a microservices-oriented architecture, where each microservice needs its own database. These microservices are hosted in dev, test, staging, production and other environments. The net result is a lot of small databases, and the need to be able to scale the number of databases hosted. Colocated tables allow for the entire dataset in each database to be hosted in one tablet, enabling scalability of the number of databases in a cluster by simply adding more nodes.

Example: Multi-tenant SaaS services where one database is created per customer. As new customers are rapidly on-boarded, it becomes necessary to add more databases quickly while maintaining high-availability and fault-tolerance of each database.

縮放數據庫的數量，每一個數據庫都有一個小的數據集 可能會有這樣的場景：數據庫的數量快速增加，而每一個數據庫的數據集卻很小。這是面向微服務架構的特色，每一個微服務都須要本身的數據庫。這些微服務被託管在開發、測試、暫存、生產和其餘環境中。淨結果是有不少小數據庫，而且須要可以擴展託管的數據庫數量。Colocated表容許將每一個數據庫中的整個數據集託管在一個tabalet中，經過簡單地添加更多的節點，實現集羣中數據庫數量的可擴展性。

**例子：**多租戶SaaS服務，每一個客戶建立一個數據庫。隨着新客戶的快速加入，就須要快速增長更多的數據庫，同時保持每一個數據庫的高可用性和容錯性。

Tradeoffs Fundamentally, colocated tables have the following tradeoffs:

Higher performance - no network reads for joins. All of the data across the various colocated tables is local, which means joins no longer have to read data over the network. This improves the speed of joins. Support higher number of tables - using fewer tablets. Because multiple tables and indexes can share one underlying tablet, a much higher number of tables can be supported using colocated tables. Lower scalability - until removal from colocation tablet. The assumptions behind tables that are colocated is that their data need not be automatically sharded and distributed across nodes. If it is known a priori that a table will get large, it can be opted out of the colocation tablet at creation time. If a table already present in the colocation tablet gets too large, it can dynamically be removed from the colocation tablet to enable splitting it into multiple tablets, allowing it to scale across nodes.

權衡利弊 從根本上講，主機表 colocated tables 有如下權衡：

**更高的性能--無需經過網絡讀取聯接數據。**各個主機表 colocated tables 的全部數據都是本地的，這意味着join再也不須要經過網絡讀取數據。這提升了聯接的速度。
**支持更多的表--使用更少的片。**因爲多個表和索引能夠共享一個底層tablet，所以使用colocated表能夠支持更多數量的表。
**較低的可擴展性--直到從主機託管tablet移除。**同地辦公的表背後的假設是，它們的數據不須要自動分片並分佈在各個節點上。若是事先知道某個表會變得很大，那麼能夠在建立時將其從主機託管tablet中選擇出來。若是已經存在於主機託管tablet中的表變得過大，能夠動態地從主機託管tablet中移除，以實現將其分割成多個tablet，使其可以跨節點擴展。

Usage To learn more about using this feature, see Explore colocated tables.

使用方法要了解有關使用此功能的更多信息，請參見探索colocated tables。

What's next? For more information, see the architecture for colocated tables.

下一步是什麼？有關更多信息，請參見 colocated tables 架構。

翻譯Explore colocated tables on Linux | YugabyteDB Docs

In workloads that do very little IOPS and have a small data set, the bottleneck shifts from CPU/disk/network to the number of tablets one can host per node. Since each table by default requires at least one tablet per node, a YugabyteDB cluster with 5000 relations (tables, indexes) will result in 5000 tablets per node.There are practical limitations to the number of tablets that YugabyteDB can handle per node since each tablet adds some CPU, disk and network overhead. If most or all of the tables in YugabyteDB cluster are small tables, then having separate tablets for each table unnecessarily adds pressure on CPU, network and disk.

To help accommodate such relational tables and workloads, you can colocate SQL tables. Colocating tables puts all of their data into a single tablet, called the colocation tablet. This can dramatically increase the number of relations (tables, indexes, etc.) that can be supported per node while keeping the number of tablets per node low. Note that all the data in the colocation tablet is still replicated across three nodes (or whatever the replication factor is).

This tutorial uses the yb-ctl local cluster management utility.

在工做負載中，若是IOPS不多，數據量很小，瓶頸就會從CPU/磁盤/網絡轉移到每一個節點能夠承載的分片tablet數量上。因爲每一個表默認須要每一個節點至少有一個tablet，一個有5000個關係（表、索引）的YugabyteDB集羣將致使每一個節點有5000個tablet.YugabyteDB每一個節點能夠處理的tablet數量是有實際限制的，由於每一個平板電腦都會增長一些CPU、磁盤和網絡開銷。若是YugabyteDB集羣中的大部分或所有表都是小表，那麼爲每一個表單獨設置tablet就會沒必要要地增長CPU、網絡和磁盤的壓力。

爲了幫助適應這樣的關係表和工做負載，你能夠將SQL表放在一塊兒。協同表將它們的全部數據放到一個單一的tablet中，稱爲colocation tablet。這能夠極大地增長每一個節點能夠支持的關係（表、索引等）數量，同時保持每一個節點的tablet較低。須要注意的是，colocation tablet中的全部數據仍然會在三個節點上進行復制（或者無論複製係數是多少）。

本教程使用yb-ctl本地集羣管理實用程序。

建立一個領域 Create a universe

./bin/yb-ctl create # 這個是建立一個靈越

建立一個colocated database，爲何不是建立一個colocated tablet

鏈接到集羣使用ysqlsh，這是幹嗎的？

./bin/ysqlsh -h 127.0.0.1

建立一個數據庫使用colocated = true這個選項，也就是在SQL裏面加這麼一句 WITH colocated = true; yugabyte=# CREATE DATABASE northwind WITH colocated = true; 這將建立一個數據庫northwind，它的全部表都在一個tablet裏面

建立表tables

鏈接到northwind數據庫，使用標準的CREATE TABLE命令建立表。因爲數據庫是在colocated = true選項下建立的，因此這些表將被集中在一個tablet上。

\c northwind #這個應該是進入表的意思吧
CREATE TABLE customers (
    customer_id bpchar,
    company_name character varing(40) NOT NULL, # character是幹嗎用的
    contact_title character varying(30),
  PRIMARY KEY(customer_id ASC) # ASC是幹嗎的
);
CREATE TABLE categories (
    category_id smallint, # 咱們知道每一個對象都由不少屬性，表示存多個相同對象的東西
    category_name character varying(15) NOT NULL, # 
    description text;
  PRIMARY KEY(category_id ASC) # ASC 是什麼？
);
#又建立了一個表，這些表都是一個業務涉及的多對象
CREATE TABLE suppliers (
    supplier_id smallint,
    company_name character varying(40) NOT NULL,
    contact_name character varying(30),
    contact_title character varying(30),
  PRIMARY KEY(supplier_id ASC)
);
#商品，這個對象會同時和多個對象打交道
CREATE TABLE products (
    product_id smallint,
    product_name character varying(40) NOT NULL,
    supplier_id smallint,
    category_id smallint,
    quantity_per_unit character varying(20),
    unit_price real,
  PRIMARY KEY(product_id ASC),
  FOREIGN KEY (category_id) REFERENCES categories,
  FOREIGN KEY (supplier_id) REFERENCES suppliers
);

若是你在主界面中進入表格視圖，你會看到全部的表格都有相同的。

選擇退出同地辦公表 Opt out table from colocation，這個和反親和特別像啊，看來仍是要把pd裏面的rule placement代碼閱讀下

YugabyteDB能夠靈活地選擇一個表退出colocation託管。在這種狀況下，表將使用本身的一組tablet，而不是使用與colocated database相同的tablets。這對於擴展可能很大的表頗有用。您能夠在建立表時使用colocated = false選項來實現這一點。

CREATE TABLE orders (
    order_id smallint NOT NULL PRIMARY KEY,
    customer_id bpchar,
    order_date date,
    ship_address character varying(60),
    ship_city character varying(15),
    ship_postal_code character varying(10),
    FOREIGN KEY (customer_id) REFERENCES customers
) WITH (colocated = false);

若是你進入主界面的表格視圖，你會看到訂單表有本身的一套tablet。

翻譯yugabyte-db/ysql-colocated-tables.md at master · yugabyte/yugabyte-db

讀寫colocated表格中的數據

你可使用標準的 YSQL DML 語句來讀取和寫入colocated表中的數據。YSQL的查詢規劃器和執行器將處理將數據路由到正確的平板。

下一步是什麼？有關更多信息，請參見colocated表的架構。

概念

鍵值對記錄

能夠表示爲 (row:string,column:string,time:int64) -> string

ACID

是指數據庫管理系統（DBMS）在寫入或更新資料的過程中，為保證事務（transaction）是正確可靠的，所必須具備的四個特性：原子性（atomicity，或稱不可分割性）、一致性（consistency）、隔離性（isolation，又稱獨立性）、持久性（durability）。

tablet

region

tidb源碼註釋

爲了方便後面的源碼閱讀，這裏把競賽的版本提取成分支，而後再放到gitee方便些註釋。

community/high-performance-tidb-challenge-cn.md at master · pingcap/community

一共3個倉庫，先克隆到本身的倉庫，若是是用gitee也能夠不用克隆，而後切換分支，建立新分支。

tidb:1bfeff96c7439ed672f8362cf67573666a43f781
tikv:dcd2f8f4076d847151fdf58e9c0ba333f242d374
pd:c05ef6f95773941db5c1060174f5a62e8f864e88

git clone https://github.com/eatcosmos/tidb.git && cd ~/git/tidb
git reset --hard 1bfeff96c7439ed672f8362cf67573666a43f781 && git checkout -b 1bfeff-dev && git push --set-upstream origin 1bfeff-dev
git reset --hard 1bfeff96c7439ed672f8362cf67573666a43f781 && git checkout -b 1bfeff-comment && git push --set-upstream origin 1bfeff-comment

git clone https://github.com/eatcosmos/tikv.git && cd ~/git/tikv
git reset --hard dcd2f8f4076d847151fdf58e9c0ba333f242d374 && git checkout -b dcd2f8-dev && git push --set-upstream origin dcd2f8-dev
git reset --hard dcd2f8f4076d847151fdf58e9c0ba333f242d374 && git checkout -b dcd2f8-comment && git push --set-upstream origin dcd2f8-comment

git clone https://github.com/eatcosmos/pd.git && cd ~/git/pd
git reset --hard c05ef6f95773941db5c1060174f5a62e8f864e88 && git checkout -b c05ef6-dev && git push --set-upstream origin c05ef6-dev
git reset --hard c05ef6f95773941db5c1060174f5a62e8f864e88 && git checkout -b c05ef6-comment && git push --set-upstream origin c05ef6-comment

#開發版 github git clone --single-branch --branch 1bfeff-dev https://github.com/eatcosmos/tidb.git git clone --single-branch --branch dcd2f8-dev https://github.com/eatcosmos/tikv.git git clone --single-branch --branch c05ef6-dev https://github.com/eatcosmos/pd.git

#註釋版 gitee https://gitee.com/eatcosmos/tidb/tree/1bfeff-comment/ https://gitee.com/eatcosmos/tikv/tree/dcd2f8-comment/ https://gitee.com/eatcosmos/pd/tree/c05ef6-comment/

學習方法

發現對比學習效果最好，原本對tikv的結構比較模糊，看來看去也不是很肯定，可是看了和他相似的yugabytedb，經過微小的差別對比達到加深理解的目的。
須要一氣呵成反覆看，避免時間被消息打散。中斷的幾天基本是由於，每次看到中途就去找其餘資料，其實不必，不懂的就跳過去，不要去搜其餘資料。等看完了，再集中去補充。
若是有交流的渠道最好，能夠把疑問和想法放上去，沒有就本身先記錄下來。