分佈式MySQL集羣Vitess-Kubernetes部署

Vitess is a database solution for deploying, scaling and managing large clusters of MySQL instances. It’s architected to run as effectively in a public or private cloud architecture as it does on dedicated hardware. It combines and extends many important MySQL features with the scalability of a NoSQL database. Vitess can help you with the following problems:mysql

  1. Scaling a MySQL database by allowing you to shard it, while keeping application changes to a minimum.
  2. Migrating from baremetal to a private or public cloud.
  3. Deploying and managing a large number of MySQL instances.

Vitess includes compliant JDBC and Go database drivers using a native query protocol. Additionally, it implements the MySQL server protocol which is compatible with virtually any other language.git

Vitess has been serving all YouTube database traffic since 2011, and has now been adopted by many enterprises for their production needs.github

The following example will use a simple commerce database to illustrate how Vitess can take you through the journey of scaling from a single database to a fully distributed and sharded cluster. This is a fairly common story, and it applies to many use cases beyond e-commerce.web

It’s 2018 and, no surprise to anyone, people are still buying stuff online. You recently attended the first half of a seminar on disruption in the tech industry and want to create a completely revolutionary e-commerce site. In classic tech postmodern fashion, you call your products widgets instead of a more meaningful identifier and it somehow fits.sql

Naturally, you realize the need for a reliable transactional datastore. Because of the new generation of hipsters, you’re probably going to pull traffic away from the main industry players just because you’re not them. You’re smart enough to foresee the scalability you need, so you choose Vitess as your best scaling solution.json

Prerequisites

Before we get started, let’s get a few things out of the way.app

 
The example settings have been tuned to run on Minikube. However, you should be able to try this on your own Kubernetes cluster. If you do, you may also want to remove some of the minikube specific resource settings (explained below).
  • Download vitess
  • Install Minikube
  • Start a minikube engine: minikube start --cpus=4 --memory=5000. Note the additional resource requirements. In order to go through all the use cases, many vttablet and mysql instances will be launched. These require more resources than the defaults used by minikube.
  • Install etcd operator
  • Install helm
  • After installing, run helm init

Optional

  • Install mysql client. On Ubuntu: apt-get install mysql-client
  • Install vtctlclient
    • Install go 1.11+
    • go get vitess.io/vitess/go/cmd/vtctlclient
    • vtctlclient will be installed at $GOPATH/bin/

Starting a single keyspace cluster

So you searched keyspace on Google and got a bunch of stuff about NoSQL… what’s the deal? It took a few hours, but after diving through the ancient Vitess scrolls you figure out that in the NewSQL world, keyspaces and databases are essentially the same thing when unsharded. Finally, it’s time to get started.ide

Change to the helm example directory:post

cd examples/helm

In this directory, you will see a group of yaml files. The first digit of each file name indicates the phase of example. The next two digits indicate the order in which to execute them. For example, ‘101_initial_cluster.yaml’ is the first file of the first phase. We shall execute that now:ui

helm install ../../helm/vitess -f 101_initial_cluster.yaml

This will bring up the initial Vitess cluster with a single keyspace.

Verify cluster

Once successful, you should see the following state:

~/...vitess/helm/vitess/templates> kubectl get pods,jobs
NAME                               READY     STATUS    RESTARTS   AGE
po/etcd-global-2cwwqfkf8d          1/1       Running   0          14m
po/etcd-operator-9db58db94-25crx   1/1       Running   0          15m
po/etcd-zone1-btv8p7pxsg           1/1       Running   0          14m
po/vtctld-55c47c8b6c-5v82t         1/1       Running   1          14m
po/vtgate-zone1-569f7b64b4-zkxgp   1/1       Running   2          14m
po/zone1-commerce-0-rdonly-0       6/6       Running   0          14m
po/zone1-commerce-0-replica-0      6/6       Running   0          14m
po/zone1-commerce-0-replica-1      6/6       Running   0          14m

NAME                                      DESIRED   SUCCESSFUL   AGE
jobs/commerce-apply-schema-initial        1         1            14m
jobs/commerce-apply-vschema-initial       1         1            14m
jobs/zone1-commerce-0-init-shard-master   1         1            14m

If you have installed the mysql client, you should now be able to connect to the cluster using the following command:

~/...vitess/examples/helm> ./kmysql.sh
mysql> show tables;
+--------------------+
| Tables_in_commerce |
+--------------------+
| corder             |
| customer           |
| product            |
+--------------------+
3 rows in set (0.01 sec)

You can also browse to the vtctld console using the following command (Ubuntu):

./kvtctld.sh

Minikube Customizations

The helm example is based on the values.yaml file provided as the default helm chart for Vitess. The following overrides have been performed in order to run under minikube:

  • resources: have been nulled out. This instructs the Kubernetes environment to use whatever is available. Note, this is not recommended for a production environment. In such cases, you should start with the baseline values provided in helm/vitess/values.yaml and iterate from those.
  • etcd and vtgate replicas are set to 1. In a production environment, there should be 3-5 etcd replicas. The number of vtgates will need to scale up based on cluster size.
  • mysqlProtocol.authType is set to none. This should be changed to secret and the credentials should be stored as Kubernetes secrets.
  • A serviceType of NodePort is not recommended in production. You may choose not to expose these end points to anyone outside Kubernetes at all. Another option is to create Ingress controllers.

Topology

The helm chart specifies a single unsharded keyspace: commerce. Unsharded keyspaces have a single shard named 0.

NOTE: keyspace/shards are global entities of a cluster, independent of a cell. Ideally, you should list the keyspace/shards separately. For a cell, you should only have to specify which of those keyspace/shards are deployed in that cell. However, for simplicity, the existence of keyspace/shards are implicitly inferred from the fact that they are mentioned under each cell.

In this deployment, we are requesting two replica type tables and one rdonly type tablet. When deployed, one of the replica tablet types will automatically be elected as master. In the vtctld console, you should see one master, one replica and one rdonly vttablets.

The purpose of a replica tablet is for serving OLTP read traffic, whereas rdonly tablets are for serving analytics, or performing cluster maintenance operations like backups, or resharding. rdonly replicas are allowed to lag far behind the master because replication needs to be stopped to perform some of these functions.

In our use case, we are provisioning one rdonly replica per shard in order to perform resharding operations.

Schema

create table product(
  sku varbinary(128),
  description varbinary(128),
  price bigint,
  primary key(sku)
);
create table customer(
  customer_id bigint not null auto_increment,
  email varbinary(128),
  primary key(customer_id)
);
create table corder(
  order_id bigint not null auto_increment,
  customer_id bigint,
  sku varbinary(128),
  price bigint,
  primary key(order_id)
);

The schema has been simplified to include only those fields that are significant to the example:

  • The product table contains the product information for all of the products.
  • The customer table has a customer_id that has an auto-increment. A typical customer table would have a lot more columns, and sometimes additional detail tables.
  • The corder table (named so because order is an SQL reserved word) has an order_id auto-increment column. It also has foreign keys into customer(customer_id) and product(sku).

VSchema

Since Vitess is a distributed system, a VSchema (Vitess schema) is usually required to describe how the keyspaces are organized.

{
  "tables": {
    "product": {},
    "customer": {},
    "corder": {}
  }
}

With a single unsharded keyspace, the VSchema is very simple; it just lists all the tables in that keyspace.

NOTE: In the case of a single unsharded keyspace, a VSchema is not strictly necessary because Vitess knows that there are no other keyspaces, and will therefore redirect all queries to the only one present.

Vertical Split

Due to a massive ingress of free-trade, single-origin yerba mate merchants to your website, hipsters are swarming to buy stuff from you. As more users flock to your website and app, the customer and corder tables start growing at an alarming rate. To keep up, you’ll want to separate those tables by moving customer and corder to their own keyspace. Since you only have as many products as there are types of yerba mate, you won’t need to shard the product table!

Let us add some data into our tables to illustrate how the vertical split works.

./kmysql.sh < ../common/insert_commerce_data.sql

 

Doc:

相關文章
相關標籤/搜索