多線程檢索 mysql 分表模式，發揮分表方案的最大性能

時間 2019-12-05

原文原文鏈接

我天真的認爲，mysql 的 merge 引擎分表模式下，mysql 會自動啓用多線程對旗下的子表進行併發檢索，但到目前爲止我是沒找到 mysql 的此機制設置，簡單測試下就會發現....mysql 依然是依次檢索各個子表，花費的時間與未分表狀況下無異python

mysql 在單表數據量達到千萬級別時，最好採用劃分紅若干個小單位的優化方案，分區或者分表，這裏咱們講分表。mysql

場景sql

主單表：data 400000w編程

子分表：data_1 100000w，data_2 100000w，data_3 100000w，data_4 100000w多線程

module/mysql.py是我本身封裝的mysql DAO併發

mysql_task.py是咱們此次的測試本，裏面啓用了四個子線程去檢索四個分表，主線程本身去檢索主單表app

module/mysql.py模塊化

#! /usr/bin/env python
# -*-coding:utf-8-*-
"""
mysql DAO
簡單的寫一下 你們就別在乎代碼有些依賴注入的問題了
"""
import MySQLdb

class Mysql():

    def __init__(self, host, user, passwd, db, port = 3306):
        self.host = host
        self.user = user
        self.passwd = passwd
        self.db = db
        self.port = port
        self.connect()

    def connect(self):
        self.conn = MySQLdb.connect(host = self.host, user = self.user, passwd = self.passwd, db = self.db, port = self.port)
        self.cursor = self.conn.cursor()

    def execute(self, sql):
        result = self.cursor.execute(sql)
        return result

    def query(self, sql):
        self.cursor.execute(sql)
        result = self.cursor.fetchall()
        return result

    def scaler(self, sql):
        self.cursor.execute(sql)
        result = self.cursor.fetchone()
        return result[0]

    def one(self, sql):
        self.cursor.execute(sql)
        result = self.cursor.fetchone()
        return result

    def __del__(self):
        self.cursor.close()
        self.conn.close()

module/__init__.py 模塊化編程不瞭解的自補一下性能

#! /usr/bin/env python
# -*-coding:utf-8-*-
"""
將Mysql類做爲模塊開註冊到module中
"""

#將mysql.py中的Mysql註冊到module模塊中
#這樣咱們在外部使用 from module import Mysql時便可訪問此類
from mysql import Mysql

mysq_task.py測試

#! /usr/bin/env python
"""
"""
__author__ = 'sallency'

from module import Mysql
from threading import Thread
import time

result = []

class MyThread(Thread):

    def __init__(self):
        Thread.__init__(self)

    def run(self):
        global result
        dbCon = Mysql('localhost', 'root', '123456', 'mydb')
        result.append(dbCon.scaler("select sql_no_cache count(`id`) from `data_" + str(self.no) +"` where `title` like '%hello%'"))

#start sub thread
def task():
    thr_1 = MyThread()
    thr_2 = MyThread()
    thr_3 = MyThread()
    thr_4 = MyThread()

    thr_1.start()
    thr_2.start()
    thr_3.start()
    thr_4.start()

    thr_1.join()
    thr_2.join()
    thr_3.join()
    thr_4.join()

    return True

if __name__ == "__main__":

    print ""
    print "...... multi thread query start ......"
    print time.ctime() + ' / ' + str(time.time())
    task()
    print result
    print time.ctime() + ' / ' + str(time.time())
    print "...... multi thread query end ......"

    print ""
    dbCon = Mysql('localhost', 'root', '123456', 'mydb')
    print "...... single thread query start ......"
    print time.ctime() + ' / ' + str(time.time())
    print dbCon.scaler("select sql_no_cache count(`id`) from `data` where `title` like '%hello%'")
    print time.ctime() + ' / ' + str(time.time())

測試結果

查詢結果：219 + 207 + 156 + 254 == 836 true 啊

多線程用時 1.8 秒，單線程 6.12 秒，性能提高了 70.59%

這個你們即使不寫代碼本身想也確定能獲得正確的結論，不過嘛，本身動手搞一下感受仍是挺不錯的，哈哈