MySQL數據庫（5）- pymysql的使用、索引

時間 2019-11-18

原文原文鏈接

1、pymysql模塊的使用

一、pymysql的下載和使用

以前咱們都是經過MySQL自帶的命令行客戶端工具mysql來操做數據庫，那如何在python程序中操做數據庫呢？這就須要用到pymysql模塊，該模塊本質就是一個套接字客戶端軟件，使用前須要事先安裝。python

1）pymysql模塊的下載mysql

    pip3 install pymysql

2）pymysql模塊的使用算法

現有數據庫mydb，其中有一個userinfo表，表中數據以下：sql

    mysql> select * from userinfo;
    +----+------+-----+
    | id | name | pwd |
    +----+------+-----+
    |  1 | wll  | 123 |
    |  2 | ssx  | 456 |
    +----+------+-----+

示例：使用Python實現用戶登陸，若是用戶存在則登陸成功，不然登陸失敗數據庫

    import pymysql
    username = input('請輸入用戶名：')
    pwd = input('請輸入密碼：')

    # 一、鏈接
    conn = pymysql.connect(
        host = '127.0.0.1',
        port = 3306,
        user = 'root',
        password = '123',
        db = 'mydb',
        charset = 'utf8'
    )
    # 二、建立遊標
    cur = conn.cursor()

    sql = "select * from userinfo where name='%s' and pwd='%s'" %(username,pwd)
    # 三、執行sql語句
    result = cur.execute(sql)
    print(result)  # result爲sql語句執行後生效的行數

    # 四、關閉：遊標和鏈接都要關閉
    cur.close()
    conn.close()

    if result:
        print('登陸成功')
    else:
        print('登陸失敗')

二、execute()之sql注入問題

　　sql語句的註釋：-- 這是註釋服務器

　　一條sql語句若是是select * from userinfo where name='wll' -- haha' and pwd=''數據結構

　　那麼-- 以後的內容就被註釋掉了（注意：--後面還有一個空格）。函數

因此，上例中當用戶輸入以下內容就會出現問題工具

# sql注入之：用戶存在，繞過密碼性能

wll' -- 任意字符

　　# sql注入之：用戶不存在，繞過用戶名和密碼

　　xxx' or 1=1 -- 任意字符

　　緣由是咱們對sql語句進行字符串拼接時，爲%s加了引號，解決方法以下：

    # 用execute()幫咱們作字符串拼接，無需且必定不能再爲%s加引號（由於pymysql會自動加上）
    sql = "select * from userinfo where name=%s and pwd=%s"
    result = cur.execute(sql,[username,pwd])  # 第二個參數能夠是列表
    result = cur.execute(sql,(username,pwd))  # 也能夠是元組


    # 當execute()的第二個參數是字典時，sql中應該加上key，以下
    sql = "select * from userinfo where name=%(key1)s and pwd=%(key2)s"
    result = cur.execute(sql,{'key1':username,'key2':pwd})

三、pymysql中對數據庫增、刪、改：conn.commit()

　　commit()方法：經過pymysql對數據庫進行增、刪、改時，必須用commit()方法提交，不然無效。

示例：

    import pymysql
    username = input('請輸入用戶名：')
    pwd = input('請輸入密碼：')

    # 一、鏈接
    conn = pymysql.connect(
        host = '127.0.0.1',
        port = 3306,
        user = 'root',
        password = '123',
        db = 'mydb',
        charset = 'utf8'
    )
    # 二、建立遊標對象
    cur = conn.cursor()

    # 三、執行sql語句
    # 增
    sql = "insert into userinfo(name,pwd) values (%s,%s)"
    result = cur.execute(sql,[username,pwd])
    print(result)  # 輸出 1
    # 同時插入多條數據
    effect_row = cur.executemany(sql,[('張三','110'),('李四','119')])
    print(effect_row)  # 輸出 2

    # 刪
    sql = "delete from userinfo where id=1"
    effect_row = cur.execute(sql)
    print(effect_row)  # 1

    # 改
    sql = "update userinfo set name=%s where id=2"
    effect_row = cur.execute(sql,username)
    print(effect_row)  # 1

    # 四、增、刪、改以後必定要commit
    conn.commit()

    # 五、關閉：遊標和鏈接都要關閉
    cur.close()
    conn.close()

四、pymysql中對數據庫查詢：fetchone()、fetchall()、fetchmany(n)

有以下表內容：

    mysql> select * from userinfo;
    +----+--------+-----+
    | id | name   | pwd |
    +----+--------+-----+
    |  1 | wll    | 123 |
    |  2 | ssx    | 456 |
    |  3 | 張三    | 123 |
    |  4 | 李四    | 456 |
    +----+--------+-----+

示例一：fetchone() – 獲取一行數據，第一次爲首行

    import pymysql
    conn = pymysql.connect(host='localhost', port=3306, user='root', password='123', db='mydb', charset='utf8')         # 此處不能寫 utf-8 ，不然報錯
    cur = conn.cursor()
    sql = "select * from userinfo"
    effct_row = cur.execute(sql)
    print(effct_row)  # 4

    row = cur.fetchone()  # 查詢第一行的數據
    print(row)  # (1, 'wll', '123')
    row = cur.fetchone()  # 從上次位置繼續，即查詢第二行數據
    print(row)  # (2, 'ssx', '456')
    cur.close()
    conn.close()

示例二：fetchall() - 獲取全部行數據

    import pymysql
    conn = pymysql.connect(host='localhost', port=3306, user='root', password='123', db='mydb', charset='utf8')
    cur = conn.cursor()
    sql = "select * from userinfo"
    effct_row = cur.execute(sql)
    print(effct_row)  # 4

    rows = cur.fetchall()  # 查詢全部行的數據
    print(rows)
    # 結果爲：
    # ((1, 'wll', '123'), (2, 'ssx', '456'), (3, '張三', '123'), (4, '李四', '456'))
    cur.close()
    conn.close()

總結：從上例中輸出結果能夠看出，咱們獲取到的返回值是一個元組，每一行數據也是一個元組，因此咱們沒法知道數據對應的字段是什麼，這個時候，能夠經過以下方式將每一行的數據變爲一個字典，字典的key就是字段名，value就是對應的值，以下：

    # 在實例化遊標對象的時候，將屬性cursor設置爲    pymysql.cursors.DictCursor
    cur = conn.cursor(cursor=pymysql.cursors.DictCursor)
    # 結果爲：
    # [
    #   {'id': 1, 'name': 'wll', 'pwd': '123'},
    #   {'id': 2, 'name': 'ssx', 'pwd': '456'},
    #   {'id': 3, 'name': '張三', 'pwd': '123'},
    #   {'id': 4, 'name': '李四', 'pwd': '456'}
    # ]

示例三：移動指針位置

fetchone示例中，在獲取行數據的時候，能夠理解爲，剛開始，有一個行指針指着第一行的上方，獲取一行，它就向下移動一行，因此當行指針移到最後一行的時候，就不能再獲取到內容，因此咱們可使用以下方法來移動行指針：

    cur.scroll(1,mode='relative')  # 相對當前位置移動
    cur.scroll(1,mode='absolute')  # 相對首行位置移動

參數解釋：

第一個值爲移動的行數，正數爲向下移動，負數爲向上移動；mode指定了是相對當前位置移動，仍是相對於首行移動。

代碼：

    import pymysql
    conn = pymysql.connect(host='localhost', port=3306, user='root', password='123', db='mydb', charset='utf8')
    cur = conn.cursor(cursor=pymysql.cursors.DictCursor)
    sql = 'select * from userinfo'
    effct_row = cur.execute(sql)

    row = cur.fetchone()   # 查詢第一行的數據
    print(row)  # {'id': 1, 'name': 'wll', 'pwd': '123'}
    row = cur.fetchone()  # 查詢第二行數據
    print(row)  # {'id': 2, 'name': 'ssx', 'pwd': '456'}

    cur.scroll(-1,mode='relative')
    row = cur.fetchone()
    print(row)   # {'id': 2, 'name': 'ssx', 'pwd': '456'}

    cur.scroll(0,mode='absolute')
    row = cur.fetchone()
    print(row)   # {'id': 1, 'name': 'wll', 'pwd': '123'}

    cur.close()
    conn.close()

示例四：fetchmany(n) - 獲取n行數據

    import pymysql
    conn = pymysql.connect(host='localhost', port=3306, user='root', password='123', db='mydb', charset='utf8')
    cur = conn.cursor(cursor=pymysql.cursors.DictCursor)
    sql = 'select * from userinfo'
    effct_row = cur.execute(sql)

    rows = cur.fetchmany(2)   # 獲取2 條數據
    print(rows)
    # 結果爲：
    # [
    # {'id': 1, 'name': 'wll', 'pwd': '123'}, 
    # {'id': 2, 'name': 'ssx', 'pwd': '456'}
    # ]
    cur.close()
    conn.close()

2、索引

一、索引的介紹

數據庫中專門用於幫助用戶快速查找數據的一種數據結構，相似於字典中的目錄，查找字典內容時能夠根據目錄查找到數據的存放位置，而後直接獲取。

二、索引的做用

約束和加速查找。

三、常見的幾種索引

1）普通索引：加速查找做用

示例一：建立表的時候設置普通索引

    create table userinfo(
        id int not null auto_increment primary key,
        name varchar(32) not null,
        email varchar(64) not null,
        index ix_name(name)   # 設置普通索引
    );

示例二：已經建立完表以後單首創建普通索引（慢）

    create index 索引的名字 on 表名(列名);

示例三：刪除普通索引（快）

    drop index 索引名 on 表名;

示例四：查看索引

    show index from 表名;

2）惟一索引：加速查找和約束惟一做用（能夠爲空）

示例一：建立表的時候設置惟一索引

    create table userinfo(
        id int not null auto_increment primary key,
        name varchar(32) not null,
        email varchar(64) not null,
        unique index ix_name(name)  # 設置惟一索引（name就有了惟一的約束）
    );

示例二：單獨設置惟一索引

    create unique index 索引名 on 表名(列名);

示例三：刪除惟一索引

    drop index 索引名 on 表名;

3）主鍵索引：加速查找和約束惟一做用（不能夠爲空）

　　示例一：建立表的時候設置主鍵索引

    create table userinfo(
        id int not null auto_increment primary key,  # 設置主鍵就是主鍵索引
        name varchar(32) not null,
        email varchar(64) not null,
    );
    或者
    create table userinfo(
        id int not null auto_increment,
        name varchar(32) not null,
        email varchar(64) not null,
        primary key(id)      # 設置主鍵，就建立主鍵索引
    );

　　示例二：單首創建主鍵索引

    alter table 表名 add primary key(列名);

　　示例三：刪除主鍵索引

    alter table 表名 drop primary key;
    alter table 表名  modify 列名 int, drop primary key;

4）聯合索引（多列）

又分爲：聯合主鍵索引、聯合惟一索引、聯合普通索引。

應用場景：頻繁的同時使用n列來進行查詢，

如：where name = ‘alex’and email = ‘alex@qq.com’;

示例一：建立聯合普通索引

    create index 索引名 on 表名(列名1,列名2);

四、覆蓋現象和合並現象

示例一：查找字段和索引字段相同，則直接在索引文件中獲取數據

    select name from userinfo where name = 'alex50000';  # 直接索引文件中獲取
    select * from userinfo where name = 'alex50000'; # 先查索引文件，再查物理表

示例二：多個單列索引同時做爲條件時，索引則合併使用

    select * from  userinfo where name = 'alex13131' and id = 13131;

五、如何正確使用索引

數據庫表中添加索引後確實會讓查詢速度起飛，但前提必須是正確的使用索引來查詢，若是以錯誤的方式使用，則即便創建索引也會不奏效。

使用索引，咱們必須遵循如下幾點：

1）建立索引；

2）命中索引；

3）正確使用索引；

準備一個含有300w數據的表：

    # 1. 準備表
    create table userinfo(
        id int,
        name varchar(20),
        gender char(6),
        email varchar(50)
    );

    # 2. 建立存儲過程，實現批量插入記錄
    delimiter $$     # 聲明存儲過程的結束符號爲$$
    create procedure auto_insert1()
    BEGIN
    　　declare i int default 1;
    　　while(i<3000000)do
    　　　　insert into userinfo values(i,concat('alex',i),'male',concat('egon',i,'@oldboy'));
    　　　　set i=i+1;
    　　end while;
    END$$    # $$結束
    delimiter ;     # 從新聲明分號爲結束符號

    # 3. 查看存儲過程
    show create procedure auto_insert1\G;

    # 4. 調用存儲過程
    call auto_insert1();

測試以下查詢語句，體會如下不正確使用索引的狀況，理解如何正確使用索引：

    # 示例一：like '%xx'
    select * from userinfo where name like '%al';

    # 示例二：使用函數
    select * from userinfo where reverse(name) = 'alex333';

    # 示例三：or
    select * from userinfo where id = 1 or email = 'alex122@oldbody';
    # 注意：當or條件中有未創建索引的列才失效，如下兩種會走索引：
    select * from userinfo where id = 1 or name = 'alex1222';
    select * from userinfo where id = 1 or email = 'alex122@oldbody' and name = 'alex112'

    # 示例四：類型不一致
    select * from userinfo where name = 999; # 表中name字段是字符串
    # 解釋：若某字段是字符串類型，則查詢條件中必須帶引號，不然即便該字段有索引，速度也很慢

    # 示例五：！=
    select count(*) from userinfo where name != 'alex';
    # 注意：若是是主鍵，則仍是會走索引

    # 示例六：>
    select * from userinfo where name > 'alex';
    # 注意：若是是主鍵或者字段是整數類型，則仍是會走索引，以下：
    select * from userinfo where id > 123
    select * from userinfo where num > 123

    # 示例七：order by
    select email from userinfo order by name desc;
    # 注意：當根據索引排序的時候，選擇的映射若是不是索引，則不走索引

    # 示例八：聯合索引最左前綴匹配

PS：什麼是最左前綴匹配？

　　create index ix_name_email on userinfo(name,email); # 建立聯合索引，name在左

　　select * from userinfo where name = 'alex'; # 查找速度快

　　select * from userinfo where name = 'alex' and email='alex@oldBody'; # 快

　　select * from userinfo where email='alex@oldBody'; # 慢

分析：若是建立了聯合索引，如上邊代碼，建立name和email聯合索引，那麼查詢

（1）name和email時 -- 使用索引，速度快

（2）name -- 使用索引，速度快

（3）email -- 不使用索引，速度慢

注意：對於同時搜索n個條件時，組合索引的性能 > 多個單列索引合併的性能。

六、使用索引的注意事項

1）避免使用select *；

　　2）count(1)或count(列) 代替count(*)；

　　3）建立表時儘可能使用char代替varchar；

　　4）表的字段順序固定長度的字段優先；

　　5）組合索引代替多個單列索引（常用多個條件查詢時）；

　　6）儘可能使用短索引（create index ix_title on tb(title(16));僅限特殊的數據類型text）；

　　7）使用鏈接（join）來代替子查詢；

　　8）連表時注意條件類型需一致；

　　9）索引散列（有重複且種類少）不適用於建索引，例如：性別不合適；

七、執行計劃

explain + 查詢SQL ：用於顯示SQL執行信息參數，根據參考信息能夠進行SQL優化。以下示例：

　　mysql> explain select * from userinfo;
　　+----+-------------+----------+------+---------------+------+---------+------+---------+-------+
　　| id | select_type | table    | type | possible_keys | key  | key_len | ref  | rows    | Extra |
　　+----+-------------+----------+------+---------------+------+---------+------+---------+-------+
　　|  1 | SIMPLE      | userinfo | ALL  | NULL          | NULL | NULL    | NULL | 2973016 | NULL  |
　　+----+-------------+----------+------+---------------+------+---------+------+---------+-------+


　　mysql> explain select * from (select id,name from userinfo where id <20) as A;
　　+----+-------------+------------+-------+---------------+---------+---------+------+------+-------------+
　　| id | select_type | table      | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
　　+----+-------------+------------+-------+---------------+---------+---------+------+------+-------------+
　　|  1 | PRIMARY     | <derived2> | ALL   | NULL          | NULL    | NULL    | NULL |   19 | NULL        |
　　|  2 | DERIVED     | userinfo   | range | PRIMARY       | PRIMARY | 4       | NULL |   19 | Using where |
　　+----+-------------+------------+-------+---------------+---------+---------+------+------+-------------+

參數說明：

    select_type（查詢類型）：
        SIMPLE      ---     簡單查詢
        PRIMARY    ---      最外層查詢
        SUBQUERY  ---       映射爲子查詢
        DERIVED    ---      子查詢
        UNION      ---     聯合
        UNION RESULT  ---  使用聯合的結果
    table（正在訪問的表名）
    type（查詢時的訪問方式）：
        性能：all < index < range < index_merge < ref_or_null < ref < eq_ref < system/const
            all --- 全表掃描，對於數據表從頭至尾找一遍（若是有limit限制，則找到以後再也不向下找）；
            index --- 全索引掃描，對索引從頭至尾找一遍；
            range --- 對索引列進行範圍查找；
            index_merge --- 合併索引，使用多個單列索引搜索；
            ref --- 根據索引查找一個或多個值；
            eq_ref --- 鏈接時使用primary key或unique類型；
            system --- 系統，表僅有一行（=系統表），這是const鏈接類型的一個特例；
            const --- 常量，表最多有一個匹配行，由於僅有一行，在這行的列值可被優化器剩餘部分認爲是常數，const表很快,由於它們只讀取一次；
    possible_keys（可能使用的索引）
    key：真實使用的
    key_len（MySQL中使用索引字節長度）：
    rows（MySQL估計爲了找到所需的行而要讀取的行數，只是預估值）：
    extra（該列包含MySQL解決查詢的詳細信息）：
        Using index --- 此值表示mysql將使用覆蓋索引，以免訪問表。不要把覆蓋索引和index訪問類型弄混了；
        Using where --- 這意味着mysql服務器將在存儲引擎檢索行後再進行過濾，許多where條件裏涉及索引中的列，當（而且若是）它讀取索引時，就能被存儲引擎檢驗，所以不是全部帶where子句的查詢都會顯示「Using where」。有時「Using where」的出現就是一個暗示：查詢可受益於不一樣的索引；
        Using temporary --- 這意味着mysql在對查詢結果排序時會使用一個臨時表；
        Using filesort --- 這意味着mysql會對結果使用一個外部索引排序，而不是按索引次序從表裏讀取行。mysql有兩種文件排序算法，這兩種排序方式均可以在內存或者磁盤上完成，explain不會告訴你mysql將使用哪種文件排序，也不會告訴你排序會在內存裏仍是磁盤上完成；
        Range checked for each record(index map: N) --- 這個意味着沒有好用的索引，新的索引將在聯接的每一行上從新估算，N是顯示在possible_keys列中索引的位圖，而且是冗餘的；

八、慢日誌記錄

開啓慢查詢日誌，可讓MySQL記錄下查詢超過指定時間的語句，經過定位分析性能的瓶頸，才能更好的優化數據庫系統的性能。

1）進入MySQL查詢是否開啓了慢查詢日誌

    show variables like 'slow_query%';

　　參數解釋：

　　　　　　slow_query_log：慢查詢開啓狀態（OFF未開啓，ON爲開啓）；

　　　　　　slow_query_log_file：慢查詢日誌存放的位置；

2）查看慢查詢超時時間（默認10秒）

    show variables like 'long%';

3）開啓慢日誌方式一：

    set global slow_query_log=1;   # 1表示開啓，0表示關閉

注意：設置關要退出從新進入才生效。

4）開啓慢日誌方式二（推薦）：

修改my.ini配置文件（mac中爲my.cnf文件），找到[mysqld]，在下面添加：

    slow_query_log = 1
    slow_query_log_file=C:\mysql-5.6.40-winx64\data\localhost-slow.log
    long_query_time = 1

參數解釋：

　　　　　　slow_query_log：慢查詢開啓狀態，1爲開啓

　　　　　　slow_query_log_file：慢查詢日誌存放的位置

　　　　　　long_query_time：查詢超過多少秒才記錄，默認10秒，修改成1秒

九、分頁性能相關方案

　　先回顧一下，如何取當前表中的前10條記錄，每十條取一次，依次獲得每頁數據，以下：

    # 第1頁：
    select * from userinfo limit 0,10;
    # 第2頁：
    select * from userinfo limit 10,10;
    # 第3頁：
    select * from userinfo limit 20,10;
    # 第4頁：
    select * from userinfo limit 30,10;
    ......
    # 第200001頁
    select * from userinfo limit 2000000,10;

　　PS:咱們會發現，越日後查詢，須要的時間約長，此方法要進行全文掃描查詢，越日後查，掃描查詢的數據越多。

解決方案：

前提：作一個記錄，記錄當前頁的第一條數據min_id或者最後一條數據max_id

    # 下一頁
    select * from userinfo where id>max_id limit 10;
    # 上一頁
    select * from userinfo where id<min_id order by id desc limit 10;

2）中間有頁碼的狀況

    select * from userinfo where id in(
        select id from (select * from userinfo 
            where id > pre_max_id limit (cur_max_id-pre_max_id)*10) as A 
        order by A.id desc limit 10
    );