[Python3]踩坑實錄-優化技巧1

時間 2020-06-14

原文原文鏈接

選擇合適的數據結構html

考慮不一樣的應用場景，應選擇不一樣的數據結構
好比在查找多於插入的場景中，考慮字典Dict是否是更適合;
由於在Python3中, 字典Dict 經過hash把key映射到hash table的不一樣位置(或者說不一樣的bucket中)，
所以查找操做的複雜度爲 O(1)；python
而列表list對象實際是個數組，完成相同的查找須要遍歷整個list，其複雜度爲 O(n)，
所以對成員的查找訪問等操做字典要比 list 更快。數組
集合Set 跟字典Dict比較相似，查找操做的複雜度爲 O(1)，由於其本質是一個建和值相同的dict,
不一樣點在於比較和插入的時候須要兩步比較，第一步經過__hash__方法比較，不相同則寫入，
若是是相同則進行第二步__eq__方法判斷，若是還相同則丟棄，若是不一樣則寫入。
這也是爲何下面的結果中set會比dict慢一點的緣由。數據結構

import string
import time
import random

if __name__ == '__main__':
    # generate a list containing a-z, 26 characters
    # 生成包含26個字母 的三種存儲對象
    array = [i for i in string.ascii_lowercase]  # ['a', 'b', 'c', 'd', 'e', 'f'....
    dictionary = dict.fromkeys(array, 1)  # {'a': 1, 'b': 1, 'c': 1, 'd': 1....
    bag = {i for i in string.ascii_lowercase}  # {'q', 'v', 'u', 'y', 'z'...

    # set random seed
    random.seed(666)

    # generate test data which contains some characters in alphabet and some special symbol
    # 固定隨機種子，生成10000000個隨機數據， 一些事字母 一些特殊字符
    test_data = random.choices([chr(i) for i in range(0, 123)], k=10000000)
    count1, count2, count3 = 0, 0, 0
    start = time.time()

    # 若是是字母 結果加一
    for val in test_data:
        count1 = count1 + 1 if val in array else count1

    print(count1)
    print("when using List, Execution Time: %.6f s." % (time.time() - start))  # 4.470003 s.
    start = time.time()

    for val in test_data:
        count2 = count2 + 1 if val in dictionary else count2

    print(count2)
    print("when using Dict Execution Time: %.6f s." % (time.time() - start))  # 1.020261 s.
    start = time.time()

    for val in test_data:
        count3 = count3 + 1 if val in bag else count3

    print(count3)
    print("when using Set Execution Time: %.6f s." % (time.time() - start))  # 1.045259 s.

對循環的優化app

基本原則是減小循環的次數和循環內的計算量；此外除了邏輯層面的優化以外，
還要在代碼實現上下功夫。儘可能使用列表解析（list comprehension），生成器(generator),
還有map,reduce操做; 而不是全員for循環dom

import time
import random

if __name__ == '__main__':

    # set random seed
    random.seed(666)
    start = time.time()

    length = 1000000
    # generate test data which contains some characters in alphabet and some special symbol
    # 固定隨機種子，生成10000000個隨機數據， 一些事字母 一些特殊字符
    list_exp_result = [chr(random.randint(0, 123)) for _ in range(length)]

    print(len(list_exp_result))
    print("when using list comprehension, Execution Time: %.6f s." % (time.time() - start))  # 1.195765 s.
    start = time.time()

    for_exp_result = list()
    for _ in range(length):
        for_exp_result.append(chr(random.randint(0, 123)))

    print(len(for_exp_result))
    print("when using normal for loop, Execution Time: %.6f s." % (time.time() - start))  # 1.306519 s.
    start = time.time()

    map_exp_result = list(map(lambda v: random.randint(0, 123), range(length)))
    print(len(map_exp_result))
    print("when using map task, Execution Time: %.6f s." % (time.time() - start))  # 1.153902 s.

更多詳細探究，請移步[Python3]爲何map比for循環快函數

其餘零碎小技巧oop
- 使用局部變量，避免"global" 關鍵字
- if done is not None 比語句 if done != None 更快
- 使用級聯比較 "x < y < z" 而不是 "x < y and y < z"
- while 1 要比 while True 更快
- build in 函數一般較快，add(a,b) 要快於 a + b
- 複製列表時，使用：new_list = list(old_list)

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。