[Advanced Python] 14 - "Generator": calculating prime

時間 2019-11-18

標籤 advanced python generator calculating prime 欄目 Python 简体版

原文原文鏈接

高性能編程

幾個核心問題

• 生成器是怎樣節約內存的？
• 使用生成器的最佳時機是什麼？
• 我如何使用 itertools 來建立複雜的生成器工做流？
• 延遲估值什麼時候有益，什麼時候無益？html

From: https://www.dataquest.io/blog/python-generators-tutorial/python

• The basic terminology needed to understand generators編程

• What a generator is閉包

• How to create your own generatorsapp

• How to use a generator and generator methodside

• When to use a generator函數

有限數列狀況

案例一：xrange，節省內存

自定義xrange使用yield，採用的方法是依次計算。　　oop

In [16]: def xrange(start, stop, step=1): 
    ...:     while start < stop: 
    ...:         yield start 
    ...:         start += step 
    ...:                                                                        
                                                                   
In [17]: for i in xrange(1,100): 
    ...:     print(i)

無限數列狀況

案例二：Fibonacci Sequence

def fibonacci(n):
    a, b = 0, 1
    while n > 0:
        yield b
        a, b = b, a + b
        n -= 1


def Fibonacci_Yield(n):
    # return [f for i, f in enumerate(Fibonacci_Yield_tool(n))]
    return list(fibonacci(n))

案例三：fibonacci中有幾個奇數

for 循環中的自定義序列。post

def fibonacci_transform():
　　count = 0
　　for f in fibonacci():
　　　　if f > 5000: 
　　　　　　break
　　　　if f % 2: 
　　　　　　count += 1

　　return count

生成器的延時估值

—— 主要關注如何處理大數據，並具有什麼優點。性能

Ref: Python Generators

Big Data. This is a somewhat nebulous term, and so we won’t delve into the various Big Data definitions here. Suffice to say that any Big Data file is too big to assign to a variable.

尤爲是List不方便一會兒裝載到內存的時候。

Load beer data in big data.

beer_data = "recipeData.csv"

lines =  (line for line in open(beer_data, encoding="ISO-8859-1"))

建議把這裏的open事先改成：with ... as。

Laziness and generators

Once we ask for the next value of a generator, the old value is discarded.

Once we go through the entire generator, it is also discarded from memory as well.

Build pipeline

beer_data = "recipeData.csv"

lines = (line for line in open(beer_data, encoding="ISO-8859-1"))　　# (1) 得到了「一行」
lists = (l.split(",") for l in lines)　　# (2) 對這「一行」進行分解

Operation in pipeline

先得到第一行的title，也就是column將做爲key；而後從第二行開始的值做爲value。

['BeerID', 'Name', 'URL', ..., 'PrimaryTemp', 'PrimingMethod', 'PrimingAmount', 'UserId\n']

zip()將兩個list的元素配對，而後轉換爲dict。

樣例模板

beer_data = "recipeData.csv"

lines = (line for line in open(beer_data, encoding="ISO-8859-1"))
lists = (l.split(",") for l in lines)

#-----------------------------------------------------------------------------

# Take the column names out of the generator and store them, leaving only data
columns = next(lists)

# Take these columns and use them to create an informative dictionary
beerdicts = (dict(zip(columns, data)) for data in lists)

Type statistics

bd["Style"] 做爲每一條數據的類別的key，拿來作統計用。

# 遍歷每一條，並統計beer的類型
beer_counts = {}
for bd in beerdicts:
    if bd["Style"] not in beer_counts:
        beer_counts[bd["Style"]] = 1
    else:
        beer_counts[bd["Style"]] += 1

# 獲得beer類型的統計結果：beer_counts
most_popular = 0
most_popular_type = None
for beer, count in beer_counts.items():
    if count > most_popular:
        most_popular      = count
        most_popular_type = beer

most_popular_type
>>> "American IPA"

# 再經過這個結果，處理相關數據
abv = (float(bd["ABV"]) for bd in beerdicts if bd["Style"] == "American IPA")

榜樣案例

1、prime number

next 結合 yield

定義了一個「內存環保」的計算素數的函數primes()。

def _odd_iter():
    n = 1
    while True:
        n = n + 2
        yield n

# 保存一個breakpoint，下次在此基礎上計算


def _not_divisible(n):
    return lambda x: x % n > 0

# 對每個元素x 都去作一次處理，參數是n 


def primes():
    yield 2 it = _odd_iter()                        # (1).初始"惰性序列"

    while True:
        n = next(it)                        # (2).n是在歷史記錄的基礎上計算而得
        yield n
        it = filter(_not_divisible(n), it)  # (3).構造新序列,it表明的序列是無限的；

這裏妙在，在邏輯上保證了it表明的序列是個無限序列，但實際上在物理意義上又不可能。

例如，當n = 9時？首選，n不可能等於9，由於後面會「不當心」yield出去。

閉包帶來的問題

Stack Overflow: How to explain this 「lambda in filter changes the result when calculate primes"

此問題涉及到 Lambda如何使用，以及閉包的風險：[Python] 07 - Statements --> Functions

# odd_iter = filter(not_divisible(odd), odd_iter)  # <--(1)
odd_iter = filter((lambda x: x%odd>0) , odd_iter)  # <--(2)

　　　　當yield的這種lazy機制出現時，謹慎使用lambda；注意保護好」內部變量「。

質數生成的"高效方案"

# Sieve of Eratosthenes
# Code by David Eppstein, UC Irvine, 28 Feb 2002
# http://code.activestate.com/recipes/117119/

def gen_primes():
    """ Generate an infinite sequence of prime numbers.
    """
    # Maps composites to primes witnessing their compositeness.
    # This is memory efficient, as the sieve is not "run forward"
    # indefinitely, but only as long as required by the current
    # number being tested.
    #
    D = {}
    
    # The running integer that's checked for primeness
    q = 2
    
    while True:

　　　　　　print()

　　　　　　print("loop: {}, {}".format(q, D))

        if q not in D:
            # q is a new prime.
            # Yield it and mark its first multiple that isn't
            # already marked in previous iterations
            # 
            yield q
            D[q * q] = [q]
        else:
            # q is composite. D[q] is the list of primes that
            # divide it. Since we've reached q, we no longer
            # need it in the map, but we'll mark the next 
            # multiples of its witnesses to prepare for larger
            # numbers
            # 
            for p in D[q]:
                D.setdefault(p + q, []).append(p)
                print("else: {}, {}".format(q, D))

del D[q]
        
        q += 1

...

loop: 2, {}
2

loop: 3, {4: [2]}
3

loop: 4, {4: [2], 9: [3]}
else: 4, {4: [2], 9: [3], 6: [2]}

loop: 5, {9: [3], 6: [2]}
5

loop: 6, {9: [3], 6: [2], 25: [5]}
else: 6, {9: [3], 6: [2], 25: [5], 8: [2]}

loop: 7, {9: [3], 25: [5], 8: [2]}
7

loop: 8, {9: [3], 25: [5], 8: [2], 49: [7]}
else: 8, {9: [3], 25: [5], 8: [2], 49: [7], 10: [2]}

loop: 9, {9: [3], 25: [5], 49: [7], 10: [2]}
else: 9, {9: [3], 25: [5], 49: [7], 10: [2], 12: [3]}

loop: 10, {25: [5], 49: [7], 10: [2], 12: [3]}
else: 10, {25: [5], 49: [7], 10: [2], 12: [3, 2]}

loop: 11, {25: [5], 49: [7], 12: [3, 2]}
11

loop: 12, {25: [5], 49: [7], 12: [3, 2], 121: [11]}
else: 12, {25: [5], 49: [7], 12: [3, 2], 121: [11], 15: [3]}
else: 12, {25: [5], 49: [7], 12: [3, 2], 121: [11], 15: [3], 14: [2]}

loop: 13, {25: [5], 49: [7], 121: [11], 15: [3], 14: [2]}
13

loop: 14, {25: [5], 49: [7], 121: [11], 15: [3], 14: [2], 169: [13]}
else: 14, {25: [5], 49: [7], 121: [11], 15: [3], 14: [2], 169: [13], 16: [2]}

loop: 15, {25: [5], 49: [7], 121: [11], 15: [3], 169: [13], 16: [2]}
else: 15, {25: [5], 49: [7], 121: [11], 15: [3], 169: [13], 16: [2], 18: [3]}

loop: 16, {25: [5], 49: [7], 121: [11], 169: [13], 16: [2], 18: [3]}
else: 16, {25: [5], 49: [7], 121: [11], 169: [13], 16: [2], 18: [3, 2]}

loop: 17, {25: [5], 49: [7], 121: [11], 169: [13], 18: [3, 2]}
17

loop: 18, {25: [5], 49: [7], 121: [11], 169: [13], 18: [3, 2], 289: [17]}
else: 18, {25: [5], 49: [7], 121: [11], 169: [13], 18: [3, 2], 289: [17], 21: [3]}
else: 18, {25: [5], 49: [7], 121: [11], 169: [13], 18: [3, 2], 289: [17], 21: [3], 20: [2]}

loop: 19, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 21: [3], 20: [2]}
19

loop: 20, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 21: [3], 20: [2], 361: [19]}
else: 20, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 21: [3], 20: [2], 361: [19], 22: [2]}

loop: 21, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 21: [3], 361: [19], 22: [2]}
else: 21, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 21: [3], 361: [19], 22: [2], 24: [3]}

loop: 22, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 22: [2], 24: [3]}
else: 22, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 22: [2], 24: [3, 2]}

loop: 23, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 24: [3, 2]}
23

loop: 24, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 24: [3, 2], 529: [23]}
else: 24, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 24: [3, 2], 529: [23], 27: [3]}
else: 24, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 24: [3, 2], 529: [23], 27: [3], 26: [2]}

loop: 25, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 529: [23], 27: [3], 26: [2]}
else: 25, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 529: [23], 27: [3], 26: [2], 30: [5]}

loop: 26, {49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 529: [23], 27: [3], 26: [2], 30: [5]}
else: 26, {49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 529: [23], 27: [3], 26: [2], 30: [5], 28: [2]}

loop: 27, {49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 529: [23], 27: [3], 30: [5], 28: [2]}
else: 27, {49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 529: [23], 27: [3], 30: [5, 3], 28: [2]}

loop: 28, {49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 529: [23], 30: [5, 3], 28: [2]}
else: 28, {49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 529: [23], 30: [5, 3, 2], 28: [2]}

loop: 29, {49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 529: [23], 30: [5, 3, 2]}
29

End.

1. SP2 PRIME1 - Prime Generator
2. PAT (Advanced Level) Practice 1059 Prime Factors
3. keep calculating, keep weighing
4. MPI Calculating Of Cos
5. [轉載] Calculating Entropy
6. python generator
7. [Advanced Python] 15 - "Metaclass": ORM
8. [Advanced Python] 12 - Transfer parameters
9. PAT-ADVANCED1059——Prime Factors
10. [1002]prime
更多相關文章...
• SQLite - Python - SQLite教程
• RSS 元素 - RSS 教程
• YAML 入門教程
• RxJava操作符（一）Creating Observables

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。