• 生成器是怎樣節約內存的?
• 使用生成器的最佳時機是什麼?
• 我如何使用 itertools 來建立複雜的生成器工做流?
• 延遲估值什麼時候有益,什麼時候無益?html
From: https://www.dataquest.io/blog/python-generators-tutorial/python
• The basic terminology needed to understand generators編程
• What a generator is閉包
• How to create your own generatorsapp
• How to use a generator and generator methodside
• When to use a generator函數
自定義xrange使用yield,採用的方法是依次計算。 oop
In [16]: def xrange(start, stop, step=1): ...: while start < stop: ...: yield start ...: start += step ...: In [17]: for i in xrange(1,100): ...: print(i)
def fibonacci(n): a, b = 0, 1 while n > 0: yield b a, b = b, a + b n -= 1 def Fibonacci_Yield(n): # return [f for i, f in enumerate(Fibonacci_Yield_tool(n))] return list(fibonacci(n))
for 循環中的自定義序列。post
def fibonacci_transform(): count = 0 for f in fibonacci(): if f > 5000: break if f % 2: count += 1 return count
—— 主要關注如何處理大數據,並具有什麼優點。性能
Ref: Python Generators
Big Data. This is a somewhat nebulous term, and so we won’t delve into the various Big Data definitions here. Suffice to say that any Big Data file is too big to assign to a variable.
尤爲是List不方便一會兒裝載到內存的時候。
beer_data = "recipeData.csv"
lines = (line for line in open(beer_data, encoding="ISO-8859-1"))
建議把這裏的open事先改成:with ... as。
Once we ask for the next value of a generator, the old value is discarded.
Once we go through the entire generator, it is also discarded from memory as well.
beer_data = "recipeData.csv"
lines = (line for line in open(beer_data, encoding="ISO-8859-1")) # (1) 得到了「一行」 lists = (l.split(",") for l in lines) # (2) 對這「一行」進行分解
先得到第一行的title,也就是column將做爲key;而後從第二行開始的值做爲value。
['BeerID', 'Name', 'URL', ..., 'PrimaryTemp', 'PrimingMethod', 'PrimingAmount', 'UserId\n']
zip()將兩個list的元素配對,而後轉換爲dict。
樣例模板
beer_data = "recipeData.csv"
lines = (line for line in open(beer_data, encoding="ISO-8859-1")) lists = (l.split(",") for l in lines)
#-----------------------------------------------------------------------------
# Take the column names out of the generator and store them, leaving only data columns = next(lists) # Take these columns and use them to create an informative dictionary beerdicts = (dict(zip(columns, data)) for data in lists)
bd["Style"] 做爲每一條數據的類別的key,拿來作統計用。
# 遍歷每一條,並統計beer的類型 beer_counts = {} for bd in beerdicts: if bd["Style"] not in beer_counts: beer_counts[bd["Style"]] = 1 else: beer_counts[bd["Style"]] += 1 # 獲得beer類型的統計結果:beer_counts most_popular = 0 most_popular_type = None for beer, count in beer_counts.items(): if count > most_popular: most_popular = count most_popular_type = beer most_popular_type >>> "American IPA"
# 再經過這個結果,處理相關數據 abv = (float(bd["ABV"]) for bd in beerdicts if bd["Style"] == "American IPA")
定義了一個「內存環保」的計算素數的函數primes()。
def _odd_iter(): n = 1 while True: n = n + 2 yield n
# 保存一個breakpoint,下次在此基礎上計算
def _not_divisible(n): return lambda x: x % n > 0
# 對每個元素x 都去作一次處理,參數是n
def primes(): yield 2 it = _odd_iter() # (1).初始"惰性序列"
while True: n = next(it) # (2).n是在歷史記錄的基礎上計算而得 yield n it = filter(_not_divisible(n), it) # (3).構造新序列,it表明的序列是無限的;
這裏妙在,在邏輯上保證了it表明的序列是個無限序列,但實際上在物理意義上又不可能。
例如,當n = 9時?首選,n不可能等於9,由於後面會「不當心」yield出去。
Stack Overflow: How to explain this 「lambda in filter changes the result when calculate primes"
此問題涉及到 Lambda如何使用,以及閉包的風險:[Python] 07 - Statements --> Functions
# odd_iter = filter(not_divisible(odd), odd_iter) # <--(1) odd_iter = filter((lambda x: x%odd>0) , odd_iter) # <--(2)
當yield的這種lazy機制出現時,謹慎使用lambda;注意保護好」內部變量「。
# Sieve of Eratosthenes # Code by David Eppstein, UC Irvine, 28 Feb 2002 # http://code.activestate.com/recipes/117119/ def gen_primes(): """ Generate an infinite sequence of prime numbers. """ # Maps composites to primes witnessing their compositeness. # This is memory efficient, as the sieve is not "run forward" # indefinitely, but only as long as required by the current # number being tested. # D = {} # The running integer that's checked for primeness q = 2 while True:
if q not in D: # q is a new prime. # Yield it and mark its first multiple that isn't # already marked in previous iterations # yield q D[q * q] = [q] else: # q is composite. D[q] is the list of primes that # divide it. Since we've reached q, we no longer # need it in the map, but we'll mark the next # multiples of its witnesses to prepare for larger # numbers # for p in D[q]: D.setdefault(p + q, []).append(p)
print("else: {}, {}".format(q, D))
del D[q] q += 1
... loop: 2, {} 2 loop: 3, {4: [2]} 3 loop: 4, {4: [2], 9: [3]} else: 4, {4: [2], 9: [3], 6: [2]} loop: 5, {9: [3], 6: [2]} 5 loop: 6, {9: [3], 6: [2], 25: [5]} else: 6, {9: [3], 6: [2], 25: [5], 8: [2]} loop: 7, {9: [3], 25: [5], 8: [2]} 7 loop: 8, {9: [3], 25: [5], 8: [2], 49: [7]} else: 8, {9: [3], 25: [5], 8: [2], 49: [7], 10: [2]} loop: 9, {9: [3], 25: [5], 49: [7], 10: [2]} else: 9, {9: [3], 25: [5], 49: [7], 10: [2], 12: [3]} loop: 10, {25: [5], 49: [7], 10: [2], 12: [3]} else: 10, {25: [5], 49: [7], 10: [2], 12: [3, 2]} loop: 11, {25: [5], 49: [7], 12: [3, 2]} 11 loop: 12, {25: [5], 49: [7], 12: [3, 2], 121: [11]} else: 12, {25: [5], 49: [7], 12: [3, 2], 121: [11], 15: [3]} else: 12, {25: [5], 49: [7], 12: [3, 2], 121: [11], 15: [3], 14: [2]} loop: 13, {25: [5], 49: [7], 121: [11], 15: [3], 14: [2]} 13 loop: 14, {25: [5], 49: [7], 121: [11], 15: [3], 14: [2], 169: [13]} else: 14, {25: [5], 49: [7], 121: [11], 15: [3], 14: [2], 169: [13], 16: [2]} loop: 15, {25: [5], 49: [7], 121: [11], 15: [3], 169: [13], 16: [2]} else: 15, {25: [5], 49: [7], 121: [11], 15: [3], 169: [13], 16: [2], 18: [3]} loop: 16, {25: [5], 49: [7], 121: [11], 169: [13], 16: [2], 18: [3]} else: 16, {25: [5], 49: [7], 121: [11], 169: [13], 16: [2], 18: [3, 2]} loop: 17, {25: [5], 49: [7], 121: [11], 169: [13], 18: [3, 2]} 17 loop: 18, {25: [5], 49: [7], 121: [11], 169: [13], 18: [3, 2], 289: [17]} else: 18, {25: [5], 49: [7], 121: [11], 169: [13], 18: [3, 2], 289: [17], 21: [3]} else: 18, {25: [5], 49: [7], 121: [11], 169: [13], 18: [3, 2], 289: [17], 21: [3], 20: [2]} loop: 19, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 21: [3], 20: [2]} 19 loop: 20, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 21: [3], 20: [2], 361: [19]} else: 20, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 21: [3], 20: [2], 361: [19], 22: [2]} loop: 21, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 21: [3], 361: [19], 22: [2]} else: 21, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 21: [3], 361: [19], 22: [2], 24: [3]} loop: 22, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 22: [2], 24: [3]} else: 22, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 22: [2], 24: [3, 2]} loop: 23, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 24: [3, 2]} 23 loop: 24, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 24: [3, 2], 529: [23]} else: 24, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 24: [3, 2], 529: [23], 27: [3]} else: 24, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 24: [3, 2], 529: [23], 27: [3], 26: [2]} loop: 25, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 529: [23], 27: [3], 26: [2]} else: 25, {25: [5], 49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 529: [23], 27: [3], 26: [2], 30: [5]} loop: 26, {49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 529: [23], 27: [3], 26: [2], 30: [5]} else: 26, {49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 529: [23], 27: [3], 26: [2], 30: [5], 28: [2]} loop: 27, {49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 529: [23], 27: [3], 30: [5], 28: [2]} else: 27, {49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 529: [23], 27: [3], 30: [5, 3], 28: [2]} loop: 28, {49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 529: [23], 30: [5, 3], 28: [2]} else: 28, {49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 529: [23], 30: [5, 3, 2], 28: [2]} loop: 29, {49: [7], 121: [11], 169: [13], 289: [17], 361: [19], 529: [23], 30: [5, 3, 2]} 29
End.