PyTips 0x10 - Python 的堆與優先隊列

項目地址:https://git.io/pytipspython

Python 中內置的 heapq 庫和 queue 分別提供了堆和優先隊列結構,其中優先隊列 queue.PriorityQueue 自己也是基於 heapq 實現的,所以咱們此次重點看一下 heapqgit

堆(Heap)是一種特殊形式的徹底二叉樹,其中父節點的值老是大於子節點,根據其性質,Python 中能夠用一個知足 heap[k] <= heap[2*k+1] and heap[k] <= heap[2*k+2] 的列表來實現(heapq 也確實是這麼作的)。堆能夠用於實現調度器(例見:Python 3.5 之協程),更經常使用的是優先隊列(例如:ImageColorTheme)。github

heapq 提供了下面這些方法:api

import heapq
print(heapq.__all__)
['heappush', 'heappop', 'heapify', 'heapreplace', 'merge', 'nlargest', 'nsmallest', 'heappushpop']

因爲 Heap 是經過列表實現的,咱們能夠直接用列表建立:app

from heapq import *
heap = []
heappush(heap, 3)
heappush(heap, 2)
heappush(heap, 1)
print(heap)
[1, 3, 2]

pop 或 sort 前要確保 heapify

或者經過 heapify 將普通列表轉化爲 Heap:dom

heap = list(reversed(range(5)))
print("List: ", heap)
heapify(heap)
print("Heap: ", heap)
List:  [4, 3, 2, 1, 0]
Heap:  [0, 1, 2, 4, 3]

每次從 Heap 中 pop 出來的元素都是最小的(於是能夠據此實現堆排序):async

heap = [5,4,3,2,1]
heapify(heap)
print(heappop(heap))
print(heappop(heap))
print(heappop(heap))
1
2
3

優先隊列

queue.PriorityQueue 實際上只是對 heapq 的簡單封裝,直接使用其 heappush/heappop 方法:ide

from queue import PriorityQueue as PQueue
pq = PQueue()
pq.put((5 * -1, 'Python'))
pq.put((4 * -1, 'C'))
pq.put((3 * -1, 'Js'))
print("Inside PriorityQueue: ", pq.queue) # 內部存儲
while not pq.empty():
    print(pq.get()[1])
Inside PriorityQueue:  [(-5, 'Python'), (-4, 'C'), (-3, 'Js')]
Python
C
Js

因爲 heapq 是最小堆,而一般 PriorityQueue 用在較大有限制的排前面,因此須要給 priority * -1spa

sorted 必定是 Heap,反之未必

須要注意的是,雖然 Heap 經過 List 實習,但未通過 heapify() 處理的仍然是一個普通的 List,而 heappushheappop 操做每次都會對 Heap 進行從新整理。此外,一個 Heap 列表不必定是正確排序的,可是通過 list.sort() 的列表必定是 Heap:code

import random
lst = [random.randrange(1, 100) for _ in range(5)]
lst.sort()
print("List: ", lst)
print("Poped: ", heappop(lst))
heappush(lst, 4)
print("Heap: ", lst)
List:  [24, 55, 81, 83, 87]
Poped:  24
Heap:  [4, 55, 81, 87, 83]

最大/最小的 N 個數

Heap 還提供了 nsmallestnlargest 方法用於取出前 n 個最大/最小數:

heap = [random.randrange(1, 1000) for _ in range(1000)]
heapify(heap)
print("N largest: ", nlargest(10, heap))
print("N smallest: ", nsmallest(10, heap))
print(len(heap))  # 不原地修改
N largest:  [999, 999, 998, 994, 992, 991, 990, 988, 985, 982]
N smallest:  [1, 1, 1, 2, 4, 5, 5, 6, 6, 9]
1000

合併(排序)

merge 方法用於將兩個 Heap 進行合併:

heapA = sorted([random.randrange(1, 100) for _ in range(3)])
heapB = sorted([random.randrange(1, 100) for _ in range(3)])

merged = []
for i in merge(heapA, heapB):
    merged.append(i)
print(merged)
[5, 29, 66, 66, 70, 99]

最後兩個方法 heapreplaceheappushpop 分別至關於:

lstA = [1,2,3,4,5]
lstB = [1,2,3,4,5]

poped = heapreplace(lstA, 0)
print("lstA: ", lstA, "poped: ", poped)

# is equal to...
poped = heappop(lstB)
heappush(lstB, 0)
print("lstB: ", lstA, "poped: ", poped)

print("*"*30)

poped = heappushpop(lstA, 9)
print("lstA: ", lstA, "poped: ", poped)

# is equal to...
heappush(lstB, 9)
poped = heappop(lstB)
print("lstB: ", lstB, "poped: ", poped)
lstA:  [0, 2, 3, 4, 5] poped:  1
lstB:  [0, 2, 3, 4, 5] poped:  1
******************************
lstA:  [2, 4, 3, 9, 5] poped:  0
lstB:  [2, 4, 3, 5, 9] poped:  0

這兩個方法的執行效率要比分開寫的方法高,但要注意 heapreplace 要取代的值是否比 heap[0] 大,若是不是,能夠用更有效的方法:

item = 0
lstA = [1,2,3,4,5]
if item < lstA[0]:
    # replace
    poped = lstA[0]
    lstA[0] = item
    print("lstA: ", lstA, "poped: ", poped)
lstA:  [0, 2, 3, 4, 5] poped:  1

歡迎關注公衆號 PyHub!

歡迎關注公衆號 PyHub!

相關文章
相關標籤/搜索