Collections--Counter

時間 2019-11-06

標籤 collections counter 简体版

原文原文鏈接

　　一個 Counter 是一個 dict 的子類，用於計數可哈希對象。它是一個集合，元素像字典鍵(key)同樣存儲，它們的計數存儲爲值。計數能夠是任何整數值，包括0和負數html

元素從一個 iterable 被計數或從其餘的 mapping (or counter)初始化：python

>>> c = Counter()                           # a new, empty counter
>>> c = Counter('gallahad')                 # a new counter from an iterable
>>> c = Counter({'red': 4, 'blue': 2})      # a new counter from a mapping
>>> c = Counter(cats=4, dogs=8)             # a new counter from keyword args

　　Counter對象有一個字典接口，若是引用的鍵沒有任何記錄，就返回一個0，而不是彈出一個 KeyErrorapp

計數器對象除了字典方法之外，還提供了三個其餘的方法：spa

1. `elements`()

返回一個迭代器，每一個元素重複計數的個數。元素順序是任意的。若是一個元素的計數小於1， elements() 就會忽略它。code

>>> c = Counter(a=4, b=2, c=0, d=-2)
>>> sorted(c.elements())
['a', 'a', 'a', 'a', 'b', 'b']

2. `most_common`([n])

返回一個列表，提供 n 個頻率最高的元素和計數。若是沒提供 n ，或者是 None ， most_common() 返回計數器中的全部元素。相等個數的元素順序隨機：htm

>>> Counter('abracadabra').most_common(3)  # doctest: +SKIP
[('a', 5), ('r', 2), ('b', 2)]

3.`subtract`([iterable-or-mapping])

從 迭代對象 或 映射對象 減去元素。像 dict.update() 可是是減去，而不是替換。輸入和輸出均可以是0或者負數。對象

>>> c = Counter(a=4, b=2, c=0, d=-2)
>>> d = Counter(a=1, b=2, c=3, d=4)
>>> c.subtract(d)
>>> c
Counter({'a': 3, 'b': 0, 'c': -3, 'd': -6})

Counter 對象的經常使用案例blog

sum(c.values())                 # total of all counts
c.clear()                       # reset all counts
list(c)                         # list unique elements
set(c)                          # convert to a set
dict(c)                         # convert to a regular dictionary
c.items()                       # convert to a list of (elem, cnt) pairs
Counter(dict(list_of_pairs))    # convert from a list of (elem, cnt) pairs
c.most_common()[:-n-1:-1]       # n least common elements
+c                              # remove zero and negative counts

典型應用場景接口

>>> # 統計一個列表中每一個元素重複次數
>>> cnt = Counter()
>>> for word in ['red', 'blue', 'red', 'green', 'blue', 'blue']:
...     cnt[word] += 1
>>> cnt
Counter({'blue': 3, 'red': 2, 'green': 1})

>>> # 查找一個文件中出現最頻繁的10個字符串
>>> import re
>>> words = re.findall(r'\w+', open('hamlet.txt').read().lower())
>>> Counter(words).most_common(10)
[('the', 1143), ('and', 966), ('to', 762), ('of', 669), ('i', 631),
 ('you', 554),  ('a', 546), ('my', 514), ('hamlet', 471), ('in', 451)]

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。

Collections--Counter

1. elements()

2. most_common([n])

3.subtract([iterable-or-mapping])

1. `elements`()

2. `most_common`([n])

3.`subtract`([iterable-or-mapping])