傳送門:python
speech_text = ''' I love you,Not for what you are,But for what I amWhen I am with you.I love you,Not only for whatYou have made of yourself,But for whatYou are making of me.I love youFor the part of meThat you bring out;I love youFor putting your handInto my heaped-up heartAnd passing overAll the foolish, weak thingsThat you can’t helpDimly seeing there,And for drawing outInto the lightAll the beautiful belongingsThat no one else had lookedQuite far enough to find.I love you because youAre helping me to makeOf the lumber of my lifeNot a tavernBut a temple;Out of the worksOf my every dayNot a reproachBut a song.I love youBecause you have doneMore than any creedCould have doneTo make me goodAnd more than any fateCould have doneTo make me happy.You have done itWithout a touch,Without a word,Without a sign.You have done itBy being yourself.Perhaps that is whatBeing a friend means,After all. '''
speech = speech_text.split()
dic = {}
for word in speech:
if word not in dic:
dic[word]=1
else:
dic[word]=dic[word] + 1
dic.items()
複製代碼
在使用nltk的時候,發現一直報錯,能夠使用下邊兩行命令安裝nltkbash
import nltk
nltk.download()
複製代碼
會彈出如下窗口,下載nltk.app
若是這種方式下載完成了 那就直接跳過下一步post
我下了不少次最後都下載失敗了,如今說第二種方法。 直接下載打包好的安裝包:下載地址1:雲盤密碼znx7,下來的包nltk_data.zip 解壓到C盤根目錄下,這樣是最保險的,防止找不到包。下載地址2:雲盤密碼4cp3ui
感謝【V_can--Python與天然語言處理_第一期_NLTK入門之環境搭建提供的安裝包】spa
#代碼以下
from collections import Counter
c = Counter(speech)
c. most_common(10)#出現的前十名
print(c. most_common(10))
for sw in stop_words:
del c[sw]
c.most_common(10)
複製代碼
speech_text = ''' I love you, Not for what you are, But for what I amWhen I am with you. I love you, Not only for whatYou have made of yourself, But for whatYou are making of me. I love youFor the part of meThat you bring out; I love youFor putting your handInto my heaped-up heartAnd passing overAll the foolish, weak thingsThat you can’t helpDimly seeing there, And for drawing outInto the lightAll the beautiful belongingsThat no one else had lookedQuite far enough to find. I love you because youAre helping me to makeOf the lumber of my lifeNot a tavernBut a temple; Out of the worksOf my every dayNot a reproachBut a song. I love youBecause you have doneMore than any creedCould have doneTo make me goodAnd more than any fateCould have doneTo make me happy. You have done itWithout a touch, Without a word, Without a sign. You have done itBy being yourself. Perhaps that is whatBeing a friend means, After all. '''
#解決大小寫的問題
speech = speech_text.lower().split()
print(speech)
dic = {}
for word in speech:
if word not in dic:
dic[word] = 1
else:
dic[word] = dic[word] + 1
import operator
swd = sorted(dic.items(),key=operator.itemgetter(1),reverse=True)
print(swd)
#停用詞處理
from nltk.corpus import stopwords
stop_words = stopwords.words('English')
for k,v in swd:
if k not in stop_words:
print(k,v)
from collections import Counter
c = Counter(speech)
c. most_common(10)#出現的前十名
print(c. most_common(10))
for sw in stop_words:
del c[sw]
c.most_common(10)
複製代碼
經過這兩種方法咱們就不難明白爲何如今Python 在數據分析、科學計算領域用得愈來愈多,除了語言自己的特色,第三方庫也不少很好用。.net
因此還等什麼,人生幾何,何不Python當歌。 跟我一塊學Python吧。3d