今天要總結的是 Word Cloud 最後一個部分了,用 Matlab 來建立 word cloud。Matlab R2018b 已經提供 wordcloud 函數能夠直接生成詞雲了。html
1) 準備文本。git
很少說了,懶人繼續用上次那個 Word Cloud History.txt 的文本吧。github
2) 讀取並清洗數據文本。編程
%read txt as a string text = string(fileread('C:\Users\yuki\Desktop\WordCloudHistory.txt')); %delete puchuation punctuationCharacters = ["." "?" "!" "," ";" ":"]; text = replace(text,punctuationCharacters," "); %convert a string to array words = split(join(text)); %delete the words has less than 5 characters, which are problely stop words words(strlength(words)<5) = []; %change all words to lowercase words = lower(words);
3) 計算詞頻並生成數組。數組
%calculate the frequencies for every word [numOccurrences,uniqueWords] = histcounts(categorical(words));
4) 生成 word cloud。less
figure %set properties for word cloud wordcloud(uniqueWords,numOccurrences,'Shape', "rectangle", 'MaxDisplayWords', 200); title("Word Cloud History")
1) Matlab 也有插件能夠直接生成詞雲,操做簡單,不用編程,哈哈。函數
2) 既然已經說了各類能夠建立詞雲的方法,那麼就順便總結一下什麼方法好用方便不花錢。插件
Tool | Easy Use | Free | Need Script | |||
---|---|---|---|---|---|---|
Python | Clear document, powerful text mining library | Yes | Yes | |||
JavaScript | Need to extract array by own, and need to find a way to save the image | Yes | Yes | |||
R | Clear document, powerful text mining library | Yes | Yes | |||
Matlab | Clear document, interactive interface | No | Optional |
download herecode