天然語言處理工具:中文 word2vec 開源項目,教程,數據集

word2vec

word2vec/glove/swivel binary file on chinese corpusphp

word2vec: https://code.google.com/p/word2vec/git

glove: http://nlp.stanford.edu/projects/glove/github

swivel: https://github.com/tensorflow/models/tree/master/swivel
http://arxiv.org/abs/1602.02215機器學習

開源項目

wordvectors學習

Pre-trained word vectors of 30+ languagesgoogle

https://github.com/Kyubyong/wordvectors.net

chinese-word2veccode

word2vec/glove/swivel binary file on chinese corpusxml

https://github.com/to-shimo/chinese-word2vecblog

教程

維基百科語料中的詞語類似度探索

http://www.52nlp.cn/tag/gensim

利用word2vec對關鍵詞進行聚類

http://blog.csdn.net/zhaoxinfan/article/details/11069485

Training Word2Vec Model on English Wikipedia by Gensim

http://textminingonline.com/training-word2vec-model-on-english-wikipedia-by-gensim

數據集

wiki

https://dumps.wikimedia.org/zhwiki/latest/zhwiki-latest-pages-articles.xml.bz2

sogou

http://www.sogou.com/labs/resource/list_news.php

更多機器學習教程:http://www.tensorflownews.com/

相關文章
相關標籤/搜索