Efficient Estimation of Word Representations in Vector Space (2013)論文要點

時間 2019-11-09

標籤 efficient estimation word representations vector space 論文要點欄目 Microsoft Office 简体版

原文原文鏈接

參考：分佈式

A Neural Probabilistic Language Model (2003)論文要點 http://www.javashuo.com/article/p-entuqhuq-gt.html學習

- 線性規律linear regularities: "king - man = queen - woman"編碼

- 語法和語義規律syntactic and semantic regularitieshtm

1986年Hinton等人提出分佈式表示。blog

典型的訓練：token

3-50輪，十億級別樣本，滑動窗口寬度N=10，向量維度D=50-200，隱層寬度H=500-1000，詞典維度|V|=10^6ip

複雜度主要取決於隱層到輸出層，即H*|V|get

hierarchical softmax，輸出層Huffman編碼，計算複雜度|V| -> log|V|it

考慮去掉隱層。

兩種方式CBOW和Skip-gram

更多數據，更高維向量：

Google News：60億tokens，100萬經常使用詞，3萬極經常使用詞

3輪迭代，學習率0.025且隨時間衰減。

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。