Popular word embedding algorithms exhibit stereotypical biases, such as gender bias.算法
The widespread use of these algorithms in machine learning systems can amplify stereotypes in important contexts.網絡
Although some methods have been developed to mitigate this problem, how word embedding biases arise during training is poorly understood.app
In this work we develop a technique to address this question.ide
Given a word embedding, our method reveals how perturbing the training corpus would affect the resulting embedding bias.oop
By tracing the origins of word embedding bias back to the original training documents, one can identify subsets of documents whose removal would most reduce bias.ui
We demonstrate our methodology on Wikipedia and New York Times corpora, and find it to be very accurate.this
《理解單詞嵌入中偏見的起源》spa
經常使用的嵌入詞算法表現出典型的偏見,如性別偏見。orm
Word embeddings generated by neural network methods such as word2vec (W2V) are well known to exhibit seemingly linear behaviour, e.g. the embeddings of analogy woman is to queen as man is to king'' approximately describe a parallelogram.ip
This property is particularly intriguing since the embeddings are not trained to achieve it.
Several explanations have been proposed, but each introduces assumptions that do not hold in practice.
We derive a probabilistically grounded definition of paraphrasing that we re-interpret as word transformation, a mathematical description of \(w_x\) is to \(w_y\)''.
From these concepts we prove existence of linear relationship between W2V-type embeddings that underlie the analogical phenomenon, identifying explicit error terms.
《類比解釋:對嵌入詞的理解》
神經網絡方法(如word2vec(w2v))生成的嵌入詞一般表現出看似線性的行爲,例如,將女性嵌入到皇后中,就像男人對國王「近似描述一個平行四邊形」。
這種特性特別有趣,由於嵌入沒有通過訓練來實現它。
已經提出了幾種解釋,但每種解釋都引入了在實踐中不成立的假設。
咱們推導了一個基於機率的釋義定義,咱們將其從新解釋爲單詞轉換,一個 \(w_x\) 到 \(w_y\) 的數學描述。
從這些概念中,咱們證實了W2V類型嵌入之間存在線性關係,這些嵌入構成了類比現象的基礎,識別了顯式錯誤項。
While humor is often thought to be beyond the reach of Natural Language Processing, we show that several aspects of single-word humor correlate with simple linear directions in Word Embeddings.
In particular:
(a) the word vectors capture multiple aspects discussed in humor theories from various disciplines;
(b) each individual's sense of humor can be represented by a vector, which can predict differences in people's senses of humor on new, unrated, words; and
(c) upon clustering humor ratings of multiple demographic groups, different humor preferences emerge across the different groups.
Humor ratings are taken from the work of Engelthaler and Hills (2017) as well as from an original crowdsourcing study of 120,000 words.
Our dataset further includes annotations for the theoretically-motivated humor features we identify.
《文字嵌入中的幽默:繁瑣的戈布爾德古克》
雖然幽默一般被認爲是超出了天然語言處理的範圍,但咱們發現,單詞幽默的幾個方面與嵌入單詞的簡單線性方向相關。
特別是:
(a)詞彙載體從各個學科捕獲幽默理論中討論的多個方面;
(b)每一個人的幽默感可由一個矢量表示,該矢量可預測人們在新的、未分級的詞語上的幽默感差別;及
(c)經過對多我的口統計學羣體的幽默評分進行聚類,不一樣羣體之間會出現不一樣的幽默偏好。
幽默評分取自Engelthaler和Hills(2017)的做品,以及120000字的原始衆包研究。
咱們的數據集還包括對咱們所識別的理論性幽默特徵的註釋。