[CVPR 2017] Semantic Autoencoder for Zero-Shot Learning論文筆記

時間 2019-11-17

標籤 cvpr semantic autoencoder zero shot learning 論文筆記简体版

原文原文鏈接

http://openaccess.thecvf.com/content_cvpr_2017/papers/Kodirov_Semantic_Autoencoder_for_CVPR_2017_paper.pdfweb

Semantic Autoencoder for Zero-Shot Learning，Elyor Kodirov Tao Xiang Shaogang Gong，Queen Mary University of London, UK，{e.kodirov, t.xiang, s.gong}@qmul.ac.uk算法

亮點性能

經過對耦學習提高零次學習系統的性能（相似CycleGan）
結構很是簡潔，且可直接求解，速度很是快
有效應用到其餘相關任務（監督聚類）上，證實了範化性能

方法學習

Linear autoencoder測試

Model Formulationspa

which is a well-known Sylvester equation which can be solved efficiently by the Bartels-Stewart algorithm (matlab sylvester).3d

零次學習：基於以上算法有兩種測試的方法：code

將一個未知的類別特徵樣本xi經過W映射到語義空間（屬性）si，經過比較語義空間的距離找到離它最近的類別（無訓練樣本），即爲它的標籤
將全部無訓練數據類別的語義特徵S經過WT映射到特徵空間X，經過比較一個未知類別的樣本xi和映射到特徵空間的類別中心X的距離，找到離它最近的類別，即爲它的標籤
以上兩種算法獲得結果的準確度基本相同。

監督聚類：在這個問題中，語義空間即爲類別標籤空間（one-hot class label）。全部測試數據被影射到訓練類別標籤空間，而後使用k-means聚合orm

與已有模型的關係：零度學習已有模型通常學習一個知足如下條件的影射：blog

或者，在［54］中將屬性影射到特徵空間，學習目標變爲，

文中的算法結合了這二者，並且因爲W*=WT，在對耦學習中W不可能太大（不然，x乘以兩個範數很大的的矩陣沒法恢復原來的初始值），正則化項能夠被忽略。

實驗

零次學習

數據集：Semantic word vector representation is used for large-scale datasets (ImNet-1 and ImNet-2). We train a skip-gram text model on a corpus of 4.6M Wikipedia documents to obtain the word2vec2 [38, 37] word vectors.

特徵：除 ImNet-1用AlexNet提取外，其餘均使用了GoogleNet

結果：

Our SAE model achieves the best results on all 6 datasets.
On the smallscale datasets, the gap between our model’s results to the strongest competitor ranges from 3.5% to 6.5%.
On the large-scale datasets, the gaps are even bigger: On the largest ImNet-2, our model improves over the state-of-the-art SS-Voc [22] by 8.8%.
Both the encoder and decoder projection functions in our SAE model (SAE (W) and SAE (WT) respectively) can be used for effective ZSL.

The encoder projection function seems to be slightly better overall.

Measures how well a zero-shot learning method can trade-off between recognising data from seen classes and that of unseen classes

Holding out 20% of the data samples from the seen classes and mixing them with the samples from the unseen classes.
On AwA, our model is slightly worse than the SynCstruct [13].
However, on the more challenging CUB dataset, our method significantly outperforms the competitors.

聚類

數據集： A synthetic dataset and Oxford Flowers-17 (848 images)

結果：

On computational cost, our model (93s) is more expensive than MLCA (39%) but much better than all others (hours~days).
Achieves the best clustering accuracy

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。