Learning deep representations of fine-grained visual descriptions

時間 2021-01-02

原文原文鏈接

Abstract 最先進的zero-shot視覺識別將學習視爲圖像和補充信息的聯合問題。其中對視覺特徵來說最有效的補充信息是屬性-描述類與類之間的共享特徵的手動編碼向量。儘管算法表現很好，但是屬性任然是有侷限的：更細粒度的識別需要相當多的屬性屬性不提供自然語言界面（attributes do not provide a natural language interface）（不能顯式的表示？）

>>阅读原文<<