survey: VQA

VQA: Given an image and a question in natural language, it requires reasoning over visual elements of the image and general knowledge to infer the correct answer. 和基於對象檢測的任務區別 對象識別-對圖像主要對象進行分類 目標檢測-通過
相關文章
相關標籤/搜索