How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for

How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation Abstract 本文調研了各種nlg系統的metric 近期的nlg metric從MT發展而來,本文發現這些metric與人類在Twitt
相關文章
相關標籤/搜索