How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for

時間 2020-12-24

原文原文鏈接

How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation Abstract 本文調研了各種nlg系統的metric 近期的nlg metric從MT發展而來，本文發現這些metric與人類在Twitt

>>阅读原文<<