論文閱讀筆記(四十七):Attention Is All You Need

Abstract The dominant sequence transduction models are based on complex recurrent or convolutional neural networks that include an encoder and a decoder. The best performing models also connect the en
相關文章
相關標籤/搜索