論文閱讀筆記:BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

目錄 摘要 1.引言 2.相關工作 2.1Feature-based Approaches 2.2Fine-tuning方法  3 BERT 3.1 Model Architecture 3.2 Input Representation  3.3 Pre-training Tasks 3.3.1 Task #1: Masked LM 3.3.2 Task #2: Next Sentence Pre
相關文章
相關標籤/搜索