機器閱讀理解之BiDAF模型

項目簡介

在機器閱讀理解(MRC)任務中,咱們會給定一個問題(Q)以及一個或多個段落(P)/文檔(D),而後利用機器在給定的段落中尋找正確答案(A),即Q + P or D => A. 機器閱讀理解(MRC)是天然語言處理(NLP)中的關鍵任務之一,須要機器對語言有深入的理解才能找到正確的答案。本項目基於paddlepaddle,針對DuReader閱讀理解數據集數據集實現並升級了一個經典的閱讀理解模型——BiDAF模型,該模型的結構圖以下所示:

在DuReader數據集上的效果以下表所示:python

Model Dev ROUGE-L Test ROUGE-L
BiDAF (原始論文基線) 39.29 45.90
本基線系統 47.68 54.66
 

DuReader數據集

DuReader是一個大規模、面向真實應用、由人類生成的中文閱讀理解數據集。DuReader聚焦於真實世界中的不限定領域的問答任務。相較於其餘閱讀理解數據集,DuReader的優點包括:json

  • 問題來自於真實的搜索日誌
  • 文章內容來自於真實網頁
  • 答案由人類生成
  • 面向真實應用場景
  • 標註更加豐富細緻

更多關於DuReader數據集的詳細信息可在DuReader官網找到數組

 

進階使用

任務定義與建模

閱讀理解任務的輸入包括:網絡

  • 一個問題Q (已分詞),例如:["明天", "的", "天氣", "怎麼樣", "?"];
  • 一個或多個段落P (已分詞),例如:[["今天", "的", "天氣", "是", "多雲", "轉", "晴", ",", "溫度", "適中", "。"], ["明天", "氣溫", "較爲", "寒冷", ",", "請", "注意", "添加", "衣物", "。"]]。

模型輸出包括:dom

  • 段落P中每一個詞是答案起始位置的機率以及答案結束位置的機率 (boundary model),例如:起始機率=[[0.01, 0.02, ...], [0.80, 0.10, ...]],結束機率=[[0.01, 0.02, ...], [0.01, 0.01, ...]],其中機率數組的維度和輸入分詞後的段落維度相同。

模型結構包括:fetch

  • 嵌入層 (embedding layer):輸入採用one-hot方式表示的詞,獲得詞向量;
  • 編碼層 (encoding layer):對詞向量進行編碼,融入上下文信息;
  • 匹配層 (matching layer):對問題Q和段落P之間進行匹配;
  • 融合層 (fusion layer):融合匹配後的結果;
  • 預測層 (output layer):預測獲得起始、結束機率。

模型原理介紹

下圖顯示了原始的模型結構(如BiDAF模型結構圖)。在本基線系統中,咱們去掉了char級別的embedding,在預測層中使用了pointer network,而且參考了R-NET中的一些網絡結構。ui

數據格式說明

DuReader數據集中每一個樣本都包含若干文檔(documents),每一個文檔又包含若干段落(paragraphs)。有關數據的詳細介紹可見官網論文以及數據集中包含的說明文件,下面是一個來自訓練集的樣本示例this

{
    "documents": [
        {
            "is_selected": true,
            "title": "板蘭根沖劑_百度百科",
            "most_related_para": 11,
            "segmented_title": ["板蘭根", "沖劑", "_", "百度百科"],
            "segmented_paragraphs": [
                    ["板蘭根", "沖劑", ",", "藥", "名", ":", ... ],
                    ["【", "功效", "與", "主治", "】", ...],
                    ...
            ],
            "paragraphs": [
                "板蘭根沖劑,藥名...",
                "【功效與主治】...",
                ...
            ],
            "bs_rank_pos": 0
        },
        {
            "is_selected": true,
            "title": "長期喝板藍根顆粒有哪些好處和壞處",
            "most_related_para": 0,
            "segmented_title": ["長期", "喝", "板藍根", "顆粒", "有", "哪些", "好處", "和", "壞處"],
            "segmented_paragraphs": [
                ["板藍根", "對", "感冒", "、", "流感","、", ...],
                ...
            ],
            "paragraphs": [
                "板藍根對感冒、流感、流腦、...",
                ...
            ],
            "bs_rank_pos": 1
        },
        ...
    ],
    "answer_spans": [[5, 28]],
    "fake_answers": ["清熱解毒、涼血;用於溫熱發熱、發斑、風熱感冒、咽喉腫爛、流行性乙型腦炎、肝炎、腮腺炎。"],
    "question": "板藍根顆粒的功效與做用",
    "segmented_answers": [
        ["清熱解毒", "、", "涼血", ";", "用於", "溫", "熱", "發熱", ...],
        ["板藍根", "的", "用途", "不只", "是", "治療", "感冒", ...],
        ...
    ],
    "answers": [
        "清熱解毒、涼血;用於溫熱發熱、發斑、風熱感冒、咽喉腫爛、流行性乙型腦炎、肝炎、 腮腺炎 。",
        "板藍根的用途不只是治療感冒,板藍根的功效與做用多,對多種細菌性、病毒性疾病都有較好的預防與治療做用。",
        ...
    ],
    "answer_docs": [0],
    "segmented_question": ["板藍根顆粒", "的", "功效", "與", "做用"],
    "question_type": "DESCRIPTION",
    "question_id": 91161,
    "fact_or_opinion": "FACT",
    "match_scores": [
        0.9583333333333334
    ]
}
In[1]
# 從work中把代碼解壓出來,解壓數據集
!unzip -qo data/data9722/demo.zip
!mv home/aistudio/data/demo data
!rm -r home
# !tar -zxf data/data9722/dureader_machine_reading-dataset-2.0.0.tar.gz -C data
 

爲了提高模型在DuReader2.0數據集上的表現,做者採用了一種新的段落抽取策略。該段落抽取策略可經過運行如下命令執行:編碼

In[3]
# 使用demo的話 這裏不用執行
!cd src && sh run.sh --para_extraction
!mv extracted/* data/extracted
Start paragraph extraction, this may take a few hours
Source dir: ../data/preprocessed
Target dir: ../extracted
Processing trainset
Processing devset
Processing testset
Paragraph extraction done!
 

在模型訓練開始以前,須要先運行如下命令來生成詞表以及建立一些必要的文件夾,用於存放模型參數等:lua

In[3]
!cd src && sh run.sh --prepare
2019-09-06 13:24:48,621 - brc - INFO - Running with args : Namespace(batch_size=16, dev_interval=10, devset=['../data/demo/devset/search.dev.json'], doc_num=5, drop_rate=0.0, embed_size=300, enable_ce=False, evaluate=False, hidden_size=150, learning_rate=0.001, load_dir='', log_interval=50, log_path=None, max_a_len=200, max_p_len=500, max_p_num=5, max_q_len=60, optim='adam', para_print=False, pass_num=5, predict=False, prepare=True, random_seed=123, result_dir='../data/results/', result_name='test_result', save_dir='../data/models', save_interval=1, testset=['../data/demo/testset/search.test.json'], train=False, trainset=['../data/demo/trainset/search.train.json'], use_gpu=True, vocab_dir='../data/vocab', weight_decay=0.0001)
2019-09-06 13:24:48,621 - brc - INFO - Checking the data files...
2019-09-06 13:24:48,622 - brc - INFO - Preparing the directories...
2019-09-06 13:24:48,622 - brc - INFO - Building vocabulary...
2019-09-06 13:24:48,744 - brc - INFO - Train set size: 95 questions.
2019-09-06 13:24:48,937 - brc - INFO - Dev set size: 100 questions.
2019-09-06 13:24:49,112 - brc - INFO - Test set size: 100 questions.
2019-09-06 13:24:49,166 - brc - INFO - After filter 5225 tokens, the final vocab size is 5007
2019-09-06 13:24:49,166 - brc - INFO - Assigning embeddings...
2019-09-06 13:24:49,180 - brc - INFO - Saving vocab...
2019-09-06 13:24:49,884 - brc - INFO - Done with preparing!
 

而後運行下面的命令,便可開始訓練。更多的參數配置可在src/args.py中修改,默認使用data/demo中的數據進行訓練

In[4]
!cd src && sh run.sh --train --pass_num 50
2019-09-06 13:24:58,878 - brc - INFO - Running with args : Namespace(batch_size=16, dev_interval=10, devset=['../data/demo/devset/search.dev.json'], doc_num=5, drop_rate=0.0, embed_size=300, enable_ce=False, evaluate=False, hidden_size=150, learning_rate=0.001, load_dir='', log_interval=50, log_path=None, max_a_len=200, max_p_len=500, max_p_num=5, max_q_len=60, optim='adam', para_print=False, pass_num=50, predict=False, prepare=False, random_seed=123, result_dir='../data/results/', result_name='test_result', save_dir='../data/models', save_interval=1, testset=['../data/demo/testset/search.test.json'], train=True, trainset=['../data/demo/trainset/search.train.json'], use_gpu=True, vocab_dir='../data/vocab', weight_decay=0.0001)
2019-09-06 13:24:58,878 - brc - INFO - Load data_set and vocab...
2019-09-06 13:24:59,220 - brc - INFO - vocab size is 5007 and embed dim is 300
2019-09-06 13:24:59,346 - brc - INFO - Train set size: 95 questions.
2019-09-06 13:24:59,542 - brc - INFO - Dev set size: 100 questions.
2019-09-06 13:24:59,543 - brc - INFO - Converting text into ids...
2019-09-06 13:24:59,596 - brc - INFO - Initialize the model...
W0906 13:25:00.714032   178 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.2, Runtime API Version: 9.0
W0906 13:25:00.718631   178 device_context.cc:267] device: 0, cuDNN Version: 7.3.
2019-09-06 13:25:00,743 - brc - INFO - Training the model...
WARNING:root:
     You can try our memory optimize feature to save your memory usage:
         # create a build_strategy variable to set memory optimize option
         build_strategy = compiler.BuildStrategy()
         build_strategy.enable_inplace = True
         build_strategy.memory_optimize = True
         
         # pass the build_strategy to with_data_parallel API
         compiled_prog = compiler.CompiledProgram(main).with_data_parallel(
             loss_name=loss.name, build_strategy=build_strategy)
      
     !!! Memory optimize is our experimental feature !!!
         some variables may be removed/reused internal to save memory usage, 
         in order to fetch the right value of the fetch_list, please set the 
         persistable property to true for each variable in fetch_list

         # Sample
         conv1 = fluid.layers.conv2d(data, 4, 5, 1, act=None) 
         # if you need to fetch conv1, then:
         conv1.persistable = True

                 
I0906 13:25:00.767354   178 parallel_executor.cc:329] The number of CUDAPlace, which is used in ParallelExecutor, is 1. And the Program will be copied 1 copies
I0906 13:25:00.788298   178 build_strategy.cc:340] SeqOnlyAllReduceOps:0, num_trainers:1
2019-09-06 13:25:13,447 - brc - INFO - epoch: 1, epoch_time_cost: 12.65
INFO:brc:epoch: 1, epoch_time_cost: 12.65
2019-09-06 13:25:13,447 - brc - INFO - Average train loss for epoch 1 is 36.6342582703
INFO:brc:Average train loss for epoch 1 is 36.6342582703
2019-09-06 13:25:26,456 - brc - INFO - epoch: 2, epoch_time_cost: 12.81
INFO:brc:epoch: 2, epoch_time_cost: 12.81
2019-09-06 13:25:26,456 - brc - INFO - Average train loss for epoch 2 is 35.1372330983
INFO:brc:Average train loss for epoch 2 is 35.1372330983
2019-09-06 13:25:39,642 - brc - INFO - epoch: 3, epoch_time_cost: 12.98
INFO:brc:epoch: 3, epoch_time_cost: 12.98
2019-09-06 13:25:39,642 - brc - INFO - Average train loss for epoch 3 is 34.1238333384
INFO:brc:Average train loss for epoch 3 is 34.1238333384
2019-09-06 13:25:52,855 - brc - INFO - epoch: 4, epoch_time_cost: 13.02
INFO:brc:epoch: 4, epoch_time_cost: 13.02
2019-09-06 13:25:52,855 - brc - INFO - Average train loss for epoch 4 is 33.5638427734
INFO:brc:Average train loss for epoch 4 is 33.5638427734
2019-09-06 13:26:06,531 - brc - INFO - epoch: 5, epoch_time_cost: 13.49
INFO:brc:epoch: 5, epoch_time_cost: 13.49
2019-09-06 13:26:06,532 - brc - INFO - Average train loss for epoch 5 is 32.7321383158
INFO:brc:Average train loss for epoch 5 is 32.7321383158
2019-09-06 13:26:20,415 - brc - INFO - epoch: 6, epoch_time_cost: 13.68
INFO:brc:epoch: 6, epoch_time_cost: 13.68
2019-09-06 13:26:20,416 - brc - INFO - Average train loss for epoch 6 is 31.9549776713
INFO:brc:Average train loss for epoch 6 is 31.9549776713
2019-09-06 13:26:33,578 - brc - INFO - epoch: 7, epoch_time_cost: 12.97
INFO:brc:epoch: 7, epoch_time_cost: 12.97
2019-09-06 13:26:33,578 - brc - INFO - Average train loss for epoch 7 is 31.1214656830
INFO:brc:Average train loss for epoch 7 is 31.1214656830
2019-09-06 13:26:47,658 - brc - INFO - epoch: 8, epoch_time_cost: 13.89
INFO:brc:epoch: 8, epoch_time_cost: 13.89
2019-09-06 13:26:47,658 - brc - INFO - Average train loss for epoch 8 is 29.9649470647
INFO:brc:Average train loss for epoch 8 is 29.9649470647
2019-09-06 13:27:00,801 - brc - INFO - epoch: 9, epoch_time_cost: 12.95
INFO:brc:epoch: 9, epoch_time_cost: 12.95
2019-09-06 13:27:00,801 - brc - INFO - Average train loss for epoch 9 is 28.2852547963
INFO:brc:Average train loss for epoch 9 is 28.2852547963
2019-09-06 13:27:14,956 - brc - INFO - epoch: 10, epoch_time_cost: 13.97
INFO:brc:epoch: 10, epoch_time_cost: 13.97
2019-09-06 13:27:14,956 - brc - INFO - Average train loss for epoch 10 is 26.4429680506
INFO:brc:Average train loss for epoch 10 is 26.4429680506
2019-09-06 13:27:28,226 - brc - INFO - epoch: 11, epoch_time_cost: 13.08
INFO:brc:epoch: 11, epoch_time_cost: 13.08
2019-09-06 13:27:28,226 - brc - INFO - Average train loss for epoch 11 is 25.0628827413
INFO:brc:Average train loss for epoch 11 is 25.0628827413
2019-09-06 13:27:41,917 - brc - INFO - epoch: 12, epoch_time_cost: 13.48
INFO:brc:epoch: 12, epoch_time_cost: 13.48
2019-09-06 13:27:41,917 - brc - INFO - Average train loss for epoch 12 is 23.9484675725
INFO:brc:Average train loss for epoch 12 is 23.9484675725
2019-09-06 13:27:55,270 - brc - INFO - epoch: 13, epoch_time_cost: 13.16
INFO:brc:epoch: 13, epoch_time_cost: 13.16
2019-09-06 13:27:55,270 - brc - INFO - Average train loss for epoch 13 is 22.9610160192
INFO:brc:Average train loss for epoch 13 is 22.9610160192
2019-09-06 13:28:10,131 - brc - INFO - epoch: 14, epoch_time_cost: 14.66
INFO:brc:epoch: 14, epoch_time_cost: 14.66
2019-09-06 13:28:10,131 - brc - INFO - Average train loss for epoch 14 is 22.0059817632
INFO:brc:Average train loss for epoch 14 is 22.0059817632
2019-09-06 13:28:23,853 - brc - INFO - epoch: 15, epoch_time_cost: 13.52
INFO:brc:epoch: 15, epoch_time_cost: 13.52
2019-09-06 13:28:23,853 - brc - INFO - Average train loss for epoch 15 is 21.3352721532
INFO:brc:Average train loss for epoch 15 is 21.3352721532
2019-09-06 13:28:37,792 - brc - INFO - epoch: 16, epoch_time_cost: 13.74
INFO:brc:epoch: 16, epoch_time_cost: 13.74
2019-09-06 13:28:37,792 - brc - INFO - Average train loss for epoch 16 is 20.9558064143
INFO:brc:Average train loss for epoch 16 is 20.9558064143
2019-09-06 13:28:51,719 - brc - INFO - epoch: 17, epoch_time_cost: 13.73
INFO:brc:epoch: 17, epoch_time_cost: 13.73
2019-09-06 13:28:51,719 - brc - INFO - Average train loss for epoch 17 is 20.6100145976
INFO:brc:Average train loss for epoch 17 is 20.6100145976
2019-09-06 13:29:05,443 - brc - INFO - epoch: 18, epoch_time_cost: 13.53
INFO:brc:epoch: 18, epoch_time_cost: 13.53
2019-09-06 13:29:05,444 - brc - INFO - Average train loss for epoch 18 is 20.1902602514
INFO:brc:Average train loss for epoch 18 is 20.1902602514
2019-09-06 13:29:18,832 - brc - INFO - epoch: 19, epoch_time_cost: 13.19
INFO:brc:epoch: 19, epoch_time_cost: 13.19
2019-09-06 13:29:18,832 - brc - INFO - Average train loss for epoch 19 is 19.8818868001
INFO:brc:Average train loss for epoch 19 is 19.8818868001
2019-09-06 13:29:32,366 - brc - INFO - epoch: 20, epoch_time_cost: 13.35
INFO:brc:epoch: 20, epoch_time_cost: 13.35
2019-09-06 13:29:32,367 - brc - INFO - Average train loss for epoch 20 is 19.5995518366
INFO:brc:Average train loss for epoch 20 is 19.5995518366
2019-09-06 13:29:46,144 - brc - INFO - epoch: 21, epoch_time_cost: 13.59
INFO:brc:epoch: 21, epoch_time_cost: 13.59
2019-09-06 13:29:46,144 - brc - INFO - Average train loss for epoch 21 is 19.3210824331
INFO:brc:Average train loss for epoch 21 is 19.3210824331
2019-09-06 13:29:59,943 - brc - INFO - epoch: 22, epoch_time_cost: 13.60
INFO:brc:epoch: 22, epoch_time_cost: 13.60
2019-09-06 13:29:59,943 - brc - INFO - Average train loss for epoch 22 is 19.0815839767
INFO:brc:Average train loss for epoch 22 is 19.0815839767
2019-09-06 13:30:13,998 - brc - INFO - epoch: 23, epoch_time_cost: 13.87
INFO:brc:epoch: 23, epoch_time_cost: 13.87
2019-09-06 13:30:13,998 - brc - INFO - Average train loss for epoch 23 is 18.8392149607
INFO:brc:Average train loss for epoch 23 is 18.8392149607
2019-09-06 13:30:27,573 - brc - INFO - epoch: 24, epoch_time_cost: 13.38
INFO:brc:epoch: 24, epoch_time_cost: 13.38
2019-09-06 13:30:27,573 - brc - INFO - Average train loss for epoch 24 is 18.5984970729
INFO:brc:Average train loss for epoch 24 is 18.5984970729
2019-09-06 13:30:41,390 - brc - INFO - epoch: 25, epoch_time_cost: 13.63
INFO:brc:epoch: 25, epoch_time_cost: 13.63
2019-09-06 13:30:41,390 - brc - INFO - Average train loss for epoch 25 is 18.3671442668
INFO:brc:Average train loss for epoch 25 is 18.3671442668
2019-09-06 13:30:54,598 - brc - INFO - epoch: 26, epoch_time_cost: 13.02
INFO:brc:epoch: 26, epoch_time_cost: 13.02
2019-09-06 13:30:54,598 - brc - INFO - Average train loss for epoch 26 is 18.1392790476
INFO:brc:Average train loss for epoch 26 is 18.1392790476
2019-09-06 13:31:08,627 - brc - INFO - epoch: 27, epoch_time_cost: 13.84
INFO:brc:epoch: 27, epoch_time_cost: 13.84
2019-09-06 13:31:08,628 - brc - INFO - Average train loss for epoch 27 is 17.9132769903
INFO:brc:Average train loss for epoch 27 is 17.9132769903
2019-09-06 13:31:21,769 - brc - INFO - epoch: 28, epoch_time_cost: 12.96
INFO:brc:epoch: 28, epoch_time_cost: 12.96
2019-09-06 13:31:21,770 - brc - INFO - Average train loss for epoch 28 is 17.6892995834
INFO:brc:Average train loss for epoch 28 is 17.6892995834
2019-09-06 13:31:35,381 - brc - INFO - epoch: 29, epoch_time_cost: 13.43
INFO:brc:epoch: 29, epoch_time_cost: 13.43
2019-09-06 13:31:35,382 - brc - INFO - Average train loss for epoch 29 is 17.4720538457
INFO:brc:Average train loss for epoch 29 is 17.4720538457
2019-09-06 13:31:49,158 - brc - INFO - epoch: 30, epoch_time_cost: 13.59
INFO:brc:epoch: 30, epoch_time_cost: 13.59
2019-09-06 13:31:49,159 - brc - INFO - Average train loss for epoch 30 is 17.2570892970
INFO:brc:Average train loss for epoch 30 is 17.2570892970
2019-09-06 13:32:03,536 - brc - INFO - epoch: 31, epoch_time_cost: 14.18
INFO:brc:epoch: 31, epoch_time_cost: 14.18
2019-09-06 13:32:03,537 - brc - INFO - Average train loss for epoch 31 is 17.0383866628
INFO:brc:Average train loss for epoch 31 is 17.0383866628
2019-09-06 13:32:17,849 - brc - INFO - epoch: 32, epoch_time_cost: 14.13
INFO:brc:epoch: 32, epoch_time_cost: 14.13
2019-09-06 13:32:17,849 - brc - INFO - Average train loss for epoch 32 is 16.8278697332
INFO:brc:Average train loss for epoch 32 is 16.8278697332
2019-09-06 13:32:31,153 - brc - INFO - epoch: 33, epoch_time_cost: 13.08
INFO:brc:epoch: 33, epoch_time_cost: 13.08
2019-09-06 13:32:31,153 - brc - INFO - Average train loss for epoch 33 is 16.6187410355
INFO:brc:Average train loss for epoch 33 is 16.6187410355
2019-09-06 13:32:45,199 - brc - INFO - epoch: 34, epoch_time_cost: 13.86
INFO:brc:epoch: 34, epoch_time_cost: 13.86
2019-09-06 13:32:45,199 - brc - INFO - Average train loss for epoch 34 is 16.4102087021
INFO:brc:Average train loss for epoch 34 is 16.4102087021
2019-09-06 13:32:59,989 - brc - INFO - epoch: 35, epoch_time_cost: 14.60
INFO:brc:epoch: 35, epoch_time_cost: 14.60
2019-09-06 13:32:59,990 - brc - INFO - Average train loss for epoch 35 is 16.2043565114
INFO:brc:Average train loss for epoch 35 is 16.2043565114
2019-09-06 13:33:14,671 - brc - INFO - epoch: 36, epoch_time_cost: 14.49
INFO:brc:epoch: 36, epoch_time_cost: 14.49
2019-09-06 13:33:14,672 - brc - INFO - Average train loss for epoch 36 is 16.0016709963
INFO:brc:Average train loss for epoch 36 is 16.0016709963
2019-09-06 13:33:28,910 - brc - INFO - epoch: 37, epoch_time_cost: 14.05
INFO:brc:epoch: 37, epoch_time_cost: 14.05
2019-09-06 13:33:28,910 - brc - INFO - Average train loss for epoch 37 is 15.8016374906
INFO:brc:Average train loss for epoch 37 is 15.8016374906
2019-09-06 13:33:43,245 - brc - INFO - epoch: 38, epoch_time_cost: 14.15
INFO:brc:epoch: 38, epoch_time_cost: 14.15
2019-09-06 13:33:43,245 - brc - INFO - Average train loss for epoch 38 is 15.6039719582
INFO:brc:Average train loss for epoch 38 is 15.6039719582
2019-09-06 13:33:57,072 - brc - INFO - epoch: 39, epoch_time_cost: 13.64
INFO:brc:epoch: 39, epoch_time_cost: 13.64
2019-09-06 13:33:57,072 - brc - INFO - Average train loss for epoch 39 is 15.4081643422
INFO:brc:Average train loss for epoch 39 is 15.4081643422
2019-09-06 13:34:11,370 - brc - INFO - epoch: 40, epoch_time_cost: 14.11
INFO:brc:epoch: 40, epoch_time_cost: 14.11
2019-09-06 13:34:11,370 - brc - INFO - Average train loss for epoch 40 is 15.2176001867
INFO:brc:Average train loss for epoch 40 is 15.2176001867
2019-09-06 13:34:25,765 - brc - INFO - epoch: 41, epoch_time_cost: 14.21
INFO:brc:epoch: 41, epoch_time_cost: 14.21
2019-09-06 13:34:25,766 - brc - INFO - Average train loss for epoch 41 is 15.0259342194
INFO:brc:Average train loss for epoch 41 is 15.0259342194
2019-09-06 13:34:40,413 - brc - INFO - epoch: 42, epoch_time_cost: 14.46
INFO:brc:epoch: 42, epoch_time_cost: 14.46
2019-09-06 13:34:40,414 - brc - INFO - Average train loss for epoch 42 is 14.8374350866
INFO:brc:Average train loss for epoch 42 is 14.8374350866
2019-09-06 13:34:54,765 - brc - INFO - epoch: 43, epoch_time_cost: 14.16
INFO:brc:epoch: 43, epoch_time_cost: 14.16
2019-09-06 13:34:54,766 - brc - INFO - Average train loss for epoch 43 is 14.6492177645
INFO:brc:Average train loss for epoch 43 is 14.6492177645
2019-09-06 13:35:09,096 - brc - INFO - epoch: 44, epoch_time_cost: 14.13
INFO:brc:epoch: 44, epoch_time_cost: 14.13
2019-09-06 13:35:09,096 - brc - INFO - Average train loss for epoch 44 is 14.4660690626
INFO:brc:Average train loss for epoch 44 is 14.4660690626
2019-09-06 13:35:22,555 - brc - INFO - epoch: 45, epoch_time_cost: 13.27
INFO:brc:epoch: 45, epoch_time_cost: 13.27
2019-09-06 13:35:22,555 - brc - INFO - Average train loss for epoch 45 is 14.2831737200
INFO:brc:Average train loss for epoch 45 is 14.2831737200
2019-09-06 13:35:36,717 - brc - INFO - epoch: 46, epoch_time_cost: 13.95
INFO:brc:epoch: 46, epoch_time_cost: 13.95
2019-09-06 13:35:36,717 - brc - INFO - Average train loss for epoch 46 is 14.1043623288
INFO:brc:Average train loss for epoch 46 is 14.1043623288
2019-09-06 13:35:50,628 - brc - INFO - epoch: 47, epoch_time_cost: 13.72
INFO:brc:epoch: 47, epoch_time_cost: 13.72
2019-09-06 13:35:50,629 - brc - INFO - Average train loss for epoch 47 is 13.9296310743
INFO:brc:Average train loss for epoch 47 is 13.9296310743
2019-09-06 13:36:05,294 - brc - INFO - epoch: 48, epoch_time_cost: 14.48
INFO:brc:epoch: 48, epoch_time_cost: 14.48
2019-09-06 13:36:05,294 - brc - INFO - Average train loss for epoch 48 is 13.7521994909
INFO:brc:Average train loss for epoch 48 is 13.7521994909
2019-09-06 13:36:19,662 - brc - INFO - epoch: 49, epoch_time_cost: 14.18
INFO:brc:epoch: 49, epoch_time_cost: 14.18
2019-09-06 13:36:19,662 - brc - INFO - Average train loss for epoch 49 is 13.5768952370
INFO:brc:Average train loss for epoch 49 is 13.5768952370
2019-09-06 13:36:34,102 - brc - INFO - epoch: 50, epoch_time_cost: 14.26
INFO:brc:epoch: 50, epoch_time_cost: 14.26
2019-09-06 13:36:34,103 - brc - INFO - Average train loss for epoch 50 is 13.4075390498
INFO:brc:Average train loss for epoch 50 is 13.4075390498
 

經過運行如下命令,能夠利用訓練好的模型在驗證集進行評估,評估結束後程序會自動計算ROUGE-L指標並顯示最終結果。默認使用data/demo中的數據進行評估。

In[5]
!cd src && sh run.sh --evaluate  --load_dir ../data/models/50
2019-09-06 13:43:34,412 - brc - INFO - Running with args : Namespace(batch_size=16, dev_interval=10, devset=['../data/demo/devset/search.dev.json'], doc_num=5, drop_rate=0.0, embed_size=300, enable_ce=False, evaluate=True, hidden_size=150, learning_rate=0.001, load_dir='../data/models/50', log_interval=50, log_path=None, max_a_len=200, max_p_len=500, max_p_num=5, max_q_len=60, optim='adam', para_print=False, pass_num=5, predict=False, prepare=False, random_seed=123, result_dir='../data/results/', result_name='test_result', save_dir='../data/models', save_interval=1, testset=['../data/demo/testset/search.test.json'], train=False, trainset=['../data/demo/trainset/search.train.json'], use_gpu=True, vocab_dir='../data/vocab', weight_decay=0.0001)
2019-09-06 13:43:34,412 - brc - INFO - Load data_set and vocab...
2019-09-06 13:43:34,754 - brc - INFO - vocab size is 5007 and embed dim is 300
2019-09-06 13:43:34,999 - brc - INFO - Dev set size: 100 questions.
2019-09-06 13:43:34,999 - brc - INFO - Converting text into ids...
2019-09-06 13:43:35,023 - brc - INFO - Initialize the model...
2019-09-06 13:43:35,112 - brc - INFO - load from ../data/models/50
W0906 13:43:35.880043   229 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.2, Runtime API Version: 9.0
W0906 13:43:35.883968   229 device_context.cc:267] device: 0, cuDNN Version: 7.3.
WARNING:root:
     You can try our memory optimize feature to save your memory usage:
         # create a build_strategy variable to set memory optimize option
         build_strategy = compiler.BuildStrategy()
         build_strategy.enable_inplace = True
         build_strategy.memory_optimize = True
         
         # pass the build_strategy to with_data_parallel API
         compiled_prog = compiler.CompiledProgram(main).with_data_parallel(
             loss_name=loss.name, build_strategy=build_strategy)
      
     !!! Memory optimize is our experimental feature !!!
         some variables may be removed/reused internal to save memory usage, 
         in order to fetch the right value of the fetch_list, please set the 
         persistable property to true for each variable in fetch_list

         # Sample
         conv1 = fluid.layers.conv2d(data, 4, 5, 1, act=None) 
         # if you need to fetch conv1, then:
         conv1.persistable = True

                 
I0906 13:43:35.925932   229 parallel_executor.cc:329] The number of CUDAPlace, which is used in ParallelExecutor, is 1. And the Program will be copied 1 copies
I0906 13:43:35.929263   229 build_strategy.cc:340] SeqOnlyAllReduceOps:0, num_trainers:1
2019-09-06 13:43:44,000 - brc - INFO - Saving test_result results to ../data/results/test_result.json
INFO:brc:Saving test_result results to ../data/results/test_result.json
{'reflen': 8741, 'guess': [3746, 3647, 3577, 3522], 'testlen': 3746, 'correct': [1199, 639, 442, 351]}
('ratio:', 0.42855508523047375)
2019-09-06 13:43:44,419 - brc - INFO - Dev eval result: {'Bleu-4': 0.04272788984782911, 'Rouge-L': 0.1452405634370218, 'Bleu-1': 0.08436327983345214, 'Bleu-3': 0.050250685466992295, 'Bleu-2': 0.06241806456038598}
INFO:brc:Dev eval result: {'Bleu-4': 0.04272788984782911, 'Rouge-L': 0.1452405634370218, 'Bleu-1': 0.08436327983345214, 'Bleu-3': 0.050250685466992295, 'Bleu-2': 0.06241806456038598}
2019-09-06 13:43:44,419 - brc - INFO - Predicted answers are saved to ../data/results/
INFO:brc:Predicted answers are saved to ../data/results/
 

經過運行如下命令,能夠利用訓練好的模型進行預測,預測結果會保存在data/result/目錄下,可使用文本編輯的模型打開json文件查看預測結果,默認使用data/demo中的數據進行預測

同時predict腳本在預測完成後還會將模型的參數進行固化,若是須要將功能拆分,能夠修改run.py文件,將freeze 和predict拆分開

In[6]
!cd src && sh run.sh --predict  --load_dir  ../data/models/50
2019-09-06 13:46:29,280 - brc - INFO - Running with args : Namespace(batch_size=16, dev_interval=10, devset=['../data/demo/devset/search.dev.json'], doc_num=5, drop_rate=0.0, embed_size=300, enable_ce=False, evaluate=False, hidden_size=150, learning_rate=0.001, load_dir='../data/models/50', log_interval=50, log_path=None, max_a_len=200, max_p_len=500, max_p_num=5, max_q_len=60, optim='adam', para_print=False, pass_num=5, predict=True, prepare=False, random_seed=123, result_dir='../data/results/', result_name='test_result', save_dir='../data/models', save_interval=1, testset=['../data/demo/testset/search.test.json'], train=False, trainset=['../data/demo/trainset/search.train.json'], use_gpu=True, vocab_dir='../data/vocab', weight_decay=0.0001)
2019-09-06 13:46:29,280 - brc - INFO - Load data_set and vocab...
2019-09-06 13:46:29,625 - brc - INFO - vocab size is 5007 and embed dim is 300
2019-09-06 13:46:29,861 - brc - INFO - Dev set size: 100 questions.
2019-09-06 13:46:29,861 - brc - INFO - Converting text into ids...
2019-09-06 13:46:29,885 - brc - INFO - Initialize the model...
2019-09-06 13:46:29,977 - brc - INFO - load from ../data/models/50
W0906 13:46:30.750571   281 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.2, Runtime API Version: 9.0
W0906 13:46:30.755205   281 device_context.cc:267] device: 0, cuDNN Version: 7.3.
WARNING:root:
     You can try our memory optimize feature to save your memory usage:
         # create a build_strategy variable to set memory optimize option
         build_strategy = compiler.BuildStrategy()
         build_strategy.enable_inplace = True
         build_strategy.memory_optimize = True
         
         # pass the build_strategy to with_data_parallel API
         compiled_prog = compiler.CompiledProgram(main).with_data_parallel(
             loss_name=loss.name, build_strategy=build_strategy)
      
     !!! Memory optimize is our experimental feature !!!
         some variables may be removed/reused internal to save memory usage, 
         in order to fetch the right value of the fetch_list, please set the 
         persistable property to true for each variable in fetch_list

         # Sample
         conv1 = fluid.layers.conv2d(data, 4, 5, 1, act=None) 
         # if you need to fetch conv1, then:
         conv1.persistable = True

                 
I0906 13:46:30.797705   281 parallel_executor.cc:329] The number of CUDAPlace, which is used in ParallelExecutor, is 1. And the Program will be copied 1 copies
I0906 13:46:30.801053   281 build_strategy.cc:340] SeqOnlyAllReduceOps:0, num_trainers:1
2019-09-06 13:46:39,329 - brc - INFO - Saving test_result results to ../data/results/test_result.json
INFO:brc:Saving test_result results to ../data/results/test_result.json
2019-09-06 13:46:39,340 - brc - INFO - Load data_set and vocab...
INFO:brc:Load data_set and vocab...
2019-09-06 13:46:39,666 - brc - INFO - vocab size is 5007 and embed dim is 300
INFO:brc:vocab size is 5007 and embed dim is 300
2019-09-06 13:46:39,845 - brc - INFO - Dev set size: 100 questions.
INFO:brc:Dev set size: 100 questions.
2019-09-06 13:46:39,846 - brc - INFO - Converting text into ids...
INFO:brc:Converting text into ids...
2019-09-06 13:46:39,870 - brc - INFO - Initialize the model...
INFO:brc:Initialize the model...
2019-09-06 13:46:39,954 - brc - INFO - load from ../data/models/50
INFO:brc:load from ../data/models/50
 

預測的結果以這樣的形式保存在文件中,根據question_type 和question_id給出對應的question

{"yesno_answers": [], "entity_answers": [[]], "answers": ["在使用路由器上網時,咱們會發如今路由器上,標註得有WAN口(有的路由器是Internet口)和LAN口(有的路由器標註的是一、二、三、4)。不少用戶一看就暈了,根本就不知道WAN口與LAN口的區別,天然不知道應該怎麼鏈接了。"], "question_type": "DESCRIPTION", "question_id": 221576}
In[7]
# 使用固化後的模型參數進行推斷
!cd src && python infer.py --predict --result_dir ../data/infer_results/
2019-09-06 13:46:53,136 - brc - INFO - Running with args : Namespace(batch_size=16, dev_interval=10, devset=['../data/demo/devset/search.dev.json'], doc_num=5, drop_rate=0.0, embed_size=300, enable_ce=False, evaluate=False, hidden_size=150, learning_rate=0.001, load_dir='', log_interval=50, log_path=None, max_a_len=200, max_p_len=500, max_p_num=5, max_q_len=60, optim='adam', para_print=False, pass_num=5, predict=True, prepare=False, random_seed=123, result_dir='../data/infer_results/', result_name='test_result', save_dir='../data/models', save_interval=1, testset=['../data/demo/testset/search.test.json'], train=False, trainset=['../data/demo/trainset/search.train.json'], use_gpu=True, vocab_dir='../data/vocab', weight_decay=0.0001)
2019-09-06 13:46:53,137 - brc - INFO - Load data_set and vocab...
2019-09-06 13:46:53,479 - brc - INFO - vocab size is 5007 and embed dim is 300
2019-09-06 13:46:53,711 - brc - INFO - Test set size: 100 questions.
2019-09-06 13:46:53,711 - brc - INFO - Converting text into ids...
2019-09-06 13:46:53,735 - brc - INFO - Initialize the model...
W0906 13:46:54.517477   332 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.2, Runtime API Version: 9.0
W0906 13:46:54.521802   332 device_context.cc:267] device: 0, cuDNN Version: 7.3.
2019-09-06 13:47:02,587 - brc - INFO - Saving test_result results to ../data/infer_results/test_result.json

點擊連接,使用AI Studio一鍵上手實踐項目吧:https://aistudio.baidu.com/aistudio/projectdetail/122349 

下載安裝命令

## CPU版本安裝命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/cpu paddlepaddle

## GPU版本安裝命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/gpu paddlepaddle-gpu

>> 訪問 PaddlePaddle 官網,瞭解更多相關內容

相關文章
相關標籤/搜索