項目簡介
在機器閱讀理解(MRC)任務中,咱們會給定一個問題(Q)以及一個或多個段落(P)/文檔(D),而後利用機器在給定的段落中尋找正確答案(A),即Q + P or D => A. 機器閱讀理解(MRC)是天然語言處理(NLP)中的關鍵任務之一,須要機器對語言有深入的理解才能找到正確的答案。本項目基於paddlepaddle,針對DuReader閱讀理解數據集數據集實現並升級了一個經典的閱讀理解模型——BiDAF模型,該模型的結構圖以下所示:
在DuReader數據集上的效果以下表所示:python
Model | Dev ROUGE-L | Test ROUGE-L |
---|---|---|
BiDAF (原始論文基線) | 39.29 | 45.90 |
本基線系統 | 47.68 | 54.66 |
DuReader數據集
DuReader是一個大規模、面向真實應用、由人類生成的中文閱讀理解數據集。DuReader聚焦於真實世界中的不限定領域的問答任務。相較於其餘閱讀理解數據集,DuReader的優點包括:json
- 問題來自於真實的搜索日誌
- 文章內容來自於真實網頁
- 答案由人類生成
- 面向真實應用場景
- 標註更加豐富細緻
更多關於DuReader數據集的詳細信息可在DuReader官網找到數組
進階使用
任務定義與建模
閱讀理解任務的輸入包括:網絡
- 一個問題Q (已分詞),例如:["明天", "的", "天氣", "怎麼樣", "?"];
- 一個或多個段落P (已分詞),例如:[["今天", "的", "天氣", "是", "多雲", "轉", "晴", ",", "溫度", "適中", "。"], ["明天", "氣溫", "較爲", "寒冷", ",", "請", "注意", "添加", "衣物", "。"]]。
模型輸出包括:dom
- 段落P中每一個詞是答案起始位置的機率以及答案結束位置的機率 (boundary model),例如:起始機率=[[0.01, 0.02, ...], [0.80, 0.10, ...]],結束機率=[[0.01, 0.02, ...], [0.01, 0.01, ...]],其中機率數組的維度和輸入分詞後的段落維度相同。
模型結構包括:fetch
- 嵌入層 (embedding layer):輸入採用one-hot方式表示的詞,獲得詞向量;
- 編碼層 (encoding layer):對詞向量進行編碼,融入上下文信息;
- 匹配層 (matching layer):對問題Q和段落P之間進行匹配;
- 融合層 (fusion layer):融合匹配後的結果;
- 預測層 (output layer):預測獲得起始、結束機率。
模型原理介紹
下圖顯示了原始的模型結構(如BiDAF模型結構圖)。在本基線系統中,咱們去掉了char級別的embedding,在預測層中使用了pointer network,而且參考了R-NET中的一些網絡結構。ui
數據格式說明
DuReader數據集中每一個樣本都包含若干文檔(documents),每一個文檔又包含若干段落(paragraphs)。有關數據的詳細介紹可見官網、論文以及數據集中包含的說明文件,下面是一個來自訓練集的樣本示例this
{ "documents": [ { "is_selected": true, "title": "板蘭根沖劑_百度百科", "most_related_para": 11, "segmented_title": ["板蘭根", "沖劑", "_", "百度百科"], "segmented_paragraphs": [ ["板蘭根", "沖劑", ",", "藥", "名", ":", ... ], ["【", "功效", "與", "主治", "】", ...], ... ], "paragraphs": [ "板蘭根沖劑,藥名...", "【功效與主治】...", ... ], "bs_rank_pos": 0 }, { "is_selected": true, "title": "長期喝板藍根顆粒有哪些好處和壞處", "most_related_para": 0, "segmented_title": ["長期", "喝", "板藍根", "顆粒", "有", "哪些", "好處", "和", "壞處"], "segmented_paragraphs": [ ["板藍根", "對", "感冒", "、", "流感","、", ...], ... ], "paragraphs": [ "板藍根對感冒、流感、流腦、...", ... ], "bs_rank_pos": 1 }, ... ], "answer_spans": [[5, 28]], "fake_answers": ["清熱解毒、涼血;用於溫熱發熱、發斑、風熱感冒、咽喉腫爛、流行性乙型腦炎、肝炎、腮腺炎。"], "question": "板藍根顆粒的功效與做用", "segmented_answers": [ ["清熱解毒", "、", "涼血", ";", "用於", "溫", "熱", "發熱", ...], ["板藍根", "的", "用途", "不只", "是", "治療", "感冒", ...], ... ], "answers": [ "清熱解毒、涼血;用於溫熱發熱、發斑、風熱感冒、咽喉腫爛、流行性乙型腦炎、肝炎、 腮腺炎 。", "板藍根的用途不只是治療感冒,板藍根的功效與做用多,對多種細菌性、病毒性疾病都有較好的預防與治療做用。", ... ], "answer_docs": [0], "segmented_question": ["板藍根顆粒", "的", "功效", "與", "做用"], "question_type": "DESCRIPTION", "question_id": 91161, "fact_or_opinion": "FACT", "match_scores": [ 0.9583333333333334 ] }
# 從work中把代碼解壓出來,解壓數據集 !unzip -qo data/data9722/demo.zip !mv home/aistudio/data/demo data !rm -r home # !tar -zxf data/data9722/dureader_machine_reading-dataset-2.0.0.tar.gz -C data
爲了提高模型在DuReader2.0數據集上的表現,做者採用了一種新的段落抽取策略。該段落抽取策略可經過運行如下命令執行:編碼
# 使用demo的話 這裏不用執行 !cd src && sh run.sh --para_extraction !mv extracted/* data/extracted
Start paragraph extraction, this may take a few hours Source dir: ../data/preprocessed Target dir: ../extracted Processing trainset Processing devset Processing testset Paragraph extraction done!
在模型訓練開始以前,須要先運行如下命令來生成詞表以及建立一些必要的文件夾,用於存放模型參數等:lua
!cd src && sh run.sh --prepare
2019-09-06 13:24:48,621 - brc - INFO - Running with args : Namespace(batch_size=16, dev_interval=10, devset=['../data/demo/devset/search.dev.json'], doc_num=5, drop_rate=0.0, embed_size=300, enable_ce=False, evaluate=False, hidden_size=150, learning_rate=0.001, load_dir='', log_interval=50, log_path=None, max_a_len=200, max_p_len=500, max_p_num=5, max_q_len=60, optim='adam', para_print=False, pass_num=5, predict=False, prepare=True, random_seed=123, result_dir='../data/results/', result_name='test_result', save_dir='../data/models', save_interval=1, testset=['../data/demo/testset/search.test.json'], train=False, trainset=['../data/demo/trainset/search.train.json'], use_gpu=True, vocab_dir='../data/vocab', weight_decay=0.0001) 2019-09-06 13:24:48,621 - brc - INFO - Checking the data files... 2019-09-06 13:24:48,622 - brc - INFO - Preparing the directories... 2019-09-06 13:24:48,622 - brc - INFO - Building vocabulary... 2019-09-06 13:24:48,744 - brc - INFO - Train set size: 95 questions. 2019-09-06 13:24:48,937 - brc - INFO - Dev set size: 100 questions. 2019-09-06 13:24:49,112 - brc - INFO - Test set size: 100 questions. 2019-09-06 13:24:49,166 - brc - INFO - After filter 5225 tokens, the final vocab size is 5007 2019-09-06 13:24:49,166 - brc - INFO - Assigning embeddings... 2019-09-06 13:24:49,180 - brc - INFO - Saving vocab... 2019-09-06 13:24:49,884 - brc - INFO - Done with preparing!
而後運行下面的命令,便可開始訓練。更多的參數配置可在src/args.py中修改,默認使用data/demo中的數據進行訓練
!cd src && sh run.sh --train --pass_num 50
2019-09-06 13:24:58,878 - brc - INFO - Running with args : Namespace(batch_size=16, dev_interval=10, devset=['../data/demo/devset/search.dev.json'], doc_num=5, drop_rate=0.0, embed_size=300, enable_ce=False, evaluate=False, hidden_size=150, learning_rate=0.001, load_dir='', log_interval=50, log_path=None, max_a_len=200, max_p_len=500, max_p_num=5, max_q_len=60, optim='adam', para_print=False, pass_num=50, predict=False, prepare=False, random_seed=123, result_dir='../data/results/', result_name='test_result', save_dir='../data/models', save_interval=1, testset=['../data/demo/testset/search.test.json'], train=True, trainset=['../data/demo/trainset/search.train.json'], use_gpu=True, vocab_dir='../data/vocab', weight_decay=0.0001) 2019-09-06 13:24:58,878 - brc - INFO - Load data_set and vocab... 2019-09-06 13:24:59,220 - brc - INFO - vocab size is 5007 and embed dim is 300 2019-09-06 13:24:59,346 - brc - INFO - Train set size: 95 questions. 2019-09-06 13:24:59,542 - brc - INFO - Dev set size: 100 questions. 2019-09-06 13:24:59,543 - brc - INFO - Converting text into ids... 2019-09-06 13:24:59,596 - brc - INFO - Initialize the model... W0906 13:25:00.714032 178 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.2, Runtime API Version: 9.0 W0906 13:25:00.718631 178 device_context.cc:267] device: 0, cuDNN Version: 7.3. 2019-09-06 13:25:00,743 - brc - INFO - Training the model... WARNING:root: You can try our memory optimize feature to save your memory usage: # create a build_strategy variable to set memory optimize option build_strategy = compiler.BuildStrategy() build_strategy.enable_inplace = True build_strategy.memory_optimize = True # pass the build_strategy to with_data_parallel API compiled_prog = compiler.CompiledProgram(main).with_data_parallel( loss_name=loss.name, build_strategy=build_strategy) !!! Memory optimize is our experimental feature !!! some variables may be removed/reused internal to save memory usage, in order to fetch the right value of the fetch_list, please set the persistable property to true for each variable in fetch_list # Sample conv1 = fluid.layers.conv2d(data, 4, 5, 1, act=None) # if you need to fetch conv1, then: conv1.persistable = True I0906 13:25:00.767354 178 parallel_executor.cc:329] The number of CUDAPlace, which is used in ParallelExecutor, is 1. And the Program will be copied 1 copies I0906 13:25:00.788298 178 build_strategy.cc:340] SeqOnlyAllReduceOps:0, num_trainers:1 2019-09-06 13:25:13,447 - brc - INFO - epoch: 1, epoch_time_cost: 12.65 INFO:brc:epoch: 1, epoch_time_cost: 12.65 2019-09-06 13:25:13,447 - brc - INFO - Average train loss for epoch 1 is 36.6342582703 INFO:brc:Average train loss for epoch 1 is 36.6342582703 2019-09-06 13:25:26,456 - brc - INFO - epoch: 2, epoch_time_cost: 12.81 INFO:brc:epoch: 2, epoch_time_cost: 12.81 2019-09-06 13:25:26,456 - brc - INFO - Average train loss for epoch 2 is 35.1372330983 INFO:brc:Average train loss for epoch 2 is 35.1372330983 2019-09-06 13:25:39,642 - brc - INFO - epoch: 3, epoch_time_cost: 12.98 INFO:brc:epoch: 3, epoch_time_cost: 12.98 2019-09-06 13:25:39,642 - brc - INFO - Average train loss for epoch 3 is 34.1238333384 INFO:brc:Average train loss for epoch 3 is 34.1238333384 2019-09-06 13:25:52,855 - brc - INFO - epoch: 4, epoch_time_cost: 13.02 INFO:brc:epoch: 4, epoch_time_cost: 13.02 2019-09-06 13:25:52,855 - brc - INFO - Average train loss for epoch 4 is 33.5638427734 INFO:brc:Average train loss for epoch 4 is 33.5638427734 2019-09-06 13:26:06,531 - brc - INFO - epoch: 5, epoch_time_cost: 13.49 INFO:brc:epoch: 5, epoch_time_cost: 13.49 2019-09-06 13:26:06,532 - brc - INFO - Average train loss for epoch 5 is 32.7321383158 INFO:brc:Average train loss for epoch 5 is 32.7321383158 2019-09-06 13:26:20,415 - brc - INFO - epoch: 6, epoch_time_cost: 13.68 INFO:brc:epoch: 6, epoch_time_cost: 13.68 2019-09-06 13:26:20,416 - brc - INFO - Average train loss for epoch 6 is 31.9549776713 INFO:brc:Average train loss for epoch 6 is 31.9549776713 2019-09-06 13:26:33,578 - brc - INFO - epoch: 7, epoch_time_cost: 12.97 INFO:brc:epoch: 7, epoch_time_cost: 12.97 2019-09-06 13:26:33,578 - brc - INFO - Average train loss for epoch 7 is 31.1214656830 INFO:brc:Average train loss for epoch 7 is 31.1214656830 2019-09-06 13:26:47,658 - brc - INFO - epoch: 8, epoch_time_cost: 13.89 INFO:brc:epoch: 8, epoch_time_cost: 13.89 2019-09-06 13:26:47,658 - brc - INFO - Average train loss for epoch 8 is 29.9649470647 INFO:brc:Average train loss for epoch 8 is 29.9649470647 2019-09-06 13:27:00,801 - brc - INFO - epoch: 9, epoch_time_cost: 12.95 INFO:brc:epoch: 9, epoch_time_cost: 12.95 2019-09-06 13:27:00,801 - brc - INFO - Average train loss for epoch 9 is 28.2852547963 INFO:brc:Average train loss for epoch 9 is 28.2852547963 2019-09-06 13:27:14,956 - brc - INFO - epoch: 10, epoch_time_cost: 13.97 INFO:brc:epoch: 10, epoch_time_cost: 13.97 2019-09-06 13:27:14,956 - brc - INFO - Average train loss for epoch 10 is 26.4429680506 INFO:brc:Average train loss for epoch 10 is 26.4429680506 2019-09-06 13:27:28,226 - brc - INFO - epoch: 11, epoch_time_cost: 13.08 INFO:brc:epoch: 11, epoch_time_cost: 13.08 2019-09-06 13:27:28,226 - brc - INFO - Average train loss for epoch 11 is 25.0628827413 INFO:brc:Average train loss for epoch 11 is 25.0628827413 2019-09-06 13:27:41,917 - brc - INFO - epoch: 12, epoch_time_cost: 13.48 INFO:brc:epoch: 12, epoch_time_cost: 13.48 2019-09-06 13:27:41,917 - brc - INFO - Average train loss for epoch 12 is 23.9484675725 INFO:brc:Average train loss for epoch 12 is 23.9484675725 2019-09-06 13:27:55,270 - brc - INFO - epoch: 13, epoch_time_cost: 13.16 INFO:brc:epoch: 13, epoch_time_cost: 13.16 2019-09-06 13:27:55,270 - brc - INFO - Average train loss for epoch 13 is 22.9610160192 INFO:brc:Average train loss for epoch 13 is 22.9610160192 2019-09-06 13:28:10,131 - brc - INFO - epoch: 14, epoch_time_cost: 14.66 INFO:brc:epoch: 14, epoch_time_cost: 14.66 2019-09-06 13:28:10,131 - brc - INFO - Average train loss for epoch 14 is 22.0059817632 INFO:brc:Average train loss for epoch 14 is 22.0059817632 2019-09-06 13:28:23,853 - brc - INFO - epoch: 15, epoch_time_cost: 13.52 INFO:brc:epoch: 15, epoch_time_cost: 13.52 2019-09-06 13:28:23,853 - brc - INFO - Average train loss for epoch 15 is 21.3352721532 INFO:brc:Average train loss for epoch 15 is 21.3352721532 2019-09-06 13:28:37,792 - brc - INFO - epoch: 16, epoch_time_cost: 13.74 INFO:brc:epoch: 16, epoch_time_cost: 13.74 2019-09-06 13:28:37,792 - brc - INFO - Average train loss for epoch 16 is 20.9558064143 INFO:brc:Average train loss for epoch 16 is 20.9558064143 2019-09-06 13:28:51,719 - brc - INFO - epoch: 17, epoch_time_cost: 13.73 INFO:brc:epoch: 17, epoch_time_cost: 13.73 2019-09-06 13:28:51,719 - brc - INFO - Average train loss for epoch 17 is 20.6100145976 INFO:brc:Average train loss for epoch 17 is 20.6100145976 2019-09-06 13:29:05,443 - brc - INFO - epoch: 18, epoch_time_cost: 13.53 INFO:brc:epoch: 18, epoch_time_cost: 13.53 2019-09-06 13:29:05,444 - brc - INFO - Average train loss for epoch 18 is 20.1902602514 INFO:brc:Average train loss for epoch 18 is 20.1902602514 2019-09-06 13:29:18,832 - brc - INFO - epoch: 19, epoch_time_cost: 13.19 INFO:brc:epoch: 19, epoch_time_cost: 13.19 2019-09-06 13:29:18,832 - brc - INFO - Average train loss for epoch 19 is 19.8818868001 INFO:brc:Average train loss for epoch 19 is 19.8818868001 2019-09-06 13:29:32,366 - brc - INFO - epoch: 20, epoch_time_cost: 13.35 INFO:brc:epoch: 20, epoch_time_cost: 13.35 2019-09-06 13:29:32,367 - brc - INFO - Average train loss for epoch 20 is 19.5995518366 INFO:brc:Average train loss for epoch 20 is 19.5995518366 2019-09-06 13:29:46,144 - brc - INFO - epoch: 21, epoch_time_cost: 13.59 INFO:brc:epoch: 21, epoch_time_cost: 13.59 2019-09-06 13:29:46,144 - brc - INFO - Average train loss for epoch 21 is 19.3210824331 INFO:brc:Average train loss for epoch 21 is 19.3210824331 2019-09-06 13:29:59,943 - brc - INFO - epoch: 22, epoch_time_cost: 13.60 INFO:brc:epoch: 22, epoch_time_cost: 13.60 2019-09-06 13:29:59,943 - brc - INFO - Average train loss for epoch 22 is 19.0815839767 INFO:brc:Average train loss for epoch 22 is 19.0815839767 2019-09-06 13:30:13,998 - brc - INFO - epoch: 23, epoch_time_cost: 13.87 INFO:brc:epoch: 23, epoch_time_cost: 13.87 2019-09-06 13:30:13,998 - brc - INFO - Average train loss for epoch 23 is 18.8392149607 INFO:brc:Average train loss for epoch 23 is 18.8392149607 2019-09-06 13:30:27,573 - brc - INFO - epoch: 24, epoch_time_cost: 13.38 INFO:brc:epoch: 24, epoch_time_cost: 13.38 2019-09-06 13:30:27,573 - brc - INFO - Average train loss for epoch 24 is 18.5984970729 INFO:brc:Average train loss for epoch 24 is 18.5984970729 2019-09-06 13:30:41,390 - brc - INFO - epoch: 25, epoch_time_cost: 13.63 INFO:brc:epoch: 25, epoch_time_cost: 13.63 2019-09-06 13:30:41,390 - brc - INFO - Average train loss for epoch 25 is 18.3671442668 INFO:brc:Average train loss for epoch 25 is 18.3671442668 2019-09-06 13:30:54,598 - brc - INFO - epoch: 26, epoch_time_cost: 13.02 INFO:brc:epoch: 26, epoch_time_cost: 13.02 2019-09-06 13:30:54,598 - brc - INFO - Average train loss for epoch 26 is 18.1392790476 INFO:brc:Average train loss for epoch 26 is 18.1392790476 2019-09-06 13:31:08,627 - brc - INFO - epoch: 27, epoch_time_cost: 13.84 INFO:brc:epoch: 27, epoch_time_cost: 13.84 2019-09-06 13:31:08,628 - brc - INFO - Average train loss for epoch 27 is 17.9132769903 INFO:brc:Average train loss for epoch 27 is 17.9132769903 2019-09-06 13:31:21,769 - brc - INFO - epoch: 28, epoch_time_cost: 12.96 INFO:brc:epoch: 28, epoch_time_cost: 12.96 2019-09-06 13:31:21,770 - brc - INFO - Average train loss for epoch 28 is 17.6892995834 INFO:brc:Average train loss for epoch 28 is 17.6892995834 2019-09-06 13:31:35,381 - brc - INFO - epoch: 29, epoch_time_cost: 13.43 INFO:brc:epoch: 29, epoch_time_cost: 13.43 2019-09-06 13:31:35,382 - brc - INFO - Average train loss for epoch 29 is 17.4720538457 INFO:brc:Average train loss for epoch 29 is 17.4720538457 2019-09-06 13:31:49,158 - brc - INFO - epoch: 30, epoch_time_cost: 13.59 INFO:brc:epoch: 30, epoch_time_cost: 13.59 2019-09-06 13:31:49,159 - brc - INFO - Average train loss for epoch 30 is 17.2570892970 INFO:brc:Average train loss for epoch 30 is 17.2570892970 2019-09-06 13:32:03,536 - brc - INFO - epoch: 31, epoch_time_cost: 14.18 INFO:brc:epoch: 31, epoch_time_cost: 14.18 2019-09-06 13:32:03,537 - brc - INFO - Average train loss for epoch 31 is 17.0383866628 INFO:brc:Average train loss for epoch 31 is 17.0383866628 2019-09-06 13:32:17,849 - brc - INFO - epoch: 32, epoch_time_cost: 14.13 INFO:brc:epoch: 32, epoch_time_cost: 14.13 2019-09-06 13:32:17,849 - brc - INFO - Average train loss for epoch 32 is 16.8278697332 INFO:brc:Average train loss for epoch 32 is 16.8278697332 2019-09-06 13:32:31,153 - brc - INFO - epoch: 33, epoch_time_cost: 13.08 INFO:brc:epoch: 33, epoch_time_cost: 13.08 2019-09-06 13:32:31,153 - brc - INFO - Average train loss for epoch 33 is 16.6187410355 INFO:brc:Average train loss for epoch 33 is 16.6187410355 2019-09-06 13:32:45,199 - brc - INFO - epoch: 34, epoch_time_cost: 13.86 INFO:brc:epoch: 34, epoch_time_cost: 13.86 2019-09-06 13:32:45,199 - brc - INFO - Average train loss for epoch 34 is 16.4102087021 INFO:brc:Average train loss for epoch 34 is 16.4102087021 2019-09-06 13:32:59,989 - brc - INFO - epoch: 35, epoch_time_cost: 14.60 INFO:brc:epoch: 35, epoch_time_cost: 14.60 2019-09-06 13:32:59,990 - brc - INFO - Average train loss for epoch 35 is 16.2043565114 INFO:brc:Average train loss for epoch 35 is 16.2043565114 2019-09-06 13:33:14,671 - brc - INFO - epoch: 36, epoch_time_cost: 14.49 INFO:brc:epoch: 36, epoch_time_cost: 14.49 2019-09-06 13:33:14,672 - brc - INFO - Average train loss for epoch 36 is 16.0016709963 INFO:brc:Average train loss for epoch 36 is 16.0016709963 2019-09-06 13:33:28,910 - brc - INFO - epoch: 37, epoch_time_cost: 14.05 INFO:brc:epoch: 37, epoch_time_cost: 14.05 2019-09-06 13:33:28,910 - brc - INFO - Average train loss for epoch 37 is 15.8016374906 INFO:brc:Average train loss for epoch 37 is 15.8016374906 2019-09-06 13:33:43,245 - brc - INFO - epoch: 38, epoch_time_cost: 14.15 INFO:brc:epoch: 38, epoch_time_cost: 14.15 2019-09-06 13:33:43,245 - brc - INFO - Average train loss for epoch 38 is 15.6039719582 INFO:brc:Average train loss for epoch 38 is 15.6039719582 2019-09-06 13:33:57,072 - brc - INFO - epoch: 39, epoch_time_cost: 13.64 INFO:brc:epoch: 39, epoch_time_cost: 13.64 2019-09-06 13:33:57,072 - brc - INFO - Average train loss for epoch 39 is 15.4081643422 INFO:brc:Average train loss for epoch 39 is 15.4081643422 2019-09-06 13:34:11,370 - brc - INFO - epoch: 40, epoch_time_cost: 14.11 INFO:brc:epoch: 40, epoch_time_cost: 14.11 2019-09-06 13:34:11,370 - brc - INFO - Average train loss for epoch 40 is 15.2176001867 INFO:brc:Average train loss for epoch 40 is 15.2176001867 2019-09-06 13:34:25,765 - brc - INFO - epoch: 41, epoch_time_cost: 14.21 INFO:brc:epoch: 41, epoch_time_cost: 14.21 2019-09-06 13:34:25,766 - brc - INFO - Average train loss for epoch 41 is 15.0259342194 INFO:brc:Average train loss for epoch 41 is 15.0259342194 2019-09-06 13:34:40,413 - brc - INFO - epoch: 42, epoch_time_cost: 14.46 INFO:brc:epoch: 42, epoch_time_cost: 14.46 2019-09-06 13:34:40,414 - brc - INFO - Average train loss for epoch 42 is 14.8374350866 INFO:brc:Average train loss for epoch 42 is 14.8374350866 2019-09-06 13:34:54,765 - brc - INFO - epoch: 43, epoch_time_cost: 14.16 INFO:brc:epoch: 43, epoch_time_cost: 14.16 2019-09-06 13:34:54,766 - brc - INFO - Average train loss for epoch 43 is 14.6492177645 INFO:brc:Average train loss for epoch 43 is 14.6492177645 2019-09-06 13:35:09,096 - brc - INFO - epoch: 44, epoch_time_cost: 14.13 INFO:brc:epoch: 44, epoch_time_cost: 14.13 2019-09-06 13:35:09,096 - brc - INFO - Average train loss for epoch 44 is 14.4660690626 INFO:brc:Average train loss for epoch 44 is 14.4660690626 2019-09-06 13:35:22,555 - brc - INFO - epoch: 45, epoch_time_cost: 13.27 INFO:brc:epoch: 45, epoch_time_cost: 13.27 2019-09-06 13:35:22,555 - brc - INFO - Average train loss for epoch 45 is 14.2831737200 INFO:brc:Average train loss for epoch 45 is 14.2831737200 2019-09-06 13:35:36,717 - brc - INFO - epoch: 46, epoch_time_cost: 13.95 INFO:brc:epoch: 46, epoch_time_cost: 13.95 2019-09-06 13:35:36,717 - brc - INFO - Average train loss for epoch 46 is 14.1043623288 INFO:brc:Average train loss for epoch 46 is 14.1043623288 2019-09-06 13:35:50,628 - brc - INFO - epoch: 47, epoch_time_cost: 13.72 INFO:brc:epoch: 47, epoch_time_cost: 13.72 2019-09-06 13:35:50,629 - brc - INFO - Average train loss for epoch 47 is 13.9296310743 INFO:brc:Average train loss for epoch 47 is 13.9296310743 2019-09-06 13:36:05,294 - brc - INFO - epoch: 48, epoch_time_cost: 14.48 INFO:brc:epoch: 48, epoch_time_cost: 14.48 2019-09-06 13:36:05,294 - brc - INFO - Average train loss for epoch 48 is 13.7521994909 INFO:brc:Average train loss for epoch 48 is 13.7521994909 2019-09-06 13:36:19,662 - brc - INFO - epoch: 49, epoch_time_cost: 14.18 INFO:brc:epoch: 49, epoch_time_cost: 14.18 2019-09-06 13:36:19,662 - brc - INFO - Average train loss for epoch 49 is 13.5768952370 INFO:brc:Average train loss for epoch 49 is 13.5768952370 2019-09-06 13:36:34,102 - brc - INFO - epoch: 50, epoch_time_cost: 14.26 INFO:brc:epoch: 50, epoch_time_cost: 14.26 2019-09-06 13:36:34,103 - brc - INFO - Average train loss for epoch 50 is 13.4075390498 INFO:brc:Average train loss for epoch 50 is 13.4075390498
經過運行如下命令,能夠利用訓練好的模型在驗證集進行評估,評估結束後程序會自動計算ROUGE-L指標並顯示最終結果。默認使用data/demo中的數據進行評估。
!cd src && sh run.sh --evaluate --load_dir ../data/models/50
2019-09-06 13:43:34,412 - brc - INFO - Running with args : Namespace(batch_size=16, dev_interval=10, devset=['../data/demo/devset/search.dev.json'], doc_num=5, drop_rate=0.0, embed_size=300, enable_ce=False, evaluate=True, hidden_size=150, learning_rate=0.001, load_dir='../data/models/50', log_interval=50, log_path=None, max_a_len=200, max_p_len=500, max_p_num=5, max_q_len=60, optim='adam', para_print=False, pass_num=5, predict=False, prepare=False, random_seed=123, result_dir='../data/results/', result_name='test_result', save_dir='../data/models', save_interval=1, testset=['../data/demo/testset/search.test.json'], train=False, trainset=['../data/demo/trainset/search.train.json'], use_gpu=True, vocab_dir='../data/vocab', weight_decay=0.0001) 2019-09-06 13:43:34,412 - brc - INFO - Load data_set and vocab... 2019-09-06 13:43:34,754 - brc - INFO - vocab size is 5007 and embed dim is 300 2019-09-06 13:43:34,999 - brc - INFO - Dev set size: 100 questions. 2019-09-06 13:43:34,999 - brc - INFO - Converting text into ids... 2019-09-06 13:43:35,023 - brc - INFO - Initialize the model... 2019-09-06 13:43:35,112 - brc - INFO - load from ../data/models/50 W0906 13:43:35.880043 229 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.2, Runtime API Version: 9.0 W0906 13:43:35.883968 229 device_context.cc:267] device: 0, cuDNN Version: 7.3. WARNING:root: You can try our memory optimize feature to save your memory usage: # create a build_strategy variable to set memory optimize option build_strategy = compiler.BuildStrategy() build_strategy.enable_inplace = True build_strategy.memory_optimize = True # pass the build_strategy to with_data_parallel API compiled_prog = compiler.CompiledProgram(main).with_data_parallel( loss_name=loss.name, build_strategy=build_strategy) !!! Memory optimize is our experimental feature !!! some variables may be removed/reused internal to save memory usage, in order to fetch the right value of the fetch_list, please set the persistable property to true for each variable in fetch_list # Sample conv1 = fluid.layers.conv2d(data, 4, 5, 1, act=None) # if you need to fetch conv1, then: conv1.persistable = True I0906 13:43:35.925932 229 parallel_executor.cc:329] The number of CUDAPlace, which is used in ParallelExecutor, is 1. And the Program will be copied 1 copies I0906 13:43:35.929263 229 build_strategy.cc:340] SeqOnlyAllReduceOps:0, num_trainers:1 2019-09-06 13:43:44,000 - brc - INFO - Saving test_result results to ../data/results/test_result.json INFO:brc:Saving test_result results to ../data/results/test_result.json {'reflen': 8741, 'guess': [3746, 3647, 3577, 3522], 'testlen': 3746, 'correct': [1199, 639, 442, 351]} ('ratio:', 0.42855508523047375) 2019-09-06 13:43:44,419 - brc - INFO - Dev eval result: {'Bleu-4': 0.04272788984782911, 'Rouge-L': 0.1452405634370218, 'Bleu-1': 0.08436327983345214, 'Bleu-3': 0.050250685466992295, 'Bleu-2': 0.06241806456038598} INFO:brc:Dev eval result: {'Bleu-4': 0.04272788984782911, 'Rouge-L': 0.1452405634370218, 'Bleu-1': 0.08436327983345214, 'Bleu-3': 0.050250685466992295, 'Bleu-2': 0.06241806456038598} 2019-09-06 13:43:44,419 - brc - INFO - Predicted answers are saved to ../data/results/ INFO:brc:Predicted answers are saved to ../data/results/
經過運行如下命令,能夠利用訓練好的模型進行預測,預測結果會保存在data/result/目錄下,可使用文本編輯的模型打開json文件查看預測結果,默認使用data/demo中的數據進行預測
同時predict腳本在預測完成後還會將模型的參數進行固化,若是須要將功能拆分,能夠修改run.py文件,將freeze 和predict拆分開
!cd src && sh run.sh --predict --load_dir ../data/models/50
2019-09-06 13:46:29,280 - brc - INFO - Running with args : Namespace(batch_size=16, dev_interval=10, devset=['../data/demo/devset/search.dev.json'], doc_num=5, drop_rate=0.0, embed_size=300, enable_ce=False, evaluate=False, hidden_size=150, learning_rate=0.001, load_dir='../data/models/50', log_interval=50, log_path=None, max_a_len=200, max_p_len=500, max_p_num=5, max_q_len=60, optim='adam', para_print=False, pass_num=5, predict=True, prepare=False, random_seed=123, result_dir='../data/results/', result_name='test_result', save_dir='../data/models', save_interval=1, testset=['../data/demo/testset/search.test.json'], train=False, trainset=['../data/demo/trainset/search.train.json'], use_gpu=True, vocab_dir='../data/vocab', weight_decay=0.0001) 2019-09-06 13:46:29,280 - brc - INFO - Load data_set and vocab... 2019-09-06 13:46:29,625 - brc - INFO - vocab size is 5007 and embed dim is 300 2019-09-06 13:46:29,861 - brc - INFO - Dev set size: 100 questions. 2019-09-06 13:46:29,861 - brc - INFO - Converting text into ids... 2019-09-06 13:46:29,885 - brc - INFO - Initialize the model... 2019-09-06 13:46:29,977 - brc - INFO - load from ../data/models/50 W0906 13:46:30.750571 281 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.2, Runtime API Version: 9.0 W0906 13:46:30.755205 281 device_context.cc:267] device: 0, cuDNN Version: 7.3. WARNING:root: You can try our memory optimize feature to save your memory usage: # create a build_strategy variable to set memory optimize option build_strategy = compiler.BuildStrategy() build_strategy.enable_inplace = True build_strategy.memory_optimize = True # pass the build_strategy to with_data_parallel API compiled_prog = compiler.CompiledProgram(main).with_data_parallel( loss_name=loss.name, build_strategy=build_strategy) !!! Memory optimize is our experimental feature !!! some variables may be removed/reused internal to save memory usage, in order to fetch the right value of the fetch_list, please set the persistable property to true for each variable in fetch_list # Sample conv1 = fluid.layers.conv2d(data, 4, 5, 1, act=None) # if you need to fetch conv1, then: conv1.persistable = True I0906 13:46:30.797705 281 parallel_executor.cc:329] The number of CUDAPlace, which is used in ParallelExecutor, is 1. And the Program will be copied 1 copies I0906 13:46:30.801053 281 build_strategy.cc:340] SeqOnlyAllReduceOps:0, num_trainers:1 2019-09-06 13:46:39,329 - brc - INFO - Saving test_result results to ../data/results/test_result.json INFO:brc:Saving test_result results to ../data/results/test_result.json 2019-09-06 13:46:39,340 - brc - INFO - Load data_set and vocab... INFO:brc:Load data_set and vocab... 2019-09-06 13:46:39,666 - brc - INFO - vocab size is 5007 and embed dim is 300 INFO:brc:vocab size is 5007 and embed dim is 300 2019-09-06 13:46:39,845 - brc - INFO - Dev set size: 100 questions. INFO:brc:Dev set size: 100 questions. 2019-09-06 13:46:39,846 - brc - INFO - Converting text into ids... INFO:brc:Converting text into ids... 2019-09-06 13:46:39,870 - brc - INFO - Initialize the model... INFO:brc:Initialize the model... 2019-09-06 13:46:39,954 - brc - INFO - load from ../data/models/50 INFO:brc:load from ../data/models/50
預測的結果以這樣的形式保存在文件中,根據question_type 和question_id給出對應的question
{"yesno_answers": [], "entity_answers": [[]], "answers": ["在使用路由器上網時,咱們會發如今路由器上,標註得有WAN口(有的路由器是Internet口)和LAN口(有的路由器標註的是一、二、三、4)。不少用戶一看就暈了,根本就不知道WAN口與LAN口的區別,天然不知道應該怎麼鏈接了。"], "question_type": "DESCRIPTION", "question_id": 221576}
# 使用固化後的模型參數進行推斷 !cd src && python infer.py --predict --result_dir ../data/infer_results/
2019-09-06 13:46:53,136 - brc - INFO - Running with args : Namespace(batch_size=16, dev_interval=10, devset=['../data/demo/devset/search.dev.json'], doc_num=5, drop_rate=0.0, embed_size=300, enable_ce=False, evaluate=False, hidden_size=150, learning_rate=0.001, load_dir='', log_interval=50, log_path=None, max_a_len=200, max_p_len=500, max_p_num=5, max_q_len=60, optim='adam', para_print=False, pass_num=5, predict=True, prepare=False, random_seed=123, result_dir='../data/infer_results/', result_name='test_result', save_dir='../data/models', save_interval=1, testset=['../data/demo/testset/search.test.json'], train=False, trainset=['../data/demo/trainset/search.train.json'], use_gpu=True, vocab_dir='../data/vocab', weight_decay=0.0001) 2019-09-06 13:46:53,137 - brc - INFO - Load data_set and vocab... 2019-09-06 13:46:53,479 - brc - INFO - vocab size is 5007 and embed dim is 300 2019-09-06 13:46:53,711 - brc - INFO - Test set size: 100 questions. 2019-09-06 13:46:53,711 - brc - INFO - Converting text into ids... 2019-09-06 13:46:53,735 - brc - INFO - Initialize the model... W0906 13:46:54.517477 332 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.2, Runtime API Version: 9.0 W0906 13:46:54.521802 332 device_context.cc:267] device: 0, cuDNN Version: 7.3. 2019-09-06 13:47:02,587 - brc - INFO - Saving test_result results to ../data/infer_results/test_result.json
點擊連接,使用AI Studio一鍵上手實踐項目吧:https://aistudio.baidu.com/aistudio/projectdetail/122349
下載安裝命令
## CPU版本安裝命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/cpu paddlepaddle
## GPU版本安裝命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/gpu paddlepaddle-gpu
>> 訪問 PaddlePaddle 官網,瞭解更多相關內容。