Skip to content

Latest commit

 

History

History
32 lines (30 loc) · 2.93 KB

README.md

File metadata and controls

32 lines (30 loc) · 2.93 KB

Machine Reading Comprehension on DuReader

Using BiDAF and QANet on DuReader. Writen by YanXu, FangYueran and ZhangTianyang

Pretrained embedding

When we train the QANet model, we use the pretrained word embedding from Baidu Encyclopedia, you can down load and save in folder ./embedding

Full experimental results

Complete experimental results (including data sets, log of experimental records, tensorboard, and predicted output) can be downloaded from the Baidu network disk:https://pan.baidu.com/s/1qoxnF00wyJ2dqcAPDYTb8w code:gn5b, You can override it with the ./data

Usage

BiDAF

Generate dict and embedding:python BaiduRun.py --prepare
Train: python BaiduRun.py --train
Evaluate on dev: python BaiduRun.py --evaluate
Output the answers: python BaiduRun.py --predict

QANet

Generate dict and embedding:python OurRun.py --prepare
Train: python OurRun.py --train
Evaluate on dev: python OurRun.py --evaluate
Output the answers: python OurRun.py --predict

Reference

[1] Yu, A. W., Dohan, D., Luong, M. T., Zhao, R., Chen, K., Norouzi, M., & Le, Q. V. (2018). QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension. arXiv preprint arXiv:1804.09541.
[2] Seo, M., Kembhavi, A., Farhadi, A., & Hajishirzi, H. (2016). Bidirectional attention flow for machine comprehension. arXiv preprint arXiv:1611.01603.
[3] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems (pp. 5998-6008).
[4] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
[5] Xiong, C., Zhong, V., & Socher, R. (2016). Dynamic coattention networks for question answering. arXiv preprint arXiv:1611.01604.
[6] Srivastava, R. K., Greff, K., & Schmidhuber, J. (2015). Highway networks. arXiv preprint arXiv:1505.00387.
[7] Ba, J. L., Kiros, J. R., & Hinton, G. E. (2016). Layer normalization. arXiv preprint arXiv:1607.06450.
[8] Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions. arXiv preprint, 1610-02357.
[9] Weissenborn, D., Wiese, G., & Seiffe, L. (2017). Making neural QA as simple as possible but not simpler. arXiv preprint arXiv:1703.04816.