Materials for ACL2022 tutorial: Knowledge-Augmented Methods for Natural Language Processing
1. Time: 9:30am - 1:00pm, May 22th, 2022 (GMT).
2. Location: Wicklow Hall 2, Convention Centre Dublin.
3. Zoom: Click "join zoom room" at Underline.
Tutorial abstract [PDF]
Knowledge in NLP has been a rising trend especially after the advent of large scale pre-trained models. NLP models with attention to knowledge can i) access unlimited amount of external information; ii) delegate the task of storing knowledge from its parameter space to knowledge sources; iii) obtain up-to-date information; iv) make prediction results more explainable via selected knowledge. In this tutorial, we will introduce the key steps in integrating knowledge into NLP, including knowledge grounding from text, knowledge representation and fusing. We will also introduce recent state-of-the-art applications in fusing knowledge into language understanding, language generation and commonsense reasoning.
@inproceedings{zhu2022knowledge,
title={Knowledge-Augmented Methods for Natural Language Processing},
author={Zhu, Chenguang and Xu, Yichong and Ren, Xiang and Lin, Bill and Jiang, Meng and Yu, Wenhao},
booktitle={Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts},
pages={12--20},
year={2022}
}
1. Slides [Introduction] [KnowledgeForNLU] [KnowledgeForNLG] [KnowledgeForCommonsense] [Conclusion]
2. Video [AllParts]
3. Survey:
-
A Survey of Knowledge-enhanced Text Generation, in ACM Computing Survey (CUSR) 2022. [PDF]
-
A Survey of Knowledge Enhanced Pre-trained Models, on arXiv 2021. [pdf]
-
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing, on arXiv 2021. [pdf]
-
A Survey on Retrieval-Augmented Text Generation, on arXiv 2022. [pdf]
4. Reading list:
-
KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning, in EMNLP 2019. [pdf]
-
Birds have four legs?! NumerSense: Probing Numerical Commonsense Knowledge of Pre-Trained Language Models, in EMNLP 2020. [pdf]
-
Differentiable Open-Ended Commonsense Reasoning, in NAACL 2021. [pdf]
-
CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning, in EMNLP 2021. [pdf]
-
KG-FiD: Infusing Knowledge Graph in Fusion-in-Decoder for Open-Domain Question Answering, in ACL 2022. [pdf]
-
Dict-BERT: Enhancing Language Model Pre-training with Dictionary, in ACL 2022. [pdf]
-
Fusing Context Into Knowledge Graph for Commonsense Question Answering, in ACL 2021. [pdf]
-
Retrieval Enhanced Model for Commonsense Generation, in ACL 2021. [pdf]
-
Diversifying Content Generation for Commonsense Reasoning with Mixture of Knowledge Graph Experts, in ACL 2022. [pdf]
-
JAKET: Joint Pre-training of Knowledge Graph and Language Understanding, in AAAI 2022. [pdf]
-
Boosting Factual Correctness of Abstractive Summarization with Knowledge Graph, in NAACL 2021. [pdf]
Local time (GMT) | Content | Presenter | Slides |
---|---|---|---|
09:30-09:45 | Motivation and Introduction of Knowledge in NLP | Chenguang Zhu | [Slides] |
09:45-10:35 | Knowledge in Natural Language Understanding | Yichong Xu | [Slides] |
10:35-11:00 | Knowledge in Natural Language Generation | Wenhao Yu | [Slides] |
11:00-11:30 | Coffee Break | - | - |
11:30-11:55 | Knowledge in Natural Language Generation | Wenhao Yu | (cont.) |
11:55-12:45 | Commonsense Knowledge and Reasoning for NLP | Yuchen Lin & Xiang Ren | [Slides] |
12:45-13:00 | Summary and Future Direction | Meng Jiang | [Slides] |
Chenguang Zhu Yichong Xu Xiang Ren Yuchen Lin Meng Jiang Wenhao Yu
Chenguang Zhu is a Principal Research Manager in Microsoft Cognitive Services Research Group, where he leads the Knowledge & Language Team. His research in NLP covers knowledge graph, text summarization and task-oriented dialogue. He has led teams to achieve first places in multiple NLP competitions, including CommonsenseQA, CommonGen, FEVER, CoQA, ARC and SQuAD v1.0. He holds a Ph.D. degree in Computer Science from Stanford University. Dr. Zhu has given talks at Stanford University, Carnegie Mellon University and University of Notre Dame. He has previously been TA for Coursera online class “Automata”, giving teaching sessions to 100K international students.
Yichong Xu is a Senior Researcher in Knowledge & Language Team in Microsoft Cognitive Services Research Group. His research in NLP focuses on using external knowledge to help natural language processing, including question answering, commonsense reasoning, and text summarization. Dr. Xu received his Ph.D. in Machine Learning from Carnegie Mellon University. During his time at CMU, he has been TA for large classes (>200 students) on machine learning and convex optimization. Dr. Xu has given talks at CMU AI Seminar, as well as in many international conferences including ACL, NAACL, NeurIPS, ICML, etc.
Xiang Ren is an assistant professor at the USC Computer Science Department, a Research Team Leader at USC ISI, and the PI of the Intelligence and Knowledge Discovery (INK) Lab at USC. Priorly, he received his Ph.D. in Computer Science from the University of Illinois Urbana-Champaign. Dr. Ren works on knowledge acquisition and reasoning in natural language processing, with focuses on developing human-centered and label-efficient computational methods for building trustworthy NLP systems. Ren publishes over 100 research papers and delivered over 10 tutorials at the top conferences in natural language process, data mining, and artificial intelligence. He received NSF CAREER Award, The Web Conference Best Paper runner-up, ACM SIGKDD Doctoral Dissertation Award, and several research awards from Google, Amazon, JP Morgan, Sony, and Snapchat. He was named Forbes' Asia 30 Under 30 in 2019.
Bill Yuchen Lin is a Ph.D. candidate at USC. His research goal is to teach machines to think, talk, and act with commonsense knowledge and commonsense reasoning ability as humans do. Towards this ultimate goal, he has been developing knowledge-augmented reasoning methods (e.g., KagNet, MHGRN, DrFact) and constructing benchmark datasets (e.g., CommonGen, RiddleSense, X-CSR) that require commonsense knowledge and complex reasoning for both NLU and NLG. He initiated an online compendium of commonsense reasoning research, which serves as a portal for the community.
Meng Jiang is currently an assistant professor at the Department of Computer Science and Engineering in the University of Notre Dame. He obtained his B.E. and Ph.D. from Tsinghua University. He spent two years in UIUC as a postdoc and joined ND in 2017. His research interests include data mining, machine learning, and natural language processing. He has published more than 100 peer-reviewed papers of these topics. He is the recipient of Notre Dame International Faculty Research Award. The honors and awards he received include best paper finalist in KDD 2014, best paper award in KDD-DLG workshop 2020, and ACM SIGSOFT Distinguished Paper Award in ICSE 2021. He received NSF CRII award in 2019 and CAREER award in 2022.
Wenhao Yu is a Ph.D. candidate in the Department of Computer Science and Engineering at the University of Notre Dame. His research lies in controllable knowledge-driven natural language processing and generation. His research has been published in top-ranked NLP and data mining conferences such as ACL, EMNLP and KDD.