Skip to content

Commit

Permalink
Merge pull request #1 from WING-NUS/main
Browse files Browse the repository at this point in the history
Sync downstream
  • Loading branch information
knmnyn authored Jul 13, 2024
2 parents 67ef63d + 4fe4a41 commit ea17e66
Show file tree
Hide file tree
Showing 197 changed files with 4,138 additions and 19 deletions.
8 changes: 3 additions & 5 deletions content/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,11 @@ sections:
- block: hero
content:
title: |
Wowchemy
Research Group
Web IR / NLP Group (WING) @ NUS
image:
filename: welcome.jpg
text: |
<br>
The **Web, Information Retrieval / Natural Language Processing Group (WING)** explores the research area of applied language processing and information retrieval to the Web and related technologies. Areas of current interest are question answering, scholarly digital libraries, verb similarity, focused crawling, citation parsing and spidering, web page classification and division, text segmentation, and full text analysis. WING is headed by A/P Min-Yen KAN. We are based in the Computational Linguistics Laboratory of the School of Computing at the National University of Singapore. We often work with the Natural Language Processing Group and the (Lab for Media Search)[http://lms.comp.nus.edu.sg/]. We are part of the Media Technologies research group umbrella.
The <strong>Web, Information Retrieval / Natural Language Processing Group (WING)</strong> explores the research area of applied language processing and information retrieval to the Web and related technologies. Areas of current interest are question answering, scholarly digital libraries, verb similarity, focused crawling, citation parsing and spidering, web page classification and division, text segmentation, and full text analysis. WING is headed by <A HREF="authors/min-yen-kan">Min</a> (A/P Min-Yen KAN). We are based in the Computational Linguistics Laboratory of the <a href="https://www.comp.nus.edu.sg">School of Computing</a> at the National University of Singapore. We often work with the Natural Language Processing Group and the <a href="http://lms.comp.nus.edu.sg/">Lab for Media Search</a>. We are part of the Media Technologies research group umbrella.
- block: collection
content:
Expand Down Expand Up @@ -73,7 +71,7 @@ sections:
title:
subtitle:
text: |
{{% cta cta_link="./people/" cta_text="Meet the team →" %}}
{{% cta cta_link="./people/" %}}
design:
columns: '1'
---
11 changes: 5 additions & 6 deletions content/authors/alumnus/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
title: Somebody

# Full Name (for SEO)
first_name: Min-Yen
last_name: Kan
first_name: Somebody
last_name: Somewhere

# Is this the primary user of the site?
superuser: false
Expand All @@ -14,9 +14,9 @@ superuser: false
# For an email link, use "fas" icon pack, "envelope" icon, and a link in the
# form "mailto:[email protected]" or "#contact" for contact widget.
social:
- icon: house
icon_pack: fas
link: https://www.comp.nus.edu.sg/~kanmy/
# - icon: house
# icon_pack: fas
# link: https://www.comp.nus.edu.sg/~kanmy/

# Highlight the author in author lists? (true/false)
highlight_name: false
Expand All @@ -27,5 +27,4 @@ user_groups:
- Alumni

---

An alumnus
62 changes: 62 additions & 0 deletions content/authors/jason/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
---
# Display name
title: Jason Qiu

# Full Name (for SEO)
first_name: Jason
last_name: Qiu

# Is this the primary user of the site?
superuser: true

# Role/position
role: Undergraduate Student

# Organizations/Affiliations
organizations:
- name: National University of Singapore, School of Computing
url: 'http://www.comp.nus.edu.sg'

# Short bio (displayed in user profile at end of posts)
bio: FYP student

interests:
- Artificial Intelligence
- Information Retrieval




# Social/Academic Networking
# For available icons, see: https://docs.hugoblox.com/getting-started/page-builder/#icons
# For an email link, use "fas" icon pack, "envelope" icon, and a link in the
# form "mailto:[email protected]" or "#contact" for contact widget.
social:
- icon: house
icon_pack: fas
link: https://www.linkedin.com/in/jasonqiu212/
- icon: envelope
icon_pack: fas
link: 'mailto:[email protected]'

# Link to a PDF of your resume/CV from the About widget.
# To enable, copy your resume/CV to `static/files/cv.pdf` and uncomment the lines below.
# - icon: cv
# icon_pack: ai
# link: files/cv.pdf

# Enter email to display Gravatar (if Gravatar enabled in Config)
email: '[email protected]'

# Highlight the author in author lists? (true/false)
highlight_name: false

# Organizational groups that you belong to (for People widget)
# Set this to `[]` or comment out if you are not using People widget.
user_groups:
- Undergraduate Students
# - Researchers
---

Jason Qiu is a FYP student joing our group in 2024.

Binary file added content/authors/jason/avatar.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 4 additions & 4 deletions content/authors/min/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ organizations:
url: 'http://www.comp.nus.edu.sg'

# Short bio (displayed in user profile at end of posts)
bio: My research interests include distributed robotics, mobile computing and programmable matter.
bio: WING lead; interests include Digital Libraries, Information Retrieval and Natural Language Processing.

interests:
- Artificial Intelligence
Expand Down Expand Up @@ -76,11 +76,11 @@ highlight_name: false
# Organizational groups that you belong to (for People widget)
# Set this to `[]` or comment out if you are not using People widget.
user_groups:
- Principal Investigators
- Researchers
- Principal Investigator / Staff
# - Researchers
---

Min-Yen Kan (BS;MS;PhD Columbia Univ.; SACM, SIEEE) is an Associate Professor and Vice Dean of Undergraduate Studies at the National University of Singapore. Min is an active member of the Association of Computational Linguistics (ACL), currently serving as a co-chair for the ACL Ethics Committee, and previously as the ACL Anthology Director (2008–2018). He is an associate editor for Information Retrieval and the survey editor for the Journal of AI Research (JAIR).
Min-Yen Kan (BS;MS;PhD Columbia Univ.; SACM, SIEEE) is an Associate Professor and Vice Dean of Undergraduate Studies at the National University of Singapore. Min is an active member of the Association of Computational Linguistics (ACL), currently serving as a co-chair for the ACL Ethics Committee (ACL AEC), and previously as the ACL Anthology Director (2008–2018). He is an associate editor for Information Retrieval and the survey editor for the Journal of AI Research (JAIR).

His research interests include digital libraries, natural language processing and information retrieval. He was recognized as a distinguished speaker by the ACM for natural language processing and digital libraries research. Specific projects include work in the areas of scientific discourse analysis, fact verification, full-text literature mining, lexical semantics and large language models. He leads the Web Information Retrieval / Natural Language Processing Group (WING.NUS) http://wing.comp.nus.edu.sg/

7 changes: 3 additions & 4 deletions content/people/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,20 +7,19 @@ type: landing
sections:
- block: people
content:
title: Meet the Team
# title: Meet the Team
# Choose which groups/teams of users to display.
# Edit `user_groups` in each user's profile to add them to one or more of these groups.
user_groups:
- Principal Investigators
- Staff
- Principal Investigator / Staff
- Graduate Students
- Undergraduate Students
- Visitors / Interns
- Alumni
sort_by: Params.last_name
sort_ascending: true
design:
show_interests: false
show_interests: true
show_role: true
show_social: true
---
11 changes: 11 additions & 0 deletions content/publication/8743365/cite.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
@article{8743365,
author = {An, Ya-Hui and Pan, Liangming and Kan, Min-Yen and Dong, Qiang and Fu, Yan},
doi = {10.1109/ACCESS.2019.2924250},
journal = {IEEE Access},
keywords = {Context;Task analysis;Tagging;Message systems;Discussion forums;Context modeling;Semantics;Artificial intelligence;deep learning;hyperlinking;learning resources;MOOC discussion forums;name entity recognition},
number = {},
pages = {87887-87900},
title = {Resource Mention Extraction for MOOC Discussion Forums},
volume = {7},
year = {2019}
}
19 changes: 19 additions & 0 deletions content/publication/8743365/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
title: Resource Mention Extraction for MOOC Discussion Forums
authors:
- Ya-Hui An
- Liangming Pan
- min
- Qiang Dong
- Yan Fu
date: '2019-01-01'
publishDate: '2024-07-12T07:37:44.756935Z'
publication_types:
- article-journal
publication: '*IEEE Access*'
doi: 10.1109/ACCESS.2019.2924250
tags:
- Context;Task analysis;Tagging;Message systems;Discussion forums;Context modeling;Semantics;Artificial
intelligence;deep learning;hyperlinking;learning resources;MOOC discussion forums;name
entity recognition
---
11 changes: 11 additions & 0 deletions content/publication/acl-2017-association-linguistics/cite.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
@proceedings{acl-2017-association-linguistics,
address = {Vancouver, Canada},
doi = {10.18653/v1/P17-2},
editor = {Barzilay, Regina and
Kan, Min-Yen},
month = {July},
publisher = {Association for Computational Linguistics},
title = {Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
url = {https://aclanthology.org/P17-2000},
year = {2017}
}
16 changes: 16 additions & 0 deletions content/publication/acl-2017-association-linguistics/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
---
title: 'Proceedings of the 55th Annual Meeting of the Association for Computational
Linguistics (Volume 2: Short Papers)'
authors:
- Regina Barzilay
- min
date: '2017-07-01'
publishDate: '2024-07-11T07:40:56.389063Z'
publication_types:
- book
publication: '*Association for Computational Linguistics*'
doi: 10.18653/v1/P17-2
links:
- name: URL
url: https://aclanthology.org/P17-2000
---
26 changes: 26 additions & 0 deletions content/publication/aksu-etal-2021-velocidapter/cite.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
@inproceedings{aksu-etal-2021-velocidapter,
abstract = {We introduce a synthetic dialogue generation framework, Velocidapter, which addresses the corpus availability problem for dialogue comprehension. Velocidapter augments datasets by simulating synthetic conversations for a task-oriented dialogue domain, requiring a small amount of bootstrapping work for each new domain. We evaluate the efficacy of our framework on a task-oriented dialogue comprehension dataset, MRCWOZ, which we curate by annotating questions for slots in the restaurant, taxi, and hotel domains of the MultiWOZ 2.2 dataset (Zang et al., 2020). We run experiments within a low-resource setting, where we pretrain a model on SQuAD, fine-tuning it on either a small original data or on the synthetic data generated by our framework. Velocidapter shows significant improvements using both the transformer-based BERTBase and BiDAF as base models. We further show that the framework is easy to use by novice users and conclude that Velocidapter can greatly help training over task-oriented dialogues, especially for low-resourced emerging domains.},
address = {Singapore and Online},
author = {Aksu, Ibrahim Taha and
Liu, Zhengyuan and
Kan, Min-Yen and
Chen, Nancy},
booktitle = {Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue},
doi = {10.18653/v1/2021.sigdial-1.14},
editor = {Li, Haizhou and
Levow, Gina-Anne and
Yu, Zhou and
Gupta, Chitralekha and
Sisman, Berrak and
Cai, Siqi and
Vandyke, David and
Dethlefs, Nina and
Wu, Yan and
Li, Junyi Jessy},
month = {July},
pages = {133--143},
publisher = {Association for Computational Linguistics},
title = {Velocidapter: Task-oriented Dialogue Comprehension Modeling Pairing Synthetic Text Generation with Domain Adaptation},
url = {https://aclanthology.org/2021.sigdial-1.14},
year = {2021}
}
32 changes: 32 additions & 0 deletions content/publication/aksu-etal-2021-velocidapter/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
---
title: 'Velocidapter: Task-oriented Dialogue Comprehension Modeling Pairing Synthetic
Text Generation with Domain Adaptation'
authors:
- Ibrahim Taha Aksu
- Zhengyuan Liu
- min  
- Nancy Chen
date: '2021-07-01'
publishDate: '2024-07-05T17:09:42.645613Z'
publication_types:
- paper-conference
publication: '*Proceedings of the 22nd Annual Meeting of the Special Interest Group
on Discourse and Dialogue*'
doi: 10.18653/v1/2021.sigdial-1.14
abstract: We introduce a synthetic dialogue generation framework, Velocidapter, which
addresses the corpus availability problem for dialogue comprehension. Velocidapter
augments datasets by simulating synthetic conversations for a task-oriented dialogue
domain, requiring a small amount of bootstrapping work for each new domain. We evaluate
the efficacy of our framework on a task-oriented dialogue comprehension dataset,
MRCWOZ, which we curate by annotating questions for slots in the restaurant, taxi,
and hotel domains of the MultiWOZ 2.2 dataset (Zang et al., 2020). We run experiments
within a low-resource setting, where we pretrain a model on SQuAD, fine-tuning it
on either a small original data or on the synthetic data generated by our framework.
Velocidapter shows significant improvements using both the transformer-based BERTBase
and BiDAF as base models. We further show that the framework is easy to use by novice
users and conclude that Velocidapter can greatly help training over task-oriented
dialogues, especially for low-resourced emerging domains.
links:
- name: URL
url: https://aclanthology.org/2021.sigdial-1.14
---
19 changes: 19 additions & 0 deletions content/publication/aksu-etal-2022-n/cite.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
@inproceedings{aksu-etal-2022-n,
abstract = {Augmentation of task-oriented dialogues has followed standard methods used for plain-text such as back-translation, word-level manipulation, and paraphrasing despite its richly annotated structure. In this work, we introduce an augmentation framework that utilizes belief state annotations to match turns from various dialogues and form new synthetic dialogues in a bottom-up manner. Unlike other augmentation strategies, it operates with as few as five examples. Our augmentation strategy yields significant improvements when both adapting a DST model to a new domain, and when adapting a language model to the DST task, on evaluations with TRADE and TOD-BERT models. Further analysis shows that our model performs better on seen values during training, and it is also more robust to unseen values. We conclude that exploiting belief state annotations enhances dialogue augmentation and results in improved models in n-shot training scenarios.},
address = {Dublin, Ireland},
author = {Aksu, Ibrahim and
Liu, Zhengyuan and
Kan, Min-Yen and
Chen, Nancy},
booktitle = {Findings of the Association for Computational Linguistics: ACL 2022},
doi = {10.18653/v1/2022.findings-acl.131},
editor = {Muresan, Smaranda and
Nakov, Preslav and
Villavicencio, Aline},
month = {May},
pages = {1659--1671},
publisher = {Association for Computational Linguistics},
title = {N-Shot Learning for Augmenting Task-Oriented Dialogue State Tracking},
url = {https://aclanthology.org/2022.findings-acl.131},
year = {2022}
}
29 changes: 29 additions & 0 deletions content/publication/aksu-etal-2022-n/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
---
title: N-Shot Learning for Augmenting Task-Oriented Dialogue State Tracking
authors:
- Ibrahim Aksu
- Zhengyuan Liu
- min
- Nancy Chen
date: '2022-05-01'
publishDate: '2024-07-05T17:09:42.588862Z'
publication_types:
- paper-conference
publication: '*Findings of the Association for Computational Linguistics: ACL 2022*'
doi: 10.18653/v1/2022.findings-acl.131
abstract: Augmentation of task-oriented dialogues has followed standard methods used
for plain-text such as back-translation, word-level manipulation, and paraphrasing
despite its richly annotated structure. In this work, we introduce an augmentation
framework that utilizes belief state annotations to match turns from various dialogues
and form new synthetic dialogues in a bottom-up manner. Unlike other augmentation
strategies, it operates with as few as five examples. Our augmentation strategy
yields significant improvements when both adapting a DST model to a new domain,
and when adapting a language model to the DST task, on evaluations with TRADE and
TOD-BERT models. Further analysis shows that our model performs better on seen values
during training, and it is also more robust to unseen values. We conclude that exploiting
belief state annotations enhances dialogue augmentation and results in improved
models in n-shot training scenarios.
links:
- name: URL
url: https://aclanthology.org/2022.findings-acl.131
---
18 changes: 18 additions & 0 deletions content/publication/aksu-etal-2023-prompter/cite.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
@inproceedings{aksu-etal-2023-prompter,
abstract = {A challenge in the Dialogue State Tracking (DST) field is adapting models to new domains without using any supervised data --- zero-shot domain adaptation. Parameter-Efficient Transfer Learning (PETL) has the potential to address this problem due to its robustness. However, it has yet to be applied to the zero-shot scenarios, as it is not clear how to apply it unsupervisedly. Our method, Prompter, uses descriptions of target domain slots to generate dynamic prefixes that are concatenated to the key and values at each layer′s self-attention mechanism. This allows for the use of prefix-tuning in zero-shot. Prompter outperforms previous methods on both the MultiWOZ and SGD benchmarks. In generating prefixes, our analyses find that Prompter not only utilizes the semantics of slot descriptions but also how often the slots appear together in conversation. Moreover, Prompter′s gains are due to its improved ability to distinguish ″none″-valued dialogue slots, compared against baselines.},
address = {Toronto, Canada},
author = {Aksu, Ibrahim Taha and
Kan, Min-Yen and
Chen, Nancy},
booktitle = {Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
doi = {10.18653/v1/2023.acl-long.252},
editor = {Rogers, Anna and
Boyd-Graber, Jordan and
Okazaki, Naoaki},
month = {July},
pages = {4588--4603},
publisher = {Association for Computational Linguistics},
title = {Prompter: Zero-shot Adaptive Prefixes for Dialogue State Tracking Domain Adaptation},
url = {https://aclanthology.org/2023.acl-long.252},
year = {2023}
}
29 changes: 29 additions & 0 deletions content/publication/aksu-etal-2023-prompter/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
---
title: 'Prompter: Zero-shot Adaptive Prefixes for Dialogue State Tracking Domain Adaptation'
authors:
- Ibrahim Taha Aksu
- min
- Nancy Chen
date: '2023-07-01'
publishDate: '2024-07-06T02:22:24.632344Z'
publication_types:
- paper-conference
publication: '*Proceedings of the 61st Annual Meeting of the Association for Computational
Linguistics (Volume 1: Long Papers)*'
doi: 10.18653/v1/2023.acl-long.252
abstract: A challenge in the Dialogue State Tracking (DST) field is adapting models
to new domains without using any supervised data --- zero-shot domain adaptation.
Parameter-Efficient Transfer Learning (PETL) has the potential to address this problem
due to its robustness. However, it has yet to be applied to the zero-shot scenarios,
as it is not clear how to apply it unsupervisedly. Our method, Prompter, uses descriptions
of target domain slots to generate dynamic prefixes that are concatenated to the
key and values at each layer′s self-attention mechanism. This allows for the use
of prefix-tuning in zero-shot. Prompter outperforms previous methods on both the
MultiWOZ and SGD benchmarks. In generating prefixes, our analyses find that Prompter
not only utilizes the semantics of slot descriptions but also how often the slots
appear together in conversation. Moreover, Prompter′s gains are due to its improved
ability to distinguish ″none″-valued dialogue slots, compared against baselines.
links:
- name: URL
url: https://aclanthology.org/2023.acl-long.252
---
Loading

0 comments on commit ea17e66

Please sign in to comment.