Skip to content

Commit

Permalink
Merge pull request #4 from WING-NUS/main
Browse files Browse the repository at this point in the history
Sync from downstream.
  • Loading branch information
knmnyn authored Jul 23, 2024
2 parents f559796 + 928a848 commit f5fbfe3
Show file tree
Hide file tree
Showing 14 changed files with 363 additions and 0 deletions.
68 changes: 68 additions & 0 deletions content/authors/thushari/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
---
# Display name
title: Thushari Pahalage

# Full Name (for SEO)
first_name: Thushari
last_name: Pahalage

# Is this the primary user of the site?
superuser: false

# Role/position
role: Research Intern

# Organizations/Affiliations
organizations:
- name: University of Lorraine
url: 'https://genial.univ-lorraine.fr/'

# Short bio (displayed in user profile at end of posts)
bio: Research Intern

interests:
- Machine Learning
- Natural Language Processing
- Explainable AI
- Cloud Computing

education:
courses:
- course: Erasmus Mundus Joint Masters Degree in Green Networking And Cloud computing
institution: University of Lorraine, Leeds Beckett University, Luleå University of Technology
year: 2023
- course: B.Tech in Software Engineering
institution: Delhi Technological University
year: 2019

# Social/Academic Networking
# For available icons, see: https://docs.hugoblox.com/getting-started/page-builder/#icons
# For an email link, use "fas" icon pack, "envelope" icon, and a link in the
# form "mailto:[email protected]" or "#contact" for contact widget.
social:
- icon: envelope
icon_pack: fas
link: 'mailto:[email protected]'
- icon: github
icon_pack: fab
link: https://github.com/thushariii
# Link to a PDF of your resume/CV from the About widget.
# To enable, copy your resume/CV to `static/files/cv.pdf` and uncomment the lines below.
# - icon: cv
# icon_pack: ai
# link: files/cv.pdf

# Enter email to display Gravatar (if Gravatar enabled in Config)
email: '[email protected]'

# Highlight the author in author lists? (true/false)
highlight_name: false

# Organizational groups that you belong to (for People widget)
# Set this to `[]` or comment out if you are not using People widget.
user_groups:
- Visitors / Interns
# - Researchers
---

Thushari Pahalage is a Research Intern in WING Research Group and working for 2 months starting from July 2024. She is currently enrolled in the Erasmus Mundus Joint Masters Degree program in Green Networking And Cloud computing.
Binary file added content/authors/thushari/avatar.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
73 changes: 73 additions & 0 deletions content/authors/victor/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
---
# Display name
title: Chuang Li

# Full Name (for SEO)
first_name: Chuang
last_name: Li

# Is this the primary user of the site?
superuser: true

# Role/position
role: Graduate Students

# Organizations/Affiliations
organizations:
- name: National University of Singapore, School of Computing
url: 'http://www.comp.nus.edu.sg'

# Short bio (displayed in user profile at end of posts)
bio: PhD Candidate August 2020 Intake

interests:
- Information Retrieval
- Conversational Recommender System
- Large Language Models

education:
courses:
- course: PhD in Computer Science
institution: National University of Singapore
year: 2020-2024
- course: BSc in Electrical and Computer Engineering
institution: National University of Singapore
year: 2016-2020

# Social/Academic Networking
# For available icons, see: https://docs.hugoblox.com/getting-started/page-builder/#icons
# For an email link, use "fas" icon pack, "envelope" icon, and a link in the
# form "mailto:[email protected]" or "#contact" for contact widget.
social:
- icon: house
icon_pack: fas
link: https://lichuangnus.github.io
- icon: envelope
icon_pack: fas
link: 'mailto:[email protected]'
- icon: google-scholar
icon_pack: ai
link: https://scholar.google.com/citations?user=_9Q_D1MAAAAJ&hl=en
- icon: github
icon_pack: fab
link: https://github.com/lichuangnus
# Link to a PDF of your resume/CV from the About widget.
# To enable, copy your resume/CV to `static/files/cv.pdf` and uncomment the lines below.
# - icon: cv
# icon_pack: ai
# link: files/cv.pdf

# Enter email to display Gravatar (if Gravatar enabled in Config)
email: '[email protected]'

# Highlight the author in author lists? (true/false)
highlight_name: false

# Organizational groups that you belong to (for People widget)
# Set this to `[]` or comment out if you are not using People widget.
user_groups:
- Graduate Students
# - Researchers
---

Victor is a 4-th year PhD Candidate jointly supervised by Professor Kan Min-Yen and Professor Li Haizhou. His reserach interestes are in Conversational Recommender System and Natural Language Processing.
Binary file added content/authors/victor/avatar.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
75 changes: 75 additions & 0 deletions content/authors/yisong/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
---
# Display name
title: Yisong Miao

# Full Name (for SEO)
first_name: Yisong
last_name: Miao

# Is this the primary user of the site?
superuser: true

# Role/position
role: Graduate Students

# Organizations/Affiliations
organizations:
- name: National University of Singapore, School of Computing
url: 'http://www.comp.nus.edu.sg'

# Short bio (displayed in user profile at end of posts)
bio: PhD Candidate January 2021 Intake

interests:
- Discourse Semantics
- Lexical Semantics

education:
courses:
- course: PhD in Computer Science
institution: National University of Singapore
year: 2021-Now
- course: Master of Computing
institution: National University of Singapore
year: 2018-2020
- course: Bachelor of Computing
institution: University of Chinese Academy of Sciences
year: 2014-2018

# Social/Academic Networking
# For available icons, see: https://docs.hugoblox.com/getting-started/page-builder/#icons
# For an email link, use "fas" icon pack, "envelope" icon, and a link in the
# form "mailto:[email protected]" or "#contact" for contact widget.
social:
- icon: house
icon_pack: fas
link: https://yisong.me
- icon: envelope
icon_pack: fas
link: 'mailto:[email protected]'
- icon: google-scholar
icon_pack: ai
link: http://scholar.google.com/citations?user=a-oIKBoAAAAJ&hl=en
- icon: github
icon_pack: fab
link: https://github.com/yisongmiao
# Link to a PDF of your resume/CV from the About widget.
# To enable, copy your resume/CV to `static/files/cv.pdf` and uncomment the lines below.
# - icon: cv
# icon_pack: ai
# link: files/cv.pdf

# Enter email to display Gravatar (if Gravatar enabled in Config)
email: '[email protected]'

# Highlight the author in author lists? (true/false)
highlight_name: false

# Organizational groups that you belong to (for People widget)
# Set this to `[]` or comment out if you are not using People widget.
user_groups:
- Graduate Students
# - Researchers
---

Yisong is a fourth-year PhD student under the supervision of Prof. Min-Yen Kan. His research interests include discourse and lexical semantics. He [composes emojis](https://yisong.me/publications/[email protected]) to represent lexical compositions and [asks questions](https://yisong.me/publications/acl24-DiSQ-CR.pdf) to evaluate language models' faithful understanding of discourse relations.
Binary file added content/authors/yisong/avatar.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added content/publication/acl24discursive/.DS_Store
Binary file not shown.
7 changes: 7 additions & 0 deletions content/publication/acl24discursive/cite.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
@inproceedings{acl24a,
title={Discursive Socratic Questioning: Evaluating the Faithfulness of Language Models’ Understanding of Discourse Relations},
author={Yisong Miao , Hongfu Liu, Wenqiang Lei, Nancy F. Chen, and Min-Yen Kan},
booktitle={Proceedings of the Annual Meeting fof the Association of Computational Linguistics},
year={2024},
organization={ACL}
}
Binary file added content/publication/acl24discursive/featured.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
61 changes: 61 additions & 0 deletions content/publication/acl24discursive/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
---
title: 'Discursive Socratic Questioning: Evaluating the Faithfulness of Language Models’ Understanding of Discourse Relations'
# Yisong Miao , Hongfu Liu, Wenqiang Lei, Nancy F. Chen and Min-Yen Kan (2024) Beyond Memorization: The Challenge of Random Memory Access in Language Models. In Proceedings of the Annual Meeting fof the Association of Computational Linguistics (ACL '24).

authors:
- yisong
- Hongfu Liu
- Wenqiang Lei
- Nancy F. Chen
- min

date: '2024-08-11T00:00:00Z'
doi: ''

# Schedule page publish date (NOT publication's date).
publishDate: '2017-01-01T00:00:00Z'
publication_types: ['paper-conference']

# Publication name and optional abbreviated publication name.
publication: In *Proceedings of the Annual Meeting fof the Association of Computational Linguistics*
publication_short: In *ACL 2024*

abstract: While large language models have significantly enhanced the effectiveness of discourse relation classifications, it remains unclear whether their comprehension is faithful and reliable. We provide DiSQ, a new method for evaluating the faithfulness of understanding discourse based on question answering. We first employ in-context learning to annotate the reasoning for discourse comprehension, based on the connections among key events within the discourse. Following this, DiSQ interrogates the model with a sequence of questions to assess its grasp of core event relations, its resilience to counterfactual queries, as well as its consistency to its previous responses. We then evaluate language models with different architectural designs using DiSQ, finding: (1) DiSQ presents a significant challenge for all models, with the top-performing GPT model attaining only 41% of the ideal performance in PDTB; (2) DiSQ is robust to domain shifts and paraphrase variations; (3) Open-source models generally lag behind their closed-source GPT counterparts, with notable exceptions being those enhanced with chat and code/math features; (4) Our analysis validates the effectiveness of explicitly signalled discourse connectives, the role of contextual information, and the benefits of using historical QA data.

# Summary. An optional shortened abstract.
summary: We propose Discursive Socratic Questioning (DiSQ), a new evaluation measure for discourse semantics. Inspired by the Socratic method, DiSQ involves asking models about key event relations, testing their robustness to counterfactuals, and ensuring consistency with equivalent questions. Experiments show that GPT-4 achieves only 41% of the DiSQ scores. We recommend using context and discourse connectives as essential linguistic features to enhance discourse comprehension.

tags: [nlp]

# Display this page in the Featured widget?
featured: true

# Custom links (uncomment lines below)
# links:
# - name: Custom Link
# url: http://example.org

url_pdf: 'https://yisong.me/publications/acl24-DiSQ-CR.pdf'
url_code: 'https://github.com/YisongMiao/DiSQ-Score'

# Featured image
# To use, add an image named `featured.jpg/png` to your page's folder.
image:
caption: 'DiSQ combines three discourse-relevant scores: (1) Targeted Score, gauging responses to key events; (2) Counterfactual Score, assessing robustness against irrelevant queries; (3) Consistency Score, measuring logical coherence to equivalent questions.'
preview_only: false

# Associated Projects (optional).
# Associate this publication with one or more of your projects.
# Simply enter your project's folder or file name without extension.
# E.g. `internal-project` references `content/project/internal-project/index.md`.
# Otherwise, set `projects: []`.
projects:
- example

# Slides (optional).
# Associate this publication with Markdown slides.
# Simply enter your slide deck's filename without extension.
# E.g. `slides: "example"` references `content/slides/example/index.md`.
# Otherwise, set `slides: ""`.
# slides: example
---
Binary file not shown.
20 changes: 20 additions & 0 deletions content/publication/lreccoling24elco/cite.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
@inproceedings{yang-etal-2024-elco-dataset,
title = "The {ELC}o Dataset: Bridging Emoji and Lexical Composition",
author = "Yang, Zi Yun and
Zhang, Ziqing and
Miao, Yisong",
editor = "Calzolari, Nicoletta and
Kan, Min-Yen and
Hoste, Veronique and
Lenci, Alessandro and
Sakti, Sakriani and
Xue, Nianwen",
booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
month = may,
year = "2024",
address = "Torino, Italia",
publisher = "ELRA and ICCL",
url = "https://aclanthology.org/2024.lrec-main.1381",
pages = "15899--15909",
abstract = "Can emojis be composed to convey intricate meanings like English phrases? As a pioneering study, we present the Emoji-Lexical Composition (ELCo) dataset, a new resource that offers parallel annotations of emoji sequences corresponding to English phrases. Our dataset contains 1,655 instances, spanning 209 diverse concepts from tangible ones like {``}right man{''} (✔️👨) to abstract ones such as {``}full attention{''} (🧐✍️, illustrating a metaphoric composition of a focusing face and writing hand). ELCo enables the analysis of the patterns shared between emoji and lexical composition. Through a corpus study, we discovered that simple strategies like direct representation and reduplication are sufficient for conveying certain concepts, but a richer, metaphorical strategy is essential for expressing more abstract ideas. We further introduce an evaluative task, Emoji-based Textual Entailment (EmoTE), to assess the proficiency of NLP models in comprehending emoji compositions. Our findings reveals the challenge of understanding emoji composition in a zero-shot setting for current models, including ChatGPT. Our analysis indicates that the intricacy of metaphorical compositions contributes to this challenge. Encouragingly, models show marked improvement when fine-tuned on the ELCo dataset, with larger models excelling in deciphering nuanced metaphorical compositions.",
}
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
59 changes: 59 additions & 0 deletions content/publication/lreccoling24elco/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
---
title: 'The ELCo Dataset: Bridging Emoji and Lexical Composition'
# Zi Yun Yang, Ziqing Zhang, and Yisong Miao (2024) The ELCo Dataset: Bridging Emoji and Lexical Composition. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING '24).

authors:
- Zi Yun Yang
- Ziqing Zhang
- yisong

date: '2024-05-20T00:00:00Z'
doi: ''

# Schedule page publish date (NOT publication's date).
publishDate: '2017-01-01T00:00:00Z'
publication_types: ['paper-conference']

# Publication name and optional abbreviated publication name.
publication: In *In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation*
publication_short: In *LREC-COLING 2024*

abstract: Can emojis be composed to convey intricate meanings like English phrases? As a pioneering study, we present the Emoji-Lexical Composition (ELCo) dataset, a new resource that offers parallel annotations of emoji sequences corresponding to English phrases. Our dataset contains 1,655 instances, spanning 209 diverse concepts from tangible ones like {``}right man{''} (✔️👨) to abstract ones such as {``}full attention{''} (🧐✍️, illustrating a metaphoric composition of a focusing face and writing hand). ELCo enables the analysis of the patterns shared between emoji and lexical composition. Through a corpus study, we discovered that simple strategies like direct representation and reduplication are sufficient for conveying certain concepts, but a richer, metaphorical strategy is essential for expressing more abstract ideas. We further introduce an evaluative task, Emoji-based Textual Entailment (EmoTE), to assess the proficiency of NLP models in comprehending emoji compositions. Our findings reveals the challenge of understanding emoji composition in a zero-shot setting for current models, including ChatGPT. Our analysis indicates that the intricacy of metaphorical compositions contributes to this challenge. Encouragingly, models show marked improvement when fine-tuned on the ELCo dataset, with larger models excelling in deciphering nuanced metaphorical compositions.

# Summary. An optional shortened abstract.
summary: Can emojis convey intricate meanings like English phrases? We present the Emoji-Lexical Composition (ELCo) dataset with 1,655 instances spanning 209 concepts, from tangible to abstract ideas. ELCo enables analysis of emoji and lexical composition. Our Emoji-based Textual Entailment (EmoTE) task reveals challenges for current models, but fine-tuning improves performance significantly.

tags: [nlp]

# Display this page in the Featured widget?
featured: true

# Custom links (uncomment lines below)
# links:
# - name: Custom Link
# url: http://example.org

url_pdf: 'https://aclanthology.org/2024.lrec-main.1381'
url_code: 'https://github.com/WING-NUS/ELCo'

# Featured image
# To use, add an image named `featured.jpg/png` to your page's folder.
image:
caption: 'A summary of the ELCo project. The ELCo dataset is comprised of 1,655 annotations of 209 EN phrases 45 adjectives and 77 attributes. Our corpus study reveals five structures to compose emoji compositions, and we show metaphorical structures use more diverse emojis. Our new EmoTE task is challenging for all models, but fine-tuning on ELCo helps to learn useful emoji composition skills. '
preview_only: false

# Associated Projects (optional).
# Associate this publication with one or more of your projects.
# Simply enter your project's folder or file name without extension.
# E.g. `internal-project` references `content/project/internal-project/index.md`.
# Otherwise, set `projects: []`.
projects:
- example

# Slides (optional).
# Associate this publication with Markdown slides.
# Simply enter your slide deck's filename without extension.
# E.g. `slides: "example"` references `content/slides/example/index.md`.
# Otherwise, set `slides: ""`.
# slides: example
---

0 comments on commit f5fbfe3

Please sign in to comment.