title

booktitle

year

volume

series

month

publisher

pdf

url

openreview

abstract

layout

issn

id

tex_title

firstpage

lastpage

page

order

cycles

bibtex_editor

editor

bibtex_author

author

date

address

container-title

genre

issued

extras

Large Vision-Language Models as Emotion Recognizers in Context Awareness

Proceedings of the 16th Asian Conference on Machine Learning

2025

260

Proceedings of Machine Learning Research

0

PMLR

https://raw.githubusercontent.com/mlresearch/v260/main/assets/lei25a/lei25a.pdf

https://proceedings.mlr.press/v260/lei25a.html

Ns5aiol8ZD

Context-aware emotion recognition (CAER) is a complex and significant task that requires perceiving emotions from various contextual cues. Previous approaches primarily focus on designing sophisticated architectures to extract emotional cues from images. However, their knowledge is confined to specific training datasets and may reflect the subjective emotional biases of the annotators. Furthermore, acquiring large amounts of labeled data is often challenging in real-world applications. In this paper, we systematically explore the potential of leveraging Large Vision-Language Models (LVLMs) to empower the CAER task from three paradigms: 1) We fine-tune LVLMs on two CAER datasets, which is the most common way to transfer large models to downstream tasks. 2) We design zero-shot and few-shot patterns to evaluate the performance of LVLMs in scenarios with limited data or even completely unseen. In this case, a training-free framework is proposed to fully exploit the In-Context Learning (ICL) capabilities of LVLMs. Specifically, we develop an image similarity-based ranking algorithm to retrieve examples; subsequently, the instructions, retrieved examples, and the test example are combined to feed LVLMs to obtain the corresponding sentiment judgment. 3) To leverage the rich knowledge base of LVLMs, we incorporate Chain-of-Thought (CoT) into our framework to enhance the model’s reasoning ability and provide interpretable results. Extensive experiments and analyses demonstrate that LVLMs achieve competitive performance in the CAER task across different paradigms. Notably, the superior performance in few-shot settings indicates the feasibility of LVLMs for accomplishing specific tasks without extensive training.

inproceedings

2640-3498

lei25a

Large Vision-Language Models as Emotion Recognizers in Context Awareness

111

126

111-126

111

false

Nguyen, Vu and Lin, Hsuan-Tien

given	family
Vu	Nguyen

given	family
Hsuan-Tien	Lin

Lei, Yuxuan and Yang, Dingkang and Chen, Zhaoyu and Chen, Jiawei and Zhai, Peng and Zhang, Lihua

given	family
Yuxuan	Lei

given	family
Dingkang	Yang

given	family
Zhaoyu	Chen

given	family
Jiawei	Chen

given	family
Peng	Zhai

given	family
Lihua	Zhang

2025-01-14

Proceedings of the 16th Asian Conference on Machine Learning

inproceedings

date-parts

2025

1

14

label	link
Supplementary PDF	https://raw.githubusercontent.com/mlresearch/v260/main/assets/assets/lei25a/lei25a-supp.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2025-01-14-lei25a.md

2025-01-14-lei25a.md

Files

2025-01-14-lei25a.md

Latest commit

History

2025-01-14-lei25a.md

File metadata and controls