IncepTR

IncepTR: Micro-Expression Recognition Integrating Inception-CBAM and Vision Transformer

Haoliang Zhou, Shucheng Huang, Xuqiao Xu

Abstract

Micro-Expressions (MEs) are the instantaneous and subtle facial movement that conveys crucial emotional information. However, traditional neural networks face difficulties in accurately capturing the delicate features of MEs due to the limited amount of available data. To address this issue, a dual-branch attention network is proposed for ME recognition, called IncepTR, which can capture attention-aware local and global representations. The network takes optical flow features as input and performs feature extraction using a dual-branch network. First, the Inception model based on the Convolutional Block Attention Module (CBAM) attention mechanism is maintained for multi-scale local feature extraction. Second, the Vision Transformer (ViT) is employed to capture subtle motion features and robustly model global relationships among multiple local patches. Additionally, to enhance the rich relationships between different local patches in ViT, Multi-head Self-Attention Dropping (MSAD) is introduced to drop an attention map randomly, effectively preventing overfitting to specific regions. Finally, the two types of features could be used to learn ME representations effectively through similarity comparison and feature fusion. With such combination, the model is forced to capture the most discriminative multi-scale local and global features while reducing the influence of affective-irrelevant features. Extensive experiments show that the proposed IncepTR achieves UF1 and UAR of 0.753 and 0.746 on the composite dataset MEGC2019-CD, demonstrating better or competitive performance compared to existing state-of-the-art methods for ME recognition.

Data preparation

Following Dual-ATME and RCN, the data lists are reorganized as follow:

data/
├─ MEGC2019/
│  ├─ v_cde_flow/
│  │  ├─ 006_test.txt
│  │  ├─ 006_train.txt
│  │  ├─ 007_test.txt
│  │  ├─ ...
│  │  ├─ sub26_train.txt
│  │  ├─ subName.txt

There are 3 columns in each txt file:

/home/user/data/samm/flow/006_006_1_2_006_05588-006_05562_flow.png 0 1

In this example, the first column is the path of the optical flow image for a particular ME sample, the second column is the label (0-2 for three emotions), and the third column is the database type (1-3 for three databases).

There are 68 raws in subName.txt, reference to subName.txt:

006
...
037
s01
...
s20
sub01
...
sub26

Represents ME samples divided by MEGC2019, as described in here ahd here.

Citation

If you find this repo useful for your research, please consider citing the paper

@article{zhou2023inceptr,
  title={Inceptr: micro-expression recognition integrating inception-CBAM and vision transformer},
  author={Zhou, Haoliang and Huang, Shucheng and Xu, Yuqiao},
  journal={Multimedia Systems},
  volume={29},
  number={6},
  pages={3863--3876},
  year={2023},
  publisher={Springer}
}

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
model		model
Datasets.py		Datasets.py
LossFunctions.py		LossFunctions.py
Metrics.py		Metrics.py
README.md		README.md
fig.png		fig.png
main_contrast.py		main_contrast.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IncepTR

IncepTR: Micro-Expression Recognition Integrating Inception-CBAM and Vision Transformer

Abstract

Data preparation

Citation

About

Releases

Packages

Languages

HaoliangZhou/IncepTR

Folders and files

Latest commit

History

Repository files navigation

IncepTR

IncepTR: Micro-Expression Recognition Integrating Inception-CBAM and Vision Transformer

Abstract

Data preparation

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages