A comprehensive roadmap to deliver user-friendly, low-cost and effective alternatives for extracting drivers’ statistics. Full paper is accepted by HCII'22.
The WIP version is here.
Procedure
Raw video streams (facial expressions contains many noisy pixels)
Pre-processing input images to retrieve only the facial expressions and enhance the performance of Face2Statistics
Exploring different deep neural network-driven predictors
First Attempt: Convolution Neural Networks (CNNs)
Second Attempt: Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN)
Third Attempt: Bidirectional Long-Short-Term Memory (BiLSTM) Recurrent Neural Network (RNN)
Visualizing predicted result
a) We utilize HSV color space instead of RGB color space to reduce the variance of illumination among different pixels.
b) We apply personalized parameters via pearson correlation coefficients to conditional random field to realize customizing.
Experiment Results
The results of training and validation accuracy, in terms of DenseNet, LSTM and BiLSTM are displayed as follows.
The results of training and validation accuracy, in terms of RGB and HSV are displayed as follows.
The results of s the comparative validation accuracy of BiLSTM w/ and w/out CRF support for four different drivers are displayed as follows.
Code Contributor
@Zeyu Xiong, @Jiahao Wang, @Junyu Liu, mentored by @Xiangjun Peng.