FME '23: Proceedings of the 3rd Workshop on Facial Micro-Expression: Advanced Techniques for Multi-Modal Facial Expression Analysis

FME '23: Proceedings of the 3rd Workshop on Facial Micro-Expression: Advanced Techniques for Multi-Modal Facial Expression Analysis

FME '23: Proceedings of the 3rd Workshop on Facial Micro-Expression: Advanced Techniques for Multi-Modal Facial Expression Analysis


Full Citation in the ACM Digital Library

SESSION: FME Workshop Presentations

Nonlinear Deep Subspace Network for Micro-expression Recognition

  • Weijia Feng
  • Manlu Xu
  • Yuanxu Chen
  • Xiaofeng Wang
  • Jia Guo
  • Lei Dai
  • Nan Wang
  • Xinyu Zuo
  • Xiaoxue Li

Deep learning (DL) models have been widely studied in the field of micro-expression recognition (MER). However, micro-expressions (MEs) suffer from small number of samples and difficulty in extracting subtle and transient features, resulting in limited improvement in the recognition performance. In addition, DL models are prone to overfitting problems and are difficult to extract discriminative features of facial actions from ME images or sequences. To address these issues, we propose a MER method that combines a nonlinear deep subspace network and optical flow features. Firstly, facial motion features are captured using optical flow computation, and then the optical flow features are input into a kernel principal component analysis network (KPCANet) to further learn deeper spatio-temporal features. Finally, a linear support vector machine (SVM) is used for ME classification. Experiments conducted on four public spontaneous ME datasets including SMIC, CASME, CASME II and SAMM validate the effectiveness of the proposed method. The experimental results demonstrate that the proposed method achieves even better recognition performance compared to existing state-of-the-art MER methods.

Simple but Effective In-the-wild Micro-Expression Spotting Based on Head Pose Segmentation

  • Xingpeng Yang
  • Henian Yang
  • Jingting Li
  • Su-Jing Wang

Micro-expressions may occur in high-stake situations when people attempt to conceal or suppress their true feelings. Nowadays, intelligent micro-expression analysis has long been focused on videos captured under constrained laboratory conditions. This is due to the relatively small number of publicly available datasets. Moreover, micro-expression characteristics are subtle and brief, and thus very susceptible to interference from external factors and difficult to capture. In particular, head movement is unavoidable in unconstrained scenarios, making micro-expression spotting highly challenging. This paper proposes a simple yet effective method for avoiding the interference of head movement on micro-expression spotting in natural scenarios by considering three-dimensional space. In particular, based on the head pose, which can be mapped to two-dimensional vectors (translations and rotations) for representation, long and complex videos could be divided into short video segments that basically exclude head movement interference. Following that, segmented micro-expression spotting is realized based on an effective short-segment-based micro-expression spotting algorithm. Experimental results on in-the-wild databases demonstrate the effectiveness of our proposed method in avoiding head movement interference. Additionally, due to the simplicity of this method, it creates opportunities for spotting micro-expressions in real-world scenarios, possibly even in real-time. Furthermore, it helps alleviate the small sample size problem in micro-expression analysis by boosting the spotting performance in massive unlabeled videos.

GLEFFN: A Global-Local Event Feature Fusion Network for Micro-Expression Recognition

  • Cunhan Guo
  • Heyan Huang

Micro-expressions are facial movements of short duration and low amplitude, which, upon analysis, can reveal genuine human emotions. However, the low frame rate of frame-based cameras hinders the further advancement of micro-expression recognition (MER). A novel technology, event-based cameras, boasting high frame rates and low latency, proves suitable for the MER task but remains challenging to obtain. In this article, a local event feature, namely the local count image, is proposed. This feature is calculated from up-sampled video using the SloMo method. Additionally, a global-local event feature fusion network is constructed, wherein the local count image and the global dense optical flow are merged to map deeper features and effectively address the MER task. Experimental results demonstrate that the proposed light-weighted method outperforms state-of-the-art approaches across multiple datasets. To our best knowledges that this work marks the first successful attempt to solve the MER task from an event perspective, thus facilitating the future promotion of event-based camera technology and providing inspiration for future research endeavors in related domains.