ETROVUB

Haifeng Chen, Dongmei Jiang, Yong Zhao, Xiaoyong Wei, Ke Lu, Hichem Sahli

Contribution to journal

Abstract ■

Facial action units (AUs) refer to a comprehensive set of atomic facial muscle movements. Recent works have focused on exploring complementary information by learning the relationships among AUs. Most existing approaches process AU co-occurrence and enhance AU recognition by learning the dependencies among AUs from labels, however, the complementary information among features of different AUs are ignored. Moreover, ground truth annotations suffer from a large intra-class variance and their associated intensity levels may vary depending on the annotators' experience. In this paper, we propose the Region Attentive AU intensity estimation method with Uncertainty Weighted Multi-task Learning (RA-UWML). A RoI-Net is first used to extract features from the pre-defined facial patches where the AUs locate. Then, we use the co-occurrence of AUs using both within patch and between patches representation learning. Within a given patch, we propose sharing representation learning in a multi-task manner. To achieve complementarity and avoid redundancy between different image patches, we propose to use a multi-head self-attention mechanism to adaptively and attentively encode each patch specific representation. Moreover, the AU intensity is represented as a Gaussian distribution, instead of a single value, where the mean value indicates the most likely AU intensity and the variance indicates the uncertainty of the estimated AU intensity. The estimated variances are leveraged to automatically weight the loss of each AU in the multitask learning model. In extensive experiments on the Disfa, Fera2015 and Feafa benchmarks, it is shown that the proposed AU intensity estimation model achieves better results compared to the state-of-the-art models.

Reference ■

Chen, H, Jiang, D, Zhao, Y, Wei, X, Lu, K & Sahli, H 2023, 'Region Attentive Action Unit Intensity Estimation with Uncertainty Weighted Multi-task Learning', IEEE Transactions on Affective Computing, vol. 14, no. 3, pp. 2033-2047. https://doi.org/10.1109/TAFFC.2021.3139101

Chen, H., Jiang, D., Zhao, Y., Wei, X., Lu, K., & Sahli, H. (2023). Region Attentive Action Unit Intensity Estimation with Uncertainty Weighted Multi-task Learning. IEEE Transactions on Affective Computing, 14(3), 2033-2047. https://doi.org/10.1109/TAFFC.2021.3139101

@article{cdf9e47aa0c343ab968bdfb3bf21163f,
title = "Region Attentive Action Unit Intensity Estimation with Uncertainty Weighted Multi-task Learning",
abstract = "Facial action units (AUs) refer to a comprehensive set of atomic facial muscle movements. Recent works have focused on exploring complementary information by learning the relationships among AUs. Most existing approaches process AU co-occurrence and enhance AU recognition by learning the dependencies among AUs from labels, however, the complementary information among features of different AUs are ignored. Moreover, ground truth annotations suffer from a large intra-class variance and their associated intensity levels may vary depending on the annotators' experience. In this paper, we propose the Region Attentive AU intensity estimation method with Uncertainty Weighted Multi-task Learning (RA-UWML). A RoI-Net is first used to extract features from the pre-defined facial patches where the AUs locate. Then, we use the co-occurrence of AUs using both within patch and between patches representation learning. Within a given patch, we propose sharing representation learning in a multi-task manner. To achieve complementarity and avoid redundancy between different image patches, we propose to use a multi-head self-attention mechanism to adaptively and attentively encode each patch specific representation. Moreover, the AU intensity is represented as a Gaussian distribution, instead of a single value, where the mean value indicates the most likely AU intensity and the variance indicates the uncertainty of the estimated AU intensity. The estimated variances are leveraged to automatically weight the loss of each AU in the multitask learning model. In extensive experiments on the Disfa, Fera2015 and Feafa benchmarks, it is shown that the proposed AU intensity estimation model achieves better results compared to the state-of-the-art models.",
author = "Haifeng Chen and Dongmei Jiang and Yong Zhao and Xiaoyong Wei and Ke Lu and Hichem Sahli",
note = "Publisher Copyright: {\textcopyright} 2010-2012 IEEE.",
year = "2023",
month = jul,
day = "1",
doi = "10.1109/TAFFC.2021.3139101",
language = "English",
volume = "14",
pages = "2033--2047",
journal = "IEEE Transactions on Affective Computing",
issn = "1949-3045",
publisher = "IEEE",
number = "3",
}