Publication Details
Overview
 
 
Lanlan Lv, Dongmei Jiang, Fengna Wang, Hichem Sahli, Werner Verhelst
 

Contribution to journal

Abstract 

This paper presents a triple stream Dynamic Bayesian Networks(DNB) mode (T_AsyDBN) for audio visual emotion recognition, in which the two audio stream are synchronous at the state level, which they are asynchronous with the visual stream within controllable constraints. MFCC features and local prosodic features are extracted as audio features, while dimensional geometric features as well facial action units' coefficients are extracted as visual features. Emotion recognition experiment show that by adjusting the asynchrony constraint, T_AsDBN performs better than the two stream audio visual DBN model(Asy_DBN), with average recognition rate improves from 52.14% to 63.71%.

Reference