ETRO VUB
About ETRO  |  News  |  Events  |  Vacancies  |  Contact  
Home Research Education Industry Publications About ETRO

ETRO Publications

Full Details

Journal Publication

Integrating Deep and Shallow Models for Multi-Modal Depression Analysis — Hybrid Architectures

This publication appears in: IEEE Transactions on Affective Computing

Authors: L. Yang, D. Jiang and H. Sahli

Number of Pages: 16

Publication Date: Sep. 2018


Abstract:

At present, although great progress has been made in automatic depression assessment, most of the recent works only concern the audio and video paralinguistic information, rather than the linguistic information from the spoken content. In this work, we argue that beside developing good audio and video features, to build reliable depression detection systems, text-based content features are also of importance to analyse depression-related textual indicators. Furthermore, to improve the performance of automatic depression assessment systems, powerful models, capable of modelling the characteristics of depression embedded in the audio, visual and text descriptors, are also required. This paper proposes new text and video features and hybridizes deep and shallow models for depression estimation and classification from audio, video and text descriptors. The proposed hybrid framework consists ofthree main parts: 1) A Deep Convolutional Neural Network (DCNN) and Deep Neural Network (DNN) based audio-visual multi-modaldepression recognition model for estimating the Patient Health Questionnaire depression scale (PHQNJ) 2) A Paragraph Vector (PV)and Support Vector Machine (SVM) based model for inferring the physical and mental conditions of the individual from the transcripts ofthe interview 3) A Random Forest (RF) model for depression classification from the estimated PHQNJ score and the inferred conditionsof the individual. In the PV-SVM model, PV embedding is used to obtain fixed-length feature vectors from transcripts of the answers tothe questions associated with psychoanalytic aspects of depression, which are subsequently fed into the SVM classifiers for detectingthe presence/absence of the considered psychoanalytic symptoms. To our best knowledge, this approach is the first attempt to applyPV for depression analysis. Besides, we propose a new visual descriptor - Histogram of Displacement Range (HDR) to characterizethe displacement and velocity of the facial landmarks in the video segment. Experiments have been carried out on the Audio VisualEmotion Challenge (AVEC2016) depression dataset, they demonstrate that: 1) The proposed hybrid framework effectively improves theaccuracies of both depression estimation and depression classification, with an average F1 measure up to 0.746, which is higher thanthe best result (0.724) of the depression sub-challenge of AVEC2016. 2) HDR obtains better depression recognition performance thanBag-of-Words (BoW) and Motion History Histogram (MHH) features.

Other Reference Styles
Current ETRO Authors

Prof. Hichem Sahli

+32 (0)02 629 291

hsahli@etrovub.be

more info

Other Publications

• Journal publications

IRIS • LAMI • AVSP

• Conference publications

IRIS • LAMI • AVSP

• Book publications

IRIS • LAMI • AVSP

• Reports

IRIS • LAMI • AVSP

• Laymen publications

IRIS • LAMI • AVSP

• PhD Theses

Search ETRO Publications

Author:

Keyword:  

Type:








- Contact person

- IRIS

- AVSP

- LAMI

- Contact person

- Thesis proposals

- ETRO Courses

- Contact person

- Spin-offs

- Know How

- Journals

- Conferences

- Books

- Vacancies

- News

- Events

- Press

Contact

ETRO Department

info@etro.vub.ac.be

Tel: +32 2 629 29 30

©2019 • Vrije Universiteit Brussel • ETRO Dept. • Pleinlaan 2 • 1050 Brussels • Tel: +32 2 629 2930 (secretariat) • Fax: +32 2 629 2883 • WebmasterDisclaimer