ETRO VUB
About ETRO  |  News  |  Events  |  Vacancies  |  Contact  
Home Research Education Industry Publications About ETRO

Master theses

Current and past ideas and concepts for Master Theses.

Diarizr – an extendable platform for audiovisual meeting diarization

Subject

In a corporate setting it is frequently necessary that a written report must be made on the results and agreements that were discussed during a meeting. This requires that one of the attendees (or a secretary) keeps notes and performs additional work during and after the meeting. In this thesis, we’ll design a meeting diarization platform that can significantly decrease this additional workload by automatically recording, analyzing and annotating what was said (by who) during the meeting. This platform will work with (replaceable) plugin modules, which will allow for e.g.:

  • (remote) audio(visual) recording of the meeting (optionally streaming the data from a local sensor node to a server for processing and storage)

  • Identifying the participants

  • Recognizing the speakers and their speech (who said what, when?)

  • Visualizing the audio/video and recognizer data.

The primary focus of the thesis is on the design and implementation of the overall modular platform/framework for meeting diarization, based on a functional and requirements analysis of the typical phases in a meeting. Based on this framework, a demonstrator will be implemented using plugin modules that can be simplified or based on existing (open source) code, such as e.g. the (pocket)Sphinx speech recognizer.

Kind of work

In this thesis, the student will design and implement a system for audio(visual) meeting diarization. In general, the work will consist of the following steps:

  • Literature study on meeting diarization, audio(visual) speaker recognition and speech recognition.
  • Design a general software framework for speaker diarization, which allows plugins for e.g. (remote) audio/video capturing, audio/video analysis (eg. recognizing speaker identity, speech, emotion,…) and visualization of the results.

  • Implementation of a demonstrator, consisting of the general software framework, and the necessary plugins to show the operation of the diarization system (complexity of the plugins depending on the available time and the availability of existing code)

  • Training and testing the diarization system on an available audio(visual) database, or using your own recorded data.

  • Thesis report and presentation

Due to the modular nature of the overall system, multiple students can work together on this thesis, each designing and implementing a subset of modules (framework, audio/video input, analysis, post-processing, visualization,…).

Framework of the Thesis

This thesis is related to the audiovisual speech analysis research track at the ETRO-AVSP group.

Number of Students

1 or 2

Expected Student Profile

Engineering or informatics student with good programming skills. Knowledge of digital signal processing is a plus.

Promotor

Prof. Hichem Sahli

+32 (0)2 629 2916

hsahli@etrovub.be

more info

Supervisors

Ir. Henk Brouckxon

+32 (0)2 629 3955

hbrouckx@etrovub.be

more info

Dr. Ir. Lukas Latacz

+32 (0)2 629 3955

llatacz@etrovub.be

more info

Image

- Contact person

- IRIS

- AVSP

- LAMI

- Contact person

- Thesis proposals

- ETRO Courses

- Contact person

- Spin-offs

- Know How

- Journals

- Conferences

- Books

- Vacancies

- News

- Events

- Press

Contact

ETRO Department

info@etro.vub.ac.be

Tel: +32 2 629 29 30

©2019 • Vrije Universiteit Brussel • ETRO Dept. • Pleinlaan 2 • 1050 Brussels • Tel: +32 2 629 2930 (secretariat) • Fax: +32 2 629 2883 • WebmasterDisclaimer