“Signal Processing in the AI era” was the tagline of this year’s IEEE International Conference on Acoustics, Speech and Signal Processing, taking place in Rhodes, Greece.
In this context, Brent de Weerdt, Xiangyu Yang, Boris Joukovsky, Alex Stergiou and Nikos Deligiannis presented ETRO’s research during poster sessions and oral presentations, with novel ways to process and understand graph, video, and audio data. Nikos Deligiannis chaired a session on Graph Deep Learning, attended the IEEE T-IP Editorial Board Meeting, and had the opportunity to meet with collaborators from the VUB-Duke-Ugent-UCL joint lab.
Featured articles:

– Non-EEA citizens: before the 1st of April. (for the next academic year).
– EEA-citizens: before the 1st of September (for the next academic year).
On October 25th 2024 at 16:00, Yuqing Yang will defend their PhD entitled “CRAFTING EFFECTIVE VISUAL EXPLANATIONS BY ATTRIBUTING THE IMPACT OF DATASETS, ARCHITECTURES AND DATA COMPRESSION TECHNIQUES”.
Everybody is invited to attend the presentation in room D.2.01 or online via this link.
Explainable Artificial Intelligence (XAI) plays an important role in modern AI research, motivated by the desire for transparency and interpretability within AI-driven decision-making. As AI systems become more advanced and complicated, it becomes increasingly important to ensure they are reliable, responsible, and ethical. These imperatives are particularly acute in domains where stakes are high, such as medical diagnostics, autonomous driving, and security frameworks.
In computer vision, XAI aims to provide understandable, straightforward explanations for AI model predictions, allowing users to grasp the decision-making processes of these complex systems. Visualizations such as saliency maps are frequently employed to identify input data regions significantly impacting model predictions, thus enhancing user understanding of AI visual data analysis. However, there are still concerns about the effectiveness of visual explanations, especially regarding their robustness, trustworthiness, and human-friendliness.
Our research aims to advance this field by evaluating how various factors—such as the diversity of datasets, the architecture of models, and techniques for data compression—influence the effectiveness of visual explanations in AI applications. Through thorough analysis and careful refinement, we strive to enhance these explanations, ensuring they are both highly informative and accessible to users in diverse XAI applications.
During our evaluation process, we conduct a detailed investigation using both automatic metrics and subjective evaluation methods to assess the effectiveness of visual explanations thoroughly. Automatic metrics, such as task performance and localization accuracy, provide quantifiable measures of the effectiveness of these explanations in real-world scenarios. For subjective evaluation, we have developed a framework named SNIPPET, which enables a detailed and user-oriented assessment of visual explanations. Additionally, our research explores how these objective metrics correlate with subjective human judgments, aiming to integrate quantitative data with the more nuanced, qualitative feedback from users. Ultimately, our goal is to provide comprehensive insights into the practical aspects of XAI methodologies, particularly focusing on their implementation in the field of computer vision.
On April 2nd 2026 at 16:00, Sebastian Amador Sanchez will defend their PhD entitled “Advancing Landmark Localization through Deep Segmentation for Reliable Malalignment Assessment in Lower Limb Radiographs”.
Everybody is invited to attend the presentation in room I.2.01 or online via this link.
Knee osteoarthritis affects millions worldwide and is frequently associated with lower limb malalignment. Clinical assessment of malalignment relies on manual landmark identification in X-ray images, a time-consuming process prone to interobserver variability. While most automated approaches use regression-based deep learning, this thesis investigates image segmentation with circular masks centered on landmark locations as an alternative strategy.
First, we introduce a segmentation-guided coordinate regression framework that integrates a segmentation network with a coordinate regression branch, trained end-to-end. This hybrid approach improves localization accuracy over standard regression and increases robustness compared to standalone segmentation, thereby mitigating false positives and missed detections.
Second, we optimize segmentation-based methods by evaluating architectures, post-processing strategies, and mask sizes. A fully convolutional model trained with radius-15 masks, combined with adaptive threshold-based centroid extraction, outperformed conventional landmark localization approaches, with improved performance in knee phenotype classification.
Third, we propose a Siamese network trained with a contrastive loss for quality control that detects inaccurate predictions by comparing image patches to reference embeddings. The method reliably identifies errors exceeding 2.0 mm and estimates their magnitude, outperforming baseline methods.
Overall, this thesis advances the field of landmark localization and demonstrates its clinical relevance for the automated assessment of lower limb malalignment. Beyond landmark localization accuracy, our contributions address robustness and failure identification, two aspects that are often overlooked yet vital for future clinical deployment.
Two of ETRO’s postdocs, Angel for the project Equitable Oxymetry and Abel for the I-Healthy path project, have been selected for the MedTech accelerator: https://lifetech.brussels/en/medtech-accelerator-en/.
More info and pictures in these links:
On March 31 2021 at 16.00 Panagiotis Tsinganos will defend his PhD entitled “Multi-channel EMG pattern classification based on deep learning”.
Everybody is invited to attend the presentation online via https://upatras-gr.zoom.us/j/98941099749?pwd=ZmdQZkxRYllIaVRDKzJrVHM2L2krQT09
In recent years, a huge body of data generated by various applications in domains like social networks and healthcare have paved the way for the development of high performance models. Deep learning has transformed the field of data analysis by dramatically improving the state of the art in various classification and prediction tasks. Combined with advancements in electromyography it has given rise to new hand gesture recognition applications, such as human computer interfaces, sign language recognition, robotics control and rehabilitation games.
The purpose of this thesis is to develop novel methods for electromyography signal analysis based on deep learning for the problem of hand gesture recognition. Specifically, we focus on methods for data preparation and developing accurate models even when few data are available. Electromyography signals are in general one-dimensional time-series with a rich frequency content. Various feature sets have been proposed in literature however due to the stochastic nature of the signals the performance of the developed models depends on the combination of the features and the classifier. On the other hand, the end-to-end training scheme of deep learning models reduces the effort needed for finding the best features and classification model, yet a suitable preprocessing of the signals is still required. Another problem is that variations in gesture duration, sensor placement and muscle physiology require continuous adaptation of the trained models using new recorded data.
The implementation is based on surface electromyography sensors, which comprise the input to the end-to-end deep learning pipelines that process and classify the electromyography signals. Preprocessing and data preparation techniques for electromyograms are examined, while data augmentation and transfer learning approaches allow developing personalised models even when few data are available. Besides their successful application in other domains, the use of deep learning models allows the development of systems that can easily generalise to new users. The use of electromyography sensors is important because the developed system can detect whether any unwanted compensatory movements are performed, which under typical vision-based interfaces is impossible.
The advancements proposed in this thesis have been evaluated with publicly available data repositories. However, considering that the models are trained in an end-to-end fashion they can be easily adapted to different setups.
