Out-of-Distribution Detection for Medical Image Analysis ■
Context
Machine learning models are typically trained on data from a specific source or
distribution. When deployed on data coming from a different source, their performance
can drop significantly a phenomenon known as distribution shift. To anticipate this,
several metrics have been proposed to measure the distance between datasets helping
practitioners decide whether a model needs to be retrained or fine-tuned before
deployment.
However, this challenge has become considerably harder with the rise of foundation
models large models pre-trained on massive, often partially undisclosed datasets.
Because the exact composition of their training data is unknown, it is difficult to predict
when such a model will fail to generalize to a new domain, and classical dataset distance
metrics lose much of their practical relevance.
This problem is particularly acute in medical imaging. Unlike natural images, medical
images are highly sensitive to acquisition conditions: the same anatomical structure can
look drastically different depending on the imaging modality (MRI, CT, PET), the scanner
manufacturer, the acquisition protocol, or the clinical site. A model trained on images
from one hospital or one type of scanner may silently underperform on images from
another, with potentially serious clinical consequences. Detecting such failures before
they impact patient care is therefore a critical safety requirement.
Out-of-distribution (OOD) detection offers a principled framework to address this: rather
than assuming a model will generalize, the goal is to automatically identify input samples
that fall outside the distribution the model was trained on, and flag them as unreliable.
Objectives
The main objective of this thesis is to develop and validate an out-of-distribution
detection approach tailored to medical image analysis, capable of identifying
samples on which a model is likely to fail.
A secondary objective is to design a metric to quantify the degree of out-ofdistribution,
going beyond a binary in/out decision and providing a continuous,
interpretable score that could be used to rank or triage samples in a clinical pipeline.
Framework of the Thesis ■
Description of Work
Literature review: conduct a comprehensive review of OOD detection methods,
both in the general machine learning literature and in the medical imaging domain
specifically. This includes a survey of existing approaches, available benchmarks,
and evaluation metrics used to assess OOD detection performance.
Development of an OOD detection approach : design and implement an OOD
detection method adapted to the constraints of medical imaging in particular,
its ability to handle heterogeneous acquisition sources and limited labeled data.
The approach should be compatible with modern deep learning architectures,
including foundation models.
Development of an OOD quantification metric : propose a metric to quantify
how far a given sample is from the training distribution, providing a continuous
and interpretable score. This metric should be meaningful across different medical
imaging contexts and robust to common sources of variability (scanner, protocol,
site).
Evaluation and comparison with state of the art: evaluate the proposed
approach and benchmark it against existing OOD detection methods.
Expected Student Profile ■
Strong knowledge of machine learning, deep learning, and AI, with a specific
focus on computer vision
Solid experience in Python programming and the PyTorch deep learning
framework
Familiarity with medical imaging concepts is a plus, but not required
Ability to work independently, conduct a literature review, and implement
research-level code