ETROVUB

Building Next-Generation V2X Cooperative Perception Datasets and Deep Learning Methods ■

Subject ■

Context
Cooperative perception in connected and autonomous systems (V2X) aims to improve
scene understanding by sharing information between multiple agents such as vehicles,
roadside units, and infrastructure sensors. By aggregating complementary viewpoints,
these systems can overcome occlusions and extend perception range, which is critical for
safe autonomous driving.
However, the performance of such systems strongly depends on the availability of highquality
annotated datasets. Unlike standard single-agent perception, V2X data introduces
additional complexity: multiple synchronized viewpoints, heterogeneous sensors,
communication constraints, and dynamic environments. As a result, curating and
annotating such datasets is significantly more challenging and remains a major
bottleneck for research progress.
While several cooperative perception datasets have been proposed, there is still a lack of
standardized annotation protocols, consistent labeling across agents, and comprehensive
benchmarks to evaluate models in realistic multi-agent settings. Ensuring dataset quality
and defining proper evaluation procedures are therefore essential steps toward reliable
and reproducible research in this domain.

Kind of work ■

Objectives
The main objective of this thesis is to curate and annotate a V2X cooperative perception
dataset, ensuring high-quality, consistent, and scalable labeling across multiple agents
and sensor modalities. A secondary objective is to design and implement benchmarking
protocols to evaluate cooperative perception models on the curated dataset, providing
meaningful and reproducible performance metrics.

Framework of the Thesis ■

Description of Work
• Literature review: conduct a review of existing V2X and cooperative perception
datasets, annotation strategies, and benchmarking protocols. This includes
identifying current limitations in dataset quality, labeling consistency, and
evaluation methodologies.
• Dataset curation and annotation: contribute to the acquisition, organization,
and annotation of a multi-agent perception dataset. This includes defining labeling
guidelines, ensuring temporal and cross-agent consistency, and possibly
developing or adapting annotation tools and pipelines.
• Quality assessment and validation: design methods to assess annotation
quality and consistency across agents and annotators. This may include
automated checks, visualization tools, or agreement metrics.
• Benchmark design and evaluation: define evaluation protocols and metrics for
cooperative perception tasks (e.g., detection, tracking, fusion). Implement
baseline models and benchmark their performance on the curated dataset.

Expected Student Profile ■

• Strong knowledge of machine learning, deep learning, and computer vision
• Solid experience in Python programming and deep learning frameworks (e.g.,
PyTorch)
• Interest in autonomous systems, multi-agent perception, or dataset engineering
• Ability to work independently, conduct a literature review, and implement
research-level code