Synthetic traffic datasets provide highly accurate and affordable annotations, which are of crucial importance in complex vision-based perception tasks performed on real-world traffic data. Due to the lack of paired 2-Dā3-D data, it remains very challenging when adapting the knowledge of a vehicle{\textquoteright}s pose in SE(3) with its known 3-D geometry. In this article, we first propose a synthetic dataset, SynthV6D, enabling 6-D pose estimation of vehicles in monocular traffic images. The dataset comprises industrial-grade vehicles in motion evolving in realistic virtual scenery, covering a wide range of viewpoints and distances. Second, we introduce a weakly supervised domain adaptation approach, dubbed W6DNet, to recover the 6-D pose. To this end, by using the synthetic dataset, a novel linked image feature space-based domain adaptation is introduced. Furthermore, an original two-step double-fusion block is proposed to fuse the multi-modal data representations and the cross-space features. Consequently, the proposed method learns the pose-specific embeddings. We evaluate W6DNet on the real-world ApolloCar3D dataset. Extensive experimental results demonstrate that, when a small amount of real-world data is accessible, the proposed approach can significantly advance the performance when adapting knowledge from SynthV6D. Moreover, it achieves competitive performance compared to fully supervised state-of-the-art methods. The code is available at https://github.com/YangLyu-123/TIM-W6DNet.git .
Yangxintong, L, Royen, RD & Munteanu, A 2024, 'W6DNet: Weakly Supervised Domain Adaptation for Monocular Vehicle 6-D Pose Estimation With 3-D Priors and Synthetic Data', IEEE Transactions on Instrumentation and Measurement, vol. 73, 5010313, pp. 1-13. https://doi.org/10.1109/TIM.2024.3363789
Yangxintong, L., Royen, R. D., & Munteanu, A. (2024). W6DNet: Weakly Supervised Domain Adaptation for Monocular Vehicle 6-D Pose Estimation With 3-D Priors and Synthetic Data. IEEE Transactions on Instrumentation and Measurement, 73, 1-13. Article 5010313. https://doi.org/10.1109/TIM.2024.3363789
@article{0127d187b8c447b2aaab3996a3bcacb0,
title = "W6DNet: Weakly Supervised Domain Adaptation for Monocular Vehicle 6-D Pose Estimation With 3-D Priors and Synthetic Data",
abstract = "Synthetic traffic datasets provide highly accurate and affordable annotations, which are of crucial importance in complex vision-based perception tasks performed on real-world traffic data. Due to the lack of paired 2-Dā3-D data, it remains very challenging when adapting the knowledge of a vehicle{\textquoteright}s pose in SE(3) with its known 3-D geometry. In this article, we first propose a synthetic dataset, SynthV6D, enabling 6-D pose estimation of vehicles in monocular traffic images. The dataset comprises industrial-grade vehicles in motion evolving in realistic virtual scenery, covering a wide range of viewpoints and distances. Second, we introduce a weakly supervised domain adaptation approach, dubbed W6DNet, to recover the 6-D pose. To this end, by using the synthetic dataset, a novel linked image feature space-based domain adaptation is introduced. Furthermore, an original two-step double-fusion block is proposed to fuse the multi-modal data representations and the cross-space features. Consequently, the proposed method learns the pose-specific embeddings. We evaluate W6DNet on the real-world ApolloCar3D dataset. Extensive experimental results demonstrate that, when a small amount of real-world data is accessible, the proposed approach can significantly advance the performance when adapting knowledge from SynthV6D. Moreover, it achieves competitive performance compared to fully supervised state-of-the-art methods. The code is available at https://github.com/YangLyu-123/TIM-W6DNet.git .",
author = "Lyu Yangxintong and Royen, {Remco Donovan} and Adrian Munteanu",
note = "Funding Information: This work was supported in part by the Innoviris through the Research Projects TORRES and in part by the Fonds Wetenschappelijk Onderzoek (FWO) under Grant 1S89420N Publisher Copyright: {\textcopyright} 1963-2012 IEEE.",
year = "2024",
month = feb,
day = "21",
doi = "10.1109/TIM.2024.3363789",
language = "English",
volume = "73",
pages = "1--13",
journal = "IEEE Transactions on Instrumentation and Measurement",
issn = "0018-9456",
publisher = "Institute of Electrical and Electronics Engineers",
}