In this Letter, the authors propose a deep learning based method to perform semantic segmentation of clothes from RGB-D images of people. First, they present a synthetic dataset containing more than 50,000 RGB-D samples of characters in different clothing styles, featuring various poses and environments for a total of nine semantic classes. The proposed data generation pipeline allows for fast production of RGB, depth images and ground-truth label maps.Secondly, a novel multi-modal encoder–ecoder convolutional network is proposed which operates on RGB and depth modalities. Multi-modal features are merged using trained fusion modules which use multi-scale atrous convolutions in the fusion process. The method is numerically evaluated on synthetic data and visually assessed on real-world data. The experiments demonstrate the efficiency of the proposed model over existing methods.
Joukovsky, BJ, Hu, P & Munteanu, A 2020, 'Multi-modal deep network for RGB-D segmentation of clothes', Electronics Letters, vol. 56, no. 9, pp. 432-434. https://doi.org/10.1049/el.2019.4150
Joukovsky, B. J., Hu, P., & Munteanu, A. (2020). Multi-modal deep network for RGB-D segmentation of clothes. Electronics Letters, 56(9), 432-434. https://doi.org/10.1049/el.2019.4150
@article{dc91e29158b84e24bc5ec77b26956217,
title = "Multi-modal deep network for RGB-D segmentation of clothes",
abstract = "In this Letter, the authors propose a deep learning based method to perform semantic segmentation of clothes from RGB-D images of people. First, they present a synthetic dataset containing more than 50,000 RGB-D samples of characters in different clothing styles, featuring various poses and environments for a total of nine semantic classes. The proposed data generation pipeline allows for fast production of RGB, depth images and ground-truth label maps.Secondly, a novel multi-modal encoder–ecoder convolutional network is proposed which operates on RGB and depth modalities. Multi-modal features are merged using trained fusion modules which use multi-scale atrous convolutions in the fusion process. The method is numerically evaluated on synthetic data and visually assessed on real-world data. The experiments demonstrate the efficiency of the proposed model over existing methods.",
author = "Joukovsky, {Boris Joseph} and Pengpeng Hu and Adrian Munteanu",
year = "2020",
month = apr,
day = "30",
doi = "10.1049/el.2019.4150",
language = "English",
volume = "56",
pages = "432--434",
journal = "Electronics Letters",
issn = "0013-5194",
publisher = "Institution of Engineering and Technology",
number = "9",
}