The reconstruction of a high resolution image given a low resolution observation is an ill-posed inverse problem in imaging. Deep learning methods rely on training data to learn an end-to-end mapping from a low-resolution input to a high-resolution output. Unlike existing deep multimodal models that do not incorporate domain knowledge about the problem, we propose a multimodal deep learning design that incorporates sparse priors and allows the effective integration of information from another image modality into the network architecture. Our solution relies on a novel deep unfolding operator, performing steps similar to an iterative algorithm for convolutional sparse coding with side information; therefore, the proposed neural network is interpretable by design. The deep unfolding architecture is used as a core component of a multimodal framework for guided image super-resolution. An alternative multimodal design is investigated by employing residual learning to improve the training efficiency. The presented multimodal approach is applied to super-resolution of near-infrared and multi-spectral images as well as depth upsampling using RGB images as side information. Experimental results show that our model outperforms state-of-the-art methods.
Marivani, I, Tsiligianni, E, Cornelis, B & Deligiannis, N 2020, 'Multimodal Deep Unfolding for Guided Image Super-Resolution', IEEE Transactions on Image Processing, vol. 29, pp. 8443-8456. https://doi.org/10.1109/TIP.2020.3014729
Marivani, I., Tsiligianni, E., Cornelis, B., & Deligiannis, N. (2020). Multimodal Deep Unfolding for Guided Image Super-Resolution. IEEE Transactions on Image Processing, 29, 8443-8456. https://doi.org/10.1109/TIP.2020.3014729
@article{a1c1400036f942d6816343dbfdb2d66e,
title = "Multimodal Deep Unfolding for Guided Image Super-Resolution",
abstract = "The reconstruction of a high resolution image given a low resolution observation is an ill-posed inverse problem in imaging. Deep learning methods rely on training data to learn an end-to-end mapping from a low-resolution input to a high-resolution output. Unlike existing deep multimodal models that do not incorporate domain knowledge about the problem, we propose a multimodal deep learning design that incorporates sparse priors and allows the effective integration of information from another image modality into the network architecture. Our solution relies on a novel deep unfolding operator, performing steps similar to an iterative algorithm for convolutional sparse coding with side information; therefore, the proposed neural network is interpretable by design. The deep unfolding architecture is used as a core component of a multimodal framework for guided image super-resolution. An alternative multimodal design is investigated by employing residual learning to improve the training efficiency. The presented multimodal approach is applied to super-resolution of near-infrared and multi-spectral images as well as depth upsampling using RGB images as side information. Experimental results show that our model outperforms state-of-the-art methods.",
author = "Iman Marivani and Evangelia Tsiligianni and Bruno Cornelis and Nikolaos Deligiannis",
year = "2020",
month = aug,
day = "12",
doi = "10.1109/TIP.2020.3014729",
language = "English",
volume = "29",
pages = "8443--8456",
journal = "IEEE Transactions on Image Processing",
issn = "1057-7149",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
}