Big datasets contain correlated heterogeneous data acquired by diversemodalities, e.g., photography, multispectral and infrared imaging, as wellas computed tomography (CT), X- radiography, and ultra-sound sensorsin medical imaging and non-destructive testing. While there aremodalities that can easily be captured in high-resolution, in practice somemodalities are more susceptible to environmental noise and are mainlyavailable in low-resolution due to the time constraints as well as the costper pixel of the corresponding sensors. Hence, multimodal imagerestoration, which refers to the reconstruction of one modality guided byanother, and multimodal image fusion, that is, the fusion of images fromdifferent sources into a single more comprehensive one, are amongimportant computer vision problems. In this PhD research, we focus ondesigning deep unfolding networks for multimodal image restoration andfusion.Analytical methods for image restoration and fusion rely on solvingcomplex optimization problems at training and inference, making themcomputationally expensive. Deep learning methods can learn a nonlinearmapping between the input and the desired output from data, deliveringhigh accuracy at a lowcomputational cost during inference. However, theexisting deep models, which behave like a black box, do not incorporateany prior knowledge. Recently, deep unfolding introduced the idea ofintegrating domain knowledge in the form of signal priors, e.g., sparsity,into the single modal neural network architecture. In this thesis, wepresent multimodal deep unfolding designs based on coupledconvolutional sparse coding for multimodal image restoration and fusion.We propose two formulations for multimodal image restoration in theform of coupled convolutional sparse coding problems. The firstformulation assumes that the representations of the guidance modality isprovided and fixed. While the second formulation allows intermediaterefinements of both modalities to produce a more suitable guidancerepresentation for the reconstruction. We design two categories ofmultimodal CNNs by adopting two optimization techniques, i.e., proximalalgorithms, and the method of multipliers, for solving the correspondingsparse coding problems. We also design a multimodal image fusion modelbased on the second formulation. Our deep unfolding models areextensively evaluated on several benchmark multimodal image datasetsfor the applications of multimodal image super- resolution and denoising,as well as multi focus and multi exposure image fusion.
Marivani, I 2022, 'Deep unfolding designs for multimodal image restoration and fusion', Vrije Universiteit Brussel.
Marivani, I. (2022). Deep unfolding designs for multimodal image restoration and fusion. [PhD Thesis, Vrije Universiteit Brussel].
@phdthesis{8ead706e6e9042528936c3ca6fbc3a06,
title = "Deep unfolding designs for multimodal image restoration and fusion",
abstract = "Big datasets contain correlated heterogeneous data acquired by diversemodalities, e.g., photography, multispectral and infrared imaging, as wellas computed tomography (CT), X- radiography, and ultra-sound sensorsin medical imaging and non-destructive testing. While there aremodalities that can easily be captured in high-resolution, in practice somemodalities are more susceptible to environmental noise and are mainlyavailable in low-resolution due to the time constraints as well as the costper pixel of the corresponding sensors. Hence, multimodal imagerestoration, which refers to the reconstruction of one modality guided byanother, and multimodal image fusion, that is, the fusion of images fromdifferent sources into a single more comprehensive one, are amongimportant computer vision problems. In this PhD research, we focus ondesigning deep unfolding networks for multimodal image restoration andfusion.Analytical methods for image restoration and fusion rely on solvingcomplex optimization problems at training and inference, making themcomputationally expensive. Deep learning methods can learn a nonlinearmapping between the input and the desired output from data, deliveringhigh accuracy at a lowcomputational cost during inference. However, theexisting deep models, which behave like a black box, do not incorporateany prior knowledge. Recently, deep unfolding introduced the idea ofintegrating domain knowledge in the form of signal priors, e.g., sparsity,into the single modal neural network architecture. In this thesis, wepresent multimodal deep unfolding designs based on coupledconvolutional sparse coding for multimodal image restoration and fusion.We propose two formulations for multimodal image restoration in theform of coupled convolutional sparse coding problems. The firstformulation assumes that the representations of the guidance modality isprovided and fixed. While the second formulation allows intermediaterefinements of both modalities to produce a more suitable guidancerepresentation for the reconstruction. We design two categories ofmultimodal CNNs by adopting two optimization techniques, i.e., proximalalgorithms, and the method of multipliers, for solving the correspondingsparse coding problems. We also design a multimodal image fusion modelbased on the second formulation. Our deep unfolding models areextensively evaluated on several benchmark multimodal image datasetsfor the applications of multimodal image super- resolution and denoising,as well as multi focus and multi exposure image fusion.",
author = "Iman Marivani",
year = "2022",
language = "English",
school = "Vrije Universiteit Brussel",
}