Big datasets contain correlated heterogeneous data acquired by diverse modalities, e.g., photography, multispectral and infrared imaging, as well as computed tomography (CT), X- radiography, and ultra-sound sensors in medical imaging and non-destructive testing. While there are modalities that can easily be captured in high-resolution, in practice some modalities are more susceptible to environmental noise and are mainly available in low-resolution due to the time constraints as well as the cost per pixel of the corresponding sensors. Hence, multimodal image restoration, which refers to the reconstruction of one modality guided by another, and multimodal image fusion, that is, the fusion of images from different sources into a single more comprehensive one, are among important computer vision problems. In this PhD research, we focus on designing deep unfolding networks for multimodal image restoration and fusion. Analytical methods for image restoration and fusion rely on solving complex optimization problems at training and inference, making them computationally expensive. Deep learning methods can learn a nonlinear mapping between the input and the desired output from data, delivering high accuracy at a lowcomputational cost during inference. However, the existing deep models, which behave like a black box, do not incorporate any prior knowledge. Recently, deep unfolding introduced the idea of integrating domain knowledge in the form of signal priors, e.g., sparsity, into the single modal neural network architecture. In this thesis, we present multimodal deep unfolding designs based on coupled convolutional sparse coding for multimodal image restoration and fusion. We propose two formulations for multimodal image restoration in the form of coupled convolutional sparse coding problems. The first formulation assumes that the representations of the guidance modality is provided and fixed. While the second formulation allows intermediate refinements of both modalities to produce a more suitable guidance representation for the reconstruction. We design two categories of multimodal CNNs by adopting two optimization techniques, i.e., proximal algorithms, and the method of multipliers, for solving the corresponding sparse coding problems. We also design a multimodal image fusion model based on the second formulation. Our deep unfolding models are extensively evaluated on several benchmark multimodal image datasets for the applications of multimodal image super- resolution and denoising, as well as multi focus and multi exposure image fusion.
Marivani, I 2022, ' DEEP UNFOLDING DESIGNS FOR MULTIMODAL IMAGE RESTORATION AND FUSION ', Doctor of Engineering Sciences, Vrije Universiteit Brussel.
Marivani, I. (2022). DEEP UNFOLDING DESIGNS FOR MULTIMODAL IMAGE RESTORATION AND FUSION .
@phdthesis{d67fb25370ac4adeae6b9b82aa8309f0,
title = " DEEP UNFOLDING DESIGNS FOR MULTIMODAL IMAGE RESTORATION AND FUSION " ,
abstract = " Big datasets contain correlated heterogeneous data acquired by diversemodalities, e.g., photography, multispectral and infrared imaging, as wellas computed tomography (CT), X- radiography, and ultra-sound sensorsin medical imaging and non-destructive testing. While there aremodalities that can easily be captured in high-resolution, in practice somemodalities are more susceptible to environmental noise and are mainlyavailable in low-resolution due to the time constraints as well as the costper pixel of the corresponding sensors. Hence, multimodal imagerestoration, which refers to the reconstruction of one modality guided byanother, and multimodal image fusion, that is, the fusion of images fromdifferent sources into a single more comprehensive one, are amongimportant computer vision problems. In this PhD research, we focus ondesigning deep unfolding networks for multimodal image restoration andfusion.Analytical methods for image restoration and fusion rely on solvingcomplex optimization problems at training and inference, making themcomputationally expensive. Deep learning methods can learn a nonlinearmapping between the input and the desired output from data, deliveringhigh accuracy at a lowcomputational cost during inference. However, theexisting deep models, which behave like a black box, do not incorporateany prior knowledge. Recently, deep unfolding introduced the idea ofintegrating domain knowledge in the form of signal priors, e.g., sparsity,into the single modal neural network architecture. In this thesis, wepresent multimodal deep unfolding designs based on coupledconvolutional sparse coding for multimodal image restoration and fusion.We propose two formulations for multimodal image restoration in theform of coupled convolutional sparse coding problems. The firstformulation assumes that the representations of the guidance modality isprovided and fixed. While the second formulation allows intermediaterefinements of both modalities to produce a more suitable guidancerepresentation for the reconstruction. We design two categories ofmultimodal CNNs by adopting two optimization techniques, i.e., proximalalgorithms, and the method of multipliers, for solving the correspondingsparse coding problems. We also design a multimodal image fusion modelbased on the second formulation. Our deep unfolding models areextensively evaluated on several benchmark multimodal image datasetsfor the applications of multimodal image super- resolution and denoising,as well as multi focus and multi exposure image fusion. " ,
author = " Iman Marivani " ,
year = " 2022 " ,
month = jun,
day = " 14 " ,
language = " English " ,
school = " Vrije Universiteit Brussel " ,
}