Deep unfolding methods design deep neural networks as learned variations of optimization algorithms through the unrolling of their iterations. These networks have been shown to achieve faster convergence and higher accuracy than the original optimization methods. In this line of research, this paper presents novel interpretable deep recurrent neural networks (RNNs), designed by the unfolding of iterative algorithms that solve the task of sequential signal reconstruction (in particular, video reconstruction). The proposed networks are designed by accounting that video frames patches have a sparse representation and the temporal difference between consecutive representations is also sparse. Specifically, we design an interpretable deep RNN (coined reweighted-RNN) by unrolling the iterations of a proximal method that solves a reweighted version of the l1-l1 minimization problem. Due to the underlying minimization model, our reweighted-RNN has a different thresholding function (alias, different activation function) for each hidden unit in each layer. In this way, it has higher network expressivity than existing deep unfolding RNN models. We also present the derivative l1-l1-RNN model, which is obtained by unfolding a proximal method for the l1-l1 minimization problem. We apply the proposed interpretable RNNs to the task of video frame reconstruction from low-dimensional measurements, that is, sequential video frame reconstruction. The experimental results on various datasets demonstrate that the proposed deep RNNs outperform various RNN models.
Van Luong, H , Joukovsky, BJ & Deligiannis, N 2021, ' Designing Interpretable Recurrent Neural Networks for Video Reconstruction Via Deep Unfolding ', IEEE Transactions on Image Processing , vol. 30, 9394770, pp. 4099 - 4113.
Van Luong, H. , Joukovsky, B. J. , & Deligiannis, N. (2021). Designing Interpretable Recurrent Neural Networks for Video Reconstruction Via Deep Unfolding . IEEE Transactions on Image Processing , 30 , 4099 - 4113. [9394770].
@article{b94deaaaa726417daf69dfe2da783a44,
title = " Designing Interpretable Recurrent Neural Networks for Video Reconstruction Via Deep Unfolding " ,
abstract = " Deep unfolding methods design deep neural networks as learned variations of optimization algorithms through the unrolling of their iterations. These networks have been shown to achieve faster convergence and higher accuracy than the original optimization methods. In this line of research, this paper presents novel interpretable deep recurrent neural networks (RNNs), designed by the unfolding of iterative algorithms that solve the task of sequential signal reconstruction (in particular, video reconstruction). The proposed networks are designed by accounting that video frames{ extquoteright} patches have a sparse representation and the temporal difference between consecutive representations is also sparse. Specifically, we design an interpretable deep RNN (coined reweighted-RNN) by unrolling the iterations of a proximal method that solves a reweighted version of the l1-l1 minimization problem. Due to the underlying minimization model, our reweighted-RNN has a different thresholding function (alias, different activation function) for each hidden unit in each layer. In this way, it has higher network expressivity than existing deep unfolding RNN models. We also present the derivative l1-l1-RNN model, which is obtained by unfolding a proximal method for the l1-l1 minimization problem. We apply the proposed interpretable RNNs to the task of video frame reconstruction from low-dimensional measurements, that is, sequential video frame reconstruction. The experimental results on various datasets demonstrate that the proposed deep RNNs outperform various RNN models. " ,
author = " {Van Luong}, Huynh and Joukovsky, {Boris Joseph} and Nikos Deligiannis " ,
note = " Funding Information: Manuscript received May 1, 2020 revised January 13, 2021 and March 16, 2021 accepted March 17, 2021. Date of publication April 2, 2021 date of current version April 9, 2021. This work was supported in part by the FWO Research Project under Grant G093817N, in part by the Ph.D. Fellowship Strategic Basic Research under Grant 1SB5721N, and in part by the Flemish Government through the Onderzoeksprogramma Artifi-ci{ " e}le Intelligentie (AI) Vlaanderen Programme. This article was presented at the 2019 IEEE International Conference on Image Processing (ICIP) [1]. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Denis Kouame. (Corresponding author: Nikos Deligiannis.) The authors are with the Department of Electronics and Informatics (ETRO), Vrije Universiteit Brussel, B-1050 Brussels, Belgium, and also with imec, B-3001 Leuven, Belgium (e-mail: [email protected] bjoukovs@ etrovub.be [email protected] ). Digital Object Identifier 10.1109/TIP.2021.3069296 Publisher Copyright: { extcopyright} 1992-2012 IEEE. Copyright: Copyright 2021 Elsevier B.V., All rights reserved. " ,
year = " 2021 " ,
month = apr,
day = " 2 " ,
doi = " 10.1109/TIP.2021.3069296 " ,
language = " English " ,
volume = " 30 " ,
pages = " 4099 4113 " ,
journal = " IEEE Transactions on Image Processing " ,
issn = " 1057-7149 " ,
publisher = " Institute of Electrical and Electronics Engineers Inc. " ,
}