Deep unfolding models are designed by unrolling an optimization algorithm into a deep learning network. These models have shown faster convergence and higher performance compared to the original optimization algorithms. Additionally, by incorporating domain knowledge from the optimization algorithm, they need much less training data to learn efficient representations. Current deep unfolding networks for sequential sparse recovery consist of recurrent neural networks (RNNs), which leverage the similarity between consecutive signals. We redesign the optimization problem to use correlations across the whole sequence, which unfolds into a Transformer architecture. Our model is used for the task of video frame reconstruction from low-dimensional measurements and is shown to outperform state-of-the-art deep unfolding RNN and Transformer models, as well as a traditional Vision Transformer on several video datasets.
De Weerdt, B, Eldar, YC & Deligiannis, N 2023, Designing Transformer networks for sparse recovery of sequential data using deep unfolding. in 2023 IEEE International Conference on Acoustics, Speech and Signal Processing. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, vol. 2023-June, IEEE, pp. 1-5, 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing, Rhodes, Greece, 4/06/23. https://doi.org/10.1109/ICASSP49357.2023.10094712
De Weerdt, B., Eldar, Y. C., & Deligiannis, N. (2023). Designing Transformer networks for sparse recovery of sequential data using deep unfolding. In 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 1-5). (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2023-June). IEEE. https://doi.org/10.1109/ICASSP49357.2023.10094712
@inproceedings{e066b1c46eac483d92e1a7ee64b64e99,
title = "Designing Transformer networks for sparse recovery of sequential data using deep unfolding",
abstract = "Deep unfolding models are designed by unrolling an optimization algorithm into a deep learning network. These models have shown faster convergence and higher performance compared to the original optimization algorithms. Additionally, by incorporating domain knowledge from the optimization algorithm, they need much less training data to learn efficient representations. Current deep unfolding networks for sequential sparse recovery consist of recurrent neural networks (RNNs), which leverage the similarity between consecutive signals. We redesign the optimization problem to use correlations across the whole sequence, which unfolds into a Transformer architecture. Our model is used for the task of video frame reconstruction from low-dimensional measurements and is shown to outperform state-of-the-art deep unfolding RNN and Transformer models, as well as a traditional Vision Transformer on several video datasets.",
keywords = "deep unfolding, Transformer networks, sparse recovery, compressed sensing",
author = "{De Weerdt}, Brent and Eldar, {Yonina C.} and Nikos Deligiannis",
note = "Funding Information: This work was supported by the FWO Flanders Ph.D. Fellowship Strategic Basic Research under Grant 1S44523N and imec.icon Surv-AI-llance. Publisher Copyright: {\textcopyright} 2023 IEEE.; 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2023 ; Conference date: 04-06-2023 Through 10-06-2023",
year = "2023",
doi = "10.1109/ICASSP49357.2023.10094712",
language = "English",
series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
publisher = "IEEE",
pages = "1--5",
booktitle = "2023 IEEE International Conference on Acoustics, Speech and Signal Processing",
url = "https://2023.ieeeicassp.org/",
}