Multi-view 3D content is subject to distortions during the process of depth image-based rendering (DIBR). Studies have shown the unreliable performance of the well-established image quality assessment (IQA) models for evaluation of DIBR-synthesized views which surge the need for more effective IQA methods. Existing objective methods generally rely on the pixel-wise correspondences between the reference and distorted images, while view synthesis can introduce pixel shifts. DIBR distortions such as stretching and local hole-filling errors have different visual impacts from conventional distortions, challenging the existing IQA models. Here, we developed a Full-Reference (FR) objective IQA metric for synthesized views that significantly outperforms 2D IQA and the state-of-the-art DIBR IQA approaches. While the pixel misalignment between the reference and synthesized views is a big challenge for quality assessment, we deployed a Convolutional Neural Network (CNN) model to acquire a feature representation that inherently offers resilience to the imperceptible pixel shift between the compared images. Therefore, our model does not need accurate shift compensation. We deployed a set of quality-aware CNN features representing high-order statistics, to measure the structural similarity which is combined with a semantic similarity measure for accurate quality assessment. Moreover, prediction accuracy is improved by incorporating a visual saliency model acquired using the activations of the higher CNN layers. Experimental results indicate a significant performance gain (14.6% in terms of Spearman's rank-order correlation) compared to the top existing IQA model. The source code of the proposed metric is available at: https://gitlab.com/saeedmp/sequss.
Mahmoudpour, S & Schelkens, P 2022, 'Unifying Structural and Semantic Similarities for Quality Assessment of DIBR-Synthesized Views', IEEE Access , vol. 10, pp. 59026-59036. https://doi.org/10.1109/ACCESS.2022.3179693
Mahmoudpour, S., & Schelkens, P. (2022). Unifying Structural and Semantic Similarities for Quality Assessment of DIBR-Synthesized Views. IEEE Access , 10, 59026-59036. https://doi.org/10.1109/ACCESS.2022.3179693
@article{668937fb67c3493ca2aef54050a887a8,
title = "Unifying Structural and Semantic Similarities for Quality Assessment of DIBR-Synthesized Views",
abstract = "Multi-view 3D content is subject to distortions during the process of depth image-based rendering (DIBR). Studies have shown the unreliable performance of the well-established image quality assessment (IQA) models for evaluation of DIBR-synthesized views which surge the need for more effective IQA methods. Existing objective methods generally rely on the pixel-wise correspondences between the reference and distorted images, while view synthesis can introduce pixel shifts. DIBR distortions such as stretching and local hole-filling errors have different visual impacts from conventional distortions, challenging the existing IQA models. Here, we developed a Full-Reference (FR) objective IQA metric for synthesized views that significantly outperforms 2D IQA and the state-of-the-art DIBR IQA approaches. While the pixel misalignment between the reference and synthesized views is a big challenge for quality assessment, we deployed a Convolutional Neural Network (CNN) model to acquire a feature representation that inherently offers resilience to the imperceptible pixel shift between the compared images. Therefore, our model does not need accurate shift compensation. We deployed a set of quality-aware CNN features representing high-order statistics, to measure the structural similarity which is combined with a semantic similarity measure for accurate quality assessment. Moreover, prediction accuracy is improved by incorporating a visual saliency model acquired using the activations of the higher CNN layers. Experimental results indicate a significant performance gain (14.6% in terms of Spearman's rank-order correlation) compared to the top existing IQA model. The source code of the proposed metric is available at: https://gitlab.com/saeedmp/sequss. ",
keywords = "Deep neural networks, depth image-based rendering, image semantics, saliency map, visual quality assessment",
author = "Saeed Mahmoudpour and Peter Schelkens",
note = "Funding Information: This work was supported by the Research Foundation-Flanders (FWO) under Grant G0B3521N. Publisher Copyright: {\textcopyright} 2013 IEEE. Copyright: Copyright 2022 Elsevier B.V., All rights reserved.",
year = "2022",
doi = "10.1109/ACCESS.2022.3179693",
language = "English",
volume = "10",
pages = "59026--59036",
journal = "IEEE Access ",
issn = "2169-3536",
publisher = "IEEE",
}