Low-count whole-body PET denoising with deep learning in a multicenter, multi-tracer and externally validated study

Low-count whole-body PET denoising with deep learning in a multicenter, multi-tracer and externally validated study ■

Justine Maes, Charles Carron, Simon DeKeyser, Tomas Brants, Vicky De Ridder, Amine Chaouki, Camille Steenhout, Stefaan Vandenberghe, Yves D'Asseler, Laurens Raes, Azzam Abdalla Ibrahim, Gerard Moulin-Romsee, Isaac Kargar Samani, Ludovic D'hulst, Sezgin Ustmert, Maarten Larmuseau

Abstract ■

BACKGROUND: Positron Emission Tomography (PET) is a powerful diagnostic tool, but its availability, high cost and radiation burden limit its accessibility. Deep learning-based denoising offers a potential solution by enabling low-count PET scans, reducing tracer dose or scan time without compromising diagnostic utility. However, clinical validation of such approaches across different scanner technologies and radiotracers remains limited.METHODS: We conducted a multicenter, blinded evaluation of NUCLARITY, a deep learning-based denoising software, using PET data from three European hospitals. Data included 65 scans acquired with [1⁸F]FDG, [1⁸F]PSMA, [68 Ga]PSMA, and [68 Ga]DOTATATE on GE and Siemens systems not seen during model training. Low-count scans (50\% simulated) were denoised and compared to full-count clinical scans. Image quality was assessed using RMSE, PSNR, and SSIM. Six nuclear physicians evaluated diagnostic image quality (DIQ), diagnostic confidence (DC), and lesion detection across six anatomical regions. Lesion quantification was compared using SUVmean, SUVmax, and MTV.RESULTS: Low-count enhanced (LCE) scans showed improved quantitative image quality metrics compared to unenhanced low-count scans (higher PSNR/SSIM, lower RMSE). Across 243 lesions, SUVmean and SUVmax showed high concordance between standard-count (SC) and LCE scans (CCC = 1.00 and 0.99, respectively). Diagnostic image quality and confidence were slightly lower on LCE versus SC scans, but only one reader indicated a clear preference for SC. Sensitivity and specificity for lesion detection in LCE scans were 99\% and 99\%, respectively, with interscan agreement exceeding inter-reader variability.CONCLUSIONS: This is the first blinded, multicenter reader study evaluating a PET denoising algorithm in a European clinical setting across multiple tracers, incorporating unseen scanner technologies. The denoising algorithm demonstrated robust generalizability and preserved diagnostic accuracy on 50\% count data. These findings support the clinical adoption of deep learning-based PET denoising to reduce dose or scan time for four commonly used tracers.