Publication Details
Overview
 
 
, Moremada Watte Gedara Charuka Moremada, Nikos Deligiannis
 

Chapter in Book/ Report/ Conference proceeding

Abstract 

The introduction of diverse text-to-image generation models has sparked significant interest across various sectors. While these models provide the groundbreaking capability to convert textual descriptions into visual data, their widespread usage has ignited concerns over misusing realistic synthesized images. Despite the pressing need, research on detecting such synthetic images remains limited. This paper aims to bridge this gap by evaluating the ability of several existing detectors to detect synthesized images produced by text-to-image generation models. Our research includes testing four popular text-to-image generation models: Stable Diffusion (SD), Latent Diffusion (LD), GLIDE, and DALLâ‹…E-MINI (DM), and leverages two benchmark prompt-image datasets as real images. Additionally, our research focuses on identifying robust, efficient, lightweight detectors to minimize computational resource usage. Recognizing the limitations of current detection approaches, we propose a novel detector grounded in latent space analysis tailored for recognizing text-to-image synthesized visuals. Experimental results demonstrate that the proposed detector not only achieves high prediction accuracy but also exhibits enhanced robustness against image perturbations while maintaining lower computational complexity compared to existing models in detecting text-to-image generated synthetic images.

Reference 
 
 
DOI  scopus