Abstract: Enhanced CAT-DM is an advanced virtual try-on system that combines diffusion models with GAN-based initialization to attain greater realism, efficiency, and controllability. It builds on the Garment-Conditioned Diffusion Model (GC-DM) and incorporates DINO-V2, a top-performing self-supervised vision model, for fine-grained, pixel-level garment representations. ControlNet is also used to improve conditioning accuracy such that garments fit naturally onto body shapes. In order to speed up the normally time-consuming sampling process of diffusion models, we propose a truncation-based acceleration method that leverages a GAN-synthesized coarse image as an initial guess. This largely minimizes the number of sampling steps needed without compromising high-fidelity garment details. In addition, Poisson blending is employed to blend the synthesized garments into the target person's image with seamless transitions and realistic texture conservation. Extensive assessments on benchmark datasets show that Enhanced CAT-DM beats current virtual try-on techniques in terms of higher LPIPS, SSIM, and CLIP-I scores, which confirm its superiority in retaining high-level details, structural features, and garment semantics. All these innovations render Enhanced CAT-DM highly appropriate for real-time, high-fidelity virtual try-on scenarios, filling the gap between AI-based garment synthesis and real-world usability in fashion and e-commerce sectors.
Keywords: Virtual Try-On, Diffusion Models, Generative Adversarial Networks (GANs), Garment-Conditioned Diffusion Model (GC-DM), ControlNet, DINO-V2, Truncation-Based Acceleration, Poisson Blending.
|
DOI:
10.17148/IJARCCE.2025.14246