Abstract: Face-swap deepfakes present significant challenges to digital media authenticity and have emerged as critical threats to information integrity in contemporary society [1]. This paper proposes an AI/ML-based detection framework that combines Vision Transformer (ViT) feature extraction with facial symmetry analysis through an early fusion architecture [2]. Our approach leverages the Data-Efficient Image Transformer (DeiT-small) backbone [3] to extract high-level visual features, which are concatenated with 50-dimensional facial symmetry metrics computed from 68-point facial landmarks detected using dlib. The fused features (434 dimensions) are classified through a lightweight fully connected layer optimized with cross-entropy loss. Extensive evaluation on a dataset of 140,002 [4] training samples demonstrates robust detection performance with confidence scores exceeding 94% on test samples. The proposed architecture significantly reduces computational overhead compared to multi-stream approaches while maintaining discriminative power through complementary feature modalities. Furthermore, we present a user-friendly Gradio-based web interface [5] enabling practical deployment and batch analysis capabilities. Our results indicate that the synergistic combination of transformer-based visual perception and geometric facial constraints provides an effective solution for face-swap detection in real-world deployment scenarios.

Keywords: Deepfake detection, Face-swap, Vision Transformers, Feature fusion, Facial symmetry, Digital forensics, Web deployment.


Downloads: PDF | DOI: 10.17148/IJARCCE.2025.1412102

How to Cite:

[1] Shriya Arunkumar, Aaradhana R, Sadiya Noor, Sanskriti Raghav, Dr. Kushal Kumar B N, "Geometry Meets Transformers: Facial Asymmetry as a Forensic Signal for Deepfake Detection," International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), DOI: 10.17148/IJARCCE.2025.1412102

Open chat
Chat with IJARCCE