← Back to VOLUME 15, ISSUE 4, APRIL 2026
This work is licensed under a Creative Commons Attribution 4.0 International License.
A Deterministic Multi-Metric Framework for Automated Image Dataset Validation in Computer Vision
đ 15 viewsđĨ 3 downloads
Abstract: Deep learning model performance in computer vision is fundamentally limited by the quality of training data, yet augmented datasets frequently contain feature corruption such as extreme blur, noise, and lighting anomalies. This paper presents the AI-Based Image Dataset Quality Validator, a high-precision, data-centric framework designed for automated dataset sanitization. The system employs a deterministic multi-metric validation pipeline integrating Laplacian Variance for sharpness auditing and ITU-R 601 Luma weighting for exposure control, enabling fine-grained defect identification that traditional global-threshold filters miss. A core innovation of the architecture is the Parallel Structural Label Synchronization module, which guarantees a strict 1:1 correspondence between images and their respective annotations stored in either TXT or CSV format, automatically eliminating orphan labels during export. To handle large- scale batches on standard hardware, the system implements Active Memory Recovery through controlled garbage collection. Experimental evaluation on a 500-image benchmark demonstrates 96.8% rejection accuracy with an average throughput of 42.5 ms per image. The proposed framework reduces manual data-cleaning effort by an estimated 98%, delivering a scalable, Green AI solution for high-integrity computer vision pipelines.
Keywords: Image Dataset Validation, Computer Vision, Laplacian Variance, Label Synchronization, Data-Centric AI, Image Quality Assessment (IQA), YOLO Framework, Green AI, Automated Data Sanitization, TXT/CSV Annotation Management.
Keywords: Image Dataset Validation, Computer Vision, Laplacian Variance, Label Synchronization, Data-Centric AI, Image Quality Assessment (IQA), YOLO Framework, Green AI, Automated Data Sanitization, TXT/CSV Annotation Management.
How to Cite:
[1] M.Balavignesh, Dr. C. Karpagavalli, Dr. M. Kaliappan, âA Deterministic Multi-Metric Framework for Automated Image Dataset Validation in Computer Vision,â International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), DOI: 10.17148/IJARCCE.2026.154161
