Abstract: The evolution of Human-Computer Interaction (HCI) is increasingly focused on developing interfaces that offer more organic and intuitive methods of system control, moving beyond the limitations of conventional hardware. This paper presents a novel, dual-modality framework that functions as a virtual mouse and system controller by integrating real-time hand gesture analysis with voice command interpretation. The primary goal of this solution is to provide a fully contactless interaction method, thereby improving accessibility for users with motor impairments and offering enhanced convenience in hands-free operational scenarios, such as academic presentations or sterile work environments. Our system is developed in Python, employing OpenCV for video stream capture and the MediaPipe framework for high-fidelity hand and finger landmark detection, enabling precise cursor manipulation and control over system parameters like audio volume and screen brightness. Complementing this, a voice interface, created using the Eel library and Google Text-to-Speech (gTTS), processes verbal instructions to perform tasks such as launching software, initiating web searches, and querying system status. The fusion of these two modalities results in a highly responsive and user-centric interface that significantly advances the state of touch-free computing.
Keywords: Human-Computer Interaction (HCI), Gesture Recognition, Voice Command, Computer Vision, MediaPipe, Accessibility.
Downloads:
|
DOI:
10.17148/IJARCCE.2025.141016
[1] Swetha P, Mithun R, Nikhil Anthony A, Nikhil N, Tharun R, "Virtual Mouse with Gesture and Voice Command," International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), DOI: 10.17148/IJARCCE.2025.141016