Abstract: VisionSpeak ,an Android-based mobile application that enhances real-world awareness through intelligent object detection and text recognition. Using a smartphone camera, the app identifies objects and extracts printed or handwritten text in real time. Recognized information is instantly converted into speech using a Text-to-Speech (TTS) engine, allowing users to receive voice-based feedback without needing to look at the screen. The app integrates deep learning models like MobileNet and YOLO for efficient object detection and uses the Tesseract OCR engine for text recognition. Designed with accessibility in mind, it supports voice commands, offline functionality, and a user-friendly interface. VisionSpeak is particularly useful for individuals with visual impairments, travelers, and those seeking hands-free interaction. Its seamless performance across diverse environments makes it a versatile tool for daily assistance.
Keywords: Object Detection, Text Recognition, Android Application, Text-to-Speech (TTS), Assistive Technology.
|
DOI:
10.17148/IJARCCE.2025.14479