Abstract: Emotion recognition from speech signals has gained significant attention in human-computer interaction, offering applications in entertainment, mental health monitoring, and personalized user experiences. This paper presents a web-based Speech Emotion Recognition and Music Recommendation System that utilizes deep learning for emotion classification and integrates music streaming services for personalized recommendations. The system records speech input, extracts Mel-Frequency Cepstral Coefficients (MFCC) as features, and classifies emotions using a pre-trained Convolutional Neural Network (CNN) model. Based on the detected emotion, the system retrieves genre-specific music recommendations from Spotify. Implemented using Flask, TensorFlow, and Librosa, the proposed approach achieves efficient real-time emotion classification and enhances user engagement through tailored music selection. Experimental results demonstrate the model’s accuracy and the effectiveness of the recommendation system.
Keywords: Speech Emotion Recognition (SER), Deep Learning, Convolutional Neural Networks (CNN), Mel-Frequency Cepstral Coefficients (MFCC), Natural Language Processing (NLP), Audio Signal Processing, Flask Web Application, Music Recommendation System, Spotify API, Human-Computer Interaction (HCI).
|
DOI:
10.17148/IJARCCE.2025.14351