Cyberbullying Detection in Social Media Contents using Machine Learning Techniques

Amey Gujar; Akhilesh Ghorpade; Indrajeet Chougule; Vedant Gawas; Paras Gurjar; Himanshu Baboria; Vinod Khetade

doi:10.17148/IJARCCE.2026.15613

← Back to VOLUME 15, ISSUE 6, JUNE 2026

Cyberbullying Detection in Social Media Contents using Machine Learning Techniques

Amey Gujar, Akhilesh Ghorpade, Indrajeet Chougule, Vedant Gawas, Paras Gurjar, Himanshu Baboria, Prof. Vinod Khetade

Downloads: Download PDF|DOI: 10.17148/IJARCCE.2026.15613

👁 14 views📥 5 downloads

Abstract: Cyberbullying is a serious problem in the Information Age. It spoils people's sentiments and wellbeing with ugly messages and cruel words. There's so much content on social media at all times, that it would be hard to find this stuff manually as it would take you a long time and you can't expand that easily. Therefore, the researchers tried to come up with a great solution - a Machine Learning framework that automatically detects cyberbullying. It employs NLP methods to clean up the text, such as normalizing words, tokenizing text and interpreting emojis. Plus, it can handle English, Hindi, Marathi and Hinglish texts as well!

Once the text is sorted, the system converts this information to numbers, known as TF-IDF. Then, it employs a Linear Support Vector Machine for classification, using sklearn’s svm.SVC(linear) kernel. There were several different SVM setups that were considered during development, but the linear SVM proved to have the greatest accuracy and computational requirement.

Our experiments demonstrate that the TF-IDF and Linear SVM model is quite effective in the classification tasks with a lesser amount of resources and is efficient. We ran it on a sample of 31,183 text messages from social media, with 23,820 of them classified as bullying and 7,363 as safe. The one thing that makes our system stand out is its multiple language processing and ability to recognize emojis. This allows it to handle the numerous modes of communication on social media. Moreover, we used it as a Flask based API, so it can be integrated with Web apps easily. Ergo, it is a convenient instrument for in real life content moderation and to improve the safety online.

Keywords: Cyberbullying Detection, Machine Learning (ML), Natural Language Processing (NLP), Text Classification, Support Vector Machine (SVM), TF-IDF, Sentiment Analysis, Multilingual Text Processing, Social Media Analysis, Flask API, Emoji Processing, Online Safety.

How to Cite:

[1] Amey Gujar, Akhilesh Ghorpade, Indrajeet Chougule, Vedant Gawas, Paras Gurjar, Himanshu Baboria, Prof. Vinod Khetade, “Cyberbullying Detection in Social Media Contents using Machine Learning Techniques,” International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), DOI: 10.17148/IJARCCE.2026.15613

This work is licensed under a Creative Commons Attribution 4.0 International License.