Abstract: Social media platforms are globally connected inter network through which users can interact with each others. Twitter is one of such social media platform. Because of the globally connected nature of social media platforms, once a user post something on social media it may reach out to everywhere around the world. This is the advantage and disadvantage of social media platforms. Some people use this feature to promote their products and needs, the legitimate users. Some of them use for spreading unwanted and illegitimate contents, the spam users. This project aims for detecting and classifying legitimate and illegitimate accounts on Twitter based on different user activities. The user activities include tweets of a user, friends count of user, followers count of user, list count of user etc. To build the model data of users is collected from Twitter API. And to label the dataset collected from twitter unsupervised machine learning algorithm like k-mean clustering technique is used for this project. And for the detection and classification of spam users this project use a machine learning model and this model performs better than other machine learning algorithms such as Random Forest, Decision Tree and Multinomial NB. And this work also capable to work on real time twitter data so that users can easily identify the spam users on real time.
Keywords: Machine Learning, Twitter API, Labelling, Classification
| DOI: 10.17148/IJARCCE.2020.9653