Abstract: Breast cancer is one of the most common forms of cancer in women worldwide, and early detection is crucial to successful treatment. However, accurately diagnosing breast cancer can be challenging, and there is often a need for more effective methods for detecting and diagnosing the disease. This project aims to generate rules from the Breast Cancer Wisconsin dataset using a combination of data exploration, feature reduction, and machine learning algorithms. The first step is to explore and understand the dataset, followed by rule conversion using random forest. Feature reduction is then performed using Extra Tree Classifier, RFE, and correlation between features, and the top 10 features are selected. Then SelectFromModel method is used to further reduce the features to 5. Rule conversion is performed again using random forest on the selected features. Finally, the generated rules are predicted with the original dataset using several machine learning algorithms such as SVM, MLP, Gradient Booster, Ada Booster, CNN, Extra Tree, and Logistic Regression. By identifying the most important features for predicting breast cancer, we aim to provide clinicians and researchers with valuable insights and tools for more accurate diagnosis and treatment of the disease.

Keywords: Breast cancer, Rule generation, Random forest, Feature selection, ExtraTreeClassifier, RFE Correlation, SVM, MLP , Gradient boosting , AdaBoost ,CNN , Logistic regression

PDF | DOI: 10.17148/IJARCCE.2023.124113

Open chat
Chat with IJARCCE