Abstract: Breast cancer continues to be a major global health concern, with survival prediction being a key element in improving treatment outcomes and clinical decision-making. This study applies machine learning (ML) techniques to the METABRIC (Molecular Taxonomy of Breast Cancer International Consortium) dataset to classify patient Overall Survival Status as either Living or Deceased. Five ML algorithms—Logistic Regression, Naïve Bayes, Support Vector Machine (SVM), Random Forest, and K-Nearest Neighbours (KNN)—are implemented after comprehensive preprocessing, including handling missing values, categorical encoding, and feature scaling. Model performance is evaluated using accuracy, precision, recall, F1-score, and confusion matrix. Results indicate that Logistic Regression achieved the highest accuracy (97.3%), closely followed by Random Forest and Naïve Bayes. The findings demonstrate the potential of ML techniques in assisting oncologists with survival prediction, offering a foundation for future integration into personalized medicine.
Keywords: Breast Cancer, Machine Learning, METABRIC, Survival Prediction, Logistic Regression, Random Forest.
Downloads:
|
DOI:
10.17148/IJARCCE.2025.141049
[1] Manasvi Manohar Phadtare, Dr. Deepak Singh, "“Breast Cancer Survival Prediction Using Machine Learning”," International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), DOI: 10.17148/IJARCCE.2025.141049