Abstract: Breast Cancer, one of the most common diseases which has impacted the female population is a result of two genes BRCA1 and BRCA2. The geneses result in the formation of cysts or lumps in the female breast which can later develop into a fully developed tumor. The tumor can either be malignant (cancerous) or benign(harmless), depending on the composition of the nuclei which forms it. This case study focuses on the several characteristics of the lumps and using classification algorithms makes an attempt for early prediction of cancer symptoms depending on the various characteristics of the lump.

Keywords: Re-index, Correlation Analysis, Relativity Analysis, 10-fold Cross Validation, Logistic Regression, Na´ve Bayes, Gradient Boosted Trees, Random Forest Trees, ROC Curves, Precision Recall Curves.