Abstract: Coronary Artery Disease is the most fatal of all diseases in human beings. The heart muscle, like every other part of the body, needs its own oxygen-rich blood supply. Arteries branch off the aorta and spread over the outside surface of the heart. The Right Coronary Artery (RCA) supplies the bottom part of the heart. The short Left Main (LM) artery branches into the Left Anterior Descending (LAD) artery that supplies the front of the heart and the Circumflex (Cx) artery that supplies the back of the heart. In this paper we start with data acquisition. Any acquired/ given data can be analysed and conclusions drawn accordingly. The acquired or given data usually exists in its crude or raw state. In our assignment, the acquired data consists of many physiological parameters which directly or indirectly lead to this disease. Data pre-processing helps to format the data into useful form by removing redundancy and noise, eliminating missing and non-numerical values, and also by normalization. Data analysis and visualization are carried out to improve the statistical analysis of given data. Logistic regression is carried out on the data since it contains lot of columns with categorical values. Accuracy, precision, and f1 score of the model have been measured. Various conclusions can be drawn from this interdependent data set and can be stored as historical data for future analysis. We then try out various other ML algorithms like Random Forest classifier, SVM and KNN algorithm. We then compare the models with Logistic Regression method.
Keywords: Coronary Artery Disease, Machine Learning, Data pre-processing, Logistic regression, accuracy, precision, and f1 score, data analysis and visualization, Random Forest classifier, SVM algorithm and KNN algorithm
| DOI: 10.17148/IJARCCE.2019.81212