Abstract: Data mining is a process of extracting valuable information from large set databases. Classification a supervised technique is assigning data samples to target classes. This paper discusses two classification algorithms namely decision trees and Random forest.. Decision trees are powerful and popular tools for classification and prediction. Decision trees represent rules, which can be understood by humans and used in knowledge system such as database. Random forest includes construction of decision trees of the given training data and matching the test data with these. Rattle an open source R-GUI is used for analysis of weather data for prediction of rainfall using 256 data samples. Based on results obtained a comparative analysis is done.

 

Keywords: Classification, Decision Trees, Random Forest, supervised learning, confusion matrix, Entropy, Information Gain.