Abstract: We have entered the era of Big Data where data is emerging with 5V (Velocity, Variety, Volume, Value, Veracity), making it complex and useful for the predictive and descriptive analysis. Decision making with analysis is an important concern in modern agriculture. One such decision making process is related to the forecast of crop yield in various environmental and soil conditions. This basis of the work is based on this data mining process. The work deals with two subtasks, first, it implements and compares different clustering method for the districts having similar kind of productivity factors for crops, and second, forecasting the yield of the major crops for different districts.

Keywords: Batchelor & Wilkinsí, DBSCAN, AGNES, Multiple Linear Regression.