Abstract: In this paper we present the Bigdata infrastructure for handling the large amounts of data processed in the Invest-ment industry. Various bigdata tools can be used for this purpose. But in this case, we are going with Hadoop. Hadoop is frame-work which is developed using Java programming language. It is a framework which uses various concepts of parallel and distributed computing to make the computational speed faster. This causes the programs to execute at a much larger speed with the help of few normal speed computers. This increases the affordability rate and makes it much more efficient. It uses its own file system called as Hadoop distributed file system, that is HDFS. HDFS is known for its security and high risk control. Since it runs in cluster there becomes absolutely no use of a Super computer to process data faster. Hadoop is the widely used big data processing engine with a simple master slave setup. One of the most common place where bigdata is most commonly uses is the share market industry. There are various reasons why bigdata is used in this field. The most common one being to increase the profits by understanding the pervious data. The analytics and understanding of data can only be possible if the large amounts of data is handled in a proper way. Suggesting the shares to users is one of the main concept of this paper. But rather than focusing on the analytical part of the framework our main aim is to make it easier for the admin to use bigdata so that the large amounts of data sets can be easier to process. This application can have a lot of advantages in the algorithmic trading. Algorithmic trading is a type of trading where different algorithms are used for buying and selling the shares. It can also have various use cases in stock brokering firms for processing large amount of data quickly. The main contribution of this paper is to integrate the cloud computing and Hadoop framework. The cloud computing in this project is a web based application which is directly connected to the Hadoop system. The parallel pipeline is developed for the purpose of easy handling of data by the admin. We also have developed a communication protocol upon TCP/IP for the purpose of the pipeline. The share market industry involves are huge amount of data and exabytes of data is processed every minute for various purposes. The investors usually go through all the data that is involved in their research purposes and try to analyse it. There are various factors and also attributes that the investors try to take into account when going through the data. Analysing data involves a large amount of data and very well built platform for it to support the data. In this paper, we have Identify applicable funding agency here. If none, delete this also built a web based platform which helps user analyse the data in a much simpler and graphical notation. The attributes include financial transaction patterns by the investor, market conditions and sentiments, macroeconomics variables, scheme level features, and demographic factors. Predicting the redemption behavior requires sophisticated platform that can capture multiple factors that affect the redemption behavior. However, these big data infrastructure provide us with various use cases, with tools like Hadoop and spark it becomes even more easier to find use cases at macro levels. This platform can investigate these factors for near real-time data and can provide highly accurate predictions for the redeeming investors in the future at a investor-level. Our results show that by implementing cloud computing, bigdata analytics and sofisticated algorithmic trading the results which are data driven can be used to generate a resonable amounts of profits and also the data could be processed in a much simple and faster way.
Keywords: Big data, Hadoop framework, BSE, NSE, FMCG
| DOI: 10.17148/IJARCCE.2018.71211