Abstract: In the modern era of big data, organizations require rapid insights from continuously generated data streams. Real-time data analytics has become essential for decision-making in sectors such as finance, healthcare, IoT, and social media. Apache Spark, a powerful open-source distributed data processing framework, provides in-memory computation and supports both batch and stream processing. This paper explores the use of Apache Spark for real-time data analytics, focusing on its architecture, components, and advantages over traditional frameworks like Hadoop MapReduce. Through integration with tools such as Apache Kafka and HDFS, Spark enables scalable, fault-tolerant, and low-latency processing. Experimental analysis shows Spark’s capability to handle large-scale, high-velocity data with minimal delay, offering significant improvements in throughput and processing speed. The results confirm that Apache Spark is a highly efficient and scalable platform for real-time big data analytics.
Keywords: Real-Time Analytics, Apache Spark , Stream Processing, Kafka, Hadoop
Downloads:
|
DOI:
10.17148/IJARCCE.2025.141118
[1] Mr. Jaybhay. D.S, Miss. Aakanksha. B. Rasure, Miss. Radha. R. Alapure, "REAL TIME BIG DATA ANALYTICS WITH APACHE SPARK," International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), DOI: 10.17148/IJARCCE.2025.141118