Abstract: In the modern era, Reinforcement Learning (RL) has evolved from foundational experiments in classic control tasks to sophisticated systems facing contemporary challenges. While early tasks featured discrete action spaces and observable states, modern problems often involve continuous control and complex dynamics. This progression has created a significant algorithmic gap, requiring different approaches for optimal performance. This paper presents a comparative analysis to characterize this gap by benchmarking two influential algorithms on representative tasks: the value-based Deep Q-Networks (DQN) on a discrete control problem, and the policy-gradient Proximal Policy Optimization (PPO) on a continuous control problem. The analysis reveals the specialized strengths of each method, demonstrating that DQN achieves high performance in its intended domain, while PPO's architecture is well-suited to the stability requirements of more complex, continuous environments. These findings provide an empirical basis for understanding the distinct capabilities of these algorithmic classes, clarifying their respective domains of application and highlighting the importance of matching algorithmic design to problem complexity.
Keywords: Reinforcement Learning, Comparative Analysis, Deep Q-Networks (DQN), Proximal Policy Optimization (PPO), Algorithmic Gap.
Downloads:
|
DOI: 
10.17148/IJARCCE.2025.14940
[1] Priyanka Mohan, Sanju Stephen, Parvez B, "Bridging the Decades: A Comparative Analysis of Reinforcement Learning in Retro and Modern Control Tasks," International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), DOI: 10.17148/IJARCCE.2025.14940