HYPERPARAMETER IMPACT ON LEARNING EFFICIENCY IN Q-LEARNING AND DQN USING OPENAI GYMNASIUM ENVIRONMENTS
- Abstract
- Keywords
- Cite This Article as
- Corresponding Author
This study compares the Q-learning and DQN methodologies within the CartPole-v1 environment. All methods were executed, trained, and evaluated according to test outcomes and training improvements. The research paper includes statistical summaries, and visualizations of performance and learning. This paper examines the impact of hyperparameters on the learning efficiency of Q-Learning and Deep Q-Network (DQN) in the CartPole-v1 environment of OpenAI Gymnasium. The results indicate that DQN substantially outperforms Q-Learning. DQN achieved a peak training reward of 500, whereas Q-Learning maintained an average test reward of 23. The hyperparameters that enhance DQN\'s neural network and performance include a learning rate of 0.001 and a batch size of 64. The efficacy of Q-Learning was hindered by inadequate hyperparameters and imprecise state discretization. The findings underscore the necessity of hyperparameter tuning to enhance the efficiency of reinforcement learning.
[Ali Raza, Asfand Ali, Alaptageen Qayyum, Ghulam Shabir, Zahid Hussain and Ghulam Murtaza (2025); HYPERPARAMETER IMPACT ON LEARNING EFFICIENCY IN Q-LEARNING AND DQN USING OPENAI GYMNASIUM ENVIRONMENTS Int. J. of Adv. Res. (May). 1164-1176] (ISSN 2320-5407). www.journalijar.com
Sapienza University of Rome, Piazzale Aldo Moro-5, 00185, Italy
Italy