Knowledge sharing in Independent Deep Q-Network

Main Article Content

DOI

Viacheslav Bochok

vooron3@gmail.com

https://orcid.org/0009-0000-3929-2758
Nataliia Fedorova

fedorova_natalia@lll.kpi.ua

Abstract

This paper investigates knowledge sharing mechanisms in weakly coupled multi-agent reinforcement learning systems based on Independent Deep Q-Networks (IDQN). Although parallel agents can accelerate data collection, their learning processes typically remain isolated, resulting in suboptimal use of collective experience. To address this limitation, the study proposes two complementary methods: (1) a teacher-selection mechanism that identifies the most efficient agent based on episodic performance, and (2) a dynamic control mechanism that adjusts the intensity of knowledge transfer according to the performance gap between teacher and student. The experiments were conducted in the OpenAI Gym CartPole-v1 and LunarLander-v3 environments using three independent agents, to validate the effectiveness across tasks with different reward structures, dynamics, and difficulty levels. All agents were trained with Batch TD(0) at the end of each episode, using a replay. Knowledge transfer was implemented through policy distillation on pseudo-labeled transitions sampled from the teacher’s experience buffer. The number of distillation epochs was dynamically determined using a nonlinear scaling function bounded by predefined minimum and maximum values. Results demonstrate that the proposed mechanisms consistently accelerate learning and improve stability compared to baseline DQN configurations without knowledge sharing. Systems employing teacher selection outperform random teacher choice and all-to-all sharing. Dynamic intensity adjustment proves more effective than constant-intensity distillation. Normalized AUC analysis further confirms statistically significant improvements in both maximum and average episodic returns, indicating faster convergence of the best agent as well as more uniform progress across all agents. The findings show that knowledge sharing with informed teacher selection and adaptive transfer strength provides a robust and scalable approach for improving the efficiency of independent agents in stationary environments. These mechanisms are compatible with common DQN extensions and can serve as a foundation for future research on adaptive multi-agent knowledge exchange strategies.

Keywords:

Deep Q-Network (DQN), knowledge sharing, multi-agent systems, policy distillation, reinforcement learning

References

Article Details

Bochok, V., & Fedorova, N. (2026). Knowledge sharing in Independent Deep Q-Network. Informatyka, Automatyka, Pomiary W Gospodarce I Ochronie Środowiska, 16(1), 104–108. https://doi.org/10.35784/iapgos.7545