Home > Articles > All Issues > 2026 > Volume 15, No. 3, 2026 >
IJMERR 2026 Vol.15(3):331-342
doi: 10.18178/ijmerr.15.3.331-342

Heterogeneous Robots Cooperation via Multi-Agent Reinforcement Learning

Rehab Uddin Shawon *, Siming Liu *, Randall Cox, Joe Davis, Mason Walters, and Robert Powers
Department of Computer Science, Missouri State University, Springfield, MO, USA
Email: rs424s@missouristate.edu (R.U.S); simingliu@missouristate.edu (S.L.);
rc284s@missouristate.edu (R.C.); jddavis@spsmail.org (J.D.); waltersmason@gmail.com (M.W.); rpowers@thesummitprep.org (R.P.)
*Corresponding author

Manuscript received January 08, 2026; revised February 11, 2026; accepted March 31, 2026; published June 18, 2026

Abstract—Coordinating heterogeneous robots to perform complex tasks in dynamic environments is challenging due to differences in robot capabilities, task requirements, and coordination constraints. Multi-Agent Reinforcement Learning (MARL) provides a promising framework for enabling decentralized robots to learn cooperative behaviors through interaction with the environment. However, most existing MARL approaches focus on homogeneous robots or single-task scenarios, limiting their applicability to heterogeneous multi-robot systems that must perform multi-step sequential and collaborative tasks. In this study, we propose a MARL framework that enables heterogeneous robots to learn individual, sequential, and collaborative behaviors using a shared policy network. To distinguish robot roles while maintaining knowledge sharing, we introduce Signed Type Encoding (STE), which embeds robot identity in the observation space, enabling a shared policy to learn type-specific behaviors without separate models. Sequential and collaborative tasks are first decomposed into atomic individual tasks that are trained separately, and curriculum learning is then applied to progressively transfer knowledge to multi-step sequential and collaborative tasks, improving training efficiency and stability. Experimental results show that STE significantly improves task specialization and learning stability compared with baselines. Furthermore, task decomposition and curriculum learning enables effective transfer of learned behaviors to more complex coordination tasks, reducing training time while maintaining consistent task completion and coordinated robot behavior.

Keywords—reinforcement learning, multi-agent systems, transfer learning, multi-robot coordination

Cite: Rehab Uddin Shawon, Siming Liu, Randall Cox, Joe Davis, Mason Walters, and Robert Powers, "Heterogeneous Robots Cooperation via Multi-Agent Reinforcement Learning," International Journal of Mechanical Engineering and Robotics Research, Vol. 15, No. 3, pp. 331-342, 2026. doi: 10.18178/ijmerr.15.3.331-342

Copyright © 2026 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Article Metrics in Dimensions