Articles
Multi-Agent Reinforcement Learning for Autonomous Drone Swarm Coordination in Dynamic Environments
Abstract
Coordinated drone swarms enable applications from disaster response to precision agriculture, but scaling coordination to hundreds of agents in dynamic, GPS-denied environments remains unsolved. We propose SwarmMARL, a multi-agent reinforcement learning (MARL) framework combining graph attention networks with centralized training and decentralized execution (CTDE). SwarmMARL agents learn emergent flocking, obstacle avoidance, and task allocation behaviors through a curriculum of progressively complex scenarios. Evaluated in high-fidelity AirSim simulations with up to 128 drones and validated on a 32-drone physical testbed, SwarmMARL achieves 98.7% mission completion rate in search-and-rescue scenarios with 40% fewer collisions than MAPPO and QMIX baselines. Real-world outdoor tests demonstrate robust formation maintenance under 12 m/s wind gusts and communication dropout rates up to 30%.