Autonomous soaring using reinforcement learning for trajectory generation
Autonomous soaring is a concept in which the endurance of unmanned aircraft can be increased by exploiting wind updrafts. Recent research has explored traditional feedback control methods for autonomous navigation of vehicles to thermal updrafts. This paper develops an approach for planar lateral/directional guidance of a linear dynamic gliding aircraft to a known thermal location. Reinforcement learning is utilized to generate reference bank angle commands for directing the aircraft to close proximity of the updraft, and from there the aircraft follows a circling trajectory centered on the thermal to gain energy. A Lyapunov-based feedback control law is used to generate bank angle commands when circling the thermal. By using reinforcement learning the problem of online trajectory generation is reduced to a simple search in a static state-action value table. This approach has the advantage of low computational burden/overhead in practice. Furthermore, the need for a precise aircraft model for learning and simulation is reduced. Monte Carlo results presented in the paper demonstrate that the reinforcement learning guidance agent can consistently navigate the aircraft to the thermal. Reliable navigation is achieved after a relatively small number of learning episodes. An analysis of typical energy gains circling a thermal of constant shape and size is also presented. These results indicate that the approach is a suitable candidate for autonomous soaring.