首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于深度强化学习的固定翼无人机编队协调控制方法
引用本文:相晓嘉,闫超,王菖,尹栋.基于深度强化学习的固定翼无人机编队协调控制方法[J].航空学报,2021,42(4):524009-524009.
作者姓名:相晓嘉  闫超  王菖  尹栋
作者单位:国防科技大学 智能科学学院, 长沙 410073
基金项目:国家自然科学基金(61906203);西北工业大学无人机特种技术重点实验室基金(614230110080817)
摘    要:由于运动学的复杂性和环境的动态性,控制一组无人机遂行任务目前仍面临较大挑战。首先,以固定翼无人机为研究对象,考虑复杂动态环境的随机性和不确定性,提出了基于无模型深度强化学习的无人机编队协调控制方法。然后,为平衡探索和利用,将ε-greedy策略与模仿策略相结合,提出了ε-imitation动作选择策略;结合双重Q学习和竞争架构对DQN(Deep Q-Network)算法进行改进,提出了ID3QN(Imitative Dueling Double Deep Q-Network)算法以提高算法的学习效率。最后,构建高保真半实物仿真系统进行硬件在环仿真飞行实验,验证了所提算法的适应性和实用性。

关 键 词:固定翼无人机  无人机编队  协调控制  深度强化学习  神经网络  
收稿时间:2020-03-24
修稿时间:2020-05-18

Coordination control method for fixed-wing UAV formation through deep reinforcement learning
XIANG Xiaojia,YAN Chao,WANG Chang,YIN Dong.Coordination control method for fixed-wing UAV formation through deep reinforcement learning[J].Acta Aeronautica et Astronautica Sinica,2021,42(4):524009-524009.
Authors:XIANG Xiaojia  YAN Chao  WANG Chang  YIN Dong
Institution:College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, China
Abstract:Due to the complexity of kinematics and environmental dynamics, controlling a squad of fixed-wing Unmanned Aerial Vehicles (UAVs) remains a challenging problem. Considering the uncertainty of complex and dynamic environments, this paper solves the coordination control problem of UAV formation based on the model-free deep reinforcement learning algorithm. A new action selection strategy, ε-imitation strategy, is proposed by combining the ε-greedy strategy and the imitation strategy to balance the exploration and the exploitation. Based on this strategy, the double Q-learning technique, and the dueling architecture, the ID3QN (Imitative Dueling Double Deep Q-Network) algorithm is developed to boost learning efficiency. The results of the Hardware-In-Loop experiments conducted in a high-fidelity semi-physical simulation system demonstrate the adaptability and practicality of the proposed ID3QN coordinated control algorithm.
Keywords:fixed-wing UAVs  UAV formation  coordination control  deep reinforcement learning  neural networks  
点击此处可从《航空学报》浏览原始摘要信息
点击此处可从《航空学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号