首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于强化学习的多无人机避碰计算制导方法
引用本文:赵毓,郭继峰,郑红星,白成超.基于强化学习的多无人机避碰计算制导方法[J].导航定位于授时,2021,8(1):31-40.
作者姓名:赵毓  郭继峰  郑红星  白成超
作者单位:哈尔滨工业大学航天学院,哈尔滨150001
基金项目:国家自然科学基金(61973101);航空科学基金(20180577005)
摘    要:针对大量固定翼无人机在有限空域内的协同避碰问题,提出了一种基于多智能体深度强化学习的计算制导方法。首先,将避碰制导过程抽象为序列决策问题,通过马尔可夫博弈理论对其进行数学描述。然后提出了一种基于深度神经网络技术的自主避碰制导决策方法,该网络使用改进的Actor-Critic模型进行训练,设计了实现该方法的机器学习架构,并给出了相关神经网络结构和机间协调机制。最后建立了一个实体数量可变的飞行场景模拟器,在其中进行"集中训练"和"分布执行"。为了验证算法的性能,在高航路密度场景中进行了仿真实验。仿真结果表明,提出的在线计算制导方法能够有效地降低多无人机在飞行过程中的碰撞概率,且对高航路密度场景具有很好的适应性。

关 键 词:多智能体  强化学习  计算制导  固定翼  避碰

A Reinforcement Learning Based Computational Guidance Approach for UAVs Collision Avoidance
ZHAO Yu,GUO Ji-feng,ZHENG Hong-xing,BAI Cheng-chao.A Reinforcement Learning Based Computational Guidance Approach for UAVs Collision Avoidance[J].Navigation Positioning & Timing,2021,8(1):31-40.
Authors:ZHAO Yu  GUO Ji-feng  ZHENG Hong-xing  BAI Cheng-chao
Institution:School of Astronautics, Harbin Institute of Technology, Harbin 150001, China
Abstract:Aiming at the problem of cooperative collision avoidance for a large number of fixed wing UAVs in limited airspace, a computational guidance method based on multi-agent deep reinforcement learning is proposed. Firstly, the process of collision avoidance and guidance is formulated as a sequential decision problem, which is mathematically described by Markov game theory. Then, a decision-making method of autonomous collision avoidance guidance based on multilayer neural network technology is proposed. The network is trained by the improved Actor-Critic model. Furthermore, the machine learning architecture is designed to implement the method. The relevant neural network structure and coordination mechanisms among UAVs are given. Finally, a flight simulator with variable number of entities is established, in which centralized training and distributed execution are performed. In order to verify the performance of the algorithm, several simulation experiments are carried out in the scene of high traffic density. The simulation results show that the proposed onboard computational guidance method can effectively reduce the collision probability of multiple UAVs in flight process have a good adaptability to the scene of high route density.
Keywords:Multi-agent  Reinforcement learning  Computational guidance  Fixed wing  Collision avoidance
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《导航定位于授时》浏览原始摘要信息
点击此处可从《导航定位于授时》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号