首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于深度强化学习的多无人机协同进攻作战智能规划
引用本文:李俊圣,岳龙飞,左家亮,俞利新,赵家乐.基于深度强化学习的多无人机协同进攻作战智能规划[J].航空工程进展,2022,13(6):40-49,96.
作者姓名:李俊圣  岳龙飞  左家亮  俞利新  赵家乐
作者单位:空军工程大学,空军工程大学,空军工程大学,空军工程大学,空军工程大学空管领航学院
基金项目:国家自然科学基金(62106284),陕西省自然科学基金(2021JQ-370),军内科研项目(KJ20191A030153)
摘    要:无人机依靠作战效费比高、灵活自主等优势逐步替代了有生力量作战,多无人机协同作战任务规划成为热点研究问题。针对传统任务规划采用的智能优化算法存在的依赖静态、低维的简单场景、机上计算较慢等不足,提出一种基于深度强化学习(DRL)的端到端的多无人机协同进攻智能规划方法。将压制敌防空作战(SEAD)任务规划过程建模为马尔科夫决策过程,建立基于近端策略优化(PPO)算法的SEAD 智能规划模型,通过两组实验验证智能规划模型的有效性和鲁棒性。结果表明:基于DRL 的智能规划方法可以实现快速、精细规划,适应未知、连续高维的环境态势,智能规划模型具有战术协同规划能力。

关 键 词:多无人机  深度学习  深度强化学习  PPO算法  泛化性
收稿时间:2022/1/13 0:00:00
修稿时间:2022/4/25 0:00:00

Multi-UAV Cooperative Offensive Combat Intelligent Planning Based on Deep Reinforcement Learning
LI Junsheng,YUE Longfei,ZUO Jialiang,YU Lixin and ZHAO JIALE.Multi-UAV Cooperative Offensive Combat Intelligent Planning Based on Deep Reinforcement Learning[J].Advances in Aeronautical Science and Engineering,2022,13(6):40-49,96.
Authors:LI Junsheng  YUE Longfei  ZUO Jialiang  YU Lixin and ZHAO JIALE
Abstract:UAV have gradually replaced manned aircraft to combat with advantages such as high effectiveness and flexible autonomy. Multi-UAV cooperative cambat mission planning has attracted widespread attention. An end-to-end cooperative attack intelligent planning method for multi-UAV based on deep reinforcement learning (DRL) is presented to overcome the shortcomings of traditional mission planning algorithms, such as dependence on static, low-dimensional simple scenarios and slow on-board computing power. The SEAD mission planning is modeled as the Markov decision process. The SEAD intelligent planning model based on PPO algorithm is established and the general intelligent planning architecture is proposed. We introduce domain randomization, maximizing the entropy of policy and the lower-layer network parameter sharing training tricks, to improve the effectiveness and generalization performance of PPO. Simulation results show that the DRL-based model can achieves fast and fine planning through offline training and online planning, adapt to unknown, continuous and high-dimensional environment situation, which reflects provides a new idea for intelligent planning research.
Keywords:multi-UAV  deep learning  deep reinforcement learning  PPO algorithms  generalization
点击此处可从《航空工程进展》浏览原始摘要信息
点击此处可从《航空工程进展》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号