首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于深度确定性策略梯度算法的战机规避中距空空导弹研究
引用本文:宋宏川,詹浩,夏露,李向阳,刘艳.基于深度确定性策略梯度算法的战机规避中距空空导弹研究[J].航空工程进展,2021,12(3):85-94.
作者姓名:宋宏川  詹浩  夏露  李向阳  刘艳
作者单位:西北工业大学,西北工业大学,西北工业大学,西安地平线电子科技有限公司,西北工业大学
基金项目:国家自然科学基金项目(面上项目,重点项目,重大项目)
摘    要:飞机规避中距空空导弹的逃逸机动策略对于提高战斗机的生存力至关重要。针对深度确定性策略梯度算法训练智能体学习飞机规避导弹的逃逸机动策略进行研究。以飞机导弹相对态势参数等作为智能体的输入状态,飞机控制指令作为智能体的输出动作,导弹飞机追逃模型作为智能体的学习环境,设计由相对态势和飞行参数构成的成型奖励以及由交战结果组成的稀疏奖励,实现从状态参数到控制量端到端的逃逸机动策略。通过与四种基于专家先验知识的典型逃逸机动攻击区仿真验证对比,结果表明:智能体实现的逃逸策略攻击区仅次于置尾下降攻击区,该策略对飞机规避导弹先验知识的依存度最低。

关 键 词:导弹规避  逃逸机动策略  深度确定性策略梯度  深度强化学习
收稿时间:2021/2/8 0:00:00
修稿时间:2021/3/15 0:00:00

The Study for a Fighter against a Medium-range Air-to-air Missile Based on Deep Deterministic Policy Gradient Algorithm
songhongchuan,zhanhao,xialu,lixiangyang and liuyan.The Study for a Fighter against a Medium-range Air-to-air Missile Based on Deep Deterministic Policy Gradient Algorithm[J].Advances in Aeronautical Science and Engineering,2021,12(3):85-94.
Authors:songhongchuan  zhanhao  xialu  lixiangyang and liuyan
Institution:Northwestern Polytechnical University,,,,
Abstract:The evasive maneuver strategy for a fighter against a medium-range air-to-air missile is crucial to improving aircraft survivability. This paper studies that the deep deterministic policy gradient algorithm trains the agent to learn the evasive maneuver strategy. The missile-aircraft engagement model parameters are the input states. The aircraft control commands are taken as the output actions. The missile-aircraft pursuit-evasion model is the environment. The shaping reward, including engagement model parameters and flight parameters, and the sparse reward of the engagement results are designed. Finally, the agent realizes the end-to-end evasive maneuver strategy from the state parameters to the aircraft control variables. Compared to the attack zones of four classic evasive maneuvers based on prior knowledge by simulating, this paper proves that the evasion strategy developed by the agent is second only to the tail dive maneuver. However, this strategy has the lowest dependence on the specialized domain knowledge of missile evasion.
Keywords:missile evasion  evasive maneuver strategy  deep deterministic policy gradient  deep reinforcement learning
点击此处可从《航空工程进展》浏览原始摘要信息
点击此处可从《航空工程进展》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号