首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于近端策略优化的空战决策算法研究
引用本文:张博超,温晓玲,刘璐,张雅茜,王宏光.基于近端策略优化的空战决策算法研究[J].航空工程进展,2023,14(2):145-151.
作者姓名:张博超  温晓玲  刘璐  张雅茜  王宏光
作者单位:沈阳飞机设计研究所,沈阳飞机设计研究所,沈阳飞机设计研究所,沈阳飞机设计研究所,沈阳飞机设计研究所
摘    要:面对未来有/无人机协同作战场景,实时准确的空战决策是制胜的关键。复杂的空中环境、瞬变的态势数据以及多重繁琐的作战任务,使有/无人机协同作战将替代单机作战成为未来空战的发展趋势,但多智能体建模和训练过程却面临奖励分配困难、网络难收敛的问题。针对5v5 有/无人机协同的空战场景,抽象出有人机和无人机智能体的特征模型,提出基于近端策略优化算法的空战智能决策算法,通过设置态势评估奖励引导空战过程中有/无人机智能体的决策行为向有利态势发展,实现在与环境的实时交互中,输出空战决策序列。通过仿真实验对所提空战决策算法进行验证,结果表明:本文提出的算法在经过训练学习后,能够适应复杂的战场态势,在连续动作空间中得到稳定合理的决策策略。

关 键 词:空战决策  智能决策  强化学习  有无人协同
收稿时间:2022/6/11 0:00:00
修稿时间:2022/9/18 0:00:00

Research on air combat decision algorithm based on Proximal Policy Optimization
Zhang Bochao,Wen Xiaoling,Liu Lu,Zhang Yaqian and Wang Hongguang.Research on air combat decision algorithm based on Proximal Policy Optimization[J].Advances in Aeronautical Science and Engineering,2023,14(2):145-151.
Authors:Zhang Bochao  Wen Xiaoling  Liu Lu  Zhang Yaqian and Wang Hongguang
Institution:Shenyang Aircraft Design and Research Institute,,Shenyang Aircraft Design and Research Institute,,
Abstract:Facing the future combat scenario with manned and unmanned aerial vehicle cooperation, real-time and accurate air combat decision-making is the basis of winning. Aiming at the above scenarios, this paper abstracts the characteristic model of single agent, and proposes an algorithm based on proximal policy optimization to obtain the air combat decision sequence by using reward and punishment incentive in the real-time interaction with the environment. The simulation results show that the algorithm proposed in this paper can adapt to the complex battlefield situation and get a reasonable decision-making strategy after training and learning.
Keywords:air combat decision  intelligent decision  reinforcement learning  manned and unmanned aerial vehicle cooperation
点击此处可从《航空工程进展》浏览原始摘要信息
点击此处可从《航空工程进展》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号