基于态势评估及DDPG算法的一对一空战格斗控制方法 One-on-One Air Combat Control Method Based on Situation Assessment and DDPG Algorithm期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于态势评估及DDPG算法的一对一空战格斗控制方法

引用本文：	贺宝记,白林亭,文鹏程.基于态势评估及DDPG算法的一对一空战格斗控制方法[J].航空工程进展,2024,15(2):179-187.

作者姓名：	贺宝记白林亭文鹏程

作者单位：	中航工业西安计算技术研究所,中航工业西安航空计算技术研究所,中航工业西安航空计算技术研究所

摘要：	已有的空中格斗控制方法未综合考虑基于专家知识的态势评估及通过连续性速度变化控制空战格斗的问题。基于深度确定性策略梯度（DDPG）强化学习算法，在态势评估函数作为强化学习奖励函数的基础上，设计综合考虑飞行高度上下限、飞行过载以及飞行速度上下限的强化学习环境；通过全连接的载机速度控制网络与环境奖励网络，实现DDPG算法与学习环境的交互，并根据高度与速度异常、被导弹锁定时间以及格斗时间设计空战格斗结束条件；通过模拟一对一空战格斗，对该格斗控制方法在环境限制学习、态势评估得分以及格斗模式学习进行验证。结果表明：本文提出的空战格斗控制方法有效，能够为自主空战格斗进一步发展提供指导。
关键词：	强化学习态势评估深度确定性策略梯度空战格斗
收稿时间：	2023/6/19 0:00:00
修稿时间：	2023/9/19 0:00:00
One-on-One Air Combat Control Method Based on Situation Assessment and DDPG Algorithm

hebaoji,BAI LINTING and WEN Pengcheng.One-on-One Air Combat Control Method Based on Situation Assessment and DDPG Algorithm[J].Advances in Aeronautical Science and Engineering,2024,15(2):179-187.

Authors:	hebaoji BAI LINTING and WEN Pengcheng

Abstract:	Due to the advantages of low cost and no casualties in unmanned aerial vehicle (UAV) autonomous air combat, it has attracted increasing attention. This paper is based on the deep deterministic policy gradient (DDPG) reinforcement learning method. Building upon the situation evaluation function as the reward function for reinforcement learning, a comprehensive reinforcement learning environment is designed that considers flight altitude limits, flight overload, and flight speed limits. The interaction between the DDPG algorithm and the learning environment is achieved through the fully connected carrier speed control network and the environment reward network. The end condition for air combat is designed based on abnormal height and speed, missile lock time, and combat time. By simulating one-on-one air combat, the effectiveness of this combat control method is validated in terms of learning under environmental constraints, situation evaluation scores, and combat mode learning. This research can provide guidance for the further development of autonomous air combat.

Keywords:	reinforcement learning situation assessment Deep Deterministic Policy Gradient(DDPG) air combat

	点击此处可从《航空工程进展》浏览原始摘要信息
	点击此处可从《航空工程进展》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏