首页 | 本学科首页   官方微博 | 高级检索  
     检索      

攻击角度约束下的分布式强化学习制导方法
引用本文:李博皓,安旭曼,杨晓飞,吴云洁,李国飞.攻击角度约束下的分布式强化学习制导方法[J].宇航学报,2022,43(8):1061-1069.
作者姓名:李博皓  安旭曼  杨晓飞  吴云洁  李国飞
作者单位:1. 北京航空航天大学虚拟现实技术与系统国家重点实验室,北京 100191;2. 北京航空航天大学大学自动化科学与电气工程学院,北京 100191;3. 飞行器控制一体化技术重点实验室,北京 100191;4. 西北工业大学航天学院,西安 710072
基金项目:国家自然科学基金(62003021);中央高校基本科研业务项目(D5000210830)
摘    要:为提高导弹在攻击角度约束下对目标的打击效能,提出了一种基于深度确定性策略梯度算法的分布式强化学习制导策略。为了最大限度地减小攻击角度误差,设计了一种新的奖励函数,使导弹在满足视场角约束的同时,视线角向期望值收敛。此外,为了增强强化学习模型的泛化能力,提出了一种分布式探索策略,提高了模型训练过程中对环境的探索效率。仿真结果验证了所提出的分布式强化学习制导方法能够在固定攻击角度约束下实现对目标的精准打击。与传统制导律相比,所提制导方法的攻击角度误差更小,收敛速度更快。

关 键 词:导弹制导  强化学习  攻击角度  梯度算法  
收稿时间:2021-12-31

A Distributed Reinforcement Learning Guidance Method under Impact Angle Constraints
LI Bohao,AN Xuman,YANG Xiaofei,WU Yunjie,LI Guofei.A Distributed Reinforcement Learning Guidance Method under Impact Angle Constraints[J].Journal of Astronautics,2022,43(8):1061-1069.
Authors:LI Bohao  AN Xuman  YANG Xiaofei  WU Yunjie  LI Guofei
Institution:1. State Key Laboratory of Virtual Reality Technology and System, Beihang University, Beijing 100191, China;2. School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China;3. Science and Technology on Aircraft Control Laboratory, Beijing 100191, China;4. School of Astronautics, Northwestern Polytechnical University, Xi’an 710072, China
Abstract:In order to improve the target hitting effect of missile with the impact angle fixed, a distributed reinforcement learning guidance strategy based on deep deterministic policy gradient algorithm is proposed. To minimize the impact angle error, a new reward function is designed to make the line of sight angle converge to the expected value while meeting the field of view angle constraint. In addition, in order to enhance the generalization ability of the reinforcement learning model, a distributed exploration strategy is proposed to improve the efficiency of environment exploration during model training. The simulation results verify that the proposed distributed reinforcement learning guidance method can achieve accurate attack on the target under the constraint of fixed impact angle. Compared with the traditional guidance law, the impact angle error of the proposed guidance law is smaller and the convergence rate is faster.
Keywords:Missile guidance  Reinforcement learning  Impact angle  Gradient algorithm  
点击此处可从《宇航学报》浏览原始摘要信息
点击此处可从《宇航学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号