首页 | 本学科首页   官方微博 | 高级检索  
     检索      

Q-learning强化学习协同拦截制导律CSCD
引用本文:王金强,苏日新,刘莉,刘玉祥,龙永松.Q-learning强化学习协同拦截制导律CSCD[J].导航定位于授时,2022(5):84-90.
作者姓名:王金强  苏日新  刘莉  刘玉祥  龙永松
作者单位:江南机电设计研究所,贵阳 550025
基金项目:国防科工委重点基础研究项目(2019-JCJQ-ZD-049
摘    要:为实现多枚导弹协同拦截机动目标,提升拦截效能,提出了一种Q-learning强化学习协同拦截制导律。首先,基于逃逸域覆盖理论,建立了非线性多弹协同拦截模型。其次,以视线角速率为状态,依据脱靶量构造奖励函数,通过离线训练生成强化学习智能体,并结合传统比例制导控制方法,构建基于强化学习的变导引系数制导律,实时生成实现协同拦截的制导指令。最终,通过数值仿真验证了所提算法的有效性和优越性。

关 键 词:协同拦截  强化学习  机动目标  逃逸域  制导律

Cooperative Interception Guidance Law Based on Reinforcement Learning of Q-learning
WANG Jin-qiang,SU Ri-xin,LIU Li,LIU Yu-xiang,LONG Yong-song.Cooperative Interception Guidance Law Based on Reinforcement Learning of Q-learning[J].Navigation Positioning & Timing,2022(5):84-90.
Authors:WANG Jin-qiang  SU Ri-xin  LIU Li  LIU Yu-xiang  LONG Yong-song
Institution:Jiangnan Institute of Mechanical and Electrical Design, Guiyang 550025, China
Abstract:To achieve the cooperative interception of multiple missiles against a maneuvering target and improve the interception effectiveness, a cooperative interception guidance law is proposed through Q-learning technology. Firstly, based on escape domain covering theory, a nonlinear cooperative interception model is established. Then, a reward function is constructed by using miss distance with taking line-of-sight rate as the state, and a reinforcement learning agent is generated by offline training. At the same time, a variable coefficient guidance law based on reinforcement learning algorithm is designed by combining proportional navigation guidance law to generate guidance commands in real time. Finally, the effectiveness and superiority of the proposed algorithm are verified based on numerical simulation.
Keywords:Cooperative interception  Reinforcement learning  Maneuvering target  Escape domain  Guidance law
本文献已被 维普 等数据库收录!
点击此处可从《导航定位于授时》浏览原始摘要信息
点击此处可从《导航定位于授时》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号