首页 | 本学科首页   官方微博 | 高级检索  
     检索      

考虑移动目标不确定行为方式的轨迹预测方法
引用本文:颜鹏,郭继峰,白成超.考虑移动目标不确定行为方式的轨迹预测方法[J].宇航学报,2022,43(8):1040-1051.
作者姓名:颜鹏  郭继峰  白成超
作者单位:哈尔滨工业大学航天学院,哈尔滨 150001
基金项目:国家自然科学基金(61973101)
摘    要:针对现有方法难以预测出符合飞行移动目标不确定行为方式轨迹的问题,提出基于逆强化学习的飞行移动目标轨迹预测方法,通过学习目标行为偏好以及模拟目标行为决策过程的方式预测目标的移动轨迹。首先基于深度神经网络建立目标的行为决策模型与行为偏好模型,然后通过最大熵逆强化学习方法交替地学习模型参数。为了有效地学习目标的不确定行为特征,采用监督学习的方法学习出目标示例轨迹概率分布模型,用于指导目标行为偏好模型的训练以及初始化目标行为决策模型,同时通过对目标行为偏好模型进行预训练的方式提高其训练质量。仿真结果表明,提出的飞行移动目标轨迹预测方法可通过学习到的目标行为决策模型较为准确地模拟目标的行为方式,预测的目标轨迹分布与真实的目标轨迹分布在Kullback Leibler(KL)散度下的相似度可达0.24。

关 键 词:轨迹预测  飞行移动目标  不确定行为方式  逆强化学习  深度神经网络  
收稿时间:2022-03-12

A Trajectory Prediction Method Considering Uncertain Behavior Patterns of Moving Targets
YAN Peng,GUO Jifeng,BAI Chengchao.A Trajectory Prediction Method Considering Uncertain Behavior Patterns of Moving Targets[J].Journal of Astronautics,2022,43(8):1040-1051.
Authors:YAN Peng  GUO Jifeng  BAI Chengchao
Institution:School of Astronautics, Harbin Institute of Technology, Harbin 150001, China
Abstract:Aiming at the problem that the existing methods are difficult to predict the trajectory of the flying moving target with uncertain behavior patterns, a trajectory prediction method for flying moving targets based on inverse reinforcement learning is proposed, which can predict the moving trajectory by learning the behavior preference of the target and simulating the decision making process of the target behavior. Firstly, the behavior decision model and behavior preference model of the target are established based on deep neural networks, and then the model parameters are alternately learned by a maximum entropy inverse reinforcement learning method. In order to effectively learn the uncertain behavior characteristics of the target, the supervised learning method is used to learn the probability distribution model of the target sample trajectories, which are then used to guide the training of the target behavior preference model and initialize the target behavior decision model. Meanwhile, the training quality of the target behavior preference model is improved by pre training. The simulation results show that the proposed method can accurately simulate the behavior patterns of the target through the learned target behavior decision model, and the similarity between the predicted target trajectory distribution and the real target trajectory distribution under Kullback Leibler (KL) divergence can reach 0.24.
Keywords:Trajectory prediction  Flying moving targets  Uncertain behavior patterns  Inverse reinforcement learning  Deep neural networks  
点击此处可从《宇航学报》浏览原始摘要信息
点击此处可从《宇航学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号