首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于改进DQN的复合模式在轨服务资源分配
引用本文:刘冰雁,叶雄兵,周赤非,刘必鎏.基于改进DQN的复合模式在轨服务资源分配[J].航空学报,2020,41(5):323630-323630.
作者姓名:刘冰雁  叶雄兵  周赤非  刘必鎏
作者单位:1. 军事科学院, 北京 100091;2. 中国人民解放军 32032部队, 北京 100094
摘    要:针对开展在轨服务前的资源分配非线性多目标优化问题,构建复合服务模式下的在轨资源分配模型,基于对DQN (Deep Q-Network)方法的收敛性和稳定性改进,提出了一种在轨服务资源分配方法。该方法能够应对同时包含"一对多""多对一"的复合服务模式,并在满足预期成功率的前提下优先分配重要服务对象,兼顾资源分配综合效益和总体能耗效率,达到了以期望成功率、较少资源投入尽快完成任务的综合目标。仿真实验表明,改进DQN方法能够在任务执行前依据服务对象重要程度自主分配航天器资源,收敛速度快、训练误差低,在分配效益和总体能耗的优化方面具有明显的比较优势。

关 键 词:在轨服务  整数规划  资源分配  深度强化学习  神经网络  
收稿时间:2019-11-04
修稿时间:2019-11-28

Allocation of composite mode on-orbit service resource based on improved DQN
LIU Bingyan,YE Xiongbing,ZHOU Chifei,LIU Biliu.Allocation of composite mode on-orbit service resource based on improved DQN[J].Acta Aeronautica et Astronautica Sinica,2020,41(5):323630-323630.
Authors:LIU Bingyan  YE Xiongbing  ZHOU Chifei  LIU Biliu
Institution:1. Academy of Military Sciences, Beijing 100091, China;2. 32032 Troops, Beijing 100094, China
Abstract:In order to solve the nonlinear multi-objective optimization before on-orbit service, an on-orbit service resource allocation model under the composite service mode is constructed, and an on-orbit service resource allocation method based on Deep Q Network (DQN) convergence and stability improvement was proposed. This approach can cope with a composite service pattern which includes "one to many" and "many to one". This method can prioritize the allocation of important service objects on the premise of satisfying the expected success rate, and at the same time, take into account the comprehensive benefit of resource allocation and the overall energy consumption efficiency, achieving the comprehensive goal of completing the task efficiently and with the expected success rate and less resource input. Simulation results show that improved DQN method can independently allocate spacecraft resources based on the importance of service objects. This method has the advantages of fast convergence, low training error, and obvious comparative advantages in the optimization of distribution benefits and overall energy consumption.
Keywords:on-orbit servicing  integer programming  resource allocation  deep reinforcement learning  neural network  
本文献已被 万方数据 等数据库收录!
点击此处可从《航空学报》浏览原始摘要信息
点击此处可从《航空学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号