首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于深度增强学习的卫星姿态控制方法
引用本文:王月娇,马钟,杨一岱,王竹平,唐磊.基于深度增强学习的卫星姿态控制方法[J].中国空间科学技术,2019,39(4):36.
作者姓名:王月娇  马钟  杨一岱  王竹平  唐磊
作者单位:西安微电子技术研究所,西安710065
基金项目:国家自然科学基金(61702413);航天九院技术创新基金(2016JY06)
摘    要:针对卫星在执行丢弃载荷或捕获目标等复杂任务时遭遇的姿态突然发生变化的问题,采用深度增强学习方法对卫星姿态进行控制,使卫星恢复稳定状态。具体来说,首先搭建飞行器的姿态动力学环境,并将连续的控制力矩输出离散化,然后采用Deep Q Network算法进行卫星自主姿态控制训练,以姿态角速度趋于稳定作为奖励获得离散行为的最优智能输出。仿真试验表明,面向空间卫星姿态控制的深度增强学习算法能够在卫星受到突发随机扰动后稳定卫星姿态,并能有效解决传统PD控制器依赖被控对象质量参数的难题。所提出的方法采用自主学习的方式对卫星姿态进行控制,具有很强的智能性和一定的普适性,在未来卫星执行复杂空间任务中的智能控制方面有着很好的应用潜力。

关 键 词:深度增强学习  卫星姿态控制  动力学环境  自主姿态控制  质量参数  
收稿时间:2018-11-01

Satellite attitude control method based on deepreinforcement learning
WANG Yue-Jiao,MA Zhong,YANG Yi-Dai,WANG Zhu-Ping,TANG Lei.Satellite attitude control method based on deepreinforcement learning[J].Chinese Space Science and Technology,2019,39(4):36.
Authors:WANG Yue-Jiao  MA Zhong  YANG Yi-Dai  WANG Zhu-Ping  TANG Lei
Institution:Xi′an Microelectronics Technology Institute,Xi′an 710065,China
Abstract:Aiming at the problem of sudden changes in the attitudes encountered by satellites while performing complex tasks such as discarding a payload or capturing a target, a satellite attitude control method based on the deep reinforcement learning is proposed to restore the satellite to a stable state. Concretely, the attitude dynamics environment of the vehicle is firstly established, and the output of continuous control torque is discretized. Deep Q Network algorithm is then performed to train the autonomous attitude control of the satellite for further processing, and the optimal intelligent output of discrete behavior is rewarded with the stabilization of attitude angular velocity. Finally, the validity of the mechanism is verified by the simulation test. Results analysis illustrates that the deep reinforcement learning algorithm for satellite attitude control can stabilize satellite attitude after the satellite is disturbed by sudden random disturbance, and it can effectively solve the problem of traditional PD controller depending on the mass parameters of the controlled object. The proposed method adopts self learning to control the satellite attitude, which has strong intelligence and universal applicability, and has a strong application potential for future intelligent control of satellites performing complex space tasks.
Keywords:deep reinforcement learning  satellite attitude control  dynamic environment  autonomous attitude control  mass parameters  
本文献已被 CNKI 等数据库收录!
点击此处可从《中国空间科学技术》浏览原始摘要信息
点击此处可从《中国空间科学技术》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号