首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于模仿强化学习的固定翼飞机姿态控制器
引用本文:付宇鹏,邓向阳,朱子强,方君,余应福,闫文君,张立民.基于模仿强化学习的固定翼飞机姿态控制器[J].海军航空工程学院学报,2022,37(5):393-399.
作者姓名:付宇鹏  邓向阳  朱子强  方君  余应福  闫文君  张立民
作者单位:海军航空大学,山东烟台 264001
摘    要:研究了基于模仿强化学习的飞机姿态控制器。首先,建立专家经验数据集,并利用行为克隆对控制网络参数初始化;而后,控制网络利用强化学习和监督学习混合模式训练,通过奖励函数塑形和经验数据集监督学习引导强化学习算法快速收敛,使姿态控制器姿态响应优化的同时符合专家经验。控制网络输入为飞机姿态角误差、角速度等状态变量,输出控制增稳系统指令。实验表明,模仿强化学习控制器能够实现不同初始条件下飞机姿态角快速响应并与经验数据相符。

关 键 词:行为克隆  强化学习  姿态控制

Imitation Reinforcement Learning Based Attitude Controller for Fixed-Wing Aircraft
FU Yupeng,DENG Xiangyang,ZHU Ziqiang,FANG Jun,YU Yingfu,YAN Wenjun,ZHANG Limin.Imitation Reinforcement Learning Based Attitude Controller for Fixed-Wing Aircraft[J].Journal of Naval Aeronautical Engineering Institute,2022,37(5):393-399.
Authors:FU Yupeng  DENG Xiangyang  ZHU Ziqiang  FANG Jun  YU Yingfu  YAN Wenjun  ZHANG Limin
Institution:Naval Aviation University, Yantai Shandong 264001, China
Abstract:An attitude controller for fixed-wing aircraft based on Imitation Reinforcement Learning (IRL) is presented. First.ly, the empirical data set is built and the control network parameters are initialized with behavior cloning. Then, the controlnetwork is trained in hybrid mode combining Reinforcement Learning and Supervised Learning by reward shaping and su.pervised learning of empirical data, so that the attitude response of the attitude controller is optimized while conforming toexpert experience. The control network inputs the state variables such as aircraft attitude angle error and angular velocity,and the outputs are the control commands of the inner control augmentation system(CAS). The simulation results show thatthe IRL controller achieves rapid response to aircraft attitude angles under different initial conditions and matches with em.pirical data.
Keywords:behavior cloning  Reinforcement Learning  attitude controller
点击此处可从《海军航空工程学院学报》浏览原始摘要信息
点击此处可从《海军航空工程学院学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号