首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于DDPG的单目无人机避障算法
作者姓名:魏瑶  刘小毛  张晗  席永辉  曹爽  李思佳
作者单位:西北工业大学 航天学院;上海航天控制技术研究所; 中国航天科技集团有限公司红外探测技术研发中心;西北工业大学 无人系统技术研究院
基金项目:上海航天科技创新基金(SAST2020-070)
摘    要:提出了一种基于单目相机的小型多旋翼无人机的连续避障策略。所提出的方法包括深度估计和导航决策两个模块。其中,在深度估计模块采用条件对抗网络对无人机采集得到的RGB图片进行训练预处理,在导航决策模块采用深度确定性策略梯度(DDPG)算法实现无人机的连续避障。在此基础上,对DDPG中的Actor网络进行改进,通过使用多模态网络代替原有策略网络,从而抑制无人机飞行震动,提高避障能力。最后,在Airsim仿真环境中进行测试,实验表明所提算法模型经过训练能够使无人机成功躲避障碍物并到达指定目标点,与原有算法相比避障轨迹得到明显改善。

关 键 词:深度强化学习  避障决策  深度估计  无人机  深度确定性策略梯度算法

Obstacle Avoidance Algorithm for Monocular UAV Based on DDPG
Authors:WEI Yao  LIU Xiaomao  ZHANG Han  XI Yonghui  CAO Shuang  LI Sijia
Abstract:A continuous obstacle avoidance strategy for a small multi-rotor UAV based on a monocular camera is proposed. The proposed method includes two modules of depth estimation and navigation decision. Among them, the conditional confrontation network is used in the depth estimation module to train and preprocess the RGB images collected by the UAV, and the deep deterministic policy gradient (DDPG) algorithm is used in the navigation decision module to realize the continuous obstacle avoidance of the UAV. On this basis, the Actor network in DDPG is improved, and the multi-modal network is used to replace the original policy network, thereby suppressing the flight vibration of the UAV and improving the obstacle avoidance ability. Finally, the test is carried out in the Airsim simulation environment. The experiment shows that the proposed algorithm model can make the UAV successfully avoid obstacles and reach the designated target point after training. Compared with the original algorithm, the obstacle avoidance trajectory has been significantly improved.
Keywords:deep reinforcement learning  obstacle avoidance decision  depth estimation  UAV  DDPG
点击此处可从《》浏览原始摘要信息
点击此处可从《》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号