首页 | 本学科首页   官方微博 | 高级检索  
     

跨传感器异步迁移学习的室内单目无人机避障
引用本文:李湛,薛喜地,杨学博,孙维超,于兴虎,高会军. 跨传感器异步迁移学习的室内单目无人机避障[J]. 宇航学报, 2020, 41(6): 811-819. DOI: 10.3873/j.issn.1000-1328.2020.06.018
作者姓名:李湛  薛喜地  杨学博  孙维超  于兴虎  高会军
作者单位:1. 哈尔滨工业大学智能控制与系统研究所,哈尔滨 150001;2. 哈尔滨工业大学宁波智能装备研究院,宁波 315201; 3. 哈尔滨工业大学机器人技术与系统国家重点实验室,哈尔滨 150001
基金项目:国家自然科学基金联合基金(U1964201);黑龙江省自然科学基金(LH2019F020)
摘    要:针对强化学习策略由仿真环境向实际迁移困难的问题,以提高无人机采用无深度信息单目视觉时的行人规避能力为目标,提出一种基于异步深度神经网络结构的跨传感器迁移学习方法。首先,在仿真环境中仅使用虚拟单线激光雷达作为传感器,通过基于确定性策略梯度(DDPG)的深度强化学习方法,训练得到一个稳定的初级避障策略。其次,用单目摄像头和激光雷达同步采集现实环境中的视觉和深度数据集并逐帧绑定,使用上述初级避障策略对现实数据集进行自动标注,进而训练得到无需激光雷达数据的单目视觉避障策略,实现从虚拟激光雷达到现实单目视觉的跨传感器迁移学习。最后,引入YOLO v3-tiny网络与Resnet18网络组成异步深度神经网络结构,有效提高了存在行人场景下的避障性能。

关 键 词:单目视觉  深度强化学习  确定性策略梯度  跨传感器迁移学习  异步深度神经网络  
收稿时间:2020-03-11

Collision Avoidance for Indoor Monocular UAV Using Cross Sensor Asynchronous Transfer Learning
LI Zhan,XUE Xi di,YANG Xue bo,SUN Wei chao,YU Xing hu,GAO Hui jun. Collision Avoidance for Indoor Monocular UAV Using Cross Sensor Asynchronous Transfer Learning[J]. Journal of Astronautics, 2020, 41(6): 811-819. DOI: 10.3873/j.issn.1000-1328.2020.06.018
Authors:LI Zhan  XUE Xi di  YANG Xue bo  SUN Wei chao  YU Xing hu  GAO Hui jun
Affiliation:1. Research Institute of Intelligent Control and Systems, Harbin Institute of Technology, Harbin 150001, China; 2. Ningbo Institute of Intelligent Equipment Technology, Harbin Institute of Technology, Ningbo 315201, China; 3. State Key Laboratory of Robotics and Systems, Harbin Institute of Technology, Harbin 150001, China
Abstract:Aiming at the difficulties in transferring reinforcement learning policies from a simulated environment to real scenarios, and to improve the pedestrian avoidance ability of a drone using monocular vision without depth information, this paper proposes a cross-sensor transfer learning method based on an asynchronous deep neural network. Firstly, a stable preliminary collision avoidance policy using only a virtual Lidar sensor is trained in the simulation environment based on the deterministic policy gradient (DDPG) deep reinforcement learning. Secondly, a monocular camera and Lidar are used to collect the real-world vision and depth data sets simultaneously and bind them frame by frame, then the aforementioned preliminary collision avoidance policy is used to automatically obtain the labels which are further used to train a new monocular vision collision avoidance policy without Lidar data, thus achieving the cross-sensor transfer leaning from the simulated Lidar to real-world monocular vision. At last, the YOLO v3-tiny network and the Resnet18 network are introduced to form an asynchronous deep neural network structure, which effectively improves the policy performance in pedestrian involved scenarios.
Keywords:Monocular vision  Deep reinforcement learning  Deterministic policy gradient  Cross-sensor transfer learning  Asynchronous deep neural network  
点击此处可从《宇航学报》浏览原始摘要信息
点击此处可从《宇航学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号