跨传感器异步迁移学习的室内单目无人机避障 Collision Avoidance for Indoor Monocular UAV Using Cross Sensor Asynchronous Transfer Learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

跨传感器异步迁移学习的室内单目无人机避障

引用本文：	李湛,薛喜地,杨学博,孙维超,于兴虎,高会军.跨传感器异步迁移学习的室内单目无人机避障[J].宇航学报,2020,41(6):811-819.

作者姓名：	李湛薛喜地杨学博孙维超于兴虎高会军

作者单位：	1. 哈尔滨工业大学智能控制与系统研究所，哈尔滨 150001；2. 哈尔滨工业大学宁波智能装备研究院，宁波 315201； 3. 哈尔滨工业大学机器人技术与系统国家重点实验室，哈尔滨 150001

基金项目：	国家自然科学基金联合基金（U1964201）；黑龙江省自然科学基金（LH2019F020）

摘要：	针对强化学习策略由仿真环境向实际迁移困难的问题，以提高无人机采用无深度信息单目视觉时的行人规避能力为目标，提出一种基于异步深度神经网络结构的跨传感器迁移学习方法。首先，在仿真环境中仅使用虚拟单线激光雷达作为传感器，通过基于确定性策略梯度(DDPG)的深度强化学习方法，训练得到一个稳定的初级避障策略。其次，用单目摄像头和激光雷达同步采集现实环境中的视觉和深度数据集并逐帧绑定，使用上述初级避障策略对现实数据集进行自动标注，进而训练得到无需激光雷达数据的单目视觉避障策略，实现从虚拟激光雷达到现实单目视觉的跨传感器迁移学习。最后，引入YOLO v3-tiny网络与Resnet18网络组成异步深度神经网络结构，有效提高了存在行人场景下的避障性能。
关键词：	单目视觉深度强化学习确定性策略梯度跨传感器迁移学习异步深度神经网络
收稿时间：	2020-03-11
Collision Avoidance for Indoor Monocular UAV Using Cross Sensor Asynchronous Transfer Learning

LI Zhan,XUE Xi di,YANG Xue bo,SUN Wei chao,YU Xing hu,GAO Hui jun.Collision Avoidance for Indoor Monocular UAV Using Cross Sensor Asynchronous Transfer Learning[J].Journal of Astronautics,2020,41(6):811-819.

Authors:	LI Zhan XUE Xi di YANG Xue bo SUN Wei chao YU Xing hu GAO Hui jun

Institution:	1. Research Institute of Intelligent Control and Systems, Harbin Institute of Technology, Harbin 150001, China; 2. Ningbo Institute of Intelligent Equipment Technology, Harbin Institute of Technology, Ningbo 315201, China; 3. State Key Laboratory of Robotics and Systems, Harbin Institute of Technology, Harbin 150001, China

Abstract:	Aiming at the difficulties in transferring reinforcement learning policies from a simulated environment to real scenarios, and to improve the pedestrian avoidance ability of a drone using monocular vision without depth information, this paper proposes a cross-sensor transfer learning method based on an asynchronous deep neural network. Firstly, a stable preliminary collision avoidance policy using only a virtual Lidar sensor is trained in the simulation environment based on the deterministic policy gradient (DDPG) deep reinforcement learning. Secondly, a monocular camera and Lidar are used to collect the real-world vision and depth data sets simultaneously and bind them frame by frame, then the aforementioned preliminary collision avoidance policy is used to automatically obtain the labels which are further used to train a new monocular vision collision avoidance policy without Lidar data, thus achieving the cross-sensor transfer leaning from the simulated Lidar to real-world monocular vision. At last, the YOLO v3-tiny network and the Resnet18 network are introduced to form an asynchronous deep neural network structure, which effectively improves the policy performance in pedestrian involved scenarios.

Keywords:	Monocular vision Deep reinforcement learning Deterministic policy gradient Cross-sensor transfer learning Asynchronous deep neural network

	点击此处可从《宇航学报》浏览原始摘要信息
	点击此处可从《宇航学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏