基于深度强化学习的复杂地形适应机器人设计与实验 Design and Experiment of Complex Terrain Adaptive Robot Based on Deep Reinforcement Learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于深度强化学习的复杂地形适应机器人设计与实验

引用本文：	杨顿,杨帅,于洋,王琪.基于深度强化学习的复杂地形适应机器人设计与实验[J].宇航学报,2022,43(9):1176-1185.

作者姓名：	杨顿杨帅于洋王琪

作者单位：	北京航空航天大学航空科学与工程学院，北京 100191

基金项目：	国家优秀青年科学基金（12022212）

摘要：	针对行星表面轻量化自主探测任务，基于仿生思想设计了一种仿海胆结构的十二足球形机器人，其具备自主改变构型以贴合复杂地形的能力，可实现无倾覆、高容错的全向运动；基于数据驱动方法，对该机器人设计了一种数据高效的无模型强化学习运动策略，可实现无先验知识的从0到1步态训练以及步态的实物样机快速部署。通过在平面地形和非结构化地形中对其进行仿真实验，验证了经过训练的机器人具备自主运动、适应非结构地形等能力；通过与常用基准策略进行对比，证实了本文提出的运动策略具有训练高效、鲁棒性好的优势；最后通过开发原理样机，开展实物实验验证了仿真环境中所生成的步态在真实物理环境中的动力学可行性。
关键词：	仿生机器人强化学习复杂地形自主运动策略行星探测
收稿时间：	2022-02-12
Design and Experiment of Complex Terrain Adaptive Robot Based on Deep Reinforcement Learning

YANG Dun,YANG Shuai,YU Yang,WANG Qi.Design and Experiment of Complex Terrain Adaptive Robot Based on Deep Reinforcement Learning[J].Journal of Astronautics,2022,43(9):1176-1185.

Authors:	YANG Dun YANG Shuai YU Yang WANG Qi

Affiliation:	School of Aeronautic Science and Engineering, Beihang University, Beijing 100191, China

Abstract:	For the lightweight autonomous exploration mission of planetary surface, a sea urchin like twelve leg spherical robot is proposed based on the structural bionic idea. It has the potential to autonomously change the structure to fit the complex terrain, and can realize omnidirectional motion without overturning and high fault tolerance. Based on the data driven method, a data efficient model free reinforcement learning motion strategy is designed for the robot, which can realize zero to one gait training and deployment without prior knowledge and rapid deployment of the physical prototype of gait. Through the simulation experiments on flat ground and unstructured terrain, it is verified that the trained robot has the ability to move autonomously and adapt to unstructured terrain. By comparing with the commonly used benchmark strategies, it is proved that the proposed strategy has the advantages of high training efficiency and good robustness. Finally, a prototype is developed to verify the dynamic feasibility of the gait generated in the simulation environment in the real physical environment.

Keywords:	Bionic robots Reinforcement learning Complex terrain Autonomous movement strategies Planetary exploration

	点击此处可从《宇航学报》浏览原始摘要信息
	点击此处可从《宇航学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏