首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于视频的三维人体姿态估计
引用本文:杨彬,李和平,曾慧.基于视频的三维人体姿态估计[J].北京航空航天大学学报,2019,45(12):2463-2469.
作者姓名:杨彬  李和平  曾慧
作者单位:北京科技大学自动化学院,北京100083;北京市工业波谱成像工程技术研究中心,北京100083;中国科学院自动化研究所,北京,100190
基金项目:国家自然科学基金61973029中央高校基本科研业务费专项资金FRF-BD-17-002A
摘    要:已有的三维人体姿态估计方法侧重于通过单帧图像来估计人体的三维姿态,忽略了视频中前后帧之间的相关性,因此,通过挖掘视频在时间维度上的信息可以进一步提高三维人体姿态估计的准确率。基于此,设计了一种可以充分提取视频时序信息的卷积神经网络结构,在获得高精度的同时也具有消耗计算资源小的优点,仅仅使用二维关节点的坐标为输入即可恢复完整的三维人体姿态。然后提出了一种新的损失函数利用相邻帧间人体姿态的连续性,来改进视频序列中三维姿态估计的平滑性,同时也解决了因缺少帧间信息而导致准确率下降的问题。通过在公开数据集Human3.6M上进行测试,实验结果表明本文方法相比目前的基准三维姿态估计算法的平均测试误差降低了1.2 mm,对于视频序列的三维人体姿态估计有着较高的准确率。 

关 键 词:三维人体姿态  卷积神经网络  视频序列  损失函数  平滑
收稿时间:2019-07-09

Three-dimensional human pose estimation based on video
Institution:1.School of Automation&Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, China2.Beijing Engineering Research Center of Industrial Spectrum Imaging, Beijing 100083, China3.Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
Abstract:The existing 3D human pose estimation method focuses on estimating the 3D pose of the human body through a single frame image, while ignoring the correlation between the front and back frames in the video. Therefore, by investigating the information of the video in the time dimension, the accuracy of the 3D human pose estimation can be further improved. Based on this, the convolutional neural network structure that can fully extract the temporal information in the video is designed. It has the advantage of low computational resources and high precision. The complete 3D human pose can be restored only by using the coordinates of the 2D articulation point as input. Furthermore, a new loss function is proposed, which uses the continuity of human pose between adjacent frames to improve the smoothness of 3D pose estimation in video sequences, and also solves the problem of accuracy degradation due to lack of inter-frame information. By testing on the Human 3.6M dataset, the experimental results indicate that the average test error of the proposed method is 1.2 mm lower than that of the current standard 3D pose estimation algorithm, and the proposed method has a high accuracy for the 3D human pose estimation of video sequences. 
Keywords:
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《北京航空航天大学学报》浏览原始摘要信息
点击此处可从《北京航空航天大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号