Structure-aware fusion network for 3D scene understanding |
| |
Affiliation: | 1. School of Automation, Beijing University of Posts and Telecommunications, Beijing 100876, China;2. Interdisciplinary Graduate School, Nanyang Technological University, Singapore 639798, Singapore |
| |
Abstract: | In this paper, we propose a Structure-Aware Fusion Network (SAFNet) for 3D scene understanding. As 2D images present more detailed information while 3D point clouds convey more geometric information, fusing the two complementary data can improve the discriminative ability of the model. Fusion is a very challenging task since 2D and 3D data are essentially different and show different formats. The existing methods first extract 2D multi-view image features and then aggregate them into sparse 3D point clouds and achieve superior performance. However, the existing methods ignore the structural relations between pixels and point clouds and directly fuse the two modals of data without adaptation. To address this, we propose a structural deep metric learning method on pixels and points to explore the relations and further utilize them to adaptively map the images and point clouds into a common canonical space for prediction. Extensive experiments on the widely used ScanNetV2 and S3DIS datasets verify the performance of the proposed SAFNet. |
| |
Keywords: | 3D point clouds Data fusion Structure-aware 3D scene understanding Deep metric learning |
本文献已被 ScienceDirect 等数据库收录! |
|