IEEE Access (Jan 2021)
PointMTL: Multi-Transform Learning for Effective 3D Point Cloud Representations
Abstract
Effectively learning and extracting the feature representations of 3D point clouds is an important yet challenging task. Most of existing works achieve reasonable performance in 3D vision tasks by modeling the relationships among points appropriately. However, the feature representations are only learned with a specific transform through these methods, which are easy to overlap and thus limit the representation ability of the model. To address these issues, we propose a novel Multi-Transform Learning framework for point clouds (PointMTL), which can extract diverse features from multiple mapping transform to obtain richer representations. Specifically, we build a module named Multi-Transform Encoder (MTE), which encodes and aggregates local features from multiple non-linear transforms. To further explore global context representations, a module named Global Spatial Fusion (GSF) is proposed to capture global information and selectively fuse with local representations. Moreover, to guarantee the richness and diversity of learned representations, we further propose a Spatial Independence Criterion (SIC) strategy to enlarge the differences between the transforms and reduce information redundancies. In contrast to previous works, our approach fully exploits representations from multiple transforms, thus having strong expressiveness and good robustness for point clouds related tasks. The experiments on three typical tasks (i.e., semantic segmentation on S3DIS and ScanNet, part segmentation on ShapeNet and shape classification on ModelNet40) demonstrates the effectiveness of our method.
Keywords