Algorithms (Mar 2022)
A Contrastive Learning Method for the Visual Representation of 3D Point Clouds
Abstract
At present, the unsupervised visual representation learning of the point cloud model is mainly based on generative methods, but the generative methods pay too much attention to the details of each point, thus ignoring the learning of semantic information. Therefore, this paper proposes a discriminative method for the contrastive learning of three-dimensional point cloud visual representations, which can effectively learn the visual representation of point cloud models. The self-attention point cloud capsule network is designed as the backbone network, which can effectively extract the features of point cloud data. By compressing the digital capsule layer, the class dependence of features is eliminated, and the generalization ability of the model and the ability of feature queues to store features are improved. Aiming at the equivariance of the capsule network, the Jaccard loss function is constructed, which is conducive to the network distinguishing the characteristics of positive and negative samples, thereby improving the performance of the contrastive learning. The model is pre-trained on the ShapeNetCore data set, and the pre-trained model is used for classification and segmentation tasks. The classification accuracy on the ModelNet40 data set is 0.1% higher than that of the best unsupervised method, PointCapsNet, and when only 10% of the label data is used, the classification accuracy exceeds 80%. The mIoU of part segmentation on the ShapeNet data set is 1.2% higher than the best comparison method, MulUnsupervised. The experimental results of classification and segmentation show that the proposed method has good performance in accuracy. The alignment and uniformity of features are better than the generative method of PointCapsNet, which proves that this method can learn the visual representation of the three-dimensional point cloud model more effectively.
Keywords