Physical Review Research (Mar 2020)
Unsupervised learning using topological data augmentation
Abstract
Unsupervised machine learning is a cornerstone of artificial intelligence as it provides algorithms capable of learning tasks, such as classification of data, without explicit human assistance. We present an unsupervised deep learning protocol for finding topological indices of quantum systems. The core of the proposed scheme is a “topological data augmentation” procedure that uses seed objects to generate ensembles of topologically equivalent data. Such data, assigned with dummy labels, can then be used to train a neural network classifier for sorting arbitrary objects into topological equivalence classes. Importantly, we also show how to retrieve the local quantities corresponding to the learned topological indices from the intermediate outputs of the trained network. Our protocol is explicitly illustrated on two-band insulators in one and two dimensions, characterized by a winding number and a Chern number respectively. Using the augmentation technique also in the classification step, to classify a family of topologically equivalent objects instead of a single object, we can achieve accuracy arbitrarily close to 100% even for indices outside the training regime. Apart from the method's applicability to topological classification, it also provides a new perspective on data augmentation in supervised machine learning, where given sufficient mathematical structure the set of category-preserving deformations can be rigorously defined.