Scientific Reports (Jul 2024)
Benchmarking quantum versions of the kNN algorithm with a metric based on amplitude-encoded features
Abstract
Abstract This work introduces a quantum subroutine for computing the distance between two patterns and integrates it into two quantum versions of the kNN classifier algorithm: one proposed by Schuld et al. and the other proposed by Quezada et al. Notably, our proposed subroutine is tailored to be memory-efficient, requiring fewer qubits for data encoding, while maintaining the overall complexity for both QkNN versions. This research focuses on comparing the performance of the two quantum kNN algorithms using the original Hamming distance with qubit-encoded features and our proposed subroutine, which computes the distance using amplitude-encoded features. Results obtained from analyzing thirteen different datasets (Iris, Seeds, Raisin, Mine, Cryotherapy, Data Bank Authentication, Caesarian, Wine, Haberman, Transfusion, Immunotherapy, Balance Scale, and Glass) show that both algorithms benefit from the proposed subroutine, achieving at least a 50% reduction in the number of required qubits, while maintaining a similar overall performance. For Shuld’s algorithm, the performance improved in Cryotherapy (68.89% accuracy compared to 64.44%) and Balance Scale (85.33% F1 score compared to 78.89%), was worse in Iris (86.0% accuracy compared to 95.33%) and Raisin (77.67% accuracy compared to 81.56%), and remained similar in the remaining nine datasets. While for Quezada’s algorithm, the performance improved in Caesarian (68.89% F1 score compared to 58.22%), Haberman (69.94% F1 score compared to 62.31%) and Immunotherapy (76.88% F1 score compared to 69.67%), was worse in Iris (82.67% accuracy compared to 95.33%), Balance Scale (77.97% F1 score compared to 69.21%) and Glass (40.04% F1 score compared to 28.79%), and remained similar in the remaining seven datasets.
Keywords