Scientific Reports (Feb 2022)
Distance-based clustering using QUBO formulations
Abstract
Abstract In computer science, clustering is a technique for grouping data. Ising machines can solve distance-based clustering problems described by quadratic unconstrained binary optimization (QUBO) formulations. A typical simple method using an Ising machine makes each cluster size equal and is not suitable for clustering unevenly distributed data. We propose a new clustering method that provides better performance than the simple method, especially for unevenly distributed data. The proposed method is a hybrid algorithm including an iterative process that comprises solving a discrete optimization problem with an Ising machine and calculating parameters with a general-purpose computer. To minimize the communication overhead between the Ising machine and the general-purpose computer, we employed a low-latency Ising machine implementing the simulated bifurcation algorithm with a field-programmable gate array attached to a local server. The proposed method results in clustering 200 unevenly distributed data points with a clustering score 18% higher than that of the simple method. The discrete optimization with 2000 variables is performed 100 times per iteration, and the overhead time is reduced to approximately 20% of the total execution time. These results suggest that hybrid algorithms using Ising machines can efficiently solve practical optimization problems.