IEEE Access (Jan 2025)

Multi-Kernel Learning for Heterogeneous Data

  • Chunlan Liao,
  • Shili Peng

DOI
https://doi.org/10.1109/access.2025.3530396
Journal volume & issue
Vol. 13
pp. 45340 – 45349

Abstract

Read online

Multi-kernel learning is an excellent machine learning algorithm widely used in various learning tasks such as classification and regression. Traditional kernel methods mainly focus on numerical data and lack sufficient research on categorical and mixed data. However, mixed data is widely used in practical applications, and many unstructured data can be converted into mixed data through appropriate preprocessing. In this work, we propose a new Heterogeneous Multi-Kernel Learning (HMKL) algorithm for processing mixed data containing both categorical and numerical attributes. In HMKL, category attributes and numerical attributes are processed separately. Different similarity measurement methods are used to obtain different kernel matrices for category attributes, which are then fused with numerical kernel matrices to improve the classification performance of multi-kernel learning. We propose a new ratio Gaussian kernel function for category attributes, which can maintain a balance between the AND and OR operations of the matching kernel matrix. In addition, to address the curse of dimensionality caused by one-of-N encoding, we use the summation of matching kernel matrices to reduce the difficulty of preprocessing categorical attributes. The experiment shows that our proposed HMKL algorithm can effectively handle mixed data and has excellent performance.

Keywords