Scientific Reports (Dec 2022)

A fast kernel independence test for cluster-correlated data

  • Hoseung Song,
  • Hongjiao Liu,
  • Michael C. Wu

DOI
https://doi.org/10.1038/s41598-022-26278-9
Journal volume & issue
Vol. 12, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Cluster-correlated data receives a lot of attention in biomedical and longitudinal studies and it is of interest to assess the generalized dependence between two multivariate variables under the cluster-correlated structure. The Hilbert–Schmidt independence criterion (HSIC) is a powerful kernel-based test statistic that captures various dependence between two random vectors and can be applied to an arbitrary non-Euclidean domain. However, the existing HSIC is not directly applicable to cluster-correlated data. Therefore, we propose a HSIC-based test of independence for cluster-correlated data. The new test statistic combines kernel information so that the dependence structure in each cluster is fully considered and exhibits good performance under high dimensions. Moreover, a rapid p value approximation makes the new test fast applicable to large datasets. Numerical studies show that the new approach performs well in both synthetic and real world data.