Scientific Reports (Dec 2022)
A fast kernel independence test for cluster-correlated data
Abstract
Abstract Cluster-correlated data receives a lot of attention in biomedical and longitudinal studies and it is of interest to assess the generalized dependence between two multivariate variables under the cluster-correlated structure. The Hilbert–Schmidt independence criterion (HSIC) is a powerful kernel-based test statistic that captures various dependence between two random vectors and can be applied to an arbitrary non-Euclidean domain. However, the existing HSIC is not directly applicable to cluster-correlated data. Therefore, we propose a HSIC-based test of independence for cluster-correlated data. The new test statistic combines kernel information so that the dependence structure in each cluster is fully considered and exhibits good performance under high dimensions. Moreover, a rapid p value approximation makes the new test fast applicable to large datasets. Numerical studies show that the new approach performs well in both synthetic and real world data.