Biomolecules (Nov 2024)
Population-Level Cell Trajectory Inference Based on Gaussian Distributions
Abstract
In the past decade, inferring developmental trajectories from single-cell data has become a significant challenge in bioinformatics. RNA velocity, with its incorporation of directional dynamics, has significantly advanced the study of single-cell trajectories. However, as single-cell RNA sequencing technology evolves, it generates complex, high-dimensional data with high noise levels. Existing trajectory inference methods, which overlook cell distribution characteristics, may perform inadequately under such conditions. To address this, we introduce CPvGTI, a Gaussian distribution-based trajectory inference method. CPvGTI utilizes a Gaussian mixture model, optimized by the Expectation–Maximization algorithm, to construct new cell populations in the original data space. By integrating RNA velocity, CPvGTI employs Gaussian Process Regression to analyze the differentiation trajectories of these cell populations. To evaluate the performance of CPvGTI, we assess CPvGTI’s performance against several state-of-the-art methods using four structurally diverse simulated datasets and four real datasets. The simulation studies indicate that CPvGTI excels in pseudo-time prediction and structural reconstruction compared to existing methods. Furthermore, the discovery of new branch trajectories in human forebrain and mouse hematopoiesis datasets confirms CPvGTI’s superior performance.
Keywords