Visualizing single-cell data supports understanding cellular heterogeneity and dynamics. Uniform manifold approximation and projection (UMAP) and t-distributed stochastic neighbor embedding (t-SNE) reveal clustering structures but often fail to preserve underlying gene-level information. Here we introduce FeatureMAP (feature-preserving manifold approximation and projection), a framework that enhances manifold learning through pairwise tangent space embedding. By integrating UMAP with principal component analysis, FeatureMAP retains clustering structures and feature variation in a low-dimensional representation. It presents three key analytic concepts: gene contribution, gene variation trajectory, and core versus transition states. Gene contribution and gene variation trajectory are derived by estimating and projecting feature loadings or variation, whereas core and transition states are computationally defined using FeatureMAP*s topological properties, including density, curvature and betweenness centrality. These concepts enable differential gene variation(DGV) analysis that highlights regulatory genes driving transitions between cell states. Demonstrated on synthetic and real single-cell RNA sequencing data from pancreatic development and T cell exhaustion, FeatureMAP improves analyses of trajectories and crucial regulatory genes.
|