Self-assembling manifolds in single-cell RNA sequencing data

Elife. 2019 Sep 16:8:e48994. doi: 10.7554/eLife.48994.

Abstract

Single-cell RNA sequencing has spurred the development of computational methods that enable researchers to classify cell types, delineate developmental trajectories, and measure molecular responses to external perturbations. Many of these technologies rely on their ability to detect genes whose cell-to-cell variations arise from the biological processes of interest rather than transcriptional or technical noise. However, for datasets in which the biologically relevant differences between cells are subtle, identifying these genes is challenging. We present the self-assembling manifold (SAM) algorithm, an iterative soft feature selection strategy to quantify gene relevance and improve dimensionality reduction. We demonstrate its advantages over other state-of-the-art methods with experimental validation in identifying novel stem cell populations of Schistosoma mansoni, a prevalent parasite that infects hundreds of millions of people. Extending our analysis to a total of 56 datasets, we show that SAM is generalizable and consistently outperforms other methods in a variety of biological and quantitative benchmarks.

Keywords: computational biology; feature selection; manifold reconstruction; regenerative medicine; schistosome; single-cell analysis; stem cells; systems biology.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Computational Biology / methods*
  • RNA, Helminth / genetics*
  • RNA, Helminth / metabolism
  • Schistosoma mansoni / genetics
  • Schistosoma mansoni / growth & development*
  • Sequence Analysis, RNA / methods*
  • Single-Cell Analysis / methods

Substances

  • RNA, Helminth

Associated data

  • GEO/GSE116920
  • GEO/GSE36552
  • GEO/GSE71585-GPL17021
  • GEO/GSE63818
  • GEO/GSE80032
  • GEO/GSE45719
  • GEO/GSE57872
  • GEO/GSE94883
  • GEO/GSE100911
  • GEO/GSE74596
  • GEO/GSE71982
  • GEO/GSE94383
  • GEO/GSE57249
  • GEO/GSE52529-GPL16791
  • GEO/GSE48968-GPL13112
  • GEO/GSE64016
  • GEO/GSE77847
  • GEO/GSE52583-GPL13112
  • GEO/GSE70245
  • GEO/GSE71485
  • GEO/GSE84465
  • GEO/GSE81547
  • GEO/GSE97519
  • GEO/GSE98556
  • GEO/GSE99235
  • GEO/GSE99933
  • GEO/GSE99989
  • GEO/GSE100471
  • GEO/GSE100597
  • GEO/GSE103334
  • GEO/GSE107632
  • GEO/GSE108020
  • GEO/GSE109796
  • GEO/GSE110496
  • GEO/GSE70630
  • GEO/GSE72857
  • GEO/GSE95430
  • GEO/GSE103840
  • GEO/GSE67835
  • GEO/GSE83139
  • GEO/GSE84133
  • GEO/GSE111764
  • GEO/GSE85241
  • SRA/SRP073808
  • SRA/SRP041736