RegSNPs-intron: a computational framework for predicting pathogenic impact of intronic single nucleotide variants

Genome Biol. 2019 Nov 28;20(1):254. doi: 10.1186/s13059-019-1847-4.

Abstract

Single nucleotide variants (SNVs) in intronic regions have yet to be systematically investigated for their disease-causing potential. Using known pathogenic and neutral intronic SNVs (iSNVs) as training data, we develop the RegSNPs-intron algorithm based on a random forest classifier that integrates RNA splicing, protein structure, and evolutionary conservation features. RegSNPs-intron showed excellent performance in evaluating the pathogenic impacts of iSNVs. Using a high-throughput functional reporter assay called ASSET-seq (ASsay for Splicing using ExonTrap and sequencing), we evaluate the impact of RegSNPs-intron predictions on splicing outcome. Together, RegSNPs-intron and ASSET-seq enable effective prioritization of iSNVs for disease pathogenesis.

Keywords: Bioinformatics; Computational biology; Disease pathogenesis; High-throughput screening assay; Intron; Prediction model; RNA splicing; Random forest; Single nucleotide polymorphism.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Algorithms
  • Alternative Splicing
  • Disease / genetics*
  • Exons
  • Gene Frequency
  • Genetic Techniques*
  • Humans
  • Introns*
  • Models, Genetic*
  • Polymorphism, Single Nucleotide*
  • Software