RIsearch: fast RNA-RNA interaction search using a simplified nearest-neighbor energy model

Bioinformatics. 2012 Nov 1;28(21):2738-46. doi: 10.1093/bioinformatics/bts519. Epub 2012 Aug 24.

Abstract

Motivation: Regulatory, non-coding RNAs often function by forming a duplex with other RNAs. It is therefore of interest to predict putative RNA-RNA duplexes in silico on a genome-wide scale. Current computational methods for predicting these interactions range from fast complementary-based searches to those that take intramolecular binding into account. Together these methods constitute a trade-off between speed and accuracy, while leaving room for improvement within the context of genome-wide screens. A fast pre-filtering of putative duplexes would therefore be desirable.

Results: We present RIsearch, an implementation of a simplified Turner energy model for fast computation of hybridization, which significantly reduces runtime while maintaining accuracy. Its time complexity for sequences of lengths m and n is with a much smaller pre-factor than other tools. We show that this energy model is an accurate approximation of the full energy model for near-complementary RNA-RNA duplexes. RIsearch uses a Smith-Waterman-like algorithm using a dinucleotide scoring matrix which approximates the Turner nearest-neighbor energies. We show in benchmarks that we achieve a speed improvement of at least 2.4× compared with RNAplex, the currently fastest method for searching near-complementary regions. RIsearch shows a prediction accuracy similar to RNAplex on two datasets of known bacterial short RNA (sRNA)-messenger RNA (mRNA) and eukaryotic microRNA (miRNA)-mRNA interactions. Using RIsearch as a pre-filter in genome-wide screens reduces the number of binding site candidates reported by miRNA target prediction programs, such as TargetScanS and miRanda, by up to 70%. Likewise, substantial filtering was performed on bacterial RNA-RNA interaction data.

Availability: The source code for RIsearch is available at: http://rth.dk/resources/risearch.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Algorithms*
  • Base Pairing
  • Base Sequence
  • Binding Sites
  • Cluster Analysis
  • Computer Simulation*
  • Genes, Duplicate
  • Information Storage and Retrieval / methods*
  • MicroRNAs / chemistry
  • MicroRNAs / genetics
  • MicroRNAs / metabolism
  • Models, Molecular*
  • Position-Specific Scoring Matrices
  • RNA / chemistry
  • RNA / genetics
  • RNA / metabolism*
  • RNA, Bacterial / chemistry
  • RNA, Messenger / chemistry
  • RNA, Messenger / genetics
  • RNA, Messenger / metabolism
  • RNA, Untranslated / genetics*
  • Sequence Alignment

Substances

  • MicroRNAs
  • RNA, Bacterial
  • RNA, Messenger
  • RNA, Untranslated
  • RNA