QGRS-Conserve: a computational method for discovering evolutionarily conserved G-quadruplex motifs

Hum Genomics. 2014 May 1;8(1):8. doi: 10.1186/1479-7364-8-8.

Abstract

Background: Nucleic acids containing guanine tracts can form quadruplex structures via non-Watson-Crick base pairing. Formation of G-quadruplexes is associated with the regulation of important biological functions such as transcription, genetic instability, DNA repair, DNA replication, epigenetic mechanisms, regulation of translation, and alternative splicing. G-quadruplexes play important roles in human diseases and are being considered as targets for a variety of therapies. Identification of functional G-quadruplexes and the study of their overall distribution in genomes and transcriptomes is an important pursuit. Traditional computational methods map sequence motifs capable of forming G-quadruplexes but have difficulty in distinguishing motifs that occur by chance from ones which fold into G-quadruplexes.

Results: We present Quadruplex forming 'G'-rich sequences (QGRS)-Conserve, a computational method for calculating motif conservation across exomes and supports filtering to provide researchers with more precise methods of studying G-quadruplex distribution patterns. Our method quantitatively evaluates conservation between quadruplexes found in homologous nucleotide sequences based on several motif structural characteristics. QGRS-Conserve also efficiently manages overlapping G-quadruplex sequences such that the resulting datasets can be analyzed effectively.

Conclusions: We have applied QGRS-Conserve to identify a large number of G-quadruplex motifs in the human exome conserved across several mammalian and non-mammalian species. We have successfully identified multiple homologs of many previously published G-quadruplexes that play post-transcriptional regulatory roles in human genes. Preliminary large-scale analysis identified many homologous G-quadruplexes in the 5'- and 3'-untranslated regions of mammalian species. An expectedly smaller set of G-quadruplex motifs was found to be conserved across larger phylogenetic distances. QGRS-Conserve provides means to build datasets that can be filtered and categorized in a variety of biological dimensions for more targeted studies in order to better understand the roles that G-quadruplexes play.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alternative Splicing / genetics*
  • Animals
  • Base Sequence
  • Conserved Sequence / genetics
  • Evolution, Molecular*
  • G-Quadruplexes*
  • Humans
  • Nucleotide Motifs / genetics*
  • Phylogeny
  • Sequence Analysis, DNA