Computational identification of novel biochemical systems involved in oxidation, glycosylation and other complex modifications of bases in DNA

Nucleic Acids Res. 2013 Sep;41(16):7635-55. doi: 10.1093/nar/gkt573. Epub 2013 Jun 28.

Abstract

Discovery of the TET/JBP family of dioxygenases that modify bases in DNA has sparked considerable interest in novel DNA base modifications and their biological roles. Using sensitive sequence and structure analyses combined with contextual information from comparative genomics, we computationally characterize over 12 novel biochemical systems for DNA modifications. We predict previously unidentified enzymes, such as the kinetoplastid J-base generating glycosyltransferase (and its homolog GREB1), the catalytic specificity of bacteriophage TET/JBP proteins and their role in complex DNA base modifications. We also predict the enzymes involved in synthesis of hypermodified bases such as alpha-glutamylthymine and alpha-putrescinylthymine that have remained enigmatic for several decades. Moreover, the current analysis suggests that bacteriophages and certain nucleo-cytoplasmic large DNA viruses contain an unexpectedly diverse range of DNA modification systems, in addition to those using previously characterized enzymes such as Dam, Dcm, TET/JBP, pyrimidine hydroxymethylases, Mom and glycosyltransferases. These include enzymes generating modified bases such as deazaguanines related to queuine and archaeosine, pyrimidines comparable with lysidine, those derived using modified S-adenosyl methionine derivatives and those using TET/JBP-generated hydroxymethyl pyrimidines as biosynthetic starting points. We present evidence that some of these modification systems are also widely dispersed across prokaryotes and certain eukaryotes such as basidiomycetes, chlorophyte and stramenopile alga, where they could serve as novel epigenetic marks for regulation or discrimination of self from non-self DNA. Our study extends the role of the PUA-like fold domains in recognition of modified nucleic acids and predicts versions of the ASCH and EVE domains to be novel 'readers' of modified bases in DNA. These results open opportunities for the investigation of the biology of these systems and their use in biotechnology.

Publication types

  • Research Support, N.I.H., Intramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Adenosine Triphosphatases / genetics
  • Amino Acid Sequence
  • Bacteriophages / enzymology
  • Bacteriophages / genetics
  • Computational Biology
  • DNA / chemistry
  • DNA / metabolism*
  • Dioxygenases / classification
  • Dioxygenases / genetics
  • Endodeoxyribonucleases / genetics
  • Evolution, Molecular
  • Genome, Bacterial
  • Glycosylation
  • Glycosyltransferases / genetics
  • Molecular Sequence Data
  • Oxidation-Reduction
  • Phylogeny
  • Protein Structure, Tertiary
  • S-Adenosylmethionine / analogs & derivatives
  • Sequence Alignment
  • Thymine / metabolism

Substances

  • S-Adenosylmethionine
  • DNA
  • Dioxygenases
  • Glycosyltransferases
  • Endodeoxyribonucleases
  • terminase
  • Adenosine Triphosphatases
  • Thymine