Diverse Eukaryotic CGG-Binding Proteins Produced by Independent Domestications of hAT Transposons

Mol Biol Evol. 2021 May 4;38(5):2070-2075. doi: 10.1093/molbev/msab007.

Abstract

The human transcription factor (TF) CGGBP1 (CGG-binding protein) is conserved only in amniotes and is believed to derive from the zf-BED and Hermes transposase DNA-binding domains (DBDs) of a hAT DNA transposon. Here, we show that sequence-specific DNA-binding proteins with this bipartite domain structure have resulted from dozens of independent hAT domestications in different eukaryotic lineages. CGGBPs display a wide range of sequence specificity, usually including preferences for CGG or CGC trinucleotides, whereas some bind AT-rich motifs. The CGGBPs are almost entirely nonsyntenic, and their protein sequences, DNA-binding motifs, and patterns of presence or absence in genomes are uncharacteristic of ancestry via speciation. At least eight CGGBPs in the coelacanth Latimeria chalumnae bind distinct motifs, and the expression of the corresponding genes varies considerably across tissues, suggesting tissue-restricted function.

Keywords: coelacanth; exaptation; horizontal transfer; transcription factors; transposons.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • DNA Transposable Elements*
  • DNA-Binding Proteins / genetics*
  • DNA-Binding Proteins / metabolism
  • Fishes / genetics*
  • Fishes / metabolism
  • Humans

Substances

  • CGGBP1 protein, human
  • DNA Transposable Elements
  • DNA-Binding Proteins

Grants and funding