Quantitative relationship between synonymous codon usage bias and GC composition across unicellular genomes

BMC Evol Biol. 2004 Jun 28:4:19. doi: 10.1186/1471-2148-4-19.

Abstract

Background: Codon usage bias has been widely reported to correlate with GC composition. However, the quantitative relationship between codon usage bias and GC composition across species has not been reported.

Results: Based on an informatics method (SCUO) we developed previously using Shannon informational theory and maximum entropy theory, we investigated the quantitative relationship between codon usage bias and GC composition. The regression based on 70 bacterial and 16 archaeal genomes showed that in bacteria, SCUO = -2.06 * GC3 + 2.05*(GC3)2 + 0.65, r = 0.91, and that in archaea, SCUO = -1.79 * GC3 + 1.85*(GC3)2 + 0.56, r = 0.89. We developed an analytical model to quantify synonymous codon usage bias by GC compositions based on SCUO. The parameters within this model were inferred by inspecting the relationship between codon usage bias and GC composition across 70 bacterial and 16 archaeal genomes. We further simplified this relationship using only GC3. This simple model was supported by computational simulation.

Conclusions: The synonymous codon usage bias could be simply expressed as 1+ (p/2)log2(p/2) + ((1-p)/2)log2((l-p)/2), where p = GC3. The software we developed for measuring SCUO (codonO) is available at http://digbio.missouri.edu/~wanx/cu/codonO.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Base Composition / genetics*
  • Codon / genetics*
  • Computational Biology / methods
  • Computer Simulation / statistics & numerical data
  • DNA, Archaeal / genetics
  • DNA, Bacterial / genetics
  • GC Rich Sequence / genetics
  • Genetic Code / genetics
  • Genome, Archaeal*
  • Genome, Bacterial*
  • Models, Genetic

Substances

  • Codon
  • DNA, Archaeal
  • DNA, Bacterial