Inferring pattern and process: maximum-likelihood implementation of a nonhomogeneous model of DNA sequence evolution for phylogenetic analysis

Mol Biol Evol. 1998 Jul;15(7):871-9. doi: 10.1093/oxfordjournals.molbev.a025991.

Abstract

A nonhomogeneous, nonstationary stochastic model of DNA sequence evolution allowing varying equilibrium G + C contents among lineages is devised in order to deal with sequences of unequal base compositions. A maximum-likelihood implementation of this model for phylogenetic analyses allows handling of a reasonable number of sequences. The relevance of the model and the accuracy of parameter estimates are theoretically and empirically assessed, using real or simulated data sets. Overall, a significant amount of information about past evolutionary modes can be extracted from DNA sequences, suggesting that process (rates of distinct kinds of nucleotide substitutions) and pattern (the evolutionary tree) can be simultaneously inferred. G + C contents at ancestral nodes are quite accurately estimated. The new method appears to be useful for phylogenetic reconstruction when base composition varies among compared sequences. It may also be suitable for molecular evolution studies.

MeSH terms

  • Algorithms*
  • Bacteria / genetics
  • Base Composition
  • Base Sequence
  • Computer Simulation
  • DNA / chemistry
  • DNA / genetics*
  • Evolution, Molecular*
  • Likelihood Functions
  • Models, Genetic*
  • Phylogeny*

Substances

  • DNA