Phylogenetic incongruence in E. coli O104: understanding the evolutionary relationships of emerging pathogens in the face of homologous recombination

PLoS One. 2012;7(4):e33971. doi: 10.1371/journal.pone.0033971. Epub 2012 Apr 6.

Abstract

Escherichia coli O104:H4 was identified as an emerging pathogen during the spring and summer of 2011 and was responsible for a widespread outbreak that resulted in the deaths of 50 people and sickened over 4075. Traditional phenotypic and genotypic assays, such as serotyping, pulsed field gel electrophoresis (PFGE), and multilocus sequence typing (MLST), permit identification and classification of bacterial pathogens, but cannot accurately resolve relationships among genotypically similar but pathotypically different isolates. To understand the evolutionary origins of E. coli O104:H4, we sequenced two strains isolated in Ontario, Canada. One was epidemiologically linked to the 2011 outbreak, and the second, unrelated isolate, was obtained in 2010. MLST analysis indicated that both isolates are of the same sequence type (ST678), but whole-genome sequencing revealed differences in chromosomal and plasmid content. Through comprehensive phylogenetic analysis of five O104:H4 ST678 genomes, we identified 167 genes in three gene clusters that have undergone homologous recombination with distantly related E. coli strains. These recombination events have resulted in unexpectedly high sequence diversity within the same sequence type. Failure to recognize or adjust for homologous recombination can result in phylogenetic incongruence. Understanding the extent of homologous recombination among different strains of the same sequence type may explain the pathotypic differences between the ON2010 and ON2011 strains and help shed new light on the emergence of this new pathogen.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Biological Evolution
  • Chromosomes, Bacterial / genetics
  • Disease Outbreaks / prevention & control*
  • Electrophoresis, Gel, Pulsed-Field
  • Escherichia coli / classification
  • Escherichia coli / genetics*
  • Escherichia coli / pathogenicity
  • Escherichia coli Infections / epidemiology*
  • Escherichia coli Infections / microbiology
  • Genome, Bacterial*
  • Homologous Recombination / genetics*
  • Humans
  • Molecular Sequence Data
  • Multigene Family
  • Multilocus Sequence Typing
  • Ontario / epidemiology
  • Phylogeny
  • Plasmids
  • Sequence Analysis, DNA