Nanopore sequencing enables near-complete de novo assembly of Saccharomyces cerevisiae reference strain CEN.PK113-7D

FEMS Yeast Res. 2017 Nov 1;17(7):fox074. doi: 10.1093/femsyr/fox074.

Abstract

The haploid Saccharomyces cerevisiae strain CEN.PK113-7D is a popular model system for metabolic engineering and systems biology research. Current genome assemblies are based on short-read sequencing data scaffolded based on homology to strain S288C. However, these assemblies contain large sequence gaps, particularly in subtelomeric regions, and the assumption of perfect homology to S288C for scaffolding introduces bias. In this study, we obtained a near-complete genome assembly of CEN.PK113-7D using only Oxford Nanopore Technology's MinION sequencing platform. Fifteen of the 16 chromosomes, the mitochondrial genome and the 2-μm plasmid are assembled in single contigs and all but one chromosome starts or ends in a telomere repeat. This improved genome assembly contains 770 Kbp of added sequence containing 248 gene annotations in comparison to the previous assembly of CEN.PK113-7D. Many of these genes encode functions determining fitness in specific growth conditions and are therefore highly relevant for various industrial applications. Furthermore, we discovered a translocation between chromosomes III and VIII that caused misidentification of a MAL locus in the previous CEN.PK113-7D assembly. This study demonstrates the power of long-read sequencing by providing a high-quality reference assembly and annotation of CEN.PK113-7D and places a caveat on assumed genome stability of microorganisms.

Keywords: Saccharomyces cerevisiae; genome assembly; long-read sequencing; nanopore sequencing; yeast.

MeSH terms

  • Chromosomes, Fungal
  • Computational Biology / methods
  • Genetic Heterogeneity
  • Genome, Fungal*
  • Genomics* / methods
  • High-Throughput Nucleotide Sequencing*
  • Nanopores*
  • Saccharomyces cerevisiae / genetics*
  • Sequence Analysis, DNA*
  • Translocation, Genetic