Thirty complete Streptomyces genome sequences for mining novel secondary metabolite biosynthetic gene clusters

Sci Data. 2020 Feb 13;7(1):55. doi: 10.1038/s41597-020-0395-9.

Abstract

Streptomyces are Gram-positive bacteria of significant industrial importance due to their ability to produce a wide range of antibiotics and bioactive secondary metabolites. Recent advances in genome mining have revealed that Streptomyces genomes possess a large number of unexplored silent secondary metabolite biosynthetic gene clusters (smBGCs). This indicates that Streptomyces genomes continue to be an invaluable source for new drug discovery. Here, we present high-quality genome sequences of 22 Streptomyces species and eight different Streptomyces venezuelae strains assembled by a hybrid strategy exploiting both long-read and short-read genome sequencing methods. The assembled genomes have more than 97.4% gene space completeness and total lengths ranging from 6.7 to 10.1 Mbp. Their annotation identified 7,000 protein coding genes, 20 rRNAs, and 68 tRNAs on average. In silico prediction of smBGCs identified a total of 922 clusters, including many clusters whose products are unknown. We anticipate that the availability of these genomes will accelerate discovery of novel secondary metabolites from Streptomyces and elucidate complex smBGC regulation.

Publication types

  • Dataset
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation
  • Drug Discovery
  • Genes, Bacterial
  • Genome, Bacterial*
  • Multigene Family*
  • Secondary Metabolism*
  • Streptomyces / genetics*
  • Streptomyces / metabolism